美文网首页
自动登陆并跳转到相应页面爬取数据,将数据保存为excel表格,并

自动登陆并跳转到相应页面爬取数据,将数据保存为excel表格,并

作者: 墨色尘埃 | 来源:发表于2020-05-20 11:21 被阅读0次

    现有一个功能,需要自动登陆到系统,登陆时有验证码验证,自动登陆后查询某个接口,将请求到数据保存到表格2中,并将表格2以附件的形式批量发送邮件,收件人信息从目录下的表格1中读取。

    这里有几个技术关键点:
    ①读取目录下的表格1
    ②多次识别验证码,直到正确为止
    ③将数据保存到表格2
    ④批量发送带表格2附件的邮件

    关于登陆系统就不说了,这里主要探讨以上提出的四个问题

    ①读取目录下的表格1

    
                List<Map<String, String>> list = new ArrayList<>();
                //当前用户桌面
                File desktopDir = FileSystemView.getFileSystemView().getHomeDirectory();
                String desktopPath = desktopDir.getAbsolutePath();
                //表格路径集合
                List<String> filePathList = new ArrayList<>();
                //表格路径
                filePathList.add(claimantsExcel1);
                filePathList.add(claimantsExcel2);
    
                int count = 0;
                for (String filePath : filePathList) {
                    File file = new File(filePath);
                    if (!file.exists() || file.isDirectory()) {
                        count++;
                        continue;
                    } else {
                        Workbook workBook = null;
                        InputStream fis = new FileInputStream(file);
                        if (filePath.endsWith("xls")) {
                            workBook = new HSSFWorkbook(fis);
                        } else if (filePath.endsWith("xlsx")) {
                            workBook = new XSSFWorkbook(fis);
                        } else {
    
                        }
                        Sheet sheet = null;
                        Row row = null;
                        String cellData = null;
    
                        if (workBook != null) {
                            //获取第一个sheet
                            sheet = workBook.getSheetAt(0);
                            //获取最大行数
                            int rownum = sheet.getPhysicalNumberOfRows();
                            //获取第一行
                            row = sheet.getRow(0);
                            //获取最大列数
                            int colnum = row.getLastCellNum();
    //                    int colnum = row.getPhysicalNumberOfCells();
    
                            String columns1[] = new String[colnum];
    
                            for (int i = 1; i < rownum; i++) {
                                Map<String, String> map = new LinkedHashMap<>();
                                row = sheet.getRow(i);
                                if (row != null) {
                                    for (int j = 0; j < colnum; j++) {
                                        cellData = ExcelUtil.getStringCellValue(row.getCell(j));
                                        columns1[j] = "a" + (j + 1);
                                        if (!org.springframework.util.StringUtils.isEmpty(cellData)) {
                                            map.put(columns1[j], cellData);
                                        }
                                    }
                                } else {
                                    break;
                                }
                                list.add(map);
                            }
                        }
                        break;
                    }
                }
    
                if (count == filePathList.size()) {
                    logger.info("认领员名单表格不存在");
                    throw new BusinessException("认领员名单表格不存在");
                }
    

    ②多次识别验证码,直到正确为止

    这个方法使用了while循环和递归,方法中嵌套自己的方法。其中while循环以自定义变量为标识,满足条件退出。

        /**
         * 验证码,经过百度文字识别api之后,识别出图片中的文字
         * 递归调用,貌似有问题,返回多次数据
         *
         * @param userId
         * @return
         * @throws IOException
         */
        public String checkHandleCode(String userId) throws Exception {
    
            JSONArray wordsResult = checkWordsResult(userId);
            boolean jsonArrayEmptyFlag = true;
            while (jsonArrayEmptyFlag) {
                if (wordsResult.size() == 0 || org.springframework.util.StringUtils.isEmpty(wordsResult)) {
                    wordsResult = null;
                    jsonArrayEmptyFlag = false;
                    wordsResult = checkWordsResult(userId);
                } else {
                    jsonArrayEmptyFlag = false;
                }
            }
    
            com.alibaba.fastjson.JSONObject o = (com.alibaba.fastjson.JSONObject) wordsResult.get(0);
            String words = (String) o.get("words");
    
            //取出百度文字识别,未经处理的文字,判断是否为空
            //为空,获取下一张验证码继续识别
            //不为空,跳出循环,执行下一步
    
    //        // 模拟出错状态
    //        String[] randomArray = {words, null, ""};
    //        int i = (int) (Math.random() * randomArray.length);
    //        words = randomArray[i];
    
            boolean emptyFlag = true;
            while (emptyFlag) {
                if (org.springframework.util.StringUtils.isEmpty(words)) {
                    words = null;
                    emptyFlag = false;
                    words = checkHandleCode(userId);
                } else {
                    emptyFlag = false;
                    break;
                }
            }
    
            //最原始的未经处理的文字,不为空之后,去除空格后的新String判断是否为空
            //为空,获取下一张验证码继续识别
            //不为空,跳出循环,执行下一步
            words = words.replace(" ", "");
            boolean replaceFlag = true;
            while (replaceFlag) {
                if (org.springframework.util.StringUtils.isEmpty(words)) {
                    words = null;
                    replaceFlag = false;
                    words = checkHandleCode(userId);
                } else {
                    replaceFlag = false;
                    break;
                }
            }
    
            //判断新String是否是4位
            //不等于4,获取下一张验证码继续识别
            //等于4,跳出循环,执行下一步
            boolean fourCharFlag = true;
            while (fourCharFlag) {
                if (words.length() != 4) {
                    //递归调用中,如果当前验证码不符合四位,又会调用自己的方法
                    //但是上一个自己的方法还没有结束,变量生命周期没有结束(见引申),所以必须对该变量修改,恢复原值
                    //引申:虚拟机栈描述的是JAVA方法执行的内存模型:每个方法在执行的同时都会创建一个栈帧(STACKFRAME
                    // )用于存储局部变量表、操作数栈、动态链接、方法出口等信息。每个方法从调用直至执行完成的过程,就对应着一个栈帧在虚拟机栈中入栈和出栈的过程。
                    words = null;
                    fourCharFlag = false;
                    words = checkHandleCode(userId);
                } else {
                    fourCharFlag = false;
                    break;
                }
            }
            return words;
        }
    

    ③将数据保存到表格2

                //预收账款查询
                DepositReceived deposit = getDepositReceived(fmCookie, driver);
                //预收账款列表,要保存到表格2的数据
                List<DepositDataReduce> depositDataList = deposit.getData();
    
                //文件名
                String originalName = "预收账款-" + DateUtil.getShortStringDate();
                //sheet名
                String sheetName = "报名情况sheet";
    
                String[] titleList = {"单据号", "行号", "客户编号", "客户名称", "项目编号", "项目名称", "部门", "申请日期",
                        "申请人", "过账日期", "凭证编号", "专业类型", "金额", "已核销金额", "可核销金额", "摘要"};
    
                String[] titleEgList = {"busRecordNo", "busLineNumber", "customerId", "customerName", "projId", "projName",
                        "budOrgName", "busAccountDate", "createName", "busAccountBudat", "accountCode", "budTypeName",
                        "budAmount", "payAmount", "noPayAmount", "budRemark"};
    
    
                HSSFWorkbook workbook = ExcelUtil.createExcel(sheetName, titleList, titleEgList, depositDataList);
    
    
                //当前用户桌面
                File desktopDir = FileSystemView.getFileSystemView().getHomeDirectory();
                String desktopPath = desktopDir.getAbsolutePath();
                //删除文件
                String fileName = cliamPath + "\\" + originalName + ".xls";
                boolean b = FileUtils.deleteFile(fileName);
    
                //输出Excel文件
                File toFile = new File(fileName);//自定义输出流
                OutputStream output = new FileOutputStream(toFile);
                workbook.write(output);
                output.flush();
                output.close();
    

    其中ExcelUtil.createExcel方法

        /**
         * 生成Excel表格
         *
         * @param sheetName sheet名称
         * @param titleList 表头列表
         * @param dataList  数据列表
         * @return HSSFWorkbook对象
         */
        public static HSSFWorkbook createExcel(String sheetName, String[] titleList, String[] titleEgList, List dataList) throws
                IllegalAccessException {
    
            //创建HSSFWorkbook对象(excel的文档对象)
            HSSFWorkbook wb = new HSSFWorkbook();
            //创建sheet对象(excel的表单)
            HSSFSheet sheet = wb.createSheet(sheetName);
            //在sheet里创建第一行,这里即是表头
            HSSFRow rowTitle = sheet.createRow(0);
    
    
            //写入表头的每一个列
            for (int i = 0; i < titleList.length; i++) {
                //创建单元格
                rowTitle.createCell(i).setCellValue(titleList[i]);
            }
    
    
            // TODO: 2020/4/13
            String[] titleEgList1 = {"busRecordNo", "busLineNumber", "customerId", "customerName", "projId", "projName",
                    "budOrgName", "busAccountDate", "createName", "busAccountBudat", "accountCode", "budTypeName",
                    "budAmount", "payAmount", "noPayAmount", "budRemark"};
    
            int count = 0;
            //写入每一行的记录
            if (dataList != null) {
                for (int i = 0; i < dataList.size(); i++) {
                    count++;
                    //创建新的一行,递增
                    HSSFRow rowData = sheet.createRow(i + 1);
    
                    //通过反射,获取POJO对象
                    Class cl = dataList.get(i).getClass();
                    //获取类的所有字段
                    Field[] fields = cl.getDeclaredFields();
                    for (int j = 0; j < titleEgList.length; j++) {
                        String egName = titleEgList[j];
                        //设置字段可见,否则会报错,禁止访问
                        fields[j].setAccessible(true);
                        //创建单元格
                        if (egName.equals("busAccountDate")) {  //日期类型
                            Date date = (Date) fields[j].get(dataList.get(i));
                            String stringDate = DateUtil.getStringDate(date);
                            rowData.createCell(j).setCellValue(stringDate);
                        } else if (egName.equals("budAmount") || egName.equals("payAmount") || egName.equals("noPayAmount")) {
                            rowData.createCell(j).setCellValue((Double) fields[j].get(dataList.get(i)));
                        } else {
                            rowData.createCell(j).setCellValue((String) fields[j].get(dataList.get(i)));
                        }
                    }
                }
            }
    
    
    
            return wb;
        }
    

    ④批量发送带表格2附件的邮件

                //https://blog.csdn.net/qq_41937685/article/details/82831450
                logger.info("开始发送邮件,请稍等...");
                String subject = "截止到" + DateUtil.getSlashStringDate() + "日未被认领的到款清单";
                String msg = "各位同事:\n" +
                        "        附件为截止到" + DateUtil.getSlashStringDate() + "日未被认领的到款清单,请各部门积极认领,谢谢。\n";
    
                try {
                    mailAcceUtil.sendMail(list, subject, msg, fileName);
                } catch (GeneralSecurityException e) {
                    e.printStackTrace();
                }
    
    
    @Component
    public class SendMailAcceUtil {
    
    
        @Autowired
        MailConfig mailConfig;
    
        /**
         * 发送带附件的邮件
         *
         * @param subject  邮件主题
         * @param msg      邮件内容
         * @param fileName 附件地址
         * @return
         * @throws GeneralSecurityException
         */
        public boolean sendMail(/*String receive, */List<Map<String, String>> list,
                                String subject, String msg, String fileName) throws GeneralSecurityException {
    
    
            if (list == null || list.size() == 0) {
                return false;
            }
    
            // 获取系统属性
            Properties properties = System.getProperties();
    
            properties.put("mail.transport.protocol", "smtp"); // 使用的协议(JavaMail规范要求)
            // 设置邮件服务器
            properties.put("mail.smtp.host", mailConfig.getMyEmailSMTPHost());
            properties.put("mail.smtp.auth", "true");// 需要请求认证
            properties.put("mail.smtp.port", "465");// SMTP服务器的端口
    //        properties.put("mail.debug", "true"); //打印发送详细信息
    
            MailSSLSocketFactory sf = new MailSSLSocketFactory();
            sf.setTrustAllHosts(true);
            properties.put("mail.smtp.ssl.enable", "true");
            properties.put("mail.smtp.ssl.socketFactory", sf);
    
            // 获取默认session对象
            Session session = Session.getDefaultInstance(properties, new Authenticator() {
                public PasswordAuthentication getPasswordAuthentication() { // qq邮箱服务器账户、第三方登录授权码
                    // 发件人邮件用户名、密码
                    return new PasswordAuthentication(mailConfig.getMyEmailAccount(), mailConfig.getMyEmailPassword());
                }
            });
    
            try {
                // 创建默认的 MimeMessage 对象
                MimeMessage message = new MimeMessage(session);
    
                // Set From: 头部头字段
                message.setFrom(new InternetAddress(mailConfig.getMyEmailAccount()));
    
                // Set To: 头部头字段
    //            message.addRecipient(Message.RecipientType.TO, new InternetAddress(receive));
                InternetAddress[] internetAddresses = new InternetAddress[list.size()];
                for (int i = 0; i < list.size(); i++) {
                    //取第三列邮箱
                    String a3 = list.get(i).get("a3").replace(" ", "").trim().toString();
                    if (!StringUtils.isEmpty(a3)) {
                        System.out.println("邮箱地址是:" + a3);
                        internetAddresses[i] = new InternetAddress(a3);
                    }
                }
                message.addRecipients(Message.RecipientType.TO, internetAddresses);
    
                // Set Subject: 主题文字
                message.setSubject(subject);
    
                // 创建消息部分
                BodyPart messageBodyPart = new javax.mail.internet.MimeBodyPart();
    
                // 消息
                messageBodyPart.setText(msg);
    
                // 创建多重消息
                Multipart multipart = new MimeMultipart();
    
                // 设置文本消息部分
                multipart.addBodyPart(messageBodyPart);
    
                // 附件部分
                messageBodyPart = new javax.mail.internet.MimeBodyPart();
                // 设置要发送附件的文件路径
                DataSource source = new FileDataSource(fileName);
                messageBodyPart.setDataHandler(new DataHandler(source));
    
                // messageBodyPart.setFileName(filename);
                // 处理附件名称中文(附带文件路径)乱码问题
                int indexOf = fileName.lastIndexOf("\\");
                String substring = fileName.substring(indexOf + 1, fileName.length());
                messageBodyPart.setFileName(MimeUtility.encodeText(substring));
                multipart.addBodyPart(messageBodyPart);
    
                // 发送完整消息
                message.setContent(multipart);
    
                // 发送消息
                Transport.send(message);
                // System.out.println("Sent message successfully....");
                return true;
            } catch (UnsupportedEncodingException e) {
                e.printStackTrace();
            } catch (AddressException e) {
                e.printStackTrace();
            } catch (javax.mail.MessagingException e) {
                e.printStackTrace();
            }
            return false;
    
    
        }
    }
    

    相关文章

      网友评论

          本文标题:自动登陆并跳转到相应页面爬取数据,将数据保存为excel表格,并

          本文链接:https://www.haomeiwen.com/subject/btwvohtx.html