美文网首页
Elastic job 执行过程分析源码分析

Elastic job 执行过程分析源码分析

作者: pcgreat | 来源:发表于2018-08-15 12:02 被阅读112次

    Elastic job AbstractElasticJobExecutor 执行过程的核心类 ,execute 方法则是 执行方法 , 补偿机制 这里会写

       /**
         * 执行作业.
         */
        public final void execute() {
            try {
                //zk服务器时间与本地时间如果超过 maxTimeDiffSeconds 会报异常 , 默认 -1 不开启 , 不开启 会有什么问题呢 
                jobFacade.checkJobExecutionEnvironment();
            } catch (final JobExecutionEnvironmentException cause) {
                jobExceptionHandler.handleException(jobName, cause);
            }
             //返回分片上下文 在[分片中有介绍](https://www.jianshu.com/p/44893e3c216d)
            ShardingContexts shardingContexts = jobFacade.getShardingContexts();
            if (shardingContexts.isAllowSendJobEvent()) {
                //shardingContexts.isAllowSendJobEvent() : 默认返回true  , 直接 发送trace event  ,job start
    
                jobFacade.postJobStatusTraceEvent(shardingContexts.getTaskId(), State.TASK_STAGING, String.format("Job '%s' execute begin.", jobName));
            }
            // 判断当前有运行的分片 ,有 ,会在 sharding/分片id/misfire 创建节点 value 为 “”  当前调度 发送trace event 后 结束 
            if (jobFacade.misfireIfRunning(shardingContexts.getShardingItemParameters().keySet())) {
                if (shardingContexts.isAllowSendJobEvent()) {
                    jobFacade.postJobStatusTraceEvent(shardingContexts.getTaskId(), State.TASK_FINISHED, String.format(
                            "Previous job '%s' - shardingItems '%s' is still running, misfired job will start after previous job completed.", jobName, 
                            shardingContexts.getShardingItemParameters().keySet()));
                }
                return;
            }
            try {
                //类似于 jobFacade.afterJobExecuted 在[doAsfterJobExecutedAtLastCompleted]
                jobFacade.beforeJobExecuted(shardingContexts);
                //CHECKSTYLE:OFF
            } catch (final Throwable cause) {
                //CHECKSTYLE:ON
                jobExceptionHandler.handleException(jobName, cause);
            }
            execute(shardingContexts, JobExecutionEvent.ExecutionSource.NORMAL_TRIGGER);
            //如果有Misfired 分片 ,清除 sharding/分片id/misfire  ,然后为Misfired 分片  调用execute
    
            while (jobFacade.isExecuteMisfired(shardingContexts.getShardingItemParameters().keySet())) {
                jobFacade.clearMisfire(shardingContexts.getShardingItemParameters().keySet());
                execute(shardingContexts, JobExecutionEvent.ExecutionSource.MISFIRE);
            }
            //处理失败分片,默认不启用 ,暂时不分析
            jobFacade.failoverIfNecessary();
            try {
                jobFacade.afterJobExecuted(shardingContexts);
                //CHECKSTYLE:OFF
            } catch (final Throwable cause) {
                //CHECKSTYLE:ON
                jobExceptionHandler.handleException(jobName, cause);
            }
        }
    
    

    jobFacade.checkJobExecutionEnvironment() : zk服务器时间与本地时间如果超过 maxTimeDiffSeconds 会报异常 , 默认 -1 不开启 , 不开启 会有什么问题呢
    jobFacade.getShardingContexts(): 返回分片上下文 在分片中有介绍
    shardingContexts.isAllowSendJobEvent() : 默认返回true , 直接 发送trace event ,job start
    jobFacade.misfireIfRunning : 判断当前有运行的分片 ,有 ,会在 sharding/分片id/misfire 创建节点 value 为 “” 当前调度 发送trace event 后 结束
    jobFacade.beforeJobExecuted :类似于 jobFacade.afterJobExecuted 在doAsfterJobExecutedAtLastCompleted 有分析
    execute中调用的private execute 方法 ,执行分片任务 ,执行完后 ,如果有Misfired 分片 ,清除 sharding/分片id/misfire ,然后为Misfired 分片 调用execute
    jobFacade.failoverIfNecessary():处理失败分片,默认不启用 ,暂时不分析

    private execute 方法

     private void execute(final ShardingContexts shardingContexts, final JobExecutionEvent.ExecutionSource executionSource) {
            // 如果分片为 empty
            if (shardingContexts.getShardingItemParameters().isEmpty()) {
                if (shardingContexts.isAllowSendJobEvent()) {
                    // no shard ,  task finished  
                    jobFacade.postJobStatusTraceEvent(shardingContexts.getTaskId(), State.TASK_FINISHED, String.format("Sharding item for job '%s' is empty.", jobName));
                }
                return;
            }
            jobFacade.registerJobBegin(shardingContexts);
            String taskId = shardingContexts.getTaskId();
            if (shardingContexts.isAllowSendJobEvent()) {
                jobFacade.postJobStatusTraceEvent(taskId, State.TASK_RUNNING, "");
            }
            try {
                process(shardingContexts, executionSource);
            } finally {
                // TODO 考虑增加作业失败的状态,并且考虑如何处理作业失败的整体回路
                jobFacade.registerJobCompleted(shardingContexts);
                if (itemErrorMessages.isEmpty()) {
                    if (shardingContexts.isAllowSendJobEvent()) {
                        jobFacade.postJobStatusTraceEvent(taskId, State.TASK_FINISHED, "");
                    }
                } else {
                    if (shardingContexts.isAllowSendJobEvent()) {
                        jobFacade.postJobStatusTraceEvent(taskId, State.TASK_ERROR, itemErrorMessages.toString());
                    }
                }
            }
        }
    

    如果 分片为empty 则 no shard , task finished 如果有 jobFacade.registerJobBegin(shardingContexts) 方法 在monitorExecution 为true 会为对应分片创建sharding/shardid/running 临时节点。
    发送TASK_RUNNING Trace Event

    调用 process 方法 ,如果分片只有1 , 当前线程执行另一process 方法 ,如果分片数 >1 则会交给 [线程池] https://www.jianshu.com/p/0d0e7339c9b0 , 结束当前调用 。

     private void process(final ShardingContexts shardingContexts, final JobExecutionEvent.ExecutionSource executionSource) {
            Collection<Integer> items = shardingContexts.getShardingItemParameters().keySet();
            if (1 == items.size()) {
                int item = shardingContexts.getShardingItemParameters().keySet().iterator().next();
                JobExecutionEvent jobExecutionEvent =  new JobExecutionEvent(shardingContexts.getTaskId(), jobName, executionSource, item);
                process(shardingContexts, item, jobExecutionEvent);
                return;
            }
            final CountDownLatch latch = new CountDownLatch(items.size());
            for (final int each : items) {
                final JobExecutionEvent jobExecutionEvent = new JobExecutionEvent(shardingContexts.getTaskId(), jobName, executionSource, each);
                if (executorService.isShutdown()) {
                    return;
                }
                executorService.submit(new Runnable() {
                    
                    @Override
                    public void run() {
                        try {
                            process(shardingContexts, each, jobExecutionEvent);
                        } finally {
                            latch.countDown();
                        }
                    }
                });
            }
            try {
                latch.await();
            } catch (final InterruptedException ex) {
                Thread.currentThread().interrupt();
            }
        }
    

    最后的process 方法

     private void process(final ShardingContexts shardingContexts, final int item, final JobExecutionEvent startEvent) {
            if (shardingContexts.isAllowSendJobEvent()) {
                jobFacade.postJobExecutionEvent(startEvent);
            }
            log.trace("Job '{}' executing, item is: '{}'.", jobName, item);
            JobExecutionEvent completeEvent;
            try {
                process(new ShardingContext(shardingContexts, item));
                completeEvent = startEvent.executionSuccess();
                log.trace("Job '{}' executed, item is: '{}'.", jobName, item);
                if (shardingContexts.isAllowSendJobEvent()) {
                    jobFacade.postJobExecutionEvent(completeEvent);
                }
                // CHECKSTYLE:OFF
            } catch (final Throwable cause) {
                // CHECKSTYLE:ON
                completeEvent = startEvent.executionFailure(cause);
                jobFacade.postJobExecutionEvent(completeEvent);
                itemErrorMessages.put(item, ExceptionUtil.transform(cause));
                jobExceptionHandler.handleException(jobName, cause);
            }
        }
    

    向jobEventBus 发送 startEvent ,调用 你实现job 类的proccess 方法 ,向jobEventBus 发送
    success completeEvent 如果出现异常 向jobEventBus 发送 error completeEvent ,处理异常等等

    回过头来想想补偿机制 job 每1s 调度一次 , job 实际执行需要3s ,会是怎么样的输出呢 ,看源码是根据sharding/分片id/running 判断 , 第二次 以及第三次调度 ,实际会变成2次 调度 ,也就是说 会丢失一次调度的 这种情况 需要注意

    quartz SimpleThreadPool 1 个线程 ,misfire 策略 MISFIRE_INSTRUCTION_DO_NOTHING 这种情况 quartz misfire 策略 就没有意义了 , 一个线程调度 execute 方法中不会出现 misfire , 但事实上出现了的misfire 是怎么回事呢 JobTriggerListener 会给你答案

    
    /**
     * 作业触发监听器.
     * 
     * @author zhangliang
     */
    @RequiredArgsConstructor
    public final class JobTriggerListener extends TriggerListenerSupport {
        
        private final ExecutionService executionService;
        
        private final ShardingService shardingService;
        
        @Override
        public String getName() {
            return "JobTriggerListener";
        }
        
        @Override
        public void triggerMisfired(final Trigger trigger) {
            if (null != trigger.getPreviousFireTime()) {
                executionService.setMisfire(shardingService.getLocalShardingItems());
            }
        }
    }
    
    
    

    JobTriggerListener 这里 PreviousFireTime 不为null 情况下 , 置本地分片 zk misfire 。

    相关文章

      网友评论

          本文标题:Elastic job 执行过程分析源码分析

          本文链接:https://www.haomeiwen.com/subject/hgmibftx.html