美文网首页
多线程之线程池执行器ThreadPoolExecutor源码探究

多线程之线程池执行器ThreadPoolExecutor源码探究

作者: 小天使999999 | 来源:发表于2020-09-13 22:01 被阅读0次

    在日常的移动开发过程中,并发处理任务是不可避免的。尤其是伴随着移动互联时代的飞速发展,用户对应用的要求也越来越最求高质量的极致体验,甚至逼得Google不允许在Android的UI线程发起网络请求。
    还好由于Android强大的生态圈,AsyncTask、Executors、RxJava等一批异步处理工具被开发人员用的炉火纯青。
    可好景不长,近年阿里发布了自己的Java开发手册,带动全行业纷纷效仿升级,不但不能使用Thread搞开发,就算是使用Executors提供的四大线程池工具,也被限制了。对就是下面这张图:


    系统提供的线程池执行器

    为什么阿里不允许使用系统提供的线程池方法构建线程池呢?这就跟线程池的执行逻辑有关了。
    那我们先通过一张表来阐述一下线程池的执行逻辑。由于所有的的Executors工具提供的线程池创建方法最终都会通过ThreadPoolExecutor创建线程池,并通过其execute方法执行。我们直接分析ThreadPoolExecutor的执行逻辑。


    线程池执行逻辑图

    一、执行策略

    根据图上我们知道,当系统调用execute方法发起执行的时候,线程池有三种选择:拒绝执行任务、直接执行任务、缓冲执行任务。执行不同的任务自然要依赖不同的条件,我们从源码角度来看看这些条件到底是什么?
    当我们调用execute方法开始执行的时候,官方对这个方法做了相当详细的解释:系统提供的任务会在未来的某个时刻被执行,执行的线程要么是一个新的线程,要么是线程池里的一个已经存在的线程。如果因为线程池关闭或者容量超限,任务无法执行,那么就会通过RejectedExecutionHandler执行拒绝策略。而系统提供的任务,就是execute传入的Runnable参数:command

        /**
         * Executes the given task sometime in the future.  The task
         * may execute in a new thread or in an existing pooled thread.
         *
         * If the task cannot be submitted for execution, either because this
         * executor has been shutdown or because its capacity has been reached,
         * the task is handled by the current {@code RejectedExecutionHandler}.
         *
         * @param command the task to execute
         * @throws RejectedExecutionException at discretion of
         *         {@code RejectedExecutionHandler}, if the task
         *         cannot be accepted for execution
         * @throws NullPointerException if {@code command} is null
         */
        public void execute(Runnable command) {
            if (command == null)
                throw new NullPointerException();
            /*
             * Proceed in 3 steps:
             *
             * 1. If fewer than corePoolSize threads are running, try to
             * start a new thread with the given command as its first
             * task.  The call to addWorker atomically checks runState and
             * workerCount, and so prevents false alarms that would add
             * threads when it shouldn't, by returning false.
             *
             * 2. If a task can be successfully queued, then we still need
             * to double-check whether we should have added a thread
             * (because existing ones died since last checking) or that
             * the pool shut down since entry into this method. So we
             * recheck state and if necessary roll back the enqueuing if
             * stopped, or start a new thread if there are none.
             *
             * 3. If we cannot queue task, then we try to add a new
             * thread.  If it fails, we know we are shut down or saturated
             * and so reject the task.
             */
            int c = ctl.get();
            if (workerCountOf(c) < corePoolSize) { // 代码1
                if (addWorker(command, true))
                    return;
                c = ctl.get();
            }
            if (isRunning(c) && workQueue.offer(command)) { // 代码2
                int recheck = ctl.get();
                if (! isRunning(recheck) && remove(command)) // 代码2.1
                    reject(command);
                else if (workerCountOf(recheck) == 0) // 代码2.2
                    addWorker(null, false);
            }
            else if (!addWorker(command, false)) // 代码3
                reject(command);
        }
    

    为了是用户更容易理解,方法内还通过三个步骤对执行逻辑,做了更进一步的解释:
    step1:如果当前正在运行的线程数<corePoolSize【核心线程数】,系统就用command这个待执行任务开启一个新的线程【其实是构造worker并的同时开启新的线程】,而这个工作是通过调用addWorker方法完成的,该方法可以自动检测运行状态和工作线程数。而且,如果在不该添加thread的时候添加了,它会返回false作为警示。举例:线程池处于STOP状态,就不允许线程运行。
    该步骤对应代码1处的条件。
    step2:如果一个任务可以成功的加入阻塞队列,我们就要对预执行行为做二次检测:2.1 我们是否本应该先添加一个新的线程【用于执行command任务】,因为从上一次核对到现在线程池中的线程可能已经死了;2.2 当前线程池已经不是running状态了。根据核对结果,如果线程池停了就重新执行入队操作;如果工作线程空了,就先启动一个新的工作线程获取阻塞队列里的执行任务,维持正常的执行工作。
    代码2对应这个步骤,其中代码2.1处所谓的恢复线程池,就是执行拒绝策略。代码2.2处倒是真的通过addWorker启动了一个新的工作线程。
    step3:如果阻塞队列已满,就要尝试通过addWorker启动一个新的工作线程。如果启动失败,就要执行拒绝策略:关闭线程池或者执行拒绝任务。
    整个线程池执行的策略,可以参考下面的流程图:


    线程池执行策略流程图

    二、任务处理逻辑

    在上面的步骤中,我们一直在提到addWorker这个方法,是否执行拒绝策略是由它通过返回一个bool值决定的。看源码说明:
    注释译文:(该方法)根据当前线程池状态、核心线程数、最大线程数等临界值,来决定是否可以向线程池中添加一个新的worker【执行线程】。如果添加完成,执行线程的数量也要做相应的调整。如果可能的话,甚至要创建并开启一个新的线程,把传入firstTask作为第一个任务。在线程池STOP、SHUTDOWN状态下,返回false。线程工厂申请创建线程失败,或者申请过程中发生OOM,也返回false。
    firstTask参数就是在execute执行时传入的command待执行任务,如果执行线程数<核心线程数,或者队列满的时候执行线程说小于最大线程数,addWorker都会通过它创建一个新的执行线程worker。
    初始空闲线程常常通过prestartCoreThread 方法创建,或者用来代替其他的死亡线程。
    当然firstTask也可能为空,比如系统在将command插入阻塞队列后发现,执行线程数为0,就会先创建一个空任务线程执行。因为就算执行任务的线程数为0,也要有一个调度线程负责获取任务。

      /**
         * Checks if a new worker can be added with respect to current
         * pool state and the given bound (either core or maximum). If so,
         * the worker count is adjusted accordingly, and, if possible, a
         * new worker is created and started, running firstTask as its
         * first task. This method returns false if the pool is stopped or
         * eligible to shut down. It also returns false if the thread
         * factory fails to create a thread when asked.  If the thread
         * creation fails, either due to the thread factory returning
         * null, or due to an exception (typically OutOfMemoryError in
         * Thread.start()), we roll back cleanly.
         *
         * @param firstTask the task the new thread should run first (or
         * null if none). Workers are created with an initial first task
         * (in method execute()) to bypass queuing when there are fewer
         * than corePoolSize threads (in which case we always start one),
         * or when the queue is full (in which case we must bypass queue).
         * Initially idle threads are usually created via
         * prestartCoreThread or to replace other dying workers.
         *
         * @param core if true use corePoolSize as bound, else
         * maximumPoolSize. (A boolean indicator is used here rather than a
         * value to ensure reads of fresh values after checking other pool
         * state).
         * @return true if successful
         */
        private boolean addWorker(Runnable firstTask, boolean core) {
            retry:  // break时,可以直接结束所有循环。
            for (;;) {
                
                int c = ctl.get();
                int rs = runStateOf(c);
    
                // Check if queue empty only if necessary.
                if (rs >= SHUTDOWN &&   
                    ! (rs == SHUTDOWN &&
                       firstTask == null &&
                       ! workQueue.isEmpty()))
                    return false;
    
                for (;;) {
                    int wc = workerCountOf(c);
                    if (wc >= CAPACITY ||
                        wc >= (core ? corePoolSize : maximumPoolSize))
                        return false;
                    if (compareAndIncrementWorkerCount(c))
                        break retry; //  结束内外循环
                    c = ctl.get();  // Re-read ctl
                    if (runStateOf(c) != rs)
                        continue retry; //  结束外循环的当前循环,执行下一个外循环
                    // else CAS failed due to workerCount change; retry inner loop
                }
            }
    
            //  代码1
            boolean workerStarted = false;
            boolean workerAdded = false;
            Worker w = null;
            try {  //  代码2
                w = new Worker(firstTask);
                final Thread t = w.thread;
                if (t != null) {
                    //  代码3
                    final ReentrantLock mainLock = this.mainLock;
                    mainLock.lock();
                    try {
                        // Recheck while holding lock.
                        // Back out on ThreadFactory failure or if
                        // shut down before lock acquired.
                        int rs = runStateOf(ctl.get());
    
                        //  代码4
                        if (rs < SHUTDOWN ||
                            (rs == SHUTDOWN && firstTask == null)) {
                            if (t.isAlive()) // precheck that t is startable
                                throw new IllegalThreadStateException();
                            workers.add(w);
                            int s = workers.size();
                            if (s > largestPoolSize)
                                largestPoolSize = s;
                            workerAdded = true;
                        }
                    } finally {
                        mainLock.unlock();
                    }
                    if (workerAdded) { //  代码5
                        t.start(); 
                        workerStarted = true;
                    }
                }
            } finally {
                if (! workerStarted)
                    addWorkerFailed(w);
            }
            return workerStarted;
        }
    
    

    知道addWorker的功能,我们就可以按部就班的查看源码的实现逻辑。整个方法大致分为两个步骤:
    step1:通过两个for循环检测整个线程池环境:判断线程池是否处于RUNNING状态【负数】、SHUTDOWN状态【0】、其他状态【正数】,只有RUNNING状态和符合条件的SHUTDOWN状态【比如,阻塞队列为空】才允许创建新的执行线程。
    在第二个for循环处对允许创建执行线程的临界值做了判断,只有执行线程数小于核心线程数或者队列满的时候,总执行线程数小于最大线程数才允许添加新的执行任务。并且通过compareAndIncrementWorkerCount更新执行线程数。
    step2:构造worker执行线程实体,并发起执行任务。
    代码1处,声明两个局部变量workerStarted和workerAdded表征执行线程worker是否添加成功并发起执行;
    代码2处,构造了worker实体,并将待执行任务firstTask一并传入。其构造方法如下:

            /**
             * Creates with given first task and thread from ThreadFactory.
             * @param firstTask the first task (null if none)
             */
            Worker(Runnable firstTask) {
                setState(-1); // inhibit interrupts until runWorker
                this.firstTask = firstTask;
                this.thread = getThreadFactory().newThread(this);
            }
    

    可见无论执行任务是否为空,线程工厂都会创建一个新的线程thread。看看整个thread的定义:

            /** Thread this worker is running in.  Null if factory fails. */
            final Thread thread;
    

    执行线程worker要运行在整个thread中。换句话说,任务在这个thread中执行。
    注意:newThread()的参数this,说明worker也是一个Runnable对象。
    紧接着,
    代码3处,声明了一个ReentrantLock可重入锁对象mainLock,它锁住了下面的try代码块,因为这里要进行一个重要的操作:
    workers.add(w);
    这里的workers是一个HashSet集合,如下:

    /**
    * Set containing all worker threads in pool. Accessed only when
    * holding mainLock.
    */
    // Android-added: @ReachabilitySensitive
    @ReachabilitySensitive
    private final HashSet<Worker> workers = new HashSet<>();

    它存储了线程里所有的执行线程,当多个任务同时访问他的时候,必然要做好安全保护。所以,规定只有持有mainlock的请求任务才能修改它。一旦一个任务添加完成,workerAdded被置位true,系统就要发起工作线程的执行。
    代码5处,通过t.start()方法发起工作线程的执行。t就是代码2处,构造worker时,ThreadFactory创建的执行线程。
    注意:之所以用workerStarted表征线程启动,是因为执行线程可能启动失败。方便通过addWorkerFailed完成兜底操作。
    整个执行逻辑比较简单,如下图:


    addWorker执行逻辑

    三、任务执行逻辑

    还记得刚刚发起的Thread吗,它在被创建的时候传入了worker实例:

    this.thread = getThreadFactory().newThread(this);

    看看worker的继承关系:

        /**
         * Class Worker mainly maintains interrupt control state for
         * threads running tasks, along with other minor bookkeeping.
         * This class opportunistically extends AbstractQueuedSynchronizer
         * to simplify acquiring and releasing a lock surrounding each
         * task execution.  This protects against interrupts that are
         * intended to wake up a worker thread waiting for a task from
         * instead interrupting a task being run.  We implement a simple
         * non-reentrant mutual exclusion lock rather than use
         * ReentrantLock because we do not want worker tasks to be able to
         * reacquire the lock when they invoke pool control methods like
         * setCorePoolSize.  Additionally, to suppress interrupts until
         * the thread actually starts running tasks, we initialize lock
         * state to a negative value, and clear it upon start (in
         * runWorker).
         */
        private final class Worker
            extends AbstractQueuedSynchronizer
            implements Runnable
    

    也就是说Worker类是一个Runnable的实现类,thread.start其实启动的是Worker的run方法。注意:这个类,主要是为了维持正在运行任务的线程的中断控制状态。它还顺便继承了AbstractQueuedSynchronizer,这是为了简化每一次围绕任务执行的锁请求和释放操作。
    看一下Worker的run方法:

            /** Delegates main run loop to outer runWorker. */
            public void run() {
                runWorker(this);
            }
    

    它把run循环委托给了外部的runWorker方法。系统对该方法也做了较为详细的说明:

        /**
         * Main worker run loop.  Repeatedly gets tasks from queue and
         * executes them, while coping with a number of issues:
         *
         * 1. We may start out with an initial task, in which case we
         * don't need to get the first one. Otherwise, as long as pool is
         * running, we get tasks from getTask. If it returns null then the
         * worker exits due to changed pool state or configuration
         * parameters.  Other exits result from exception throws in
         * external code, in which case completedAbruptly holds, which
         * usually leads processWorkerExit to replace this thread.
         *
         * 2. Before running any task, the lock is acquired to prevent
         * other pool interrupts while the task is executing, and then we
         * ensure that unless pool is stopping, this thread does not have
         * its interrupt set.
         *
         * 3. Each task run is preceded by a call to beforeExecute, which
         * might throw an exception, in which case we cause thread to die
         * (breaking loop with completedAbruptly true) without processing
         * the task.
         *
         * 4. Assuming beforeExecute completes normally, we run the task,
         * gathering any of its thrown exceptions to send to afterExecute.
         * We separately handle RuntimeException, Error (both of which the
         * specs guarantee that we trap) and arbitrary Throwables.
         * Because we cannot rethrow Throwables within Runnable.run, we
         * wrap them within Errors on the way out (to the thread's
         * UncaughtExceptionHandler).  Any thrown exception also
         * conservatively causes thread to die.
         *
         * 5. After task.run completes, we call afterExecute, which may
         * also throw an exception, which will also cause thread to
         * die. According to JLS Sec 14.20, this exception is the one that
         * will be in effect even if task.run throws.
         *
         * The net effect of the exception mechanics is that afterExecute
         * and the thread's UncaughtExceptionHandler have as accurate
         * information as we can provide about any problems encountered by
         * user code.
         *
         * @param w the worker
         */
        final void runWorker(Worker w) {
            Thread wt = Thread.currentThread();
            Runnable task = w.firstTask;
            w.firstTask = null;
            w.unlock(); // allow interrupts
            boolean completedAbruptly = true;
            try {
                while (task != null || (task = getTask()) != null) {
                    w.lock();
                    // If pool is stopping, ensure thread is interrupted;
                    // if not, ensure thread is not interrupted.  This
                    // requires a recheck in second case to deal with
                    // shutdownNow race while clearing interrupt
                    if ((runStateAtLeast(ctl.get(), STOP) ||
                         (Thread.interrupted() &&
                          runStateAtLeast(ctl.get(), STOP))) &&
                        !wt.isInterrupted())
                        wt.interrupt();
                    try {
                        beforeExecute(wt, task);
                        Throwable thrown = null;
                        try {
                            task.run(); 
                        } catch (RuntimeException x) {
                            thrown = x; throw x;
                        } catch (Error x) {
                            thrown = x; throw x;
                        } catch (Throwable x) {
                            thrown = x; throw new Error(x);
                        } finally {
                            afterExecute(task, thrown);
                        }
                    } finally {
                        task = null;
                        w.completedTasks++;
                        w.unlock();
                    }
                }
                completedAbruptly = false;
            } finally {
                processWorkerExit(w, completedAbruptly);
            }
        }
    

    虽然这个方法,从头到尾扯了一箩筐的闲片儿,归结起来就是:获取任务task、执行任务task.run、中断控制、执行前任务处理、执行后任务处理。

    1. 获取任务

    1.1 firstTask不为null
    也就是执行线程<核心线程数,或者队列已满,执行线程<最大线程数的情况下,直接执行firstTask中的任务
    1.2 firstTask为null
    此时需要从阻塞队列获取正在等待的任务。它是通过getTask方法完成。
    源码如下:

        /**
         * Performs blocking or timed wait for a task, depending on
         * current configuration settings, or returns null if this worker
         * must exit because of any of:
         * 1. There are more than maximumPoolSize workers (due to
         *    a call to setMaximumPoolSize).
         * 2. The pool is stopped.
         * 3. The pool is shutdown and the queue is empty.
         * 4. This worker timed out waiting for a task, and timed-out
         *    workers are subject to termination (that is,
         *    {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
         *    both before and after the timed wait, and if the queue is
         *    non-empty, this worker is not the last thread in the pool.
         *
         * @return task, or null if the worker must exit, in which case
         *         workerCount is decremented
         */
        private Runnable getTask() {
            boolean timedOut = false; // Did the last poll() time out?
    
            for (;;) {
                int c = ctl.get();
                int rs = runStateOf(c);
    
                // Check if queue empty only if necessary.
                if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                    decrementWorkerCount();
                    return null;
                }
    
                int wc = workerCountOf(c);
    
                // Are workers subject to culling?
                boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
    
                if ((wc > maximumPoolSize || (timed && timedOut))
                    && (wc > 1 || workQueue.isEmpty())) {
                    if (compareAndDecrementWorkerCount(c))
                        return null;
                    continue;
                }
    
                try {
                    Runnable r = timed ?
                        workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                        workQueue.take();
                    if (r != null)
                        return r;
                    timedOut = true;
                } catch (InterruptedException retry) {
                    timedOut = false;
                }
            }
        }
    

    通过一个无限循环,检测线程池环境、队列状况,没有异常的情况下,就从队列里获取一个等待执行的任务。
    1.3 中断控制
    线程的执行要跟线程池的状态保持一致:如果线程池停止,执行线程就要中断;如果线程池运行,执行线程就不能中断。
    1.4 预执行处理
    在执行任务执行,runWorker调用beforeExecute(wt, task)方法,将将要执行的工作线程和任务拦截处理,比如:重新初始化ThreadLocals、打印日志等
    1.5 执行后处理
    在执行完task后,runWorker最终会调用afterExecute(task, thrown)方法,将任务和搜集的异常传入。当然,如果调用submit方法可以针对每一个执行任务的结果进行监听。
    值得注意的是,以上五个步骤都是在
    AQS的保护下完成的。还记得Worker的继承关系吗?它是AbstractQueuedSynchronizer的派生类,Worker利用了AQS的独占机制,来控制任务执行过程的安全。之所以,没有使用ReentrantLock这个可重入锁,是为了防止调用类似于setCorePoolSize的方法时,worker任务可以再次获取到锁。

    问题1. 什么是可重入锁?ThreadPoolExecutor在向workers添加任务的时候使用了ReentrantLock这个可重入锁?使用sychronized行吗?
    ReentrantLock是AQS的派生类,它不仅支持synchronized加锁方式的基本功能,还做了相应的扩充:支持中断、超时、在获取失败的时候可以尝试二次获取。synchronized的灵活性相对较差,而且他是基于监视器模式,在大量并发的情况下性能不如ReentrantLock。
    问题2. 什么是AQS,谈谈对它的理解
    AQS也就是AbstractQueuedSynchronizer,他是一个框架。提供了原子式管理、阻塞和唤醒线程的功能。Worker和ReentrantLock、CountdownLatch都是基于这种框架的,它通过一个volatile修饰的state变量,控制锁的可重入性。Worker只使用了State的0和1两个值,所以不支持可重入机制。可重入的意思就是:一个线程可以对一个临界资源可以重复加锁,并且将请求次数+1。释放的时候,将请求次数-1。
    为了实现原子式管理,它通过CAS修改STATE状态,而且它内存通过双向链表队列控制锁的acquire和release,支持独占和共享模式。
    问题3. 如何监听线程池中执行任务的执行结果?
    ThreadPoolExecutor继承自AbstractExecutorService,该方法提拱了submit方法执行任务并返回FutureTask,然后用afterExecute即可监听结果。
    问题4. 为什么阿里不允许使用系统提供的线程池方法构建线程池呢?
    以单例线程池为例,它使用的是阻塞队列是LinkedBlockingQueue,该队列允许最多添加Integer.MAX_VALUE个task进行等待,如果CPU执行效率低,而任务量过于繁重的情况下OOM是不可避免的。由于四个基本线程池的限制条件是固定的,可控性相对较差,为了灵活控制线程池的运行,使用自定义方案是不错的选择。

    相关文章

      网友评论

          本文标题:多线程之线程池执行器ThreadPoolExecutor源码探究

          本文链接:https://www.haomeiwen.com/subject/embwektx.html