现象描述
线上的定时任务使用了corntab来定时启动console应用,最近出现应用运行完成后,不自动关闭的情况,因为进程不停止,占用了大量内存。
使用jstack查看线程信息,发现几个非守护进程,处于WAITING状态。
$ jstack pid
"background_executor-3-thread-7" #87 prio=5 os_prio=0 tid=0x00007fdfcc17d000 nid=0x6423 waiting on condition [0x00007fdf566ed000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000f1ee9850> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
问题分析
在代码中搜索关键字 “background_executor”,找到是线程池里自定义了ThreadFactory,创建线程时同时给了线程名称,并且设定了线程为非守护进线程 t.setDaemon(true)
。
同时线程池初始化设置了
corePoolSize = 8
maximumPoolSize = 20
keepAliveTime = 30 second
这致使线程池在完成任务空闲时,仍会保持8个线程处于WAITING状态,等待新任务。除非设置
allowCoreThreadTimeOut = true
。
参照线程池构造函数的说明:
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue<Runnable> workQueue,
RejectedExecutionHandler handler)
Creates a new ThreadPoolExecutor with the given initial parameters and default thread factory.
Parameters:
corePoolSize - the number of threads to keep in the pool, even if they are idle, unless allowCoreThreadTimeOut is set
maximumPoolSize - the maximum number of threads to allow in the pool
keepAliveTime - when the number of threads is greater than the core, this is the maximum time that excess idle threads will wait for new tasks before terminating.
unit - the time unit for the keepAliveTime argument
workQueue - the queue to use for holding tasks before they are executed. This queue will hold only the Runnable tasks submitted by the execute method.
handler - the handler to use when execution is blocked because the thread bounds and queue capacities are reached
解决方法
将线程池的线程设置为守护进程肯定不行,那样会导致任务没有执行完就退出。llowCoreThreadTimeOut = true ,线程池在执行完任务后,经过超时时间,将所有空闲的线程都释放掉,进程池这样进程就可以退出。
扩展方法
那是否还有其他方法呢?当然线程池还提供了 Shutdown 方法,在提交完所有任务后,调用Shutdown关闭线程池,线程池完成所有任务后,进程池会被回收。
Shutdown调用后,线程池从Running状态变为Shutdown状态,不会接收新任务,但会将队列的任务执行完,然后进入Tidying状态,在执行完terminated()钩子后,变成Terminated状态,线程池可以被回收。
线程池中使用一个int值的前3位标识线程池状态,后面29位表示线程数量。这样在设置线程数量的时候,可以同时设置线程池的状态,避免加锁。
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0));
状态 | 值 |
---|---|
RUNNING | 100 |
SHUTDOWN | 000 |
STOP | 001 |
TIDYING | 010 |
TERMINATED | 011 |
/*
* The runState provides the main lifecycle control, taking on values:
*
* RUNNING: Accept new tasks and process queued tasks
* SHUTDOWN: Don't accept new tasks, but process queued tasks
* STOP: Don't accept new tasks, don't process queued tasks,
* and interrupt in-progress tasks
* TIDYING: All tasks have terminated, workerCount is zero,
* the thread transitioning to state TIDYING
* will run the terminated() hook method
* TERMINATED: terminated() has completed
*
* The numerical order among these values matters, to allow
* ordered comparisons. The runState monotonically increases over
* time, but need not hit each state. The transitions are:
*
* RUNNING -> SHUTDOWN
* On invocation of shutdown(), perhaps implicitly in finalize()
* (RUNNING or SHUTDOWN) -> STOP
* On invocation of shutdownNow()
* SHUTDOWN -> TIDYING
* When both queue and pool are empty
* STOP -> TIDYING
* When pool is empty
* TIDYING -> TERMINATED
* When the terminated() hook method has completed
*
* Threads waiting in awaitTermination() will return when the
* state reaches TERMINATED.
*
* Detecting the transition from SHUTDOWN to TIDYING is less
* straightforward than you'd like because the queue may become
* empty after non-empty and vice versa during SHUTDOWN state, but
* we can only terminate if, after seeing that it is empty, we see
* that workerCount is 0 (which sometimes entails a recheck -- see
* below).
*/
网友评论