从Daemons到finalize timed out afte

作者: Jamin_正宗红罐辣酱 | 来源:发表于2017-12-13 16:52 被阅读0次

    〇.序

    起因,在Fabric遇到了很多BUG..出于想对于BUG的来龙去脉做进一步的了解.做了一番调查..本文也是Android Crash的防护与追踪的后续研究.

    一.浅读Daemons

    1.Daemons的由来:

    代码位置:java.lang.Daemons.

    Daemons开始于Zygote进程创建子进程.

    三件事情:

    1. VM_HOOKS.preFork(),做准备工作。
    2. nativeForkAndSpecialize,创建子进程
    3. VM_HOOKS.postForkCommon();启动Zygote的4个Daemon线程,java堆整理,引用队列,以及析构线程。
        //第一步:
        //停止四个线程:Daemon线程,java堆整理,引用队列,析构线程 
        //也就是创建子进程的时候,不能有这几个线程搅和。 
        public void preFork() {
            Daemons.stop();
            waitUntilAllThreadsStopped();
            token = nativePreFork();
        }
        
        ....
        
        /**
         * Called by the zygote in both the parent and child processes after
         * every fork. In the child process, this method is called after
         * {@code postForkChild}.
         */
         //第三步,启动Daemons
        public void postForkCommon() {
            Daemons.start();
        }
    

    2.代码开篇:

    
    /**
     * Calls Object.finalize() on objects in the finalizer reference queue. The VM
     * will abort if any finalize() call takes more than the maximum finalize time
     * to complete.
     *
     * @hide
     */
    public final class Daemons {
    
        private static final int NANOS_PER_MILLI = 1000 * 1000;
        private static final int NANOS_PER_SECOND = NANOS_PER_MILLI * 1000;
        private static final long MAX_FINALIZE_NANOS = 10L * NANOS_PER_SECOND;
    
        public static void start() {
            ReferenceQueueDaemon.INSTANCE.start();
            FinalizerDaemon.INSTANCE.start();
            FinalizerWatchdogDaemon.INSTANCE.start();
            HeapTaskDaemon.INSTANCE.start();
        }
    
        public static void stop() {
            HeapTaskDaemon.INSTANCE.stop();
            ReferenceQueueDaemon.INSTANCE.stop();
            FinalizerDaemon.INSTANCE.stop();
            FinalizerWatchdogDaemon.INSTANCE.stop();
        }
        
        ...
    }
    

    Daemons类的开头写下了关键的注释,当调用finalizer reference queue(终结引用队列)中的对象的Object.finalize()方法的时候.假如有任何调用finalize() 方法时.超出了最大终结时间(一般为10秒).VM就会中止.这个最大终结时间就是MAX_FINALIZE_NANOS

    2. 1四个守护线程

    1. ReferenceQueueDaemon:引用队列守护线程。我们知道,在创建引用对象的时候,可以关联一个队列。当被引用对象引用的对象被GC回收的时候,被引用对象就会被加入到其创建时关联的队列去。这个加入队列的操作就是由ReferenceQueueDaemon守护线程来完成的。这样应用程序就可以知道哪些被引用的对象已经被回收了。
    2. FinalizerDaemon:析构守护线程。对于重写了成员函数finalize的对象,它们被GC决定回收时,并没有马上被回收,而是被放入到一个队列中,等待FinalizerDaemon守护线程去调用它们的成员函数finalize,然后再被回收。
    3. FinalizerWatchdogDaemon:析构监护守护线程。用来监控FinalizerDaemon线程的执行。一旦检测那些重写了finalize的对象在执行成员函数finalize时超出一定时间,那么就会退出VM。
    4. HeapTaskDaemon : 堆裁剪守护线程。用来执行裁剪堆的操作,也就是用来将那些空闲的堆内存归还给系统。

    2.2 重要属性解析

    • MAX_FINALIZE_NANOS:此类开始就定义了一个MAX_FINALIZE_NANOS.可能有类似经历的同学,看了之后就能明白.这个是finalize方法执行的超时时间.(经常会遇到XXX.finalize() timed out after 10 seconds)

    • Daemons提供了start(),stop()方法,分别开启和停止四个守护进程.

    3.四大守护线程的抽象类

      /**
         * A background task that provides runtime support to the application.
         * Daemons can be stopped and started, but only so that the zygote can be a
         * single-threaded process when it forks.
         */
        private static abstract class Daemon implements Runnable {
        
            private Thread thread;
            private String name;
    
            protected Daemon(String name) {
                this.name = name;
            }
    
            public synchronized void start() {
                if (thread != null) {
                    throw new IllegalStateException("already running");
                }
                thread = new Thread(ThreadGroup.systemThreadGroup, this, name);
                thread.setDaemon(true);
                thread.start();
            }
    
            public abstract void run();
            
            ...
            
    }
    

    Daemon线程是四大守护线程的抽象类.核心是在start()方法的时候,将自己放到ThreadGroup.systemThreadGroup中.并设置为守护线程.这部分不过分解读.

    二. 本文重点:FinalizerWatchdogDaemon

    本文重点是因为遇到finalize() timed out after 10 seconds.这样的异常,想要对此调研一番.

    1.初识FinalizerWatchdogDaemon

    1.1 类注释解读

    FinalizerWatchdogDaemon的类注释写的很清楚.当一个实例的finalizer函数,花费了超过MAX_FINALIZATION_MILLIS时间(默认为10秒).将被认定finalizer卡死,此时,FinalizerWatchdogDaemon将会让VM退出.

        /**
         * The watchdog exits the VM if the finalizer ever gets stuck. We consider
         * the finalizer to be stuck if it spends more than MAX_FINALIZATION_MILLIS
         * on one instance.
         */
        private static class FinalizerWatchdogDaemon extends Daemon {
            private static final FinalizerWatchdogDaemon INSTANCE = new FinalizerWatchdogDaemon();
    
            FinalizerWatchdogDaemon() {
                super("FinalizerWatchdogDaemon");
            }
    
            @Override public void run() {
            ...
            }
    }
    

    1.2 错误日志元凶

    • 找你一万年.当读到这的时候,大家一定发现了.这就是我们看到的那个TimeoutException.拼装.finalize() timed out after 10 seconds.的整个作案现场.
    • h.uncaughtException(Thread.currentThread(), syntheticException);上面的注释也特别关键,虽然这个TimeoutException是我抛出的.但是并不是我这个线程超时,我只是静静的看着它.

    当WatchDog线程检测finalizing!=null的时候.会提示TimeOutException.

      @Override public void run() {
                while (isRunning()) {
                    if (!sleepUntilNeeded()) {
                        // We have been interrupted, need to see if this daemon has been stopped.
                        continue;
                    }
                    final Object finalizing = waitForFinalization();
                    if (finalizing != null && !VMRuntime.getRuntime().isDebuggerActive()) {
                        finalizerTimedOut(finalizing);
                        break;
                    }
                }
            }
    

    TimeOutException的生成.

       private static void finalizerTimedOut(Object object) {
                // The current object has exceeded the finalization deadline; abort!
                String message = object.getClass().getName() + ".finalize() timed out after "
                        + (MAX_FINALIZE_NANOS / NANOS_PER_SECOND) + " seconds";
                Exception syntheticException = new TimeoutException(message);
                // We use the stack from where finalize() was running to show where it was stuck.
                syntheticException.setStackTrace(FinalizerDaemon.INSTANCE.getStackTrace());
                Thread.UncaughtExceptionHandler h = Thread.getDefaultUncaughtExceptionHandler();
                // Send SIGQUIT to get native stack traces.
                try {
                    Os.kill(Os.getpid(), OsConstants.SIGQUIT);
                    // Sleep a few seconds to let the stack traces print.
                    Thread.sleep(5000);
                } catch (Exception e) {
                    System.logE("failed to send SIGQUIT", e);
                } catch (OutOfMemoryError ignored) {
                    // May occur while trying to allocate the exception.
                }
                if (h == null) {
                    // If we have no handler, log and exit.
                    System.logE(message, syntheticException);
                    System.exit(2);
                }
                // Otherwise call the handler to do crash reporting.
                // We don't just throw because we're not the thread that
                // timed out; we're the thread that detected it.
                h.uncaughtException(Thread.currentThread(), syntheticException);
            }
            
    

    观察FinalizerDaemons的回收.

    
    
            /**
             * Return an object that took too long to finalize or return null.
             * Wait MAX_FINALIZE_NANOS.  If the FinalizerDaemon took essentially the whole time
             * processing a single reference, return that reference.  Otherwise return null.
             */
            private Object waitForFinalization() {
                long startCount = FinalizerDaemon.INSTANCE.progressCounter.get();
                // Avoid remembering object being finalized, so as not to keep it alive.
                if (!sleepFor(MAX_FINALIZE_NANOS)) {
                    // Don't report possibly spurious timeout if we are interrupted.
                    return null;
                }
                if (getNeedToWork() && FinalizerDaemon.INSTANCE.progressCounter.get() == startCount) {
                    // We assume that only remove() and doFinalize() may take time comparable to
                    // MAX_FINALIZE_NANOS.
                    // We observed neither the effect of the gotoSleep() nor the increment preceding a
                    // later wakeUp. Any remove() call by the FinalizerDaemon during our sleep
                    // interval must have been followed by a wakeUp call before we checked needToWork.
                    // But then we would have seen the counter increment.  Thus there cannot have
                    // been such a remove() call.
                    // The FinalizerDaemon must not have progressed (from either the beginning or the
                    // last progressCounter increment) to either the next increment or gotoSleep()
                    // call.  Thus we must have taken essentially the whole MAX_FINALIZE_NANOS in a
                    // single doFinalize() call.  Thus it's OK to time out.  finalizingObject was set
                    // just before the counter increment, which preceded the doFinalize call.  Thus we
                    // are guaranteed to get the correct finalizing value below, unless doFinalize()
                    // just finished as we were timing out, in which case we may get null or a later
                    // one.  In this last case, we are very likely to discard it below.
                    Object finalizing = FinalizerDaemon.INSTANCE.finalizingObject;
                    sleepFor(NANOS_PER_SECOND / 2);
                    // Recheck to make it even less likely we report the wrong finalizing object in
                    // the case which a very slow finalization just finished as we were timing out.
                    if (getNeedToWork()
                            && FinalizerDaemon.INSTANCE.progressCounter.get() == startCount) {
                        return finalizing;
                    }
                }
                return null;
            }
    

    顺便学习下 System.exit(2);的数字的意思.

     System.exit(2);
     //0表示正常退出,0以上表示非正常退出.
     private static final int RUNNING = 0;
     private static final int HOOKS = 1;
     private static final int FINALIZERS = 2;
    

    传送门 : StackOverflow传送门,System.exit()扩展阅读

    1.3 不得不提的FinalizerDaemon

    WatchDog顾名思义,观察的就是这个FinalizerDaemon.一旦FinalizerDaemon在处理一个对象的时候超过10s中,那么就会结束进程,导致崩溃。附上FinalizerDaemon的源码.源码中的注释很详尽.可以自行阅读.

       private static class FinalizerDaemon extends Daemon {
            private static final FinalizerDaemon INSTANCE = new FinalizerDaemon();
            private final ReferenceQueue<Object> queue = FinalizerReference.queue;
            private final AtomicInteger progressCounter = new AtomicInteger(0);
            // Object (not reference!) being finalized. Accesses may race!
            private Object finalizingObject = null;
    
            FinalizerDaemon() {
                super("FinalizerDaemon");
            }
    
            @Override public void run() {
                // This loop may be performance critical, since we need to keep up with mutator
                // generation of finalizable objects.
                // We minimize the amount of work we do per finalizable object. For example, we avoid
                // reading the current time here, since that involves a kernel call per object.  We
                // limit fast path communication with FinalizerWatchDogDaemon to what's unavoidable: A
                // non-volatile store to communicate the current finalizable object, e.g. for
                // reporting, and a release store (lazySet) to a counter.
                // We do stop the  FinalizerWatchDogDaemon if we have nothing to do for a
                // potentially extended period.  This prevents the device from waking up regularly
                // during idle times.
    
                // Local copy of progressCounter; saves a fence per increment on ARM and MIPS.
                int localProgressCounter = progressCounter.get();
    
                while (isRunning()) {
                    try {
                        // Use non-blocking poll to avoid FinalizerWatchdogDaemon communication
                        // when busy.
                        FinalizerReference<?> finalizingReference = (FinalizerReference<?>)queue.poll();
                        if (finalizingReference != null) {
                            finalizingObject = finalizingReference.get();
                            progressCounter.lazySet(++localProgressCounter);
                        } else {
                            finalizingObject = null;
                            progressCounter.lazySet(++localProgressCounter);
                            // Slow path; block.
                            FinalizerWatchdogDaemon.INSTANCE.goToSleep();
                            finalizingReference = (FinalizerReference<?>)queue.remove();
                            finalizingObject = finalizingReference.get();
                            progressCounter.set(++localProgressCounter);
                            FinalizerWatchdogDaemon.INSTANCE.wakeUp();
                        }
                        doFinalize(finalizingReference);
                    } catch (InterruptedException ignored) {
                    } catch (OutOfMemoryError ignored) {
                    }
                }
            }
    
            @FindBugsSuppressWarnings("FI_EXPLICIT_INVOCATION")
            private void doFinalize(FinalizerReference<?> reference) {
                FinalizerReference.remove(reference);
                Object object = reference.get();
                reference.clear();
                try {
                    object.finalize();
                } catch (Throwable ex) {
                    // The RI silently swallows these, but Android has always logged.
                    System.logE("Uncaught exception thrown by finalizer", ex);
                } finally {
                    // Done finalizing, stop holding the object as live.
                    finalizingObject = null;
                }
            }
        }
    

    三.问题分析:

    从GC来看.JAVA在内存回收的时候.并不是马上将对象销毁,而是放入到一个队列中,由FinalizerDaemon(析构守护线程)调用他们的finalize方法,再被回收,所以当FinalizerDaemon回收过程中.超过10秒,FinalizerWatchdogDaemon线程就会抛出TimeoutException XXX.finalize() timed out after 10 seconds.结尾附赠了两篇文章,有提到析构相关的性能开销和可以去追踪问题的办法.供大家扩展阅读.

    四.聊聊FinalizerReference

    内存占用大户(FinalizerReference)
    [待续]换个认知角度再来看看这个问题.

    结尾:

    两篇不错的文章.
    传送门 :
    提升Android下内存的使用意识和排查能力
    再谈Finalizer对象--大型App中内存与性能的隐性杀手

    相关文章

      网友评论

        本文标题:从Daemons到finalize timed out afte

        本文链接:https://www.haomeiwen.com/subject/rzkzixtx.html