美文网首页Android开发Android技术知识Android开发
JVM垃圾回收之finalize执行时引起timed out 闪

JVM垃圾回收之finalize执行时引起timed out 闪

作者: LooperJing | 来源:发表于2019-05-10 17:14 被阅读28次

    一、背景

    java.util.concurrent.TimeoutException: android.content.res.AssetManager$AssetInputStream.finalize() timed out after 10 seconds
     at android.content.res.AssetManager$AssetInputStream.close(AssetManager.java:812)
     at android.content.res.AssetManager$AssetInputStream.finalize(AssetManager.java:845)
     at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:202)
     at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:185)
     at java.lang.Thread.run(Thread.java:833)
    

    这是一个高频问题,必须要诊治一下了,去查看一下这个crash在我们发出去的最新版本上的上报情况,看到所有重写finalize方法的对象,都有可能发生这个异常,堆栈有几十种,其中AssetManager是个大户,发生了289次,ThreadedRenderer发生了33次。其余的类大部分都是我们自己的,比如CursorWindow,SQliteCursor等

    二、初步分析

    android.content.res.AssetManager$AssetInputStream.finalize() timed out after 10 seconds,从报错的直观意思上来看,是由于finalize方法超时了,下面来模拟一下。

    class MyObject extends View {
           public MyObject(Context context) {
               super(context);
           }
           @Override
           protected void finalize() throws Throwable {
               System.out.println("WANG  finalize begain");
               try{
                   Thread.sleep(2000);
               }finally {
                   super.finalize();
                   System.out.println("WANG  finalize end");
               }
           }
       }
    
    
    view.setOnClickListener(new View.OnClickListener() {
               @Override
               public void onClick(View v) {
                   for (int i = 0; i < 1000; i++) {
                      MyObject object= new MyObject(MainActivity.this);
                   }
               }
           });
    

    当每次被GC的时候,MyObject可以被正常的回收,如下图。


    image.png

    看一下FinalizerWatchdogDaemon线程的trace,Sleeping状态,可能抓的时候,正处于sleep。

    "FinalizerWatchdogDaemon" daemon prio=5 tid=5 Sleeping
     | group="system" sCount=1 dsCount=0 flags=1 obj=0x12f40350 self=0x6fad268c00
     | sysTid=1347 nice=4 cgrp=default sched=0/0 handle=0x6f95d264f0
     | state=S schedstat=( 118415628 218405939 1140 ) utm=7 stm=4 core=6 HZ=100
     | stack=0x6f95c23000-0x6f95c25000 stackSize=1041KB
     | held mutexes=
     at java.lang.Thread.sleep(Native method)
     - sleeping on <0x0edad885> (a java.lang.Object)
     at java.lang.Thread.sleep(Thread.java:373)
     - locked <0x0edad885> (a java.lang.Object)
     at java.lang.Thread.sleep(Thread.java:314)
     at java.lang.Daemons$FinalizerWatchdogDaemon.sleepFor(Daemons.java:344)
     at java.lang.Daemons$FinalizerWatchdogDaemon.waitForFinalization(Daemons.java:366)
     at java.lang.Daemons$FinalizerWatchdogDaemon.runInternal(Daemons.java:283)
     at java.lang.Daemons$Daemon.run(Daemons.java:105)
     at java.lang.Thread.run(Thread.java:764)
    

    现在修改一下程序,把睡眠2s改成睡眠20s,前台运行一段时间之后,突然崩溃了(和机型相关,有一些新出来的机型,系统已经优化,这个问题就不会再出现),FinalizerWatchdogDaemon线程报错了。

    04-23 15:27:57.153 26407-26416/com.example.myapplication E/AndroidRuntime: FATAL EXCEPTION: FinalizerWatchdogDaemon
       Process: com.example.myapplication, PID: 26407
       java.util.concurrent.TimeoutException: com.example.myapplication.MainActivity$MyObject.finalize() timed out after 10 seconds
           at java.lang.Thread.sleep(Native Method)
           at java.lang.Thread.sleep(Thread.java:1031)
           at java.lang.Thread.sleep(Thread.java:985)
           at com.example.myapplication.MainActivity$MyObject.finalize(MainActivity.java:36)
           at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:202)
           at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:185)
           at java.lang.Thread.run(Thread.java:818)
    

    所以我们知道,这个crash和finalize的耗时有一定关系,那我们在看看我们的代码是怎么写的?

    256      /**
    257       * Release the native resources, if they haven't been released yet.
    258       */
    259      @Override
    260      protected void finalize() {
    261          try {
    262              // if the cursor hasn't been closed yet, close it first
    263              if (mWindow != null) {
    264                  close();
    265              }
    266          } finally {
    267              super.finalize();
    268          }
    269      }
    
    138      @Override
    139      protected void finalize() throws Throwable {
    140          try {
    141              dispose();
    142          } finally {
    143              super.finalize();
    144          }
    145      }
    146  
    147      private void dispose() {
    148          if (mWindowPtr != 0) {
    149              recordClosingOfWindow(mWindowPtr);
    150              nativeDispose(mWindowPtr);
    151              mWindowPtr = 0;
    152          }
    153      }
    154  
    

    由此可见,finalize中都做一些额外的工作,正是这些额外的任务,在系统资源紧张,GC的时候很繁忙导致Finalizer对象回收超时,触发了App的crash。

    三、深入分析

    FinalizerWatchdogDaemon线程顾名思义,带有一个watchdog,说明和一个看门狗的性质是一样的,超过一定的时候不喂狗,就会被狗咬,看看源码是不是这个样子的。

    http://androidxref.com/9.0.0_r3/xref/libcore/libart/src/main/java/java/lang/Daemons.java#37

    30/**
    31 * Calls Object.finalize() on objects in the finalizer reference queue. The VM
    32 * will abort if any finalize() call takes more than the maximum finalize time
    33 * to complete.
    34 *
    35 * @hide
    36 */
    37public final class Daemons {
    38    private static final int NANOS_PER_MILLI = 1000 * 1000;
    39    private static final int NANOS_PER_SECOND = NANOS_PER_MILLI * 1000;
    40    private static final long MAX_FINALIZE_NANOS = 10L * NANOS_PER_SECOND;
    41
           //开启四个线程
    42    public static void start() {
    43        ReferenceQueueDaemon.INSTANCE.start();
    44        FinalizerDaemon.INSTANCE.start();
    45        FinalizerWatchdogDaemon.INSTANCE.start();
    46        HeapTaskDaemon.INSTANCE.start();
    47    }
    48
    55    //停止4个线程
    56    public static void stop() {
    57        HeapTaskDaemon.INSTANCE.stop();
    58        ReferenceQueueDaemon.INSTANCE.stop();
    59        FinalizerDaemon.INSTANCE.stop();
    60        FinalizerWatchdogDaemon.INSTANCE.stop();
    61    }
       ......
    62}
    

    Daemons类的开头注释很清楚,从finalizer reference queue(终结引用队列)中取出对象,调用对象的finalize()方法的时候.如果超出了最大终结时间(一般为MAX_FINALIZE_NANOS,值是10秒),JVM就会中止。这个类中开启了4个线程。

    FinalizerDaemon:对于重写了成员函数finalize的对象,它们被GC决定回收时,并没有马上被回收,而是被放入到一个队列中,等待FinalizerDaemon守护线程去调用它们的成员函数finalize,然后再被回收。

    FinalizerWatchdogDaemon:与上面的线程对应,用来监控FinalizerDaemon线程的执行。一旦检测那些重写了finalize的对象在执行成员函数finalize时超出一定时间,那么就会退出JVM。

    我们来看看FinalizerDaemon线程的实现。

    193    private static class FinalizerDaemon extends Daemon {
    194        private static final FinalizerDaemon INSTANCE = new FinalizerDaemon();
    195        private final ReferenceQueue<Object> queue = FinalizerReference.queue;
    196        private final AtomicInteger progressCounter = new AtomicInteger(0);
    197        // Object (not reference!) being finalized. Accesses may race!
    198        private Object finalizingObject = null;
    199
    200        FinalizerDaemon() {
    201            super("FinalizerDaemon");
    202        }
    203
    204        @Override public void runInternal() {
    205            // This loop may be performance critical, since we need to keep up with mutator
    206            // generation of finalizable objects.
    207            // We minimize the amount of work we do per finalizable object. For example, we avoid
    208            // reading the current time here, since that involves a kernel call per object.  We
    209            // limit fast path communication with FinalizerWatchDogDaemon to what's unavoidable: A
    210            // non-volatile store to communicate the current finalizable object, e.g. for
    211            // reporting, and a release store (lazySet) to a counter.
    212            // We do stop the  FinalizerWatchDogDaemon if we have nothing to do for a
    213            // potentially extended period.  This prevents the device from waking up regularly
    214            // during idle times.
    215
    216            // Local copy of progressCounter; saves a fence per increment on ARM and MIPS.
    217            int localProgressCounter = progressCounter.get();
    218
    219            while (isRunning()) {
    220                try {
    221                    // Use non-blocking poll to avoid FinalizerWatchdogDaemon communication
    222                    // when busy.
    223                    FinalizerReference<?> finalizingReference = (FinalizerReference<?>)queue.poll();
    224                    if (finalizingReference != null) {
    225                        finalizingObject = finalizingReference.get();
    226                        progressCounter.lazySet(++localProgressCounter);
    227                    } else {
    228                        finalizingObject = null;
    229                        progressCounter.lazySet(++localProgressCounter);
    230                        // Slow path; block.
    231                        FinalizerWatchdogDaemon.INSTANCE.goToSleep();
    232                        finalizingReference = (FinalizerReference<?>)queue.remove();
    233                        finalizingObject = finalizingReference.get();
    234                        progressCounter.set(++localProgressCounter);
    235                        FinalizerWatchdogDaemon.INSTANCE.wakeUp();
    236                    }
    237                    doFinalize(finalizingReference);
    238                } catch (InterruptedException ignored) {
    239                } catch (OutOfMemoryError ignored) {
    240                }
    241            }
    242        }
    243
    244        @FindBugsSuppressWarnings("FI_EXPLICIT_INVOCATION")
    245        private void doFinalize(FinalizerReference<?> reference) {
    246            FinalizerReference.remove(reference);
    247            Object object = reference.get();
    248            reference.clear();
    249            try {
                       //上面从reference取出对象,调用它的finalize
    250                object.finalize();
    251            } catch (Throwable ex) {
    252                // The RI silently swallows these, but Android has always logged.
    253                System.logE("Uncaught exception thrown by finalizer", ex);
    254            } finally {
    255                // Done finalizing, stop holding the object as live.
    256                finalizingObject = null;
    257            }
    258        }
    

    在看看FinalizerWatchdogDaemon线程的实现

    261    /**
    262     * The watchdog exits the VM if the finalizer ever gets stuck. We consider
    263     * the finalizer to be stuck if it spends more than MAX_FINALIZATION_MILLIS
    264     * on one instance.
    265     */
    266    private static class FinalizerWatchdogDaemon extends Daemon {
    267        private static final FinalizerWatchdogDaemon INSTANCE = new FinalizerWatchdogDaemon();
    268
    269        private boolean needToWork = true;  // Only accessed in synchronized methods.
    270
    271        FinalizerWatchdogDaemon() {
    272            super("FinalizerWatchdogDaemon");
    273        }
    274
    275        @Override public void runInternal() {
    276            while (isRunning()) {
    277                if (!sleepUntilNeeded()) {
    278                    // We have been interrupted, need to see if this daemon has been stopped.
    279                    continue;
    280                }
    281                final Object finalizing = waitForFinalization();
    282                if (finalizing != null && !VMRuntime.getRuntime().isDebuggerActive()) {
    283                    finalizerTimedOut(finalizing);
    284                    break;
    285                }
    286            }
    287        }
    
    

    waitForFinalization方法用来监控FinalizerDaemon线程的回收,如果超时waitForFinalization方法返回值就不为空,触发了finalizerTimedOut方法的执行,进程被杀死,JVM退出。

    396        private static void finalizerTimedOut(Object object) {
    397            // The current object has exceeded the finalization deadline; abort!
    398            String message = object.getClass().getName() + ".finalize() timed out after "
    399                    + (MAX_FINALIZE_NANOS / NANOS_PER_SECOND) + " seconds";
    400            Exception syntheticException = new TimeoutException(message);
    401            // We use the stack from where finalize() was running to show where it was stuck.
    402            syntheticException.setStackTrace(FinalizerDaemon.INSTANCE.getStackTrace());
    403
    404            // Send SIGQUIT to get native stack traces.
    405            try {
                      //杀死进程
    406                Os.kill(Os.getpid(), OsConstants.SIGQUIT);
    407                // Sleep a few seconds to let the stack traces print.
    408                Thread.sleep(5000);
    409            } catch (Exception e) {
    410                System.logE("failed to send SIGQUIT", e);
    411            } catch (OutOfMemoryError ignored) {
    412                // May occur while trying to allocate the exception.
    413            }
    426            if (Thread.getUncaughtExceptionPreHandler() == null &&
    427                    Thread.getDefaultUncaughtExceptionHandler() == null) {
    428                // If we have no handler, log and exit.
    429                System.logE(message, syntheticException);
    430                System.exit(2);
    431            }
    436            Thread.currentThread().dispatchUncaughtException(syntheticException);
    437        }
    438    }
    

    从Android 5.0开始,每个View都包含了一个或者多个的Finalizer对象,如果页面的View比较多,甚至还有Activity等对象泄漏的话,那么FinalizerDaemon线程需要检测的对象越来越多,负担可想而知,在低端设备上,有可能就会来不及回收而引起性能和稳定性问题,给了FinalizerWatchdogDaemon线程可乘之机.

    四、解决方案

    从第三部分我们知道这个问题的RootCasue是对象回收超时,那么修复的方法基本上有两种,一种是破坏FinalizerWatchdogDaemon线程,使之stop。

      public void fix1() {
            try {
                Class clazz = Class.forName("java.lang.Daemons$FinalizerWatchdogDaemon");
    
                Method method = clazz.getSuperclass().getDeclaredMethod("stop");
                method.setAccessible(true);
    
                Field field = clazz.getDeclaredField("INSTANCE");
                field.setAccessible(true);
    
                method.invoke(field.get(null));
    
            } catch (Throwable e) {
                e.printStackTrace();
            }
        }
    

    这种方案从DEMO中确实可以解决timeout的异常问题,但是风险未知。

    另外想法一种是加长时间到60s,如下

     private static final int NANOS_PER_MILLI = 1000 * 1000;
        private static final int NANOS_PER_SECOND = NANOS_PER_MILLI * 1000;
        private static final long MAX_FINALIZE_NANOS = 10L * NANOS_PER_SECOND;
    
        public void fix2() {
            try {
                Class<?> c = Class.forName("java.lang.Daemons");
                Field maxField = c.getDeclaredField("MAX_FINALIZE_NANOS");
                maxField.setAccessible(true);
                maxField.set(null, MAX_FINALIZE_NANOS * 6);
    
            } catch (ClassNotFoundException e) {
                e.printStackTrace();
            } catch (NoSuchFieldException e) {
                e.printStackTrace();
            } catch (IllegalAccessException e) {
                e.printStackTrace();
            }
        }
    

    第二种方案我在做ROM的时候,改过Daemons.java的源码,直接改成了60秒,全机型,未发现风险,这可能也是为什么出来的错误有的是timed out after 30 seconds的原因,但是由于MAX_FINALIZE_NANOS是final的string常量,并不能用反射,这种思路在App中行不通。

    综上,我修复方案是,对于已经国内ROM已经把MAX_FINALIZE_NANOS改变大于30s的的情况,暂时不动。对于仍然是10s的,stop掉FinalizerWatchdogDaemon线程,为了风险可控,加上了云控开关,可以随时控制关闭FinalizerWatchdogDaemon线程这个功能是否打开。

    最后给出个人的几个建议

    1、不到万不得已,不要重写finalize方法

    2、做好内存优化,减少GC的频繁调用

    相关文章

      网友评论

        本文标题:JVM垃圾回收之finalize执行时引起timed out 闪

        本文链接:https://www.haomeiwen.com/subject/hfiaoqtx.html