美文网首页
Android-Stability【OOM】: Android

Android-Stability【OOM】: Android

作者: 马小藤 | 来源:发表于2019-02-20 09:40 被阅读0次

    OOM Hprof Catch

    一. am dumpheap方法

    Java进程需要抓取hprof文件的时候可以使用如下命令抓取:

    adb shell am dumheap [pid] savePath
    

    1.1 看下流程:

    1.Am.java接收到相应的指令后进行分发,会执行相应的方法 runDumpHeap ,然后通过相应的aidl把任务传递到AMS:

    frameworks/base/services/core/java/com/android/server/am/ActivityManagerShellCommand.java
        int runDumpHeap(PrintWriter pw) throws RemoteException {
            if (!mInterface.dumpHeap(process, userId, managed, mallocInfo, runGc, heapFile, fd)) {
                err.println("HEAP DUMP FAILED on process " + process);
                return -1;
            }
            return 0;
        }
    

    2.AMS 的dumpHeap方法,通过AMS的一系列权限检查后,分发到相应的进程进行真正的hprof dump工作:

    frameworks/base/services/core/java/com/android/server/am/ActivityManagerService.java
        public boolean dumpHeap(String process, int userId, boolean managed, boolean mallocInfo,
                boolean runGc, String path, ParcelFileDescriptor fd) throws RemoteException {
            try {
                ProcessRecord proc = findProcessLocked(process, userId, "dumpHeap");
                proc.thread.dumpHeap(managed, mallocInfo, runGc, path, fd);
                return true;
            } catch (RemoteException e) {
            
            }    
        }
    

    3.目标进程的ApplicationThread的dumpHeap方法,这里会把任务交给ActivityThread 的内部类H去处理::

    frameworks/base/core/java/android/app/ActivityThread.java
     static void handleDumpHeap(DumpHeapData dhd) {
        try {
            Debug.dumpHprofData(dhd.path, dhd.fd.getFileDescriptor());
        } catch (IOException e) {
            Slog.w(TAG, "Managed heap dump failed on path " + dhd.path
                + " -- can the process access this path?");
        } finally {}
     }
    

    4.接下来调用到Debug.java中的实现方法:

    frameworks/base/core/java/android/os/Debug.java
        /**
         * Like dumpHprofData(String), but takes an already-opened
         * FileDescriptor to which the trace is written.  The file name is also
         * supplied simply for logging.  Makes a dup of the file descriptor.
         *
         * Primarily for use by the "am" shell command.
         *
         * @hide
         */
        public static void dumpHprofData(String fileName, FileDescriptor fd)
                throws IOException {
            VMDebug.dumpHprofData(fileName, fd);
        }
    

    5.调用到虚拟机中的Dumpheap方法:

    art/runtime/native/dalvik_system_VMDebug.cc
    /*
     * static void dumpHprofData(String fileName, FileDescriptor fd)
     *
     * Cause "hprof" data to be dumped.  We can throw an IOException if an
     * error occurs during file handling.
     */
    static void VMDebug_dumpHprofData(JNIEnv* env, jclass, jstring javaFilename, jint javaFd) {
      // Only one of these may be null.
      if (javaFilename == nullptr && javaFd < 0) {
        ScopedObjectAccess soa(env);
        ThrowNullPointerException("fileName == null && fd == null");
        return;
      }
    
      std::string filename;
      if (javaFilename != nullptr) {
        ScopedUtfChars chars(env, javaFilename);
        if (env->ExceptionCheck()) {
          return;
        }
        filename = chars.c_str();
      } else {
        filename = "[fd]";
      }
    
      int fd = javaFd;
    
      hprof::DumpHeap(filename.c_str(), fd, false);
    }
    

    6.最终调用hprof.cc中的dump方法:

    art/runtime/hprof/hprof.cc
    void DumpHeap(const char* filename, int fd, bool direct_to_ddms) {
      CHECK(filename != nullptr);
      Thread* self = Thread::Current();
      // Need to take a heap dump while GC isn't running. See the comment in Heap::VisitObjects().
      // Also we need the critical section to avoid visiting the same object twice. See b/34967844
      gc::ScopedGCCriticalSection gcs(self,
                                      gc::kGcCauseHprof,
                                      gc::kCollectorTypeHprof);
      ScopedSuspendAll ssa(__FUNCTION__, true /* long suspend */);
      Hprof hprof(filename, fd, direct_to_ddms);
      hprof.Dump();//最终使用DumpToFile输出到文件
    }
    

    因此本质上是目标进程的虚拟机在dump这个hprof文件,大致过程如下:


    在这里插入图片描述

    1.2 抓取Java进程OOM时的Hprof文件

    ROM开发者可以通过直接调用系统接口Debug.dumpHprofData()在需要dumphprof的地方进行dump操作。
    比如我们可以在RuntimeInit.java中在java进程抛异常的时候抓取本进程的hprof文件:

    frameworks/base/core/java/com/android/internal/os/RuntimeInit.java
        private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
            @Override
            public void uncaughtException(Thread t, Throwable e) {
                // SELF-ADD begin
                ApplicationErrorReport.CrashInfo crashInfo = new ApplicationErrorReport.CrashInfo(e);
                if("java.lang.OutOfMemoryError".equals(crashInfo.exceptionClassName) {
                     try {
                        Debug.dumpHprofData(outPutFileName);
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }
                // END
            }
        }
    

    当然如上只是简单的描述,是否该做dump需要对具体场景,软件版本等进行判断后再做dumpheap操作。
    因为dumpheap的操作是在应用进程的主线程中进行操作,本质上是在该应用进程的虚拟机中进行,dumpheap时应用进程会block住,如过heap文件过大很容易导致应用进程操作界面卡住,如果此时再进行点击或滑动等操作极易再抛出anr等弹窗,用户体验极差。

    另外system_server进程抛出OOM异常时同样走到这里,同样可以使用该方法抓取system_server进程OOM时的hprof文件。
    此时运行shell命令同样可以,只是跨了众多的进程,这个比较直接。

    二. 异常情况的hprof文件抓取

    也有异常情况,当OOM发生时去dumpheap,此时内存又回到了低值,抓不到oom之前的现场了。

    案例:

    发生OOM的是system_server进程,因此重启概率突增,后台监控看来该机型的重启概率飙升了十几倍,相当严重,可是用户反馈看来基本上很少。更奇怪的是什么都没动,两个版本过去后,该问题导致重启概率竟然又下降了10几倍,日了够。

    服务器抓取到的日志,在短时间内飙升 从08-10 00:04:44.167 到 08-10 00:04:57.825 10秒多从几十兆飙到500多兆达到峰值,就说你看到这个日志怕不怕:

    2.1 日志:

    08-09 23:59:31.395  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 7270(1366KB) AllocSpace objects, 84(1680KB) LOS objects, 4% free, 57MB/60MB, paused 5.453ms total 53.036ms
    08-09 23:59:54.735  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 11130(978KB) AllocSpace objects, 39(1300KB) LOS objects, 4% free, 58MB/60MB, paused 5.354ms total 40.800ms
    08-10 00:00:00.226  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 13139(917KB) AllocSpace objects, 18(1336KB) LOS objects, 3% free, 58MB/60MB, paused 5.051ms total 50.740ms
    08-10 00:00:01.685  1193  1202 I art     : Background partial concurrent mark sweep GC freed 54242(8MB) AllocSpace objects, 3(1004KB) LOS objects, 6% free, 61MB/65MB, paused 2.338ms total 196.838ms
    08-10 00:00:02.091  1193  1202 I art     : Background partial concurrent mark sweep GC freed 22138(8MB) AllocSpace objects, 0(0B) LOS objects, 5% free, 63MB/67MB, paused 1.922ms total 242.373ms
    08-10 00:00:03.009  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 4959(3MB) AllocSpace objects, 30(480KB) LOS objects, 5% free, 63MB/67MB, paused 5.672ms total 42.871ms
    08-10 00:00:06.326  5659  5698 I art     : Waiting for a blocking GC DisableMovingGc
    08-10 00:00:06.328  5659  5698 I art     : Starting a blocking GC DisableMovingGc
    08-10 00:00:10.989  3138  3150 I art     : Background sticky concurrent mark sweep GC freed 3252(3MB) AllocSpace objects, 30(480KB) LOS objects, 24% free, 12MB/16MB, paused 5.453ms total 31.291ms
    08-10 00:00:12.127  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 1177(2028KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 66MB/67MB, paused 5.414ms total 31.304ms
    08-10 00:00:13.469  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 345(753KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 66MB/67MB, paused 7.552ms total 22.277ms
    08-10 00:00:13.989  1193  1202 I art     : Background partial concurrent mark sweep GC freed 12567(6MB) AllocSpace objects, 59(1048KB) LOS objects, 5% free, 63MB/67MB, paused 2.038ms total 184.519ms
    08-10 00:00:21.986  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 2471(2MB) AllocSpace objects, 0(0B) LOS objects, 3% free, 64MB/67MB, paused 5.281ms total 31.327ms
    08-10 00:00:28.702  1193  1202 I art     : Background partial concurrent mark sweep GC freed 4937(4MB) AllocSpace objects, 120(2MB) LOS objects, 5% free, 63MB/67MB, paused 3.661ms total 364.700ms
    08-10 00:00:30.352  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 2912(4MB) AllocSpace objects, 0(0B) LOS objects, 5% free, 63MB/67MB, paused 5.093ms total 40.789ms
    08-10 00:00:33.555  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 3201(3MB) AllocSpace objects, 0(0B) LOS objects, 4% free, 64MB/67MB, paused 5.120ms total 41.522ms
    08-10 00:00:35.927  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 4108(2MB) AllocSpace objects, 6(96KB) LOS objects, 4% free, 64MB/67MB, paused 5.201ms total 39.964ms
    08-10 00:00:36.504  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 3715(2MB) AllocSpace objects, 0(0B) LOS objects, 4% free, 64MB/67MB, paused 5.005ms total 39.263ms
    08-10 00:00:37.439  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 2013(2MB) AllocSpace objects, 0(0B) LOS objects, 3% free, 64MB/67MB, paused 5.401ms total 42.217ms
    08-10 00:00:37.742  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 2327(2MB) AllocSpace objects, 0(0B) LOS objects, 3% free, 64MB/67MB, paused 5.479ms total 40.687ms
    08-10 00:00:42.013  2648  3700 I art     : Starting a blocking GC Explicit
    08-10 00:00:42.049  2648  3700 I art     : Explicit concurrent mark sweep GC freed 13530(1605KB) AllocSpace objects, 0(0B) LOS objects, 28% free, 10MB/14MB, paused 393us total 35.471ms
    08-10 00:00:42.053  2648  3700 I art     : Starting a blocking GC Explicit
    08-10 00:00:42.072  2648  3700 I art     : Explicit concurrent mark sweep GC freed 462(16KB) AllocSpace objects, 0(0B) LOS objects, 28% free, 10MB/14MB, paused 376us total 18.540ms
    08-10 00:00:42.315  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 19742(1979KB) AllocSpace objects, 7(156KB) LOS objects, 2% free, 65MB/67MB, paused 5.721ms total 84.216ms
    08-10 00:00:43.328  1193  1202 I art     : Background partial concurrent mark sweep GC freed 60214(3MB) AllocSpace objects, 64(1208KB) LOS objects, 5% free, 63MB/67MB, paused 1.594ms total 216.641ms
    08-10 00:00:52.118  1193  1202 I art     : Background partial concurrent mark sweep GC freed 38624(3MB) AllocSpace objects, 21(420KB) LOS objects, 5% free, 63MB/67MB, paused 1.699ms total 212.617ms
    08-10 00:00:55.946  2004  2014 I art     : Background sticky concurrent mark sweep GC freed 8019(353KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 39MB/39MB, paused 5.213ms total 40.251ms
    08-10 00:02:15.572  1193  1202 I art     : Background partial concurrent mark sweep GC freed 47034(8MB) AllocSpace objects, 53(1364KB) LOS objects, 6% free, 56MB/60MB, paused 1.790ms total 238.573ms
    08-10 00:02:18.896  1193  1202 I art     : Background partial concurrent mark sweep GC freed 52779(3MB) AllocSpace objects, 8(160KB) LOS objects, 6% free, 56MB/60MB, paused 3.178ms total 382.199ms
    08-10 00:04:00.117  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 32710(2MB) AllocSpace objects, 33(1220KB) LOS objects, 6% free, 56MB/60MB, paused 5.415ms total 62.678ms
    08-10 00:04:44.167  1193  1202 I art     : Background partial concurrent mark sweep GC freed 11012(632KB) AllocSpace objects, 8(332KB) LOS objects, 5% free, 67MB/71MB, paused 1.991ms total 241.279ms
    08-10 00:04:44.519  1193  1202 I art     : Background partial concurrent mark sweep GC freed 3652(391KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 77MB/81MB, paused 1.854ms total 228.374ms
    08-10 00:04:44.876  1193  1202 I art     : Background partial concurrent mark sweep GC freed 68(7KB) AllocSpace objects, 0(0B) LOS objects, 4% free, 88MB/92MB, paused 1.921ms total 237.600ms
    08-10 00:04:45.158  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 3% free, 99MB/103MB, paused 2.042ms total 176.341ms
    08-10 00:04:45.460  1193  1202 I art     : Background partial concurrent mark sweep GC freed 18(8KB) AllocSpace objects, 0(0B) LOS objects, 3% free, 111MB/115MB, paused 2.190ms total 190.518ms
    08-10 00:04:45.586  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 115MB/115MB, paused 5.780ms total 37.561ms
    08-10 00:04:45.798  1193  1202 I art     : Background partial concurrent mark sweep GC freed 36(1648B) AllocSpace objects, 0(0B) LOS objects, 3% free, 122MB/126MB, paused 2.088ms total 211.466ms
    08-10 00:04:45.931  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 226(18KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 126MB/126MB, paused 11.727ms total 44.764ms
    08-10 00:04:46.123  1193  1202 I art     : Background partial concurrent mark sweep GC freed 453(24KB) AllocSpace objects, 0(0B) LOS objects, 2% free, 134MB/138MB, paused 2.029ms total 190.864ms
    08-10 00:04:46.395  1193  1202 I art     : Background partial concurrent mark sweep GC freed 134(7KB) AllocSpace objects, 0(0B) LOS objects, 2% free, 145MB/149MB, paused 1.995ms total 169.401ms
    08-10 00:04:46.770  1193  1202 I art     : Background partial concurrent mark sweep GC freed 109(5KB) AllocSpace objects, 0(0B) LOS objects, 2% free, 156MB/160MB, paused 2.027ms total 253.147ms
    08-10 00:04:47.147  1193  1202 I art     : Background partial concurrent mark sweep GC freed 215(22KB) AllocSpace objects, 0(0B) LOS objects, 2% free, 171MB/175MB, paused 1.984ms total 252.140ms
    08-10 00:04:47.557  1193  1202 I art     : Background partial concurrent mark sweep GC freed 515(65KB) AllocSpace objects, 0(0B) LOS objects, 2% free, 182MB/186MB, paused 2.035ms total 294.901ms
    08-10 00:04:47.897  1193  1202 I art     : Background partial concurrent mark sweep GC freed 358(31KB) AllocSpace objects, 1(20KB) LOS objects, 2% free, 191MB/195MB, paused 2.134ms total 224.444ms
    08-10 00:04:48.191  1193  1202 I art     : Background partial concurrent mark sweep GC freed 621(63KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 203MB/207MB, paused 2.055ms total 187.326ms
    08-10 00:04:48.514  1193  1202 I art     : Background partial concurrent mark sweep GC freed 110(8KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 215MB/219MB, paused 2.196ms total 215.961ms
    08-10 00:04:48.790  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 1% free, 226MB/230MB, paused 2.149ms total 172.089ms
    08-10 00:04:49.090  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 1% free, 237MB/241MB, paused 2.228ms total 173.851ms
    08-10 00:04:49.375  1193  1202 I art     : Background partial concurrent mark sweep GC freed 104(5KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 248MB/252MB, paused 2.253ms total 173.835ms
    08-10 00:04:49.673  1193  1202 I art     : Background partial concurrent mark sweep GC freed 13(448B) AllocSpace objects, 0(0B) LOS objects, 1% free, 260MB/264MB, paused 2.259ms total 193.370ms
    08-10 00:04:49.970  1193  1202 I art     : Background partial concurrent mark sweep GC freed 1104(87KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 271MB/275MB, paused 2.486ms total 186.926ms
    08-10 00:04:50.291  1193  1202 I art     : Background partial concurrent mark sweep GC freed 259(39KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 283MB/287MB, paused 2.229ms total 207.021ms
    08-10 00:04:50.596  1193  1202 I art     : Background partial concurrent mark sweep GC freed 35(2048B) AllocSpace objects, 0(0B) LOS objects, 1% free, 294MB/298MB, paused 3.200ms total 192.377ms
    08-10 00:04:50.709  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 70(3KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 299MB/299MB, paused 5.388ms total 26.256ms
    08-10 00:04:50.898  1193  1202 I art     : Background partial concurrent mark sweep GC freed 1338(64KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 306MB/310MB, paused 2.209ms total 188.606ms
    08-10 00:04:51.200  1193  1202 I art     : Background partial concurrent mark sweep GC freed 331(17KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 317MB/321MB, paused 2.329ms total 184.804ms
    08-10 00:04:51.323  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 1592(139KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 322MB/322MB, paused 8.261ms total 36.921ms
    08-10 00:04:51.520  1193  1202 I art     : Background partial concurrent mark sweep GC freed 1960(161KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 329MB/333MB, paused 2.538ms total 196.193ms
    08-10 00:04:51.883  1193  1202 I art     : Background partial concurrent mark sweep GC freed 464(18KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 344MB/348MB, paused 2.162ms total 254.264ms
    08-10 00:04:52.207  1193  1202 I art     : Background partial concurrent mark sweep GC freed 341(19KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 354MB/358MB, paused 2.173ms total 217.485ms
    08-10 00:04:52.579  1193  1202 I art     : Background partial concurrent mark sweep GC freed 406(16KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 367MB/371MB, paused 2.258ms total 262.958ms
    08-10 00:04:53.007  1193  1202 I art     : Background partial concurrent mark sweep GC freed 102(4KB) AllocSpace objects, 0(0B) LOS objects, 1% free, 378MB/382MB, paused 2.208ms total 310.370ms
    08-10 00:04:53.327  1193  1202 I art     : Background partial concurrent mark sweep GC freed 38(1424B) AllocSpace objects, 0(0B) LOS objects, 1% free, 388MB/392MB, paused 2.212ms total 210.770ms
    08-10 00:04:53.767  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 0% free, 402MB/406MB, paused 2.285ms total 321.039ms
    08-10 00:04:53.897  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 406MB/406MB, paused 5.966ms total 41.389ms
    08-10 00:04:54.129  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 0% free, 413MB/417MB, paused 2.322ms total 230.110ms
    08-10 00:04:54.261  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 417MB/417MB, paused 7.865ms total 43.687ms
    08-10 00:04:54.511  1193  1202 I art     : Background partial concurrent mark sweep GC freed 298(21KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 424MB/428MB, paused 2.090ms total 249.911ms
    08-10 00:04:54.887  1193  1202 I art     : Background partial concurrent mark sweep GC freed 48(2624B) AllocSpace objects, 0(0B) LOS objects, 0% free, 435MB/439MB, paused 2.242ms total 258.446ms
    08-10 00:04:55.017  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 10(752B) AllocSpace objects, 0(0B) LOS objects, 0% free, 440MB/440MB, paused 5.978ms total 40.969ms
    08-10 00:04:55.221  1193  1202 I art     : Background partial concurrent mark sweep GC freed 6(192B) AllocSpace objects, 0(0B) LOS objects, 0% free, 447MB/451MB, paused 2.312ms total 203.380ms
    08-10 00:04:55.366  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 451MB/451MB, paused 6.244ms total 37.508ms
    08-10 00:04:55.627  1193  1202 I art     : Background partial concurrent mark sweep GC freed 163(8KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 459MB/463MB, paused 2.290ms total 259.702ms
    08-10 00:04:55.759  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 464MB/464MB, paused 7.253ms total 44.006ms
    08-10 00:04:56.123  1193  1202 I art     : Background partial concurrent mark sweep GC freed 19(592B) AllocSpace objects, 0(0B) LOS objects, 0% free, 471MB/475MB, paused 5.091ms total 363.359ms
    08-10 00:04:56.320  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 475MB/475MB, paused 5.201ms total 28.116ms
    08-10 00:04:56.620  1193  1202 I art     : Background partial concurrent mark sweep GC freed 68(4KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 483MB/487MB, paused 4.594ms total 299.701ms
    08-10 00:04:56.776  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 486MB/487MB, paused 5.572ms total 27.013ms
    08-10 00:04:56.984  1193  1202 I art     : Background partial concurrent mark sweep GC freed 19(60KB) AllocSpace objects, 1(20KB) LOS objects, 0% free, 494MB/498MB, paused 2.273ms total 207.931ms
    08-10 00:04:57.117  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 1(16B) AllocSpace objects, 0(0B) LOS objects, 0% free, 498MB/498MB, paused 6.765ms total 44.880ms
    08-10 00:04:57.335  1193  1202 I art     : Background partial concurrent mark sweep GC freed 20(704B) AllocSpace objects, 0(0B) LOS objects, 0% free, 505MB/509MB, paused 3.179ms total 217.057ms
    08-10 00:04:57.467  1193  1202 I art     : Background sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 509MB/509MB, paused 8.363ms total 36.710ms
    08-10 00:04:57.628  1193  2617 I art     : Waiting for a blocking GC Alloc
    08-10 00:04:57.825  1193  1202 I art     : Clamp target GC heap from 515MB to 512MB
    08-10 00:04:57.825  1193  1202 I art     : Background partial concurrent mark sweep GC freed 87(6KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 511MB/512MB, paused 3.637ms total 357.088ms
    08-10 00:04:57.826  1193  2617 I art     : Starting a blocking GC Alloc
    08-10 00:04:57.826  1193  2617 I art     : Starting a blocking GC Alloc
    08-10 00:04:57.854  1193  2617 I art     : Alloc sticky concurrent mark sweep GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 511MB/512MB, paused 3.953ms total 27.850ms
    08-10 00:04:57.854  1193  2617 I art     : Starting a blocking GC Alloc
    08-10 00:04:57.859  1193  2601 I art     : Waiting for a blocking GC Alloc
    08-10 00:04:58.278  1193  2617 I art     : Clamp target GC heap from 515MB to 512MB
    08-10 00:04:58.279  1193  2617 I art     : Alloc partial concurrent mark sweep GC freed 88(5KB) AllocSpace objects, 0(0B) LOS objects, 0% free, 511MB/512MB, paused 5.126ms total 424.244ms
    08-10 00:04:58.279  1193  2617 I art     : Starting a blocking GC Alloc
    08-10 00:04:58.279  1193  2601 I art     : Waiting for a blocking GC Alloc
    08-10 00:04:58.634  1193  2617 I art     : Clamp target GC heap from 515MB to 512MB
    08-10 00:04:58.634  1193  2617 I art     : Alloc concurrent mark sweep GC freed 6(208B) AllocSpace objects, 0(0B) LOS objects, 0% free, 511MB/512MB, paused 3.198ms total 354.919ms
    。。。
    

    2.2 看日志打印的地方:

    art/runtime/gc/heap.cc
    void Heap::LogGC(GcCause gc_cause, collector::GarbageCollector* collector) {
      const size_t duration = GetCurrentGcIteration()->GetDurationNs();
      const std::vector<uint64_t>& pause_times = GetCurrentGcIteration()->GetPauseTimes();
      // Print the GC if it is an explicit GC (e.g. Runtime.gc()) or a slow GC
      // (mutator time blocked >= long_pause_log_threshold_).
      bool log_gc = kLogAllGCs || gc_cause == kGcCauseExplicit;
      if (!log_gc && CareAboutPauseTimes()) {
        // GC for alloc pauses the allocating thread, so consider it as a pause.
        log_gc = duration > long_gc_log_threshold_ ||
            (gc_cause == kGcCauseForAlloc && duration > long_pause_log_threshold_);
        for (uint64_t pause : pause_times) {
          log_gc = log_gc || pause >= long_pause_log_threshold_;
        }
      }
      if (log_gc) {
        const size_t percent_free = GetPercentFree();
        const size_t current_heap_size = GetBytesAllocated();
        const size_t total_memory = GetTotalMemory();
        std::ostringstream pause_string;
        for (size_t i = 0; i < pause_times.size(); ++i) {
          pause_string << PrettyDuration((pause_times[i] / 1000) * 1000)
                       << ((i != pause_times.size() - 1) ? "," : "");
        }
        LOG(INFO) << gc_cause << " " << collector->GetName()
                  << " GC freed "  << current_gc_iteration_.GetFreedObjects() << "("
                  << PrettySize(current_gc_iteration_.GetFreedBytes()) << ") AllocSpace objects, "
                  << current_gc_iteration_.GetFreedLargeObjects() << "("
                  << PrettySize(current_gc_iteration_.GetFreedLargeObjectBytes()) << ") LOS objects, "
                  << percent_free << "% free, " << PrettySize(current_heap_size) << "/"
                  << PrettySize(total_memory) << ", " << "paused " << pause_string.str()
                  << " total " << PrettyDuration((duration / 1000) * 1000);
        VLOG(heap) << Dumpable<TimingLogger>(*current_gc_iteration_.GetTimings());
      }
    }
    

    从代码中可以看出各个值的意思,其中 current_heap_size 是当前进程分配了的heap大小,的确是从很短的时间内就内存上涨上来了。

    现在看下为什么突然上涨上来,因此就需要抓取system_server的hprof文件来确定是哪个对象过大,哪里导致突然分配这么多内存。

    于是用上面提到的在RuntimeInit.java处system_server抛OOM异常时抓取system_server的hporf文件,可是每次手机抓取上传到服务器hprof文件大小只有几十兆,恢复了正常。

    猜测在system_server OOM crash之前大对象又被清理掉了,头疼,抓不到造事者影子。

    2.3 本地复现抓取Hprof文件方法:

    于是准备本地复现,跑上monkey进行复现,写个脚本,不断的抓取system_server的hprof文件

    编个bin文件监控system_server的heap大小,大体上的想法就是想在运行的手机中抓取到system_server的大的hprof文件,用于查找大对象到底是谁,当然也可以不用这么麻烦,写个python脚本手机连着电脑不断监控抓取就可以,只是后来发现Monkey复现不了,于是准备使用该手机作为日常机器进行操作使用进行复现,需要能在手机自身里运行监控抓取,使用了如下方案,源码如下:

    #include <stdio.h>
    #include <unistd.h>
    #include <sys/time.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <string.h>
    #include <errno.h>
    #include <time.h>
    #include <assert.h>
    #include <dirent.h>
    #define BUF_SIZE 1024
    
    void modifyMyAdj() 
    {
        char text[100];
        sprintf(text, "/proc/%d/oom_score_adj", getpid());    
        int fd = open(text, O_WRONLY);  
        if (fd >= 0) {  
            sprintf(text, "%d", -1000);  //让自己不被杀死
            write(fd, text, strlen(text));  
            printf("modify adj sucess!!\n");
        } else {
            printf("modify adj fail!!\n");
        }
        close(fd);  
    }
    
    char* getPidByName(char* task_name, char * pid) {
        DIR *dir;
        struct dirent *ptr;
        FILE *fp;
        char filepath[50];//大小随意,能装下cmdline文件的路径即可
        char cur_task_name[50];//大小随意,能装下要识别的命令行文本即可
        char buf[BUF_SIZE];
        dir = opendir("/proc"); //打开路径
        if (NULL != dir) {
            while ((ptr = readdir(dir)) != NULL) {//循环读取路径下的每一个文件/文件夹
            
                //如果读取到的是"."或者".."则跳过,读取到的不是文件夹名字也跳过
                if ((strcmp(ptr->d_name, ".") == 0) || (strcmp(ptr->d_name, "..") == 0))             
                    continue;
                if (DT_DIR != ptr->d_type) 
                    continue;
    
                sprintf(filepath, "/proc/%s/status", ptr->d_name);//生成要读取的文件的路径
                fp = fopen(filepath, "r");//打开文件
                if (NULL != fp) {
                    if( fgets(buf, BUF_SIZE-1, fp)== NULL ){
                    fclose(fp);
                    continue;
                }
                sscanf(buf, "%*s %s", cur_task_name);
    
                //如果文件内容满足要求则打印路径的名字(即进程的PID)
                if (!strcmp(task_name, cur_task_name)){
                    printf("PID:  %s\n", ptr->d_name);
                    strcpy(pid, ptr->d_name);
                
                }
                fclose(fp);
            }
    
            }
            closedir(dir);//关闭路径
        }
        return ptr->d_name;
    }
    
    void getNameByPid(pid_t pid, char *task_name) {
        char proc_pid_path[BUF_SIZE];
        char buf[BUF_SIZE];
        sprintf(proc_pid_path, "/proc/%d/status", pid);
        FILE* fp = fopen(proc_pid_path, "r");
        if(NULL != fp){
            if( fgets(buf, BUF_SIZE-1, fp)== NULL ){
                fclose(fp);
            }
            fclose(fp);
            sscanf(buf, "%*s %s", task_name);
         }
    }
    
    unsigned getRssByPid(pid_t pid) {
    
        char proc_pid_stat[BUF_SIZE];
        char buf[BUF_SIZE];
        sprintf(proc_pid_stat, "/proc/%d/stat", pid);
        FILE* fp = fopen(proc_pid_stat, "r");
        char *temp;
        int i = 0;
    
        if(NULL != fp){
            if( fgets(buf, BUF_SIZE-1, fp)== NULL ){
                fclose(fp);
            }
            fclose(fp);
            temp = strtok(buf," ");
            while(temp)
            {
                i++;
                printf("%d :%s \n",i,temp);
                if(i == 24) {
                    return strtoul(temp, 0, 10)*4;  
                }
    
                temp = strtok(NULL," ");
    
            }
         }
         return 0;
    }
    
    #define RSS_STEP 50*1024 //50M
    #define HPROF_FILE_DIR  "/data/local/tmp/"
    int main(int argc, char *argv[]) {
        modifyMyAdj();
    
        unsigned rss_threshold = 200 * 1024;
        char task_name[50];
        char *pid = (char*)calloc(10,sizeof(char));
        strcpy(task_name, "system_server");
        getPidByName(task_name, pid);
        char target_file[128]; 
        char dumpheap_cmd[256];
        system("mkdir -p " HPROF_FILE_DIR);
        printf("OOMMonitor start monitor system_server rss!\n");
        while (true ){          
            unsigned rss = getRssByPid(atoi(pid));
            printf("OOMMonitor unsigned rss = %u !\n",rss);
            if (rss==0)
            {
                printf("OOMMonitor Error read system_server pid , system_server may reboot!\n");
                getPidByName(task_name, pid);
                continue;
            }
    
            if (rss>rss_threshold){
                sprintf(target_file, "%sss_oom_%u.hprof",HPROF_FILE_DIR,rss);
                sprintf(dumpheap_cmd, "am dumpheap %s %s",pid,target_file);
                system(dumpheap_cmd);
                printf("OOMMonitor dumpheap finished: %s\n",target_file);
                rss_threshold += RSS_STEP;
            }
    
            if (rss>500 * 1024)
            {
                printf("%s\n","OOMMonitor system_server oom!");
                sprintf(target_file, "%sss_oom_%u.hprof",HPROF_FILE_DIR,rss);
                sprintf(dumpheap_cmd, "am dumpheap %s %s",pid,target_file);
                system(dumpheap_cmd);
                system("bugreport > " HPROF_FILE_DIR "bug.txt");
                system("am broadcast -a com.XXX.bugreport.service.action.CONFIRM_DIALOG");
                break;
            }
            sleep(2);
        }
        return 0;
    }
    

    mk文件,在源码下编译出该bin文件,OOMMonitor:

    LOCAL_PATH:= $(call my-dir)
    include $(CLEAR_VARS)
    LOCAL_SRC_FILES:= \
        monitor.cpp
    LOCAL_MODULE:= OOMMonitor
    include $(BUILD_EXECUTABLE)
    

    push到system/bin/下,给予OOMMonitor运行的权限,并跑起来,也可以更改rc文件让其开机自动就跑起来。

    adb shell OOMMonitor &
    

    跑起monkey,一般如果大概率OOM的案例跑跑应该都可以很快复现,可这次怎么跑都不复现,本地手机又少,只有2,3台,日乐购。

    因此最终还是思考需要从用户机器上抓取hprof。

    2.3 在线抓取用户机器上的hprof文件:

    既然最终OOM 抛异常Crash之前大对象就不见了,那就在system_server heap增大的过程中对其heap进行dump,还能跑的了?我靠

    于是准备如下这么干:
    通过跑shell脚本来抓取hprof文件
    将system_server的pid设置到属性里脚本直接读取属性值得到system_server的pid

    #!/system/bin/sh
    ss_pid=`getprop sys.XXX.dumpheap`
    mkdir -p /data/local/hprof/exceed/
    am dumpheap $ss_pid /data/local/hprof/exceed/system.hprof
    echo "oom sh dumpheap done!"
    chown -R system:system /data/local/hprof/
    

    更该rc文件,当属性一旦被设置后就跑起该shell脚本:

    on property:sys.XXX.dumpheap=*
        exec u:r:dumpstate:s0 -- /system/bin/sh /init.XXX.dumpheap.sh
    

    设置属性的地方就放在上面gclog打印的地方,当 current_heap_size 大于450M的时候就去抓取设置熟悉,执行shell脚本,抓取hprof文件:

    diff --git a/runtime/gc/heap.cc b/runtime/gc/heap.cc
    index 82c160f..053f459 100644
    --- a/runtime/gc/heap.cc
    +++ b/runtime/gc/heap.cc
    @@ -23,6 +23,10 @@
     #include <memory>
     #include <unwind.h>  // For GC verification.
     #include <vector>
    +// XXX Add : Start
    +#include <sys/types.h>
    +#include <unistd.h>
    +// End
     
     #include "art_field-inl.h"
     #include "base/allocator.h"
    @@ -74,6 +78,12 @@
     #include "handle_scope-inl.h"
     #include "thread_list.h"
     #include "well_known_classes.h"
    +
    +#ifdef HAVE_ANDROID_OS
    +extern "C" {
    +  int property_set(const char* name, const char* value);
    +}
    +#endif
     
     namespace art {
     
    @@ -215,7 +225,8 @@
                                             kGcCountRateMaxBucketCount),
           backtrace_lock_(nullptr),
           seen_backtrace_count_(0u),
    -      unique_backtrace_count_(0u) {
    +      unique_backtrace_count_(0u),
    +      need_dump_heap_(0u) {// XXX Add
       if (VLOG_IS_ON(heap) || VLOG_IS_ON(startup)) {
         LOG(INFO) << "Heap() entering";
       }
    @@ -2522,6 +2533,9 @@
                   << percent_free << "% free, " << PrettySize(current_heap_size) << "/"
                   << PrettySize(total_memory) << ", " << "paused " << pause_string.str()
                   << " total " << PrettyDuration((duration / 1000) * 1000);
    +    // XXX Add : Start
    +    CheckConcurrentHeapSize(current_heap_size);
    +    // End
         VLOG(heap) << Dumpable<TimingLogger>(*current_gc_iteration_.GetTimings());
       }
     }
    @@ -3776,6 +3790,30 @@
         }
       }
     }
    -
    +// XXX Add : Start
    +#ifdef HAVE_ANDROID_OS
    +void Heap::CheckConcurrentHeapSize(size_t current_heap_size) {
    +    if(current_heap_size < 450 *1024 *1024) {
    +        return;
    +    }
    +    if(!need_dump_heap_.CompareExchangeStrongSequentiallyConsistent(0,1)) {
    +        return;
    +    }
    +    std::string current_cmd_line;
    +    std::string  propertyName = "sys.XXX.dumpheap";
    +    if(ReadFileToString("/proc/self/cmdline", &current_cmd_line)) {
    +        int ret = strcmp(current_cmd_line.c_str(), "system_server");
    +        if(ret == 0 ) {
    +            ret = property_set(propertyName.c_str(),std::to_string(getpid()).c_str());
    +            LOG(WARNING) << "system_server oomevent ,set prop ret: " <<  ret;
    +        }
    +    }
    +}
    +#else
    +void Heap::CheckConcurrentHeapSize(__attribute__((unused)) size_t new_num_bytes_allocated) {
    +    return;
    +}
    +#endif
    +// End
     }  // namespace gc
     }  // namespace art
    diff --git a/runtime/gc/heap.h b/runtime/gc/heap.h
    index 026795c..75c97c7 100644
    --- a/runtime/gc/heap.h
    +++ b/runtime/gc/heap.h
    @@ -891,6 +891,9 @@
       // GC stress mode attempts to do one GC per unique backtrace.
       void CheckGcStressMode(Thread* self, mirror::Object** obj)
           SHARED_LOCKS_REQUIRED(Locks::mutator_lock_);
    +  // XXX Add : Start
    +  void CheckConcurrentHeapSize(size_t current_heapsize);
    +  // End
     
       // All-known continuous spaces, where objects lie within fixed bounds.
       std::vector<space::ContinuousSpace*> continuous_spaces_;
    @@ -1205,7 +1208,9 @@
       Atomic<uint64_t> unique_backtrace_count_;
       // Stack trace hashes that we already saw,
       std::unordered_set<uint64_t> seen_backtraces_ GUARDED_BY(backtrace_lock_);
    -
    +  // XXX Add : Start
    +  AtomicInteger need_dump_heap_;
    +  // End
       friend class CollectorTransitionTask;
       friend class collector::GarbageCollector;
       friend class collector::MarkCompact;
    

    之后再通过我厂的log收集系统得到log和hprof文件,hprof需要压缩后上传,当然这之后都是建立在已有的log收集系统之上,hprof有将近500M,压缩完后大概只有30-50M。

    应该可以直接在art打日志的地方调用hprof:dump()方法实现,就不用通过上面这种方式,跨了好几个进程,只是我在试直接dump的时候有发生系统卡住的情况,毕竟是虚拟机里改动,影响会很大,决定采用上面方式,虽然绕了点,但是比较完整成熟。

    因为抓取大的hprof会影响用户体验,因此我们选择携带如上change打包,新ota升级的用户手机里发生问题时即可以抓取hprof文件并上传上来,再下一个版本就不携带该change了,只影响该一个版本,尽量将影响控制到最小。

    呵呵,终于抓到了这次。。。

    2.4 OOM问题分析

    看下抓到的hprof文件终于可以定位问题了:


    在这里插入图片描述

    hprof中 long型数据Processstats对象占了400多兆,大头呀,数量众多,都是32768字节,这个东西是怎么来的?

    在异常的logcat中我们看到一些相关的异常日志:

    09-08 22:14:53.058 1169 2675 E ProcessStatsService: java.lang.OutOfMemoryError: Failed to allocate a 32784 byte allocation with 16256 free bytes and 11KB until OOM
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.internal.app.ProcessStats.readFromParcel(ProcessStats.java:1751)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.internal.app.ProcessStats.read(ProcessStats.java:1696)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.readLocked(ProcessStatsService.java:313)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.getStatsOverTime(ProcessStatsService.java:530)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.dumpAggregatedStats(ProcessStatsService.java:583)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.dumpInner(ProcessStatsService.java:909)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.dump(ProcessStatsService.java:651)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at android.os.Binder.dump(Binder.java:331)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at android.os.Binder.onTransact(Binder.java:297)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.internal.app.IProcessStats$Stub.onTransact(IProcessStats.java:81)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at com.android.server.am.ProcessStatsService.onTransact(ProcessStatsService.java:120)
    09-08 22:14:53.058 1169 2675 E ProcessStatsService: at android.os.Binder.execTransact(Binder.java:460) 
    

    该调用栈一般是使用dumpsys procstats服务调用,查看整个代码发现只有我厂自己的性能监控软件会去定时的调用dump procstats服务来获取进程的运行数据。

    抓上来的日志看到调用栈的线程 1169 2675的binder trasaction信息,该调用是从进程31045调用过来:

    proc 31045
    thread 31045: l 10
    outgoing transaction 9933905: 0000000000000000 from 31045:31045 to 1169:2675 code 5f444d50 flags 10 pri 0 r1 node 4001 size 72:8 data 0000000000000000
    

    而ps信息中可以看到最终是由我厂的com.XXX.XXXdaemon性能监控程序调用的,该程序一般夜里收集性能数据,用户都不知道,当然用户反馈少。

    ------ ps(-t,-p,-P) ------
    USER PID PPID VSIZE RSS PRIO NICE RTPRI SCHED PCY WCHAN PC NAME
    system    31045 2610  16616  1736  20    0     0     0     fg  binder_thr 0000000000 S dumpsys //父进程是2610
    
    system    2610  533   1685660 50384 20    0     0     0     fg  SyS_epoll_ 0000000000 S com.XXX.XXXdaemon //2610是我厂性能监控程序
    

    但是看了调用的性能监控程序,调用的频率并不高,一天调用一次,看了下dumpsys procstats 的原理,实际上是应用进程的操作的生命周期里,AMS会不断的向/data/system/procstats/下 更新进程的运行数据:


    在这里插入图片描述

    当然dumpsys procstats时即是从该文件夹下读取数据的过程。

    查看代码,该代码是m机型上才有,m之上的已经没有了:

    frameworks/base/core/java/com/android/internal/app/ProcessStats.java
    final int NLONGS = in.readInt();
    final int NEXTLONG = in.readInt();
     mLongs.clear();
     for (int i=0; i<(NLONGS-1); i++) {//日志中未打印该值,无法知道到底多大,但应该是很大导致分配new longs 过多而oom,
           for (int i=0; i<(NLONGS-1); i++) {
               while (i >= mLongs.size()) {
                  mLongs.add(new long[LONGS_SIZE]);
                }
              readCompactedLongArray(in, version, mLongs.get(i), LONGS_SIZE);
           }
     long[] longs = new long[LONGS_SIZE];
    mNextLong = NEXTLONG;
    readCompactedLongArray(in, version, longs, NEXTLONG);//日志中有报NEXTLONG太大而抛异常 dump procstats 失败,但dump 方法 有try catch这种异常而未影响system_server进程
    最终因为new long[LONGS_SIZE])分配过多而oom
    

    这里的LONGS_SIZE值是4096,long型占用8字节,这里mLongs中的元素是LONGS_SIZE的long型数组,则new一个的size正好就是32768。
    猜测/data/system/procstats/下的文件有异常,再看hprof文件,的确mLongs的size值很大:
    hprof1:


    在这里插入图片描述

    hprof2:


    在这里插入图片描述
    NLONGS是从ProcessStats对象中的成员变量mLongs的size获取到的,从hprof文件看来这个值应该相当的大,这么大的值,上面代码中一个for循环分配,很快heap就飙上去了,这里就是根本原因。

    /data/system/procstats下的文件会不断的更新掉,如果异常文件被更新掉了也就不会再出现OOM了,怪不得又OOM的概率两个版本后又下降了。

    寻找代码为看出哪里有导致procstats写入异常,而且当时OOM的概率已经下降,异常文件已经被更新掉,因此采取workarround方法,避免之后再遇到异常文件.
    workaround方式,对NSLONG进行判断,若太大则不再去分配,放弃本次dumpsys procstats操作即可。

            if (NLONGS > 4000) {
                Slog.i(TAG, "Record size maybe too large , disgard to avoid oom ! NLONGS: " + NLONGS);
                mReadError = "bad NLONGS count: " + NLONGS;
                return;
            }
    

    三. OOM异常分析总结

    分析OOM最主要需要看是哪里有内存泄漏,通常OOM的时候其他进程或线程有分配内存的时候就很容易躺枪,造成crash的调用栈并不是根本原因,只是躺枪,分析OOM最主要的利器就是hprof文件了,能有效抓取到现场的hprof基本上就已经成功一大半了。

    1. 本地容易复现:
      采用脚本监控进程内存的情况,达到阈值及手动抓取,或脚本自动抓取,python脚本shell脚本均可。
      应用的话可以使用leakcanery进行复现,很简单且明了。也可以复写unCatchExceptionHandler,抛异常之前抓取下hprof文件。
    2. 本地不易复现:
      想办法在线抓取,即在该抓hprof的地方通过调用Debug接口的方式或者跑属性服务脚本的方式进行监控及抓取,上传上来分析。

    相关文章

      网友评论

          本文标题:Android-Stability【OOM】: Android

          本文链接:https://www.haomeiwen.com/subject/tvjcyqtx.html