ANR的认识
ANR是什么
Application Not Responding的缩写,即应用程序无响应。一般是主线程做了耗时操作或者阻塞导致无法相应用户的触摸事件等。
触发原因
- 主线程有耗时操作,如IO操作
- Binder调用
- 同步锁block
- 拿不到系统资源cpu
ANR分类
- Activity onCreate方法或Input事件超过5s没有完成
- BroadcastReceiver前台10s,后台60s
- ContentProvider 在publish过超时10s
- Service前台20s,后台200s
注意: - 对于Service, Broadcast, Input发生ANR之后,最终都会调用AMS.appNotResponding;
- 对于provider,在其进程启动时publish过程可能会出现ANR, 则会直接杀进程以及清理相应信息,而不会弹出ANR的对话框.
- 对于输入事件发生ANR,首先会调用InputMonitor.notifyANR,最终也会调用AMS.appNotResponding。
线下调试ANR
LOG:Cause reason
当ANR产生的时候,logcat会打印出一段log,会输出类似下面的信息。
- 首先可以得到ANR所在进程的进程名、进程号、及出错的组件;
- 其中Reason主要描述了ANR产生的具体原因/分类;
- CPU usage...ago则主要记录了ANR发生前CPU的使用状况;
- CPU usage...later则记录了ANR发生之后CPU的使用状况。
E/ActivityManager: ANR in com.example.testapp (com.example.testapp/.CrashTestActivity)
PID: 2480
Reason: Input dispatching timed out (Waiting because the touched window has not finished processing the input events that were previously delivered to it.)
Load: 0.06 / 0.08 / 0.05
CPU usage from 9865ms to 0ms ago:
0.7% 1558/system_server: 0% user + 0.7% kernel / faults: 39 minor
0.4% 1143/adbd: 0% user + 0.4% kernel / faults: 117 minor
0.4% 1796/com.estrongs.android.pop: 0.2% user + 0.2% kernel / faults: 95 minor
0.2% 1132/surfaceflinger: 0% user + 0.2% kernel
0.1% 1114/kworker/0:1H: 0% user + 0.1% kernel
0.1% 1131/rild: 0% user + 0.1% kernel
0.1% 1682/com.android.phone: 0% user + 0.1% kernel
+0% 2510/logcat: 0% user + 0% kernel
0.8% TOTAL: 0.1% user + 0.7% kernel + 0% iowait
CPU usage from 1098ms to 1603ms later:
2% 1558/system_server: 2% user + 0% kernel
2% 1573/ActivityManager: 2% user + 0% kernel
0% TOTAL: 0% user + 0% kernel
data/anr/trace.txt
每次产生ANR之后,系统都会向/data/anr/traces.txt中写入新的数据。1)介于----- pid 0000 xxx -----与----- end 0000 -----之间的为进程0000的所有线程堆栈信息。一般来说,发生ANR的进程信息会在文件头部,下面AMS源码分析的时候会说明为什么。
- "main" prio=5 tid=1 TIMED_WAIT分为为线程名、线程优先级(默认值5)、线程ID、线程状态;主线程之后会接着打印进程中其他线程的信息,此处不再贴出。
----- pid 2480 at 2017-04-06 08:48:58 -----
Cmd line: com.example.testapp
JNI: CheckJNI is on; workarounds are off; pins=0; globals=263
DALVIK THREADS:
(mutexes: tll=0 tsl=0 tscl=0 ghl=0)
"main" prio=5 tid=1 TIMED_WAIT
| group="main" sCount=1 dsCount=0 obj=0xaccdcbd8 self=0xb80204a0
| sysTid=2480 nice=0 sched=0/0 cgrp=[fopen-error:2] handle=-1217093536
| state=S schedstat=( 0 0 0 ) utm=2 stm=1 core=1
at java.lang.VMThread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:1013)
at java.lang.Thread.sleep(Thread.java:995)
at com.tencent.bugly.proguard.ag.a(BUGLY:897)
at com.tencent.bugly.crashreport.crash.anr.BuglyTestANR_Reciver.onReceive(BUGLY:30)
at android.app.LoadedApk$ReceiverDispatcher$Args.run(LoadedApk.java:768)
at android.os.Handler.handleCallback(Handler.java:733)
at android.os.Handler.dispatchMessage(Handler.java:95)
at android.os.Looper.loop(Looper.java:136)
at android.app.ActivityThread.main(ActivityThread.java:5017)
at java.lang.reflect.Method.invokeNative(Native Method)
at java.lang.reflect.Method.invoke(Method.java:515)
at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:779)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:595)
at dalvik.system.NativeStart.main(Native Method)
"AnotherThread1" xxx
xxx
xxx
xxx
"AnotherThread2" xxx
xxx
xxx
xxx
----- end 2480 -----
----- pid 1558 at 2017-04-06 08:48:58 -----
Cmd line: another_process_name
...
...
...
----- end 2480 -----
...
...
...
线程状态
image.png/data/system/dropbox
traces.txt:只保留最近一次发生ANR时的信息,位置:/data/anr/traces.txt
DropBox:Android 2.2 开始增加, 会保留历史上发生的所有ANR的logs,位置:/data/system/dropbox,保存时长3天。详见:ActivityManagerService.addErrorToDropBox()
线上监控ANR
借助Bugly的思路:利用FileObserver监听/data/anr目录下是否有本进程的trace文件写入,有则通过ActivityManager.getProcessesInErrorState() 来获取系统中有所有异常进程的信息
网友评论