什么是ANR
ANR(Application Not Responding)就是应用在规定的时间内没有响应用户输入或者其他应用或者系统服务。
发生ANR的场景
- Service超时
Service ANR一般是指AMS通过binder IPC调用Service CMD,如果在规定的时间内没有收到Service宿主进程反馈的Service CMD执行完成的消息,就会触发Service ANR。这里Service CMD包括CREATE_SERVICE、BIND_SERVICE、UNBIND_SERVICE、STOP_SERVICE等。
接下来我们以startService为例分析Service Timeout ANR的流程,如图所示:
我们从startService开始分析Service超时ANR的流程。
public ComponentName startService(IApplicationThread caller, Intent service,
String resolvedType, String callingPackage, int userId)
throws TransactionTooLargeException {
...
synchronized(this) {
final int callingPid = Binder.getCallingPid();
final int callingUid = Binder.getCallingUid();
final long origId = Binder.clearCallingIdentity();
ComponentName res = mServices.startServiceLocked(caller, service,
resolvedType, callingPid, callingUid, callingPackage, userId);
Binder.restoreCallingIdentity(origId);
return res;
}
}
startService进行一些简单的检查工作后,直接调用ActiveServices对象的startServiceLocked,进一步调用bringUpServiceLocked来启动Service。
private String bringUpServiceLocked(ServiceRecord r, int intentFlags, boolean execInFg,
boolean whileRestarting, boolean permissionsReviewRequired)
throws TransactionTooLargeException {
//Slog.i(TAG, "Bring up service:");
//r.dump(" ");
// 检查服务是否已经运行
if (r.app != null && r.app.thread != null) {
sendServiceArgsLocked(r, execInFg, false);
return null;
}
...
if (!isolated) {
// 服务运行在宿主应用的进程
app = mAm.getProcessRecordLocked(procName, r.appInfo.uid, false);
if (DEBUG_MU) Slog.v(TAG_MU, "bringUpServiceLocked: appInfo.uid=" + r.appInfo.uid
+ " app=" + app);
if (app != null && app.thread != null) {
try {
app.addPackage(r.appInfo.packageName, r.appInfo.versionCode, mAm.mProcessStats);
// 运行服务
realStartServiceLocked(r, app, execInFg);
return null;
} catch (TransactionTooLargeException e) {
throw e;
} catch (RemoteException e) {
Slog.w(TAG, "Exception when starting service " + r.shortName, e);
}
// If a dead object exception was thrown -- fall through to
// restart the application.
}
} else {
// 服务运行在单独的进程
// If this service runs in an isolated process, then each time
// we call startProcessLocked() we will get a new isolated
// process, starting another process if we are currently waiting
// for a previous process to come up. To deal with this, we store
// in the service any current isolated process it is running in or
// waiting to have come up.
app = r.isolatedProc;
}
...
}
bringUpServiceLocked首先进行一些检查工作,如果Service运行在宿主应用,调用realStartServiceLocked运行服务;如果Service运行在单独的进程,首先要创建一个进程,然后运行Service。这里我们只考虑前者。
private final void realStartServiceLocked(ServiceRecord r,
ProcessRecord app, boolean execInFg) throws RemoteException {
if (app.thread == null) {
throw new RemoteException();
}
...
// 根据服务类型设置超时Message
bumpServiceExecutingLocked(r, execInFg, "create");
// 更新LRU链表
mAm.updateLruProcessLocked(app, false, null);
// 更新相关进程的OomAdj
mAm.updateOomAdjLocked();
...
try {
if (LOG_SERVICE_START_STOP) {
String nameTerm;
int lastPeriod = r.shortName.lastIndexOf('.');
nameTerm = lastPeriod >= 0 ? r.shortName.substring(lastPeriod) : r.shortName;
EventLogTags.writeAmCreateService(
r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
}
synchronized (r.stats.getBatteryStats()) {
r.stats.startLaunchedLocked();
}
mAm.notifyPackageUse(r.serviceInfo.packageName,
PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
// 通过binder IPC运行服务
app.thread.scheduleCreateService(r, r.serviceInfo,
mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
app.repProcState);
r.postNotification();
created = true;
}
...
}
realStartServiceLocked的参数execInFg表示服务是前台服务还是后台服务,bumpServiceExecutingLocked会根据服务的类型设定不同的超时时间,其中前台服务的超时时间为SERVICE_TIMEOUT(20s),后台服务的超时时间为10 * SERVICE_TIMEOUT。如果超时时间结束时,超时消息仍然没有被移除,就表明有Service执行超时,下面分析ServiceTimeout。
void serviceTimeout(ProcessRecord proc) {
String anrMessage = null;
synchronized(mAm) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
final long now = SystemClock.uptimeMillis();
final long maxTime = now -
(proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
ServiceRecord timeout = null;
long nextTime = 0;
for (int i=proc.executingServices.size()-1; i>=0; i--) {
// 遍历正在执行的服务,找到执行超时的服务
ServiceRecord sr = proc.executingServices.valueAt(i);
if (sr.executingStart < maxTime) {
timeout = sr;
break;
}
if (sr.executingStart > nextTime) {
nextTime = sr.executingStart;
}
}
if (timeout != null && mAm.mLruProcesses.contains(proc)) {
// 找到执行超时的服务
Slog.w(TAG, "Timeout executing service: " + timeout);
StringWriter sw = new StringWriter();
PrintWriter pw = new FastPrintWriter(sw, false, 1024);
pw.println(timeout);
timeout.dump(pw, " ");
pw.close();
mLastAnrDump = sw.toString();
mAm.mHandler.removeCallbacks(mLastAnrDumpClearer);
mAm.mHandler.postDelayed(mLastAnrDumpClearer, LAST_ANR_LIFETIME_DURATION_MSECS);
anrMessage = "executing service " + timeout.shortName;
} else {
// 没有单个服务执行超时,继续设定超时Message
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
mAm.mHandler.sendMessageAtTime(msg, proc.execServicesFg
? (nextTime+SERVICE_TIMEOUT) : (nextTime + SERVICE_BACKGROUND_TIMEOUT));
}
}
if (anrMessage != null) {
// 处理Service ANR
mAm.mAppErrors.appNotResponding(proc, null, null, false, anrMessage);
}
}
serviceTimeout负责找到执行超时的服务,然后调用AppErrors的appNotResponding处理。
上面的整个过程就是appNotResponding的处理流程,对于细节我们在分析ANR一节中再讨论,需要注意的
是红色区域的流程在Android N上已经改变(改善系统性能),详情参考下面链接。
- Broadcast超时
在分析Broadcast ANR之前我们先简单了解下Broadcast。
Broadcast一般分为两类:
- Normal broadcasts (sent with Context.sendBroadcast) are completely asynchronous. All receivers of the broadcast are run in an undefined order,
often at the same time. This is more efficient, but means that receivers cannot use the result or abort APIs included here.- Ordered broadcasts (sent with Context.sendOrderedBroadcast) are delivered to one receiver at a time. As each receiver executes in turn, it can propagate a result to the next receiver, or it can completely abort the broadcast so that it won't be passed to other receivers. The order receivers run in can be controlled with the android:priority attribute of the matching intent-filter; receivers with the same priority will be run in an arbitrary order.
Even in the case of normal broadcasts, the system may in some situations revert to delivering the broadcast one receiver at a time. In particular, for receivers that may require the creation of a process, only one will be run at a time to avoid overloading the system with new processes. In this situation, however, the non-ordered semantics hold: these receivers still cannot return results or abort their broadcast.
BroadcastReceiver有两种注册方式:
You can either dynamically register an instance of this class with Context.registerReceiver() or statically publish an implementation through the receiver tag in your AndroidManifest.xml.
下面分析Broadcast ANR的流程,如图所示:
如果AMS将Broadcast发送给广播接收机后,在规定的时间内没有收到广播接收机
发送的finishReceiver的消息,就会触发BroadcastTimeout ANR。下面从broadcastIntentLocked开始分析。
final int broadcastIntentLocked(ProcessRecord callerApp,
String callerPackage, Intent intent, String resolvedType,
IIntentReceiver resultTo, int resultCode, String resultData,
Bundle resultExtras, String[] requiredPermissions, int appOp, Bundle bOptions,
boolean ordered, boolean sticky, int callingPid, int callingUid, int userId) {
...
// Figure out who all will receive this broadcast.
List receivers = null;
List<BroadcastFilter> registeredReceivers = null;
// Need to resolve the intent to interested receivers...
if ((intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY)
== 0) {
// 收集静态注册的广播接收机
receivers = collectReceiverComponents(intent, resolvedType, callingUid, users);
}
if (intent.getComponent() == null) {
if (userId == UserHandle.USER_ALL && callingUid == Process.SHELL_UID) {
// Query one target user at a time, excluding shell-restricted users
for (int i = 0; i < users.length; i++) {
if (mUserController.hasUserRestriction(
UserManager.DISALLOW_DEBUGGING_FEATURES, users[i])) {
continue;
}
List<BroadcastFilter> registeredReceiversForUser =
mReceiverResolver.queryIntent(intent,
resolvedType, false, users[i]);
if (registeredReceivers == null) {
registeredReceivers = registeredReceiversForUser;
} else if (registeredReceiversForUser != null) {
registeredReceivers.addAll(registeredReceiversForUser);
}
}
} else {
// 查找动态注册的广播接收机
registeredReceivers = mReceiverResolver.queryIntent(intent,
resolvedType, false, userId);
}
}
...
int NR = registeredReceivers != null ? registeredReceivers.size() : 0;
if (!ordered && NR > 0) {
// 发送普通广播到动态注册的广播接收机
// If we are not serializing this broadcast, then send the
// registered receivers separately so they don't wait for the
// components to be launched.
// 根据广播类型决定发送广播的队列,前台广播由前台广播对表处理;
// 后台广播由后台广播队列处理
final BroadcastQueue queue = broadcastQueueForIntent(intent);
// 创建广播记录
BroadcastRecord r = new BroadcastRecord(queue, intent, callerApp,
callerPackage, callingPid, callingUid, resolvedType, requiredPermissions,
appOp, brOptions, registeredReceivers, resultTo, resultCode, resultData,
resultExtras, ordered, sticky, false, userId);
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Enqueueing parallel broadcast " + r);
final boolean replaced = replacePending && queue.replaceParallelBroadcastLocked(r);
if (!replaced) {
// 将(前台\后台)普通广播放入(前台\后台)并行广播列表中
queue.enqueueParallelBroadcastLocked(r);
// 处理(前台\后台)并行广播列表中的广播
queue.scheduleBroadcastsLocked();
}
registeredReceivers = null;
NR = 0;
}
// Merge into one list.
// 动态注册的广播接收机、静态注册的广播接收机按优先级排序(高->低),
// 存放到receivers中
int ir = 0;
if (receivers != null) {
...
int NT = receivers != null ? receivers.size() : 0;
int it = 0;
ResolveInfo curt = null;
BroadcastFilter curr = null;
while (it < NT && ir < NR) {
if (curt == null) {
curt = (ResolveInfo)receivers.get(it);
}
if (curr == null) {
curr = registeredReceivers.get(ir);
}
if (curr.getPriority() >= curt.priority) {
// Insert this broadcast record into the final list.
receivers.add(it, curr);
ir++;
curr = null;
it++;
NT++;
} else {
// Skip to the next ResolveInfo in the final list.
it++;
curt = null;
}
}
}
while (ir < NR) {
if (receivers == null) {
receivers = new ArrayList();
}
receivers.add(registeredReceivers.get(ir));
ir++;
}
if ((receivers != null && receivers.size() > 0)
|| resultTo != null) {
// 根据广播类型决定发送广播的队列,前台广播由前台广播对表处理;
// 后台广播由后台广播队列处理
BroadcastQueue queue = broadcastQueueForIntent(intent);
BroadcastRecord r = new BroadcastRecord(queue, intent, callerApp,
callerPackage, callingPid, callingUid, resolvedType,
requiredPermissions, appOp, brOptions, receivers, resultTo, resultCode,
resultData, resultExtras, ordered, sticky, false, userId);
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Enqueueing ordered broadcast " + r
+ ": prev had " + queue.mOrderedBroadcasts.size());
if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
"Enqueueing broadcast " + r.intent.getAction());
boolean replaced = replacePending && queue.replaceOrderedBroadcastLocked(r);
if (!replaced) {
// 将(前台\后台)普通广播放入(前台\后台)有序广播列表中
queue.enqueueOrderedBroadcastLocked(r);
// 处理(前台\后台)有序广播列表中的广播
queue.scheduleBroadcastsLocked();
}
}
...
}
broadcastIntentLocked在进行一系列的检查以及特殊情况的处理后,按广播的类型以及相应的广播接收机的类型进行分发。
下面分析分发函数scheduleBroadcastsLocked
public void scheduleBroadcastsLocked() {
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Schedule broadcasts ["
+ mQueueName + "]: current="
+ mBroadcastsScheduled);
if (mBroadcastsScheduled) {
return;
}
// 发送处理广播的Message
mHandler.sendMessage(mHandler.obtainMessage(BROADCAST_INTENT_MSG, this));
mBroadcastsScheduled = true;
}
scheduleBroadcastsLocked只是简单的发送BROADCAST_INTENT_MSG消息,该消息的处理函数调用processNextBroadcast进行分发。
final void processNextBroadcast(boolean fromMsg) {
synchronized(mService) {
BroadcastRecord r;
...
mService.updateCpuStats();
...
// First, deliver any non-serialized broadcasts right away.
while (mParallelBroadcasts.size() > 0) {
r = mParallelBroadcasts.remove(0);
r.dispatchTime = SystemClock.uptimeMillis();
r.dispatchClockTime = System.currentTimeMillis();
final int N = r.receivers.size();
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST, "Processing parallel broadcast ["
+ mQueueName + "] " + r);
for (int i=0; i<N; i++) {
Object target = r.receivers.get(i);
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Delivering non-ordered on [" + mQueueName + "] to registered "
+ target + ": " + r);
//将广播同时发送给Parallel列表中的广播接收机
deliverToRegisteredReceiverLocked(r, (BroadcastFilter)target, false, i);
}
addBroadcastToHistoryLocked(r);
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST, "Done with parallel broadcast ["
+ mQueueName + "] " + r);
}
...
do {
if (mOrderedBroadcasts.size() == 0) {
// No more broadcasts pending, so all done!
mService.scheduleAppGcsLocked();
if (looped) {
// If we had finished the last ordered broadcast, then
// make sure all processes have correct oom and sched
// adjustments.
mService.updateOomAdjLocked();
}
return;
}
// 将广播按照优先级一个一个的分发给Ordered列表中的广播接收机
r = mOrderedBroadcasts.get(0);
boolean forceReceive = false;
...
int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
if (mService.mProcessesReady && r.dispatchTime > 0) {
// 1) 广播发送始于SystemReady之前,结束于SystemReady之后的超时检测
// 由于SystemReady之前的广播发送可能很慢,而且不检测,所以超时时间为
// 2 * mTimeoutPeriod * numReceivers
// 2) 广播发送过程中有dex2oat发生。
long now = SystemClock.uptimeMillis();
if ((numReceivers > 0) &&
(now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
...
broadcastTimeoutLocked(false); // forcibly finish this broadcast
forceReceive = true;
r.state = BroadcastRecord.IDLE;
}
}
...
if (r.receivers == null || r.nextReceiver >= numReceivers
|| r.resultAbort || forceReceive) {
// No more receivers for this broadcast! Send the final
// result if requested...
if (r.resultTo != null) {
// 广播发送完成,如果发送方需要结果,将结果反馈给发送方。
try {
if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
"Finishing broadcast [" + mQueueName + "] "
+ r.intent.getAction() + " app=" + r.callerApp);
performReceiveLocked(r.callerApp, r.resultTo,
new Intent(r.intent), r.resultCode,
r.resultData, r.resultExtras, false, false, r.userId);
// Set this to null so that the reference
// (local and remote) isn't kept in the mBroadcastHistory.
r.resultTo = null;
} catch (RemoteException e) {
r.resultTo = null;
Slog.w(TAG, "Failure ["
+ mQueueName + "] sending broadcast result of "
+ r.intent, e);
}
}
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Cancelling BROADCAST_TIMEOUT_MSG");
// 一个广播的所有接收机发送完成,取消超时消息设置。
cancelBroadcastTimeoutLocked();
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST,
"Finished with ordered broadcast " + r);
// ... and on to the next...
addBroadcastToHistoryLocked(r);
if (r.intent.getComponent() == null && r.intent.getPackage() == null
&& (r.intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
// This was an implicit broadcast... let's record it for posterity.
mService.addBroadcastStatLocked(r.intent.getAction(), r.callerPackage,
r.manifestCount, r.manifestSkipCount, r.finishTime-r.dispatchTime);
}
// 从Ordered队列中移除发送完成的广播
mOrderedBroadcasts.remove(0);
r = null;
looped = true;
continue;
}
} while (r == null);
// Get the next receiver...
// 获取广播的下一个接收者(可能有多个)发送
int recIdx = r.nextReceiver++;
// Keep track of when this receiver started, and make sure there
// is a timeout message pending to kill it if need be.
r.receiverTime = SystemClock.uptimeMillis();
if (recIdx == 0) {
// 广播多个接收者中的第一个,记录分发时间
r.dispatchTime = r.receiverTime;
r.dispatchClockTime = System.currentTimeMillis();
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST, "Processing ordered broadcast ["
+ mQueueName + "] " + r);
}
if (! mPendingBroadcastTimeoutMessage) {
long timeoutTime = r.receiverTime + mTimeoutPeriod;
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Submitting BROADCAST_TIMEOUT_MSG ["
+ mQueueName + "] for " + r + " at " + timeoutTime);
// 如果没有设定广播发送超时时间,在这里设定
setBroadcastTimeoutLocked(timeoutTime);
}
...
final Object nextReceiver = r.receivers.get(recIdx);
if (nextReceiver instanceof BroadcastFilter) {
// Simple case: this is a registered receiver who gets
// a direct call.
BroadcastFilter filter = (BroadcastFilter)nextReceiver;
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Delivering ordered ["
+ mQueueName + "] to registered "
+ filter + ": " + r);
// 如果是动态注册的广播接收机,直接发送
deliverToRegisteredReceiverLocked(r, filter, r.ordered, recIdx);
// 我的理解r.ordered == true ???
if (r.receiver == null || !r.ordered) {
// The receiver has already finished, so schedule to
// process the next one.
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Quick finishing ["
+ mQueueName + "]: ordered="
+ r.ordered + " receiver=" + r.receiver);
r.state = BroadcastRecord.IDLE;
scheduleBroadcastsLocked();
} else {
if (brOptions != null && brOptions.getTemporaryAppWhitelistDuration() > 0) {
scheduleTempWhitelistLocked(filter.owningUid,
brOptions.getTemporaryAppWhitelistDuration(), r);
}
}
return;
}
...
// Is this receiver's application already running?
if (app != null && app.thread != null) {
// 广播接收机Host进程已经运行,发送广播
try {
app.addPackage(info.activityInfo.packageName,
info.activityInfo.applicationInfo.versionCode, mService.mProcessStats);
// 最终通过Binder IPC运行广播接收机
processCurBroadcastLocked(r, app);
return;
}
}
...
// 创建广播接收机Host进程
if ((r.curApp=mService.startProcessLocked(targetProcess,
info.activityInfo.applicationInfo, true,
r.intent.getFlags() | Intent.FLAG_FROM_BACKGROUND,
"broadcast", r.curComponent,
(r.intent.getFlags()&Intent.FLAG_RECEIVER_BOOT_UPGRADE) != 0, false, false))
== null) {
// Ah, this recipient is unavailable. Finish it if necessary,
// and mark the broadcast record as ready for the next.
Slog.w(TAG, "Unable to launch app "
+ info.activityInfo.applicationInfo.packageName + "/"
+ info.activityInfo.applicationInfo.uid + " for broadcast "
+ r.intent + ": process is bad");
logBroadcastReceiverDiscardLocked(r);
finishReceiverLocked(r, r.resultCode, r.resultData,
r.resultExtras, r.resultAbort, false);
scheduleBroadcastsLocked();
r.state = BroadcastRecord.IDLE;
return;
}
mPendingBroadcast = r;
}
}
简单总结一下广播的发送超时流程:
与服务超时ANR类似,如果在规定的时间内,广播超时消息没有取消,就会触发ANR。接下来分析
广播超时ANR的处理。
final void broadcastTimeoutLocked(boolean fromMsg) {
// fromMsg标记超时触发者,true表示超时消息触发
// false表示直接调用超时处理
if (fromMsg) {
mPendingBroadcastTimeoutMessage = false;
}
if (mOrderedBroadcasts.size() == 0) {
return;
}
long now = SystemClock.uptimeMillis();
BroadcastRecord r = mOrderedBroadcasts.get(0);
if (fromMsg) {
if (mService.mDidDexOpt) {
// Delay timeouts until dexopt finishes.
mService.mDidDexOpt = false;
long timeoutTime = SystemClock.uptimeMillis() + mTimeoutPeriod;
setBroadcastTimeoutLocked(timeoutTime);
return;
}
if (!mService.mProcessesReady) {
// Only process broadcast timeouts if the system is ready. That way
// PRE_BOOT_COMPLETED broadcasts can't timeout as they are intended
// to do heavy lifting for system up.
return;
}
long timeoutTime = r.receiverTime + mTimeoutPeriod;
// 如果发送给当前广播接收机(可能多个)没有超时,则重新设定超时消息;从这里
// 看出超时其实是针对单个广播接收机,如果多个广播接收机收发累计时间
// 超时,并不会触发ANR。
if (timeoutTime > now) {
// We can observe premature timeouts because we do not cancel and reset the
// broadcast timeout message after each receiver finishes. Instead, we set up
// an initial timeout then kick it down the road a little further as needed
// when it expires.
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Premature timeout ["
+ mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
+ timeoutTime);
setBroadcastTimeoutLocked(timeoutTime);
return;
}
}
...
// 触发广播超时ANR
if (anrMessage != null) {
// Post the ANR to the handler since we do not want to process ANRs while
// potentially holding our lock.
mHandler.post(new AppNotResponding(app, anrMessage));
}
}
broadcastTimeoutLocked根据参数fromMsg进一步判定是否确实广播超时ANR,这里要注意如果有dex2oat,广播超时时间被推迟;如果系统启动还未就绪,不检测广播超时。广播超时ANR的处理流程跟服务超时ANR类似,不再赘述。
- ContentProvider超时
Content providers are one of the primary building blocks of Android applications, providing content to applications. They encapsulate data and provide it to applications through the single ContentResolver interface.
A content provider is only required if you need to share data between multiple applications. For example, the contacts data is used by multiple applications and must be stored in a content provider.
If you don't need to share data amongst multiple applications you can use a database directly via SQLiteDatabase.
ContentProvider超时的示意图如上图所示,如果我们只是使用getContentResolver返回的对象(ApplicationContentResolver)来访问ContentProvider提供的数据,并不会有超时ANR的检测。当使用ContentResolverClient来访问数据才会有超时ANR检测。ContentResolverClient检测到数据操作超时后,最终通过Binder IPC通知AMS,AMS的处理与之前讨论的类似,不再赘述。
-
Input Dispatching超时
我们以Touch事件来说明Input Dispatching超时机制。这里只分析关键的函数,首先看
checkWindowReadyForMoreInputLocked。
String8 InputDispatcher::checkWindowReadyForMoreInputLocked(nsecs_t currentTime,
const sp<InputWindowHandle>& windowHandle, const EventEntry* eventEntry,
const char* targetType) {
...
// Ensure that the dispatch queues aren't too far backed up for this event.
if (eventEntry->type == EventEntry::TYPE_KEY) {
// 对于按键事件处理,如果outboundQueue以及waitQueue均不为空的话
// 不能发送。
// If the event is a key event, then we must wait for all previous events to
// complete before delivering it because previous events may have the
// side-effect of transferring focus to a different window and we want to
// ensure that the following keys are sent to the new window.
//
// Suppose the user touches a button in a window then immediately presses "A".
// If the button causes a pop-up window to appear then we want to ensure that
// the "A" key is delivered to the new pop-up window. This is because users
// often anticipate pending UI changes when typing on a keyboard.
// To obtain this behavior, we must serialize key events with respect to all
// prior input events.
if (!connection->outboundQueue.isEmpty() || !connection->waitQueue.isEmpty()) {
return String8::format("Waiting to send key event because the %s window has not "
"finished processing all of the input events that were previously "
"delivered to it. Outbound queue length: %d. Wait queue length: %d.",
targetType, connection->outboundQueue.count(), connection->waitQueue.count());
}
} else {
// Touch事件的处理,如果waitQueue不为空,且waitQueue的head事件分发完成距离当前已经
// 超过STREAM_AHEAD_EVENT_TIMEOUT,不能发送,
// Touch events can always be sent to a window immediately because the user intended
// to touch whatever was visible at the time. Even if focus changes or a new
// window appears moments later, the touch event was meant to be delivered to
// whatever window happened to be on screen at the time.
//
// Generic motion events, such as trackball or joystick events are a little trickier.
// Like key events, generic motion events are delivered to the focused window.
// Unlike key events, generic motion events don't tend to transfer focus to other
// windows and it is not important for them to be serialized. So we prefer to deliver
// generic motion events as soon as possible to improve efficiency and reduce lag
// through batching.
//
// The one case where we pause input event delivery is when the wait queue is piling
// up with lots of events because the application is not responding.
// This condition ensures that ANRs are detected reliably.
if (!connection->waitQueue.isEmpty()
&& currentTime >= connection->waitQueue.head->deliveryTime
+ STREAM_AHEAD_EVENT_TIMEOUT) {
return String8::format("Waiting to send non-key event because the %s window has not "
"finished processing certain input events that were delivered to it over "
"%0.1fms ago. Wait queue length: %d. Wait queue head age: %0.1fms.",
targetType, STREAM_AHEAD_EVENT_TIMEOUT * 0.000001f,
connection->waitQueue.count(),
(currentTime - connection->waitQueue.head->deliveryTime) * 0.000001f);
}
}
...
}
checkWindowReadyForMoreInputLocked用于检查目标窗口是否已经准备好接收更多的输入事件,如果没有准备好,需要进一步检查窗口没有准备好的原因,下面分析handleTargetsNotReadyLocked。
int32_t InputDispatcher::handleTargetsNotReadyLocked(nsecs_t currentTime,
const EventEntry* entry,
const sp<InputApplicationHandle>& applicationHandle,
const sp<InputWindowHandle>& windowHandle,
nsecs_t* nextWakeupTime, const char* reason) {
...
if (applicationHandle == NULL && windowHandle == NULL) {
...
} else {
// 记录首次目标窗口没有准备好的时间,并计算出超时时间mInputTargetWaitTimeoutTime
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for application to become ready for input: %s. Reason: %s",
getApplicationWindowLabelLocked(applicationHandle, windowHandle).string(),
reason);
#endif
nsecs_t timeout;
if (windowHandle != NULL) {
timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else if (applicationHandle != NULL) {
timeout = applicationHandle->getDispatchingTimeout(
DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else {
timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
}
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = currentTime + timeout;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationHandle.clear();
...
}
}
...
if (currentTime >= mInputTargetWaitTimeoutTime) {
// Input Dispatching ANR处理
onANRLocked(currentTime, applicationHandle, windowHandle,
entry->eventTime, mInputTargetWaitStartTime, reason);
// Force poll loop to wake up immediately on next iteration once we get the
// ANR response back from the policy.
*nextWakeupTime = LONG_LONG_MIN;
return INPUT_EVENT_INJECTION_PENDING;
}
...
}
AMS对于Input Dispatching超时ANR的处理与之前类似,不再赘述。
如何分析ANR
- 问题描述
ANR in com.samsung.android.email.provider during CAS Test
-
Log分析
- ANR发生的准确时间及原因
01-03 03:47:07.488 1149 1313 I am_anr : [0,4910,com.samsung.android.email.provider,953695813,Input dispatching timed out (Waiting to send non-key event because the focused window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 4. Wait queue head age: 5615.7ms.)]
- Email进程的backtrace(/data/anr/traces.txt or dropbox)
通常重点关注mian线程以及binder线程
"main" prio=5 tid=1 Native | group="main" sCount=1 dsCount=0 obj=0x768f3fb8 self=0x557be4ec40 | sysTid=4910 nice=0 cgrp=default sched=0/0 handle=0x7f8bb4ffd0 | state=S schedstat=( 15510426384 7139276978 32108 ) utm=1214 stm=337 core=4 HZ=100 | stack=0x7fdda0b000-0x7fdda0d000 stackSize=8MB | held mutexes= kernel: __switch_to+0x7c/0x88 kernel: SyS_epoll_wait+0x2cc/0x394 kernel: SyS_epoll_pwait+0x9c/0x114 kernel: __sys_trace+0x48/0x4c native: #00 pc 0000000000069d94 /system/lib64/libc.so (__epoll_pwait+8) native: #01 pc 000000000001ce64 /system/lib64/libc.so (epoll_pwait+32) native: #02 pc 000000000001be88 /system/lib64/libutils.so (_ZN7android6Looper9pollInnerEi+144) native: #03 pc 000000000001c268 /system/lib64/libutils.so (_ZN7android6Looper8pollOnceEiPiS1_PPv+80) native: #04 pc 00000000000d89dc /system/lib64/libandroid_runtime.so (_ZN7android18NativeMessageQueue8pollOnceEP7_JNIEnvP8_jobjecti+48) native: #05 pc 000000000000087c /system/framework/arm64/boot.oat (Java_android_os_MessageQueue_nativePollOnce__JI+144) at android.os.MessageQueue.nativePollOnce(Native method) at android.os.MessageQueue.next(MessageQueue.java:323) at android.os.Looper.loop(Looper.java:135) at android.app.ActivityThread.main(ActivityThread.java:7421) at java.lang.reflect.Method.invoke!(Native method) at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1230) at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1120) "Binder_1" prio=5 tid=8 Native | group="main" sCount=1 dsCount=0 obj=0x32c072e0 self=0x557c129ec0 | sysTid=4921 nice=0 cgrp=default sched=0/0 handle=0x7f86c55450 | state=S schedstat=( 518922106 635149861 2589 ) utm=25 stm=26 core=4 HZ=100 | stack=0x7f86b59000-0x7f86b5b000 stackSize=1013KB | held mutexes= kernel: __switch_to+0x7c/0x88 kernel: binder_thread_read+0xf60/0x10d8 kernel: binder_ioctl_write_read+0x1cc/0x304 kernel: binder_ioctl+0x348/0x738 kernel: do_vfs_ioctl+0x4e0/0x5c0 kernel: SyS_ioctl+0x5c/0x88 kernel: __sys_trace+0x48/0x4c native: #00 pc 0000000000069e80 /system/lib64/libc.so (__ioctl+4) native: #01 pc 0000000000073ea4 /system/lib64/libc.so (ioctl+100) native: #02 pc 000000000002d584 /system/lib64/libbinder.so (_ZN7android14IPCThreadState14talkWithDriverEb+164) native: #03 pc 000000000002de00 /system/lib64/libbinder.so (_ZN7android14IPCThreadState20getAndExecuteCommandEv+24) native: #04 pc 000000000002df1c /system/lib64/libbinder.so (_ZN7android14IPCThreadState14joinThreadPoolEb+76) native: #05 pc 0000000000036a10 /system/lib64/libbinder.so (???) native: #06 pc 00000000000167b4 /system/lib64/libutils.so (_ZN7android6Thread11_threadLoopEPv+208) native: #07 pc 00000000000948d0 /system/lib64/libandroid_runtime.so (_ZN7android14AndroidRuntime15javaThreadShellEPv+96) native: #08 pc 0000000000016004 /system/lib64/libutils.so (???) native: #09 pc 0000000000067904 /system/lib64/libc.so (_ZL15__pthread_startPv+52) native: #10 pc 000000000001c804 /system/lib64/libc.so (__start_thread+16) ...
- ANR发生时系统CPU使用信息
01-03 03:47:28.928 1149 1313 E android.os.Debug: ro.product_ship = false 01-03 03:47:28.928 1149 1313 E android.os.Debug: ro.debug_level = 0x494d 01-03 03:47:28.928 1149 1313 E android.os.Debug: Failed open /proc/schedinfo 01-03 03:47:28.928 1149 1313 E ActivityManager: ANR in com.samsung.android.email.provider (com.samsung.android.email.provider/com.samsung.android.email.ui.activity.MessageListXL) 01-03 03:47:28.928 1149 1313 E ActivityManager: PID: 4910 01-03 03:47:28.928 1149 1313 E ActivityManager: Reason: Input dispatching timed out (Waiting to send non-key event because the focused window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 4. Wait queue head age: 5615.7ms.) 01-03 03:47:28.928 1149 1313 E ActivityManager: Load: 0.0 / 0.0 / 0.0 01-03 03:47:28.928 1149 1313 E ActivityManager: ------ Current CPU Core Info ------ 01-03 03:47:28.928 1149 1313 E ActivityManager: - offline : 01-03 03:47:28.928 1149 1313 E ActivityManager: - online : 0-7 01-03 03:47:28.928 1149 1313 E ActivityManager: - cpu_normalized_load : - 01-03 03:47:28.928 1149 1313 E ActivityManager: - run_queue_avg : 2.9 01-03 03:47:28.928 1149 1313 E ActivityManager: - AP Temp = 475 01-03 03:47:28.928 1149 1313 E ActivityManager: 0 1 2 3 4 5 6 7 01-03 03:47:28.928 1149 1313 E ActivityManager: ------------------------------------------------------------------------------------------------------------------ 01-03 03:47:28.928 1149 1313 E ActivityManager: scaling_cur_freq 1689600 1689600 1689600 1689600 1689600 1689600 1689600 1689600 01-03 03:47:28.928 1149 1313 E ActivityManager: scaling_governor interactive interactive interactive interactive interactive interactive interactive interactive 01-03 03:47:28.928 1149 1313 E ActivityManager: scaling_max_freq 1689600 1689600 1689600 1689600 1689600 1689600 1689600 1689600 01-03 03:47:28.928 1149 1313 E ActivityManager: ------------------------------------------------------------------------------------------------------------------ 01-03 03:47:28.928 1149 1313 E ActivityManager: CPU usage from 1277ms to -51ms ago: 01-03 03:47:28.928 1149 1313 E ActivityManager: 100% 7807/procrank: 10% user + 89% kernel / faults: 6598 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 36% 466/surfaceflinger: 9% user + 27% kernel / faults: 596 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 25% 1149/system_server: 4.4% user + 20% kernel / faults: 139 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 21% 259/spi3: 0% user + 21% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 10% 4910/com.samsung.android.email.provider: 7.2% user + 3.6% kernel / faults: 39 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 8.6% 2837/com.samsung.android.providers.context: 3.6% user + 5% kernel / faults: 28 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 3.4% 7806/dumpstate: 0.2% user + 3.1% kernel / faults: 26 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 8.3% 506/mediaserver: 2.2% user + 6% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 6.8% 320/mmcqd/0: 0% user + 6.8% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 1.5% 6263/com.sec.spp.push:RemoteDlcProcess: 1.2% user + 0.3% kernel / faults: 547 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 5% 5677/com.samsung.cas: 1.4% user + 3.6% kernel / faults: 463 minor 01-03 03:47:28.928 1149 1313 E ActivityManager: 5% 7751/screenrecord: 2.1% user + 2.8% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 4.5% 29/ksoftirqd/4: 0% user + 4.5% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 4.4% 913/kworker/u16:9: 0% user + 4.4% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 3.7% 34/ksoftirqd/5: 0% user + 3.7% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 3.7% 73/kworker/u16:3: 0% user + 3.7% kernel 01-03 03:47:28.928 1149 1313 E ActivityManager: 3.7% 262/irq/69-madera: 0% user + 3.7% kernel ... 01-03 03:47:28.928 1149 1313 E ActivityManager: 62% TOTAL: 9.5% user + 50% kernel + 1.3% iowait + 0.5% softirq 01-03 03:47:28.928 1149 1313 E ActivityManager: CPU usage from 189522ms to 189522ms ago with 0% awake: 01-03 03:47:28.928 1149 1313 E ActivityManager: 0% TOTAL: 0% user + 0% kernel
- 超时ANR期间内log(以及Service/Broadcast history等)
... Line 33430: 01-03 03:47:01.858 1149 1550 I InputDispatcher: Delivering touch to (7647): x: 520.000, y: 1562.000, flags=0x0, action: 0x0, channel '8d82752 ScrollCaptureUiService (server)', toolType: 0 ... Line 33432: 01-03 03:47:01.858 1149 1550 I InputDispatcher: Delivering touch to (7647): x: 520.456, y: 1560.577, flags=0x0, action: 0x1, channel '8d82752 ScrollCaptureUiService (server)', toolType: 0 ... Line 33447: 01-03 03:47:02.158 1149 1550 I InputDispatcher: Delivering touch to (7647): x: 369.000, y: 1089.000, flags=0x0, action: 0x0, channel '8d82752 ScrollCaptureUiService (server)', toolType: 0 ... Line 33449: 01-03 03:47:02.158 1149 1550 I InputDispatcher: Delivering touch to (7647): x: 375.404, y: 1090.664, flags=0x0, action: 0x1, channel '8d82752 ScrollCaptureUiService (server)', toolType: 0 Line 33481: 01-03 03:47:02.468 1149 1550 D InputDispatcher: Waiting for application to become ready for input: AppWindowToken{d0aeba2b3 token=Token{7242822 ActivityRecord{b24b1ed u0 com.samsung.android.email.provider/com.samsung.android.email.ui.activity.MessageListXL t1044}}} - Window{8d82752 u0 d0 p7647 ScrollCaptureUiService}. Reason: Waiting to send non-key event because the focused window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 4. Wait queue head age: 610.8ms. Line 33825: 01-03 03:47:07.468 1149 1550 I InputDispatcher: Application is not responding: AppWindowToken{d0aeba2b3 token=Token{7242822 ActivityRecord{b24b1ed u0 com.samsung.android.email.provider/com.samsung.android.email.ui.activity.MessageListXL t1044}}} - Window{8d82752 u0 d0 p7647 ScrollCaptureUiService}. It has been 5007.8ms since event, 5004.9ms since wait started. Reason: Waiting to send non-key event because the focused window has not finished processing certain input events that were delivered to it over 500.0ms ago. Wait queue length: 4. Wait queue head age: 5615.7ms.
首先分析Email在发生ANR之前5~6内的事件分发log,我们从log中发现在这段事件内,InputDispatcher并没有分发输入事件给Email的窗口,而是一直有输入时间分发给ScrollCaptureUiService。继续看发现实际上是ScrollCaptureUiService窗口不响应输入事件(根据waitQueue中事件数量以及事件超时时间可判定),为什么ScrollCaptureUiService不响应输入事件?继续看log发现在收到输入事件的时候,com.samsung.android.app.scrollcapture发生Crash,这就可以解释ScrollCaptureUiService窗口不响应输入事件啦(PS:细节后续讨论debuggerd64时候再展开)。
... 01-03 03:47:01.618 7647 7647 F libc : heap corruption detected by dlmalloc 01-03 03:47:01.618 7647 7647 F libc : Fatal signal 6 (SIGABRT), code -6 in tid 7647 (p.scrollcapture) 01-03 03:47:01.628 7647 7777 D SC_ScrollCapture_JNI: getRgbSumTable : W=1080 H=1920 SumW=1027 SumH=1920 Elapsed=10ms 01-03 03:47:01.698 501 501 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** 01-03 03:47:01.698 501 501 F DEBUG : Build fingerprint: 'Android/c5proltezc/c5proltechn:6.0.1/MMB29M/C5010ZCE0APJ1c:eng/test-keys' 01-03 03:47:01.698 501 501 F DEBUG : Revision: '0' 01-03 03:47:01.698 501 501 F DEBUG : ABI: 'arm64' 01-03 03:47:01.698 501 501 F DEBUG : pid: 7647, tid: 7647, name: p.scrollcapture >>> com.samsung.android.app.scrollcapture ...
既然是ScrollCaptureUiService窗口不响应输入事件为什么Email发生ANR?
InputDispatcher: Waiting for application to become ready for input: AppWindowToken{d0aeba2b3 token=Token{7242822 ActivityRecord{b24b1ed u0 com.samsung.android.email.provider/com.samsung.android.email.ui.activity.MessageListXL t1044}}} - Window{8d82752 u0 d0 p7647 ScrollCaptureUiService}
从log中我们看出InputApplicationHandle是
AppWindowToken{d0aeba2b3 token=Token{7242822 ActivityRecord{b24b1ed u0 com.samsung.android.email.provider/com.samsung.android.email.ui.activity.MessageListXL t1044}}
,但是InputWindowHandle是Window{8d82752 u0 d0 p7647 ScrollCaptureUiService}
,我们可以理解为窗口覆盖在应用之上。对于窗口与应用handle不一致的情况,发生ANR时,首先找到窗口handle也就是Window{8d82752 u0 d0 p7647 ScrollCaptureUiService}
对应的应用,如果为空的话,系统认为应用handle对应的应用也就是com.samsung.android.email.provider
ANR,这就是Email发生ANR的原因。
如何避免ANR
这里大家可以参考Android Developer网站给出建议,理解了ANR发生的原理,自然也就懂得避免ANR。
网友评论