美文网首页程序员
Android中使用Contentprovider导致进程被杀死

Android中使用Contentprovider导致进程被杀死

作者: 大大大大大先生 | 来源:发表于2018-06-30 17:39 被阅读922次

    使用contentprovider发现的问题

    • 在我们的android设备上有两个app,app1提供了一个contentprovider出去,也就是contentprovider的server端,app2使用了app1提供的contentprovider,也就是contentprovider的client端,当app1的进程被杀死的时候会发现app2的进程也被杀死了,一开始遇到这个问题的是感觉非常“不讲道理”,于是乎就开始查看源码来“讲讲道理”

    简单介绍contentprovider的用法

    • contentprovider的使用分为server和client,server端其实就是数据提供端,client端就是数据获取端,在server端的实现就是定一个类继承ContentProvider,然后重写ContentProvider中定义的query,delete,insert等方法,记得在AndroidManifest.xml文件中配置该自定义的contentprovider;对于client端就是通过context.getContentResolver()来获取到一个ContentResolver对象,然后调用对象的query,delete,update等方法,而当调用这些方法的时候是如何匹配到相应的contentprovider的呢?就是通过方法中的uri参数来匹配的,详情继续看源码分析

    Contentprovider的调用流程

    • 随便找一个ContentResolver的方法,比如query方法:
    public final @Nullable Cursor query(final @RequiresPermission.Read @NonNull Uri uri,
                @Nullable String[] projection, @Nullable String selection,
                @Nullable String[] selectionArgs, @Nullable String sortOrder,
                @Nullable CancellationSignal cancellationSignal) {
            Preconditions.checkNotNull(uri, "uri");
            IContentProvider unstableProvider = acquireUnstableProvider(uri);
            if (unstableProvider == null) {
                return null;
            }
            IContentProvider stableProvider = null;
            Cursor qCursor = null;
            try {
                long startTime = SystemClock.uptimeMillis();
    
                ICancellationSignal remoteCancellationSignal = null;
                if (cancellationSignal != null) {
                    cancellationSignal.throwIfCanceled();
                    remoteCancellationSignal = unstableProvider.createCancellationSignal();
                    cancellationSignal.setRemote(remoteCancellationSignal);
                }
                try {
                    qCursor = unstableProvider.query(mPackageName, uri, projection,
                            selection, selectionArgs, sortOrder, remoteCancellationSignal);
                } catch (DeadObjectException e) {
                    // The remote process has died...  but we only hold an unstable
                    // reference though, so we might recover!!!  Let's try!!!!
                    // This is exciting!!1!!1!!!!1
                    unstableProviderDied(unstableProvider);
                    stableProvider = acquireProvider(uri);
                    if (stableProvider == null) {
                        return null;
                    }
                    qCursor = stableProvider.query(mPackageName, uri, projection,
                            selection, selectionArgs, sortOrder, remoteCancellationSignal);
                }
                if (qCursor == null) {
                    return null;
                }
    
                // Force query execution.  Might fail and throw a runtime exception here.
                qCursor.getCount();
                long durationMillis = SystemClock.uptimeMillis() - startTime;
                maybeLogQueryToEventLog(durationMillis, uri, projection, selection, sortOrder);
    
                // Wrap the cursor object into CursorWrapperInner object.
                final IContentProvider provider = (stableProvider != null) ? stableProvider
                        : acquireProvider(uri);
                final CursorWrapperInner wrapper = new CursorWrapperInner(qCursor, provider);
                stableProvider = null;
                qCursor = null;
                return wrapper;
            } catch (RemoteException e) {
                // Arbitrary and not worth documenting, as Activity
                // Manager will kill this process shortly anyway.
                return null;
            } finally {
                if (qCursor != null) {
                    qCursor.close();
                }
                if (cancellationSignal != null) {
                    cancellationSignal.setRemote(null);
                }
                if (unstableProvider != null) {
                    releaseUnstableProvider(unstableProvider);
                }
                if (stableProvider != null) {
                    releaseProvider(stableProvider);
                }
            }
        }
    

    Contentprovider也是四大组件之一,支持跨进程调用,因此肯定会用到IPC的Binder机制来实现跨进程调用,在应用层就是AIDL

    public final IContentProvider acquireUnstableProvider(Uri uri) {
            if (!SCHEME_CONTENT.equals(uri.getScheme())) {
                return null;
            }
            String auth = uri.getAuthority();
            if (auth != null) {
                return acquireUnstableProvider(mContext, uri.getAuthority());
            }
            return null;
        }
    

    acquireUnstableProvider就是通过URI去获取到一个AIDL的定义接口IContentProvider,继续往下跟踪就到了ContextImpl的一个类ApplicationContentResolver,这个类继承自ContentResolver并实现了里面的acquireUnstableProvider方法,所以acquireUnstableProvider就会执行到ApplicationContentResolver中的acquireUnstableProvider方法:

    ContextImpl.java
    
    mContentResolver = new ApplicationContentResolver(this, mainThread, user);
    
    public ContentResolver getContentResolver() {
            return mContentResolver;
        }
    
    ApplicationContentResolver类
    
    @Override
            protected IContentProvider acquireUnstableProvider(Context c, String auth) {
                return mMainThread.acquireProvider(c,
                        ContentProvider.getAuthorityWithoutUserId(auth),
                        resolveUserIdFromAuthority(auth), false);
            }
    

    以上就已经调用到了ActivityThread的acquireProvider方法了,我们都知道ActivityThread是跟app都在一个进程中的,app进程启动的时候就会创建ActivityThread,里面定义了Android四大组件和system_process的一些通信接口,其实就是担当了四大组件和system_process之间的桥梁的一个包装者,而桥梁就是AIDL:

    public final IContentProvider acquireProvider(
                Context c, String auth, int userId, boolean stable) {
            final IContentProvider provider = acquireExistingProvider(c, auth, userId, stable);
            if (provider != null) {
                return provider;
            }
    
            // There is a possible race here.  Another thread may try to acquire
            // the same provider at the same time.  When this happens, we want to ensure
            // that the first one wins.
            // Note that we cannot hold the lock while acquiring and installing the
            // provider since it might take a long time to run and it could also potentially
            // be re-entrant in the case where the provider is in the same process.
            IActivityManager.ContentProviderHolder holder = null;
            try {
                holder = ActivityManagerNative.getDefault().getContentProvider(
                        getApplicationThread(), auth, userId, stable);
            } catch (RemoteException ex) {
                throw ex.rethrowFromSystemServer();
            }
            if (holder == null) {
                Slog.e(TAG, "Failed to find provider info for " + auth);
                return null;
            }
    
            // Install provider will increment the reference count for us, and break
            // any ties in the race.
            holder = installProvider(c, holder, holder.info,
                    true /*noisy*/, holder.noReleaseNeeded, stable);
            return holder.provider;
        }
    

    上面就已经是调用了ActivityManagerNative的getContentProvider方法了,ActivityManagerNative是ActivityManagerService在app进程的一个AIDL代理,这里已经是跨进程调用了,当然在进行跨进程调用之前会先检查是否已经有匹配的Contentprovider缓存acquireExistingProvider:

    public final IContentProvider acquireExistingProvider(
                Context c, String auth, int userId, boolean stable) {
            synchronized (mProviderMap) {
                final ProviderKey key = new ProviderKey(auth, userId);
                final ProviderClientRecord pr = mProviderMap.get(key);
                if (pr == null) {
                    return null;
                }
    
                IContentProvider provider = pr.mProvider;
                IBinder jBinder = provider.asBinder();
                if (!jBinder.isBinderAlive()) {
                    // The hosting process of the provider has died; we can't
                    // use this one.
                    Log.i(TAG, "Acquiring provider " + auth + " for user " + userId
                            + ": existing object's process dead");
                    handleUnstableProviderDiedLocked(jBinder, true);
                    return null;
                }
    
                // Only increment the ref count if we have one.  If we don't then the
                // provider is not reference counted and never needs to be released.
                ProviderRefCount prc = mProviderRefCountMap.get(jBinder);
                if (prc != null) {
                    incProviderRefLocked(prc, stable);
                }
                return provider;
            }
        }
    

    这里先说明一点通过URI去匹配对应的Contentprovider用的是URI中的getAuthority()方法返回的值,这个值返回什么?看注解:

    /**
         * Gets the decoded authority part of this URI. For
         * server addresses, the authority is structured as follows:
         * {@code [ userinfo '@' ] host [ ':' port ]}
         *
         * <p>Examples: "google.com", "bob@google.com:80"
         *
         * @return the authority for this URI or null if not present
         */
        public abstract String getAuthority();
    

    通俗的将就是返回“主机域名”,对于http url为http://domain:port/的就返回domain,而这个属性就会根据配置文件中配置的Contentprovider的android:authorities属性去查找到对应的Contentprovider,回到正题,如果acquireExistingProvider返回null,那么这时候就会通过aidl取调用ActivityManagerNative的getContentProvider方法,返回的是IActivityManager.ContentProviderHolder,接下来就跟踪到ActivityManagerService中的代码了,getContentProvider方法最终会调用到getContentProviderImpl方法中去,代码太长,只截取getContentProviderImpl方法中的核心部分:

    ComponentName comp = new ComponentName(cpi.packageName, cpi.name);
                    checkTime(startTime, "getContentProviderImpl: before getProviderByClass");
                    cpr = mProviderMap.getProviderByClass(comp, userId);
                    checkTime(startTime, "getContentProviderImpl: after getProviderByClass");
                    final boolean firstClass = cpr == null;
                    if (firstClass) {
                        final long ident = Binder.clearCallingIdentity();
    
                        // If permissions need a review before any of the app components can run,
                        // we return no provider and launch a review activity if the calling app
                        // is in the foreground.
                        if (Build.PERMISSIONS_REVIEW_REQUIRED) {
                            if (!requestTargetProviderPermissionsReviewIfNeededLocked(cpi, r, userId)) {
                                return null;
                            }
                        }
    
                        try {
                            checkTime(startTime, "getContentProviderImpl: before getApplicationInfo");
                            ApplicationInfo ai =
                                AppGlobals.getPackageManager().
                                    getApplicationInfo(
                                            cpi.applicationInfo.packageName,
                                            STOCK_PM_FLAGS, userId);
                            checkTime(startTime, "getContentProviderImpl: after getApplicationInfo");
                            if (ai == null) {
                                Slog.w(TAG, "No package info for content provider "
                                        + cpi.name);
                                return null;
                            }
                            ai = getAppInfoForUser(ai, userId);
                            cpr = new ContentProviderRecord(this, cpi, ai, comp, singleton);
                        } catch (RemoteException ex) {
                            // pm is in same process, this will never happen.
                        } finally {
                            Binder.restoreCallingIdentity(ident);
                        }
                    }
    
                    checkTime(startTime, "getContentProviderImpl: now have ContentProviderRecord");
    
                    if (r != null && cpr.canRunHere(r)) {
                        // If this is a multiprocess provider, then just return its
                        // info and allow the caller to instantiate it.  Only do
                        // this if the provider is the same user as the caller's
                        // process, or can run as root (so can be in any process).
                        return cpr.newHolder(null);
                    }
    
                    if (DEBUG_PROVIDER) Slog.w(TAG_PROVIDER, "LAUNCHING REMOTE PROVIDER (myuid "
                                + (r != null ? r.uid : null) + " pruid " + cpr.appInfo.uid + "): "
                                + cpr.info.name + " callers=" + Debug.getCallers(6));
    
                    // This is single process, and our app is now connecting to it.
                    // See if we are already in the process of launching this
                    // provider.
                    final int N = mLaunchingProviders.size();
                    int i;
                    for (i = 0; i < N; i++) {
                        if (mLaunchingProviders.get(i) == cpr) {
                            break;
                        }
                    }
    
                    // If the provider is not already being launched, then get it
                    // started.
                    if (i >= N) {
                        final long origId = Binder.clearCallingIdentity();
    
                        try {
                            // Content provider is now in use, its package can't be stopped.
                            try {
                                checkTime(startTime, "getContentProviderImpl: before set stopped state");
                                AppGlobals.getPackageManager().setPackageStoppedState(
                                        cpr.appInfo.packageName, false, userId);
                                checkTime(startTime, "getContentProviderImpl: after set stopped state");
                            } catch (RemoteException e) {
                            } catch (IllegalArgumentException e) {
                                Slog.w(TAG, "Failed trying to unstop package "
                                        + cpr.appInfo.packageName + ": " + e);
                            }
    
                            // Use existing process if already started
                            checkTime(startTime, "getContentProviderImpl: looking for process record");
                            ProcessRecord proc = getProcessRecordLocked(
                                    cpi.processName, cpr.appInfo.uid, false);
                            if (proc != null && proc.thread != null && !proc.killed) {
                                if (DEBUG_PROVIDER) Slog.d(TAG_PROVIDER,
                                        "Installing in existing process " + proc);
                                if (!proc.pubProviders.containsKey(cpi.name)) {
                                    checkTime(startTime, "getContentProviderImpl: scheduling install");
                                    proc.pubProviders.put(cpi.name, cpr);
                                    try {
                                        proc.thread.scheduleInstallProvider(cpi);
                                    } catch (RemoteException e) {
                                    }
                                }
                            } else {
                                checkTime(startTime, "getContentProviderImpl: before start process");
                                proc = startProcessLocked(cpi.processName,
                                        cpr.appInfo, false, 0, "content provider",
                                        new ComponentName(cpi.applicationInfo.packageName,
                                                cpi.name), false, false, false);
                                checkTime(startTime, "getContentProviderImpl: after start process");
                                if (proc == null) {
                                    Slog.w(TAG, "Unable to launch app "
                                            + cpi.applicationInfo.packageName + "/"
                                            + cpi.applicationInfo.uid + " for provider "
                                            + name + ": process is bad");
                                    return null;
                                }
                            }
                            cpr.launchingApp = proc;
                            mLaunchingProviders.add(cpr);
                        } finally {
                            Binder.restoreCallingIdentity(origId);
                        }
                    }
    
                    checkTime(startTime, "getContentProviderImpl: updating data structures");
    
                    // Make sure the provider is published (the same provider class
                    // may be published under multiple names).
                    if (firstClass) {
                        mProviderMap.putProviderByClass(comp, cpr);
                    }
    
                    mProviderMap.putProviderByName(name, cpr);
                    conn = incProviderCountLocked(r, cpr, token, stable);
                    if (conn != null) {
                        conn.waiting = true;
                    }
                }
                checkTime(startTime, "getContentProviderImpl: done!");
    

    这里就会根据之前PackageManagerService从配置文件读取的Contentprovider信息生成的ProviderInfo对象中的信息去创建一个ContentProviderRecord对象,接下来的会判断当前申请调用Contentprovider的client端和Contentprovider的server端是否是同一个userId或者client端是否是Contentprovider的serv端进程的另一个子进程,如果是就直接返回,如果不是就会判断Contentprovider的server端进程是否启动,如果没有启动就先启动进程,然后把创建的ContentProviderRecord对象缓存起来,到这来Contentprovider的创建过程就结束了,但是在最后看到一段比较奇怪的代码:

    // Wait for the provider to be published...
            synchronized (cpr) {
                while (cpr.provider == null) {
                    if (cpr.launchingApp == null) {
                        Slog.w(TAG, "Unable to launch app "
                                + cpi.applicationInfo.packageName + "/"
                                + cpi.applicationInfo.uid + " for provider "
                                + name + ": launching app became null");
                        EventLog.writeEvent(EventLogTags.AM_PROVIDER_LOST_PROCESS,
                                UserHandle.getUserId(cpi.applicationInfo.uid),
                                cpi.applicationInfo.packageName,
                                cpi.applicationInfo.uid, name);
                        return null;
                    }
                    try {
                        if (DEBUG_MU) Slog.v(TAG_MU,
                                "Waiting to start provider " + cpr
                                + " launchingApp=" + cpr.launchingApp);
                        if (conn != null) {
                            conn.waiting = true;
                        }
                        cpr.wait();
                    } catch (InterruptedException ex) {
                    } finally {
                        if (conn != null) {
                            conn.waiting = false;
                        }
                    }
                }
            }
    

    看注释是说等待Contentprovider发布,那我们就来看看Contentprovider是怎么发布的,可以得知cpr.provider == null就是未发布,那么就来网上找之前的代码发现当创建完ContentProviderRecord后我漏掉了一个重要的细节:

    ProcessRecord proc = getProcessRecordLocked(
                                    cpi.processName, cpr.appInfo.uid, false);
                            if (proc != null && proc.thread != null && !proc.killed) {
                                if (DEBUG_PROVIDER) Slog.d(TAG_PROVIDER,
                                        "Installing in existing process " + proc);
                                if (!proc.pubProviders.containsKey(cpi.name)) {
                                    checkTime(startTime, "getContentProviderImpl: scheduling install");
                                    proc.pubProviders.put(cpi.name, cpr);
                                    try {
                                        proc.thread.scheduleInstallProvider(cpi);
                                    } catch (RemoteException e) {
                                    }
                                }
                            }
    

    这里就是发现Contentprovider的server端进程如果已经启动,那么就会去判断Contentprovider的发布缓存中是否有匹配的信息,如果没有就执行scheduleInstallProvider,继续跟踪就回到了ActivityThread类中:

    public void scheduleInstallProvider(ProviderInfo provider) {
                sendMessage(H.INSTALL_PROVIDER, provider);
            }
    

    跟踪H.INSTALL_PROVIDER就会发现在这个Handler中执行了:

    public void handleInstallProvider(ProviderInfo info) {
            final StrictMode.ThreadPolicy oldPolicy = StrictMode.allowThreadDiskWrites();
            try {
                installContentProviders(mInitialApplication, Lists.newArrayList(info));
            } finally {
                StrictMode.setThreadPolicy(oldPolicy);
            }
        }
    private void installContentProviders(
                Context context, List<ProviderInfo> providers) {
            final ArrayList<IActivityManager.ContentProviderHolder> results =
                new ArrayList<IActivityManager.ContentProviderHolder>();
    
            for (ProviderInfo cpi : providers) {
                if (DEBUG_PROVIDER) {
                    StringBuilder buf = new StringBuilder(128);
                    buf.append("Pub ");
                    buf.append(cpi.authority);
                    buf.append(": ");
                    buf.append(cpi.name);
                    Log.i(TAG, buf.toString());
                }
                IActivityManager.ContentProviderHolder cph = installProvider(context, null, cpi,
                        false /*noisy*/, true /*noReleaseNeeded*/, true /*stable*/);
                if (cph != null) {
                    cph.noReleaseNeeded = true;
                    results.add(cph);
                }
            }
    
            try {
                ActivityManagerNative.getDefault().publishContentProviders(
                    getApplicationThread(), results);
            } catch (RemoteException ex) {
                throw ex.rethrowFromSystemServer();
            }
        }
    

    在这里我们看到了Contentprovider的发布方法publishContentProviders,又回到了ActivityManagerService,继续看publishContentProviders:

    public final void publishContentProviders(IApplicationThread caller,
                List<ContentProviderHolder> providers) {
            if (providers == null) {
                return;
            }
    
            enforceNotIsolatedCaller("publishContentProviders");
            synchronized (this) {
                final ProcessRecord r = getRecordForAppLocked(caller);
                if (DEBUG_MU) Slog.v(TAG_MU, "ProcessRecord uid = " + r.uid);
                if (r == null) {
                    throw new SecurityException(
                            "Unable to find app for caller " + caller
                          + " (pid=" + Binder.getCallingPid()
                          + ") when publishing content providers");
                }
    
                final long origId = Binder.clearCallingIdentity();
    
                final int N = providers.size();
                for (int i = 0; i < N; i++) {
                    ContentProviderHolder src = providers.get(i);
                    if (src == null || src.info == null || src.provider == null) {
                        continue;
                    }
                    ContentProviderRecord dst = r.pubProviders.get(src.info.name);
                    if (DEBUG_MU) Slog.v(TAG_MU, "ContentProviderRecord uid = " + dst.uid);
                    if (dst != null) {
                        ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
                        mProviderMap.putProviderByClass(comp, dst);
                        String names[] = dst.info.authority.split(";");
                        for (int j = 0; j < names.length; j++) {
                            mProviderMap.putProviderByName(names[j], dst);
                        }
    
                        int launchingCount = mLaunchingProviders.size();
                        int j;
                        boolean wasInLaunchingProviders = false;
                        for (j = 0; j < launchingCount; j++) {
                            if (mLaunchingProviders.get(j) == dst) {
                                mLaunchingProviders.remove(j);
                                wasInLaunchingProviders = true;
                                j--;
                                launchingCount--;
                            }
                        }
                        if (wasInLaunchingProviders) {
                            mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
                        }
                        synchronized (dst) {
                            dst.provider = src.provider;
                            dst.proc = r;
                            dst.notifyAll();
                        }
                        updateOomAdjLocked(r);
                        maybeUpdateProviderUsageStatsLocked(r, src.info.packageName,
                                src.info.authority);
                    }
                }
    
                Binder.restoreCallingIdentity(origId);
            }
        }
    

    这个方法就比较简单了,可以看到在这里对ContentProviderRecord对象的provider属性进行了赋值,并且notifyAll:

    synchronized (dst) {
                            dst.provider = src.provider;
                            dst.proc = r;
                            dst.notifyAll();
                        }
    

    所以发布结束了,现在我们就来聊聊发布Contentprovider这个到底有啥用,可以得知发布Contentprovider其实就是对ContentProviderRecord的provider进行初始化,而provider是IContentProvider类型的,这时候就明白了,IContentProvider是一个AIDL接口对应的java类,里面提供了AIDL对应的方法,而实现这个接口的类是在ContentProvider中的:

    try {
                    final java.lang.ClassLoader cl = c.getClassLoader();
                    localProvider = (ContentProvider)cl.
                        loadClass(info.name).newInstance();
                    provider = localProvider.getIContentProvider();
                    if (provider == null) {
                        Slog.e(TAG, "Failed to instantiate class " +
                              info.name + " from sourceDir " +
                              info.applicationInfo.sourceDir);
                        return null;
                    }
                    if (DEBUG_PROVIDER) Slog.v(
                        TAG, "Instantiating local provider " + info.name);
                    // XXX Need to create the correct context for this provider.
                    localProvider.attachInfo(c, info);
                } catch (java.lang.Exception e) {
                    if (!mInstrumentation.onException(null, e)) {
                        throw new RuntimeException(
                                "Unable to get provider " + info.name
                                + ": " + e.toString(), e);
                    }
                    return null;
                }
    

    这里是在Contentprovider的server端的进程中直接通过反射new了一个Contentprovider对象出来,然后通过吧getIContentProvider方法返回的值赋值给provider:

    private Transport mTransport = new Transport();
    
    public IContentProvider getIContentProvider() {
            return mTransport;
        }
    

    而Transport集成ContentProviderNative,ContentProviderNative实现了IContentProvider,所以ContentProviderNative就是IContentProvider这个远程的AIDL接口提供的本地实现,Contentprovider的跨进程通信都是在ContentProviderNative实现的,到这里provider的初始化就结束了,现在来理一理provider的作用:Contentprovider的server端是对外暴露一些接口给Contentprovider的client端使用,这些接口都是通过aidl来实现通信的,那么如果是server端的进程自己调用自己的Contentprovider的话就直接有localProvider,这个就是provider,因为此时是同一个进程所以无需进行跨进程调用,如果不是同一个进程,那么这时候会server端初始化的provider在之前Contentprovider初始化的时候已经缓存在system_process进程的ActivityManagerService的相关的Contentprovider缓存中了,也就是ContentProviderRecord,然后通过ActivityManagerService的getContentProvider方法把ContentProviderHolder返回给Contentprovider的client端,在ContentProviderHolder中就有provider对象了,也就是说Contentprovider的server初始化了provider,然后把这个对象跨进程传到各个Contentprovider的client端进程给他们使用来进行跨进程通信,好了,貌似已经over了,继续回到正题。。。

    真正的正题。。。

    之前不是说Contentprovider的server端进程死了,client端的进程也会被杀死的这个问题吗?从log日志发现打印了这么一段log:

    depends on provider...in dying proc...
    

    那么简单,找到打印这个log的地方ActivityManagerService的removeDyingProviderLocked:

    if (conn.stableCount > 0) {
                    if (!capp.persistent && capp.thread != null
                            && capp.pid != 0
                            && capp.pid != MY_PID) {
                        capp.kill("depends on provider "
                                + cpr.name.flattenToShortString()
                                + " in dying proc " + (proc != null ? proc.processName : "??")
                                + " (adj " + (proc != null ? proc.setAdj : "??") + ")", true);
                    }
                }
    

    而我发现我代码中杀死Contentprovider的server进程使用的是ActivityManager的forceStopPackage(通过反射调用的),杀死进程的方法流程就不分析了,反正最终也是走到了ActivityManagerService的cleanUpApplicationRecordLocked方法,再调用removeDyingProviderLocked方法,然后就会发现server端进程被kill掉的时候跟Contentprovider相关的信息stableCount > 0,于是就杀死了对应的client端进程,那么为什么stableCount会大于0?继续分析stableCount是怎么初始化,怎么进行赋值变化的,看下ContentResolver的update方法代码,因为项目中确实调用了该方法:

    public final int update(@RequiresPermission.Write @NonNull Uri uri,
                @Nullable ContentValues values, @Nullable String where,
                @Nullable String[] selectionArgs) {
            Preconditions.checkNotNull(uri, "uri");
            IContentProvider provider = acquireProvider(uri);
            if (provider == null) {
                throw new IllegalArgumentException("Unknown URI " + uri);
            }
            try {
                long startTime = SystemClock.uptimeMillis();
                int rowsUpdated = provider.update(mPackageName, uri, values, where, selectionArgs);
                long durationMillis = SystemClock.uptimeMillis() - startTime;
                maybeLogUpdateToEventLog(durationMillis, uri, "update", where);
                return rowsUpdated;
            } catch (RemoteException e) {
                // Arbitrary and not worth documenting, as Activity
                // Manager will kill this process shortly anyway.
                return -1;
            } finally {
                releaseProvider(provider);
            }
        }
    

    可以看到有acquireProvider和releaseProvider,releaseProvider比较简单,先看下releaseProvider方法都做了什么,最终会调用到ActivityThread中的releaseProvider方法,只截取有用部分,对于update方法来说stabled传的是true:

    if (stable) {
                    if (prc.stableCount == 0) {
                        if (DEBUG_PROVIDER) Slog.v(TAG,
                                "releaseProvider: stable ref count already 0, how?");
                        return false;
                    }
                    prc.stableCount -= 1;
                    if (prc.stableCount == 0) {
                        // What we do at this point depends on whether there are
                        // any unstable refs left: if there are, we just tell the
                        // activity manager to decrement its stable count; if there
                        // aren't, we need to enqueue this provider to be removed,
                        // and convert to holding a single unstable ref while
                        // doing so.
                        lastRef = prc.unstableCount == 0;
                        try {
                            if (DEBUG_PROVIDER) {
                                Slog.v(TAG, "releaseProvider: No longer stable w/lastRef="
                                        + lastRef + " - " + prc.holder.info.name);
                            }
                            ActivityManagerNative.getDefault().refContentProvider(
                                    prc.holder.connection, -1, lastRef ? 1 : 0);
                        } catch (RemoteException e) {
                            //do nothing content provider object is dead any way
                        }
                    }
                }
    

    这里面先对stableCount减1,然后就执行了ActivityManagerService的refContentProvider方法,其中refContentProvider的stable参数传的是-1:

    public boolean refContentProvider(IBinder connection, int stable, int unstable) {
            ContentProviderConnection conn;
            try {
                conn = (ContentProviderConnection)connection;
            } catch (ClassCastException e) {
                String msg ="refContentProvider: " + connection
                        + " not a ContentProviderConnection";
                Slog.w(TAG, msg);
                throw new IllegalArgumentException(msg);
            }
            if (conn == null) {
                throw new NullPointerException("connection is null");
            }
    
            synchronized (this) {
                if (stable > 0) {
                    conn.numStableIncs += stable;
                }
                stable = conn.stableCount + stable;
                if (stable < 0) {
                    throw new IllegalStateException("stableCount < 0: " + stable);
                }
    
                if (unstable > 0) {
                    conn.numUnstableIncs += unstable;
                }
                unstable = conn.unstableCount + unstable;
                if (unstable < 0) {
                    throw new IllegalStateException("unstableCount < 0: " + unstable);
                }
    
                if ((stable+unstable) <= 0) {
                    throw new IllegalStateException("ref counts can't go to zero here: stable="
                            + stable + " unstable=" + unstable);
                }
                conn.stableCount = stable;
                conn.unstableCount = unstable;
                return !conn.dead;
            }
        }
    

    这里面的核心操作就是把ContentProviderConnection中的stableCount 进行减1操作,好了,既然releaseProvider是对stableCount 减1,那么acquireProvider很可能是对stableCount 进行加1操作,前面我们已经分析过了acquireProvider中首先会去缓存取所需要的provider信息,如果没有就会通过ActivityManagerService的getContentProvider方法去获取一个ContentProviderHolder对象,获取到了之后会执行installProvider方法,上面在分析provider发布的时候已经分析过installProvider方法了,那时候跑的逻辑是通过反射new一个Contentprovider出来作为localProvider,但是在acquireProvider中去执行的installProvider方法此时跑的就不是这个逻辑了,因为此时provider已经成功发布,所以:

    if (holder == null || holder.provider == null) {
    ...
    }
    

    以上条件不会满足,而是跑到else分支,并且由于没有跑if分支所以localProvider没有初始化,所以:

    if (localProvider != null) {
                    ComponentName cname = new ComponentName(info.packageName, info.name);
                    ProviderClientRecord pr = mLocalProvidersByName.get(cname);
                    if (pr != null) {
                        if (DEBUG_PROVIDER) {
                            Slog.v(TAG, "installProvider: lost the race, "
                                    + "using existing local provider");
                        }
                        provider = pr.mProvider;
                    } else {
                        holder = new IActivityManager.ContentProviderHolder(info);
                        holder.provider = provider;
                        holder.noReleaseNeeded = true;
                        pr = installProviderAuthoritiesLocked(provider, localProvider, holder);
                        mLocalProviders.put(jBinder, pr);
                        mLocalProvidersByName.put(cname, pr);
                    }
                    retHolder = pr.mHolder;
                } else {
                    ProviderRefCount prc = mProviderRefCountMap.get(jBinder);
                    if (prc != null) {
                        if (DEBUG_PROVIDER) {
                            Slog.v(TAG, "installProvider: lost the race, updating ref count");
                        }
                        // We need to transfer our new reference to the existing
                        // ref count, releasing the old one...  but only if
                        // release is needed (that is, it is not running in the
                        // system process).
                        if (!noReleaseNeeded) {
                            incProviderRefLocked(prc, stable);
                            try {
                                ActivityManagerNative.getDefault().removeContentProvider(
                                        holder.connection, stable);
                            } catch (RemoteException e) {
                                //do nothing content provider object is dead any way
                            }
                        }
                    } else {
                        ProviderClientRecord client = installProviderAuthoritiesLocked(
                                provider, localProvider, holder);
                        if (noReleaseNeeded) {
                            prc = new ProviderRefCount(holder, client, 1000, 1000);
                        } else {
                            prc = stable
                                    ? new ProviderRefCount(holder, client, 1, 0)
                                    : new ProviderRefCount(holder, client, 0, 1);
                        }
                        mProviderRefCountMap.put(jBinder, prc);
                    }
                    retHolder = prc.holder;
                }
    

    以上也是跑else分支,而mProviderRefCountMap.get(jBinder)此时返回的也肯定是null,因为我们分析的ContentResolver的update是首次调用的,所以缓存信息肯定是空的,所以就会执行以下代码来增加缓存信息:

    ProviderClientRecord client = installProviderAuthoritiesLocked(
                                provider, localProvider, holder);
                        if (noReleaseNeeded) {
                            prc = new ProviderRefCount(holder, client, 1000, 1000);
                        } else {
                            prc = stable
                                    ? new ProviderRefCount(holder, client, 1, 0)
                                    : new ProviderRefCount(holder, client, 0, 1);
                        }
                        mProviderRefCountMap.put(jBinder, prc);
    
    ProviderRefCount(IActivityManager.ContentProviderHolder inHolder,
                    ProviderClientRecord inClient, int sCount, int uCount) {
                holder = inHolder;
                client = inClient;
                stableCount = sCount;
                unstableCount = uCount;
            }
    

    到这里可以知道当noReleaseNeeded为true的时候创建ProviderRefCount的时候stableCount传的是1000,而noReleaseNeeded为false的时候stableCount才是传1,那noReleaseNeeded在什么情况下才为true,让我们来看下ContentProviderHolder是怎么创建出来的,代码在ActivityManagerService的getContentProviderImpl方法中,ContentProviderHolder是从ContentProviderRecord的newHolder方法创建出来的,然后把ContentProviderRecordnoReleaseNeeded赋值给ContentProviderHolder对象:

    public ContentProviderHolder newHolder(ContentProviderConnection conn) {
            ContentProviderHolder holder = new ContentProviderHolder(info);
            holder.provider = provider;
            holder.noReleaseNeeded = noReleaseNeeded;
            holder.connection = conn;
            return holder;
        }
    

    看ContentProviderRecord是如何创建的:

    public ContentProviderRecord(ActivityManagerService _service, ProviderInfo _info,
                ApplicationInfo ai, ComponentName _name, boolean _singleton) {
            service = _service;
            info = _info;
            uid = ai.uid;
            appInfo = ai;
            name = _name;
            singleton = _singleton;
            noReleaseNeeded = uid == 0 || uid == Process.SYSTEM_UID;
        }
    

    这里看到了noReleaseNeeded的初始化条件,uid == 0表示当前进程是拥有root权限,uid == Process.SYSTEM_UID是表示当前进程是系统进程,所以通常系统APP进程才会满足上述条件,因此第三方APP是使用Contentprovider的update方法的时候noReleaseNeeded通常为false,此时stableCount == 1,而系统APP在使用Contentprovider的update方法的时候noReleaseNeeded就为true了,此时stableCount == 1000,而我遇到这个问题的场景正好是系统APP,并且APP进程拥有root权限,所以理所当然stableCount == 1000,而且我用debug去跟踪也发现确实在创建ProviderRefCount的时候stableCount参数的初始化值是1000,releaseProvider方法执行后变为了999,那么这里就找到Contentprovider的client端进程被杀死的元凶了,就是由于stableCount == 999导致的?这么草率的吗?这样下的结论肯定是不对的,因为ProviderRefCount的初始化是发生在ActivityThread,而ActivityThread是在app进程的,那么stableCount == 999也是在app进程中的值,而进程被杀死是发生在ActivityManagerService类中的,ActivityManagerService是系统Service,属于system_process进程,那system_process进程种关于stableCount的值是多少呢?让我们回到ActivityManagerService的getContentProviderImpl方法,然后再跟踪到incProviderCountLocked方法:

    ContentProviderConnection incProviderCountLocked(ProcessRecord r,
                final ContentProviderRecord cpr, IBinder externalProcessToken, boolean stable) {
            if (r != null) {
                for (int i=0; i<r.conProviders.size(); i++) {
                    ContentProviderConnection conn = r.conProviders.get(i);
                    if (conn.provider == cpr) {
                        if (DEBUG_PROVIDER) Slog.v(TAG_PROVIDER,
                                "Adding provider requested by "
                                + r.processName + " from process "
                                + cpr.info.processName + ": " + cpr.name.flattenToShortString()
                                + " scnt=" + conn.stableCount + " uscnt=" + conn.unstableCount);
                        if (stable) {
                            conn.stableCount++;
                            conn.numStableIncs++;
                        } else {
                            conn.unstableCount++;
                            conn.numUnstableIncs++;
                        }
                        return conn;
                    }
                }
                ContentProviderConnection conn = new ContentProviderConnection(cpr, r);
                if (stable) {
                    conn.stableCount = 1;
                    conn.numStableIncs = 1;
                } else {
                    conn.unstableCount = 1;
                    conn.numUnstableIncs = 1;
                }
                cpr.connections.add(conn);
                r.conProviders.add(conn);
                startAssociationLocked(r.uid, r.processName, r.curProcState,
                        cpr.uid, cpr.name, cpr.info.processName);
                return conn;
            }
            cpr.addExternalProcessHandleLocked(externalProcessToken);
            return null;
        }
    

    好了,真相大白,ContentProviderConnection创建后stable为true的话stableCount = 1,也就是说在system_process进程的stableCount还是1的,那就更尴尬了,既然system_process进程的stableCount初始化的时候是1,那释放掉的时候不是就变成0了,那就没有问题了呀?继续看,刚才我们分析了再app进程的ProviderRefCount对象的stableCount在app进程是系统进程或者有root权限的时候初始化值为1000,那么再来看下ActivityThread中releaseProvider方法实现:

    if (stable) {
                    if (prc.stableCount == 0) {
                        if (DEBUG_PROVIDER) Slog.v(TAG,
                                "releaseProvider: stable ref count already 0, how?");
                        return false;
                    }
                    prc.stableCount -= 1;
                    if (prc.stableCount == 0) {
                        // What we do at this point depends on whether there are
                        // any unstable refs left: if there are, we just tell the
                        // activity manager to decrement its stable count; if there
                        // aren't, we need to enqueue this provider to be removed,
                        // and convert to holding a single unstable ref while
                        // doing so.
                        lastRef = prc.unstableCount == 0;
                        try {
                            if (DEBUG_PROVIDER) {
                                Slog.v(TAG, "releaseProvider: No longer stable w/lastRef="
                                        + lastRef + " - " + prc.holder.info.name);
                            }
                            ActivityManagerNative.getDefault().refContentProvider(
                                    prc.holder.connection, -1, lastRef ? 1 : 0);
                        } catch (RemoteException e) {
                            //do nothing content provider object is dead any way
                        }
                    }
                }
    

    stable为true的时候prc.stableCount的值减1,然后判断当prc.stableCount == 0的时候才会去触发调用ActivityManagerService的refContentProvider方法,而refContentProvider方法才是对ActivityManagerService中的ContentProviderConnection对象中的stableCount减1,所以当app进程的stableCount初始化值为1000的时候,调用releaseProvider方法,那么app进程的stableCount值减1后为999,因此不会触发调用ActivityManagerService的refContentProvider方法,所以此时ActivityManagerService中的stableCount仍然为1,所以在执行完ContentResoler的update方法后stableCount都不为0,因此在Contentprovider的server端进程被杀死的时候会顺带杀死Contentprovider的client端进程。那么问题来了,如何才能避免Contentprovider的server端被杀死的时候不会吧Contentprovider的client端的进程也杀死呢,那就确保noReleaseNeeded为false,也就是进程的uid != 0 && uid != Process.SYSTEM_UID,也就是进程如果是普通的第三方app进程的话noReleaseNeeded会为false那么这时候就不会有Contentprovider的server端被杀死了连带Contentprovider的client进程也一起被杀死,我自己写了一个demo测试了下,确实是这样的,server端和client端的进程互相不影响,而对于具有系统权限的进程,例如系统进程而言就会出现这个问题

    备注

    • 我分析的是我自己遇到的场景,我是在具有root权限的系统进程上uid == 1000的app上发现有这个问题,但是第三方app从源码的角度分析没看出哪里会导致这个问题,但并不代表第三方app使用Contentprovider就不会有这个问题,有可能还有其他场景会导致第三方app使用自定义的Contentprovider的时候也可能导致这个问题

    相关文章

      网友评论

        本文标题:Android中使用Contentprovider导致进程被杀死

        本文链接:https://www.haomeiwen.com/subject/zztxuftx.html