SystemServer的理解

作者: Jenchar | 来源:发表于2018-03-19 10:01 被阅读136次

    SystemServer创建的可以分成两部分,一部分是在Zygote进程中fork并初始化SystemServer进程,另一部分是执行SystemServer类的mian来启动系统的服务。

    1、SystemServer的创建过程

    1.1、创建SystemServer创建
    init.rc文件根据语句import /init.${ro.zygote}.rc属性来判断引入不同的文件,从而导入不同版本的Zygote进程,属性ro.zygote可能取值有zygote32,zyoget32_64,zygote64,zygote64_32,因此init.rc同级目录下一共有4个与zygote相关的文件:init.zygote32.rc,init.zygote64.rc,init.zygote32_64.rc,init.zygote64_32.rc,例如:
    init.zygote64.rc

    service zygote /system/bin/app_process64 -Xzygote /system/bin --zygote --start-system-server
        class main
        socket zygote stream 660 root system
        onrestart write /sys/android_power/request_state wake
        onrestart write /sys/power/state on
        onrestart restart media
        onrestart restart netd
    

    从init.zygote64.rc文件定义中Zygote进程的启动参数包括了“--start-system-server”,因此在ZygoteInit类中的main方法会调用startSystemServer方法来启SystemServer,startSystemServer方法内容:
    ZygoteInit.java

     private static boolean startSystemServer(String abiList, String socketName)
                throws MethodAndArgsCaller, RuntimeException {
           ......
            /* Hardcoded command line to start the system server */
            String args[] = {
                "--setuid=1000",
                "--setgid=1000",
                "--setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1032,3001,3002,3003,3006,3007",
                "--capabilities=" + capabilities + "," + capabilities,
                "--runtime-init",
                "--nice-name=system_server",
                "com.android.server.SystemServer",
            };
            ZygoteConnection.Arguments parsedArgs = null;
    
            int pid;
    
            try {
                 ......
                /* Request to fork the system server process */
                pid = Zygote.forkSystemServer(
                        parsedArgs.uid, parsedArgs.gid,
                        parsedArgs.gids,
                        parsedArgs.debugFlags,
                        null,
                        parsedArgs.permittedCapabilities,
                        parsedArgs.effectiveCapabilities);
            } catch (IllegalArgumentException ex) {
                throw new RuntimeException(ex);
            }
    
            /* For child process */
            if (pid == 0) {
                if (hasSecondZygote(abiList)) {
                    waitForSecondaryZygote(socketName);
                }
    
                handleSystemServerProcess(parsedArgs);
            }
    
            return true;
        }
    

    从面看出,上面主要做了三件事,

    • 为SystemServer准备启动参数,其中SystemServer的进程ID和组ID都是1000,执行类是com.android.server.SystemServer
    • 调用Zygote的forkSystemServer()来fork出SystemServer子进程,接着调用native层的函数来完成实际的操作,forkSystemServer->nativeForkSystemServer
      Zygote.java
    public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
                int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
            VM_HOOKS.preFork();
            int pid = nativeForkSystemServer(
                    uid, gid, gids, debugFlags, rlimits, permittedCapabilities, effectiveCapabilities);
            VM_HOOKS.postForkCommon();
            return pid;
        }
    
        native private static int nativeForkSystemServer(int uid, int gid, int[] gids, int debugFlags,
                int[][] rlimits, long permittedCapabilities, long effectiveCapabilities);
    

    framework/base/core/jni/com_android_internal_os_Zygote.cpp

    static jint com_android_internal_os_Zygote_nativeForkSystemServer(
            JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
            jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
            jlong effectiveCapabilities) {
      pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
                                          debug_flags, rlimits,
                                          permittedCapabilities, effectiveCapabilities,
                                          MOUNT_EXTERNAL_NONE, NULL, NULL, true, NULL,
                                          NULL, NULL);
      if (pid > 0) {
          // The zygote process checks whether the child process has died or not.
          ALOGI("System server process %d has been created", pid);
          gSystemServerPid = pid;
          // There is a slight window that the system server process has crashed
          // but it went unnoticed because we haven't published its pid yet. So
          // we recheck here just to make sure that all is well.
          int status;
          if (waitpid(pid, &status, WNOHANG) == pid) {
              ALOGE("System server process %d has died. Restarting Zygote!", pid);
              RuntimeAbort(env);//启动SystemServer失败,重启系统
          }
      }
      return pid;
    }
    
    

    native层调用ForkAndSpecializeCommon函数后,如果启动的SystemServer,会去判断是否成功启动,如果不成功,会让让Zygote自己退出重启,在ForkAndSpecializeCommon方法里,会通SetSigChldHandler()方法去设置处理SIGCHID信号的函数SigChidHandler():

    static void SigChldHandler(int /*signal_number*/) {
      pid_t pid;
      int status;
    
      while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
         // Log process-death status that we care about.  In general it is
         // not safe to call LOG(...) from a signal handler because of
         // possible reentrancy.  However, we know a priori that the
         // current implementation of LOG() is safe to call from a SIGCHLD
         // handler in the zygote process.  If the LOG() implementation
         // changes its locking strategy or its use of syscalls within the
         // lazy-init critical section, its use here may become unsafe.
        if (WIFEXITED(status)) {
          if (WEXITSTATUS(status)) {
            ALOGI("Process %d exited cleanly (%d)", pid, WEXITSTATUS(status));
          }
        } else if (WIFSIGNALED(status)) {
          if (WTERMSIG(status) != SIGKILL) {
            ALOGI("Process %d exited due to signal (%d)", pid, WTERMSIG(status));
          }
          if (WCOREDUMP(status)) {
            ALOGI("Process %d dumped core.", pid);
          }
        }
    
        // If the just-crashed process is the system_server, bring down zygote
        // so that it is restarted by init and system server will be restarted
        // from there.
        if (pid == gSystemServerPid) {
    //如果死亡的是SystemServer进程,zygote将退出
          ALOGE("Exit zygote because system server (%d) has terminated");
          kill(getpid(), SIGKILL);
        }
      }
    
      // Note that we shouldn't consider ECHILD an error because
      // the secondary zygote might have no children left to wait for.
      if (pid < 0 && errno != ECHILD) {
        ALOGW("Zygote SIGCHLD error in waitpid: %s", strerror(errno));
      }
    }
    

    waitpid()来防止子进程变"僵尸",来判断是否是SystemServer进程,如果是,会令Zygote进程会自杀,将导致init进程杀死自己所有服务并重启Zygote

    • fork出SystemServer后,在fork出来的进程调用 handleSystemServerProcess(parsedArgs):
    private static void handleSystemServerProcess(
                ZygoteConnection.Arguments parsedArgs)
                throws ZygoteInit.MethodAndArgsCaller {
    
            closeServerSocket();
    
            // set umask to 0077 so new files and directories will default to owner-only permissions.
            Os.umask(S_IRWXG | S_IRWXO);
    
            if (parsedArgs.niceName != null) {
                Process.setArgV0(parsedArgs.niceName);
            }
    
            final String systemServerClasspath = Os.getenv("SYSTEMSERVERCLASSPATH");
            if (systemServerClasspath != null) {
                performSystemServerDexOpt(systemServerClasspath);
            }
    
            if (parsedArgs.invokeWith != null) {
                String[] args = parsedArgs.remainingArgs;
                // If we have a non-null system server class path, we'll have to duplicate the
                // existing arguments and append the classpath to it. ART will handle the classpath
                // correctly when we exec a new process.
                if (systemServerClasspath != null) {
                    String[] amendedArgs = new String[args.length + 2];
                    amendedArgs[0] = "-cp";
                    amendedArgs[1] = systemServerClasspath;
                    System.arraycopy(parsedArgs.remainingArgs, 0, amendedArgs, 2, parsedArgs.remainingArgs.length);
                }
    
                WrapperInit.execApplication(parsedArgs.invokeWith,
                        parsedArgs.niceName, parsedArgs.targetSdkVersion,
                        null, args);
            } else {
                ClassLoader cl = null;
                if (systemServerClasspath != null) {
                    cl = new PathClassLoader(systemServerClasspath, ClassLoader.getSystemClassLoader());
                    Thread.currentThread().setContextClassLoader(cl);
                }
    
                /*
                 * Pass the remaining arguments to SystemServer.
                 */
                RuntimeInit.zygoteInit(parsedArgs.targetSdkVersion, parsedArgs.remainingArgs, cl);
            }
    
            /* should never reach here */
        }
    

    这里先关闭从zygote继承来的socket,接着将SystemServer进程的umask设为0077(Os.umask(S_IRWXG | S_IRWXO):R读,W写,X执行,O其他用户,G组用户),所以System创建的文件属性就是0700,(umask设置 的权限就是chmod设置权限的补码),只有文件创建者SystemServer进程可以访问。接着 Process.setArgV0(parsedArgs.niceName),修改进程名。
    参数invokeWith通常为null,走else分支,接着用ClassLoader加载SystemServer类,从而调用 RuntimeInit.zygoteInit(),最后抛出来异常来(清除调用帧,防止ActivityThread.main方法返回),才真正调用SystemServer.main

    2、SystemServer的初始化

    public static void main(String[] args) {
            new SystemServer().run();
        }
    
    private void run() {
            // If a device's clock is before 1970 (before 0), a lot of
            // APIs crash dealing with negative numbers, notably
            // java.io.File#setLastModified, so instead we fake it and
            // hope that time from cell towers or NTP fixes it shortly.
            if (System.currentTimeMillis() < EARLIEST_SUPPORTED_TIME) {
                Slog.w(TAG, "System clock is before 1970; setting to 1970.");
                SystemClock.setCurrentTimeMillis(EARLIEST_SUPPORTED_TIME);
            }
    
            // Here we go!
            Slog.i(TAG, "Entered the Android system server!");
            EventLog.writeEvent(EventLogTags.BOOT_PROGRESS_SYSTEM_RUN, SystemClock.uptimeMillis());
    
            // In case the runtime switched since last boot (such as when
            // the old runtime was removed in an OTA), set the system
            // property so that it is in sync. We can't do this in
            // libnativehelper's JniInvocation::Init code where we already
            // had to fallback to a different runtime because it is
            // running as root and we need to be the system user to set
            // the property. http://b/11463182
            SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary());
    
            // Enable the sampling profiler.
            if (SamplingProfilerIntegration.isEnabled()) {
                SamplingProfilerIntegration.start();
                mProfilerSnapshotTimer = new Timer();
                mProfilerSnapshotTimer.schedule(new TimerTask() {
                    @Override
                    public void run() {
                        SamplingProfilerIntegration.writeSnapshot("system_server", null);
                    }
                }, SNAPSHOT_INTERVAL, SNAPSHOT_INTERVAL);
            }
    
            // Mmmmmm... more memory!
            VMRuntime.getRuntime().clearGrowthLimit();
    
            // The system server has to run all of the time, so it needs to be
            // as efficient as possible with its memory usage.
            VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);
    
            // Some devices rely on runtime fingerprint generation, so make sure
            // we've defined it before booting further.
            Build.ensureFingerprintProperty();
    
            // Within the system server, it is an error to access Environment paths without
            // explicitly specifying a user.
            Environment.setUserRequired(true);
    
            // Ensure binder calls into the system always run at foreground priority.
            BinderInternal.disableBackgroundScheduling(true);
    
            // Prepare the main looper thread (this thread).
            android.os.Process.setThreadPriority(
                    android.os.Process.THREAD_PRIORITY_FOREGROUND);
            android.os.Process.setCanSelfBackground(false);
            Looper.prepareMainLooper();
    
            // Initialize native services.
            System.loadLibrary("android_servers");
            nativeInit();
    
            // Check whether we failed to shut down last time we tried.
            // This call may not return.
            performPendingShutdown();
    
            // Initialize the system context.
            createSystemContext();
    
            // Create the system service manager.
            mSystemServiceManager = new SystemServiceManager(mSystemContext);
            LocalServices.addService(SystemServiceManager.class, mSystemServiceManager);
    
            // Start services.
            try {
                startBootstrapServices();
                startCoreServices();
                startOtherServices();
            } catch (Throwable ex) {
                Slog.e("System", "******************************************");
                Slog.e("System", "************ Failure starting system services", ex);
                throw ex;
            }
    
            // For debug builds, log event loop stalls to dropbox for analysis.
            if (StrictMode.conditionallyEnableDebugLogging()) {
                Slog.i(TAG, "Enabled StrictMode for system server main thread.");
            }
    
            // Loop forever.
            Looper.loop();
            throw new RuntimeException("Main thread loop unexpectedly exited");
        }
    

    总结上面做的事情 :

    • 1、调整时间,如果系统时间比1970还早,就调到到1970.
    • 2、SystemProperties.set("persist.sys.dalvik.vm.lib.2", VMRuntime.getRuntime().vmLibrary()) 设置该属性的值为虚拟机制的运行库路径
    • 3、 VMRuntime.getRuntime().setTargetHeapUtilization(0.8f);调整虚拟机堆的内存,设定虚拟机堆利用率为0.8,当实际的使用率偏离设定的比率时,虚拟机在垃圾回收的时候将调整堆的大小 ,使实际胳使用率接近设定的百分比
    • 4、System.loadLibrary("android_servers") 装载库libandroid_servers.so
    • 5、nativeInit()启用native层的SensorService(提供各种传感器的服务)
    • 6、调用createSystemContext来获取context
    • 7、创建SystemServiceManager的对象,负责系统的Service启动
    • 8、startBootstrapServices(); startCoreServices(); startOtherServices();创建并运行所有java服务
    • 9、调用Looper.looper()进入消息的循环
      分析下,createSystemContext():
     private void createSystemContext() {
            ActivityThread activityThread = ActivityThread.systemMain();
            mSystemContext = activityThread.getSystemContext();
            mSystemContext.setTheme(android.R.style.Theme_DeviceDefault_Light_DarkActionBar);
        }
    
    public static ActivityThread systemMain() {
            // The system process on low-memory devices do not get to use hardware
            // accelerated drawing, since this can add too much overhead to the
            // process.
            if (!ActivityManager.isHighEndGfx()) {
                HardwareRenderer.disable(true);
            } else {
                HardwareRenderer.enableForegroundTrimming();
            }
            ActivityThread thread = new ActivityThread();
            thread.attach(true);
            return thread;
        }
    

    SystemServer本身也需要一个和APK应用类似的上下文环境,所以会创建一个ActivityThread,再调用attch方法,true就是代表系统级应用,

     private void attach(boolean system) {
            sCurrentActivityThread = this;
            mSystemThread = system;
            if (!system) {
                ......
            } else {
                // Don't set application object here -- if the system crashes,
                // we can't display an alert, we just want to die die die.
                android.ddm.DdmHandleAppName.setAppName("system_process",
                        UserHandle.myUserId());
                try {
                    mInstrumentation = new Instrumentation();
                  //创建context对象
                    ContextImpl context = ContextImpl.createAppContext(
                            this, getSystemContext().mPackageInfo);
                    mInitialApplication = context.mPackageInfo.makeApplication(true, null);
                    mInitialApplication.onCreate();
                } catch (Exception e) {
                    throw new RuntimeException(
                            "Unable to instantiate Application():" + e.toString(), e);
                }
             .....
            }
        }
    

    attach方法为true时会创建一个类似应用的环境,这里创建ContextImp和Application,最后还调用了appliction的oncreate方法,完全模拟创建一个应用。但是,每个上下文环境都需要对应一个apk文件,看方法getSystemContext:

     public ContextImpl getSystemContext() {
            synchronized (this) {
                if (mSystemContext == null) {
                    mSystemContext = ContextImpl.createSystemContext(this);
                }
                return mSystemContext;
            }
        }
    

    ->createSystemContext

    static ContextImpl createSystemContext(ActivityThread mainThread) {
            LoadedApk packageInfo = new LoadedApk(mainThread);
            ContextImpl context = new ContextImpl(null, mainThread,
                    packageInfo, null, null, false, null, null);
            context.mResources.updateConfiguration(context.mResourcesManager.getConfiguration(),
                    context.mResourcesManager.getDisplayMetricsLocked(Display.DEFAULT_DISPLAY));
            return context;
        }
    

    ->LoadedApk()

    LoadedApk(ActivityThread activityThread) {
            mActivityThread = activityThread;
            mApplicationInfo = new ApplicationInfo();
            mApplicationInfo.packageName = "android";
            mPackageName = "android";
            ......
        }
    

    LoadedApkc对象保存一个apk文件的信息,这个构造方法中会使用的包名指定为”android“,而framework-res.apk的包名就是android。getSystemContext就是mSystemContext对象所对应的apk文件其实就是framework-res.apk。
    那么,attach就好理解了,attach返回新创建的ContextImpl对象实际上是在复制mSystemContext对象,接下来创建的application对象实际代表framework-res.apk。
    因此ActivityThread.systemMain实际上是相当于创建了framework-res.apk的上下文环境。

    3、SystemServer的WatchDog

    在SystemServer中的startOtherServices()方法中,

    private void startOtherServices() {
    ...
                Slog.i(TAG, "Init Watchdog");
                final Watchdog watchdog = Watchdog.getInstance();
                watchdog.init(context, mActivityManagerService);
    ...
    }
    

    watchdog单例模式的获取对象形式,

    private Watchdog() {
            super("watchdog");
            // Initialize handler checkers for each common thread we want to check.  Note
            // that we are not currently checking the background thread, since it can
            // potentially hold longer running operations with no guarantees about the timeliness
            // of operations there.
    
            // The shared foreground thread is the main checker.  It is where we
            // will also dispatch monitor checks and do other work.
            mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
                    "foreground thread", DEFAULT_TIMEOUT);
            mHandlerCheckers.add(mMonitorChecker);
            // Add checker for main thread.  We only do a quick check since there
            // can be UI running on the thread.
            mHandlerCheckers.add(new HandlerChecker(new Handler(Looper.getMainLooper()),
                    "main thread", DEFAULT_TIMEOUT));
            // Add checker for shared UI thread.
            mHandlerCheckers.add(new HandlerChecker(UiThread.getHandler(),
                    "ui thread", DEFAULT_TIMEOUT));
            // And also check IO thread.
            mHandlerCheckers.add(new HandlerChecker(IoThread.getHandler(),
                    "i/o thread", DEFAULT_TIMEOUT));
            // And the display thread.
            mHandlerCheckers.add(new HandlerChecker(DisplayThread.getHandler(),
                    "display thread", DEFAULT_TIMEOUT));
        }
    

    它的构造方法,主要是创建几个HandlerChecker对象:

    • 主线程
    • FgThread
    • uiThread
    • IoThread
    • DisplayThread

    并把它们保存到数组列表mHandlerCheckers,每个HandlerChecker对象对应一个被监控的线程,
    ->watcjdpg.init();

    public void init(Context context, ActivityManagerService activity) {
            mResolver = context.getContentResolver();
            mActivity = activity;
    
            context.registerReceiver(new RebootRequestReceiver(),
                    new IntentFilter(Intent.ACTION_REBOOT),
                    android.Manifest.permission.REBOOT, null);
        }
    

    注册了ReboorRequestReceive,监听重启的intent:ACTION_REBOOT
    3.1、
    watchdog主要监控的是进程。
    Watchdog提供两个方法addThread与addMonitor分别明智来增加需要监控的线程与服务,addThread实际创建一个和受监控对象关联的HandlerChecker对象

    public void addThread(Handler thread) {
            addThread(thread, DEFAULT_TIMEOUT);
        }
    
        public void addThread(Handler thread, long timeoutMillis) {
            synchronized (this) {
                if (isAlive()) {
                    throw new RuntimeException("Threads can't be added once the Watchdog is running");
                }
                final String name = thread.getLooper().getThread().getName();
                mHandlerCheckers.add(new HandlerChecker(thread, name, timeoutMillis));
            }
        }
    

    对服务的监控也是通过 HandlerChecker 对象来完成的,一个HandlerChecker 对象可以检查所有服务,mHandlerCheckers会完成这个任务

    public void addMonitor(Monitor monitor) {
            synchronized (this) {
                if (isAlive()) {
                    throw new RuntimeException("Monitors can't be added once the Watchdog is running");
                }
                mMonitorChecker.addMonitor(monitor);
            }
        }
    

    这个mMonitorChecker就是watchdog构造方法的mMonitorChecker,它的类里面有一个成员变量private final ArrayList<Monitor> mMonitors = new ArrayList<Monitor>(),通过jaddMonitor方法把服务add进这个mMonitors 列表里:

    public final class HandlerChecker implements Runnable {
            private final Handler mHandler;
            private final String mName;
            private final long mWaitMax;
            private final ArrayList<Monitor> mMonitors = new ArrayList<Monitor>();
            private boolean mCompleted;
            private Monitor mCurrentMonitor;
            private long mStartTime;
    
            HandlerChecker(Handler handler, String name, long waitMaxMillis) {
                mHandler = handler;
                mName = name;
                mWaitMax = waitMaxMillis;
                mCompleted = true;
            }
    
            public void addMonitor(Monitor monitor) {
                mMonitors.add(monitor);
            }
    

    例如,ActivityManagerService,PackageManagerService,PowerManagerService等服务和里面的线程加入到监控中。一个服务想通过WatchDog来监控,必须实现watchdog的接口Moniter:

    public interface Monitor {
            void monitor();
        }
    

    并且调用addMonitor方法把自己加入wathcdog的服务监控列表中。
    3.2、监控原理
    通过锁的mLock持有的时间是否颠簸来判断服务是否正常
    watchdog.run方法

     public void run() {
            boolean waitedHalf = false;
            while (true) {
                final ArrayList<HandlerChecker> blockedCheckers;
                final String subject;
                final boolean allowRestart;
                int debuggerWasConnected = 0;
                synchronized (this) {
                    long timeout = CHECK_INTERVAL;
                    // Make sure we (re)spin the checkers that have become idle within
                    // this wait-and-check interval
                    for (int i=0; i<mHandlerCheckers.size(); i++) {
                        HandlerChecker hc = mHandlerCheckers.get(i);
                        hc.scheduleCheckLocked();
                    }
    
                    if (debuggerWasConnected > 0) {
                        debuggerWasConnected--;
                    }
    
                    // NOTE: We use uptimeMillis() here because we do not want to increment the time we
                    // wait while asleep. If the device is asleep then the thing that we are waiting
                    // to timeout on is asleep as well and won't have a chance to run, causing a false
                    // positive on when to kill things.
                    long start = SystemClock.uptimeMillis();
                    while (timeout > 0) {
                        if (Debug.isDebuggerConnected()) {
                            debuggerWasConnected = 2;
                        }
                        try {
                            wait(timeout);
                        } catch (InterruptedException e) {
                            Log.wtf(TAG, e);
                        }
                        if (Debug.isDebuggerConnected()) {
                            debuggerWasConnected = 2;
                        }
                        timeout = CHECK_INTERVAL - (SystemClock.uptimeMillis() - start);
                    }
    
                    final int waitState = evaluateCheckerCompletionLocked();
                    if (waitState == COMPLETED) {
                        // The monitors have returned; reset
                        waitedHalf = false;
                        continue;
                    } else if (waitState == WAITING) {
                        // still waiting but within their configured intervals; back off and recheck
                        continue;
                    } else if (waitState == WAITED_HALF) {
                        if (!waitedHalf) {
                            // We've waited half the deadlock-detection interval.  Pull a stack
                            // trace and wait another half.
                            ArrayList<Integer> pids = new ArrayList<Integer>();
                            pids.add(Process.myPid());
                            ActivityManagerService.dumpStackTraces(true, pids, null, null,
                                    NATIVE_STACKS_OF_INTEREST);
                            waitedHalf = true;
                        }
                        continue;
                    }
                     ......
                    //杀死SystemServer
                    Slog.w(TAG, "*** GOODBYE!");
                    Process.killProcess(Process.myPid());
                    System.exit(10);
                }
    
                waitedHalf = false;
            }
        }
    

    做三件事:
    1、

     public void scheduleCheckLocked() {
                if (mMonitors.size() == 0 && mHandler.getLooper().isIdling()) {
                    // If the target looper is or just recently was idling, then
                    // there is no reason to enqueue our checker on it since that
                    // is as good as it not being deadlocked.  This avoid having
                    // to do a context switch to check the thread.  Note that we
                    // only do this if mCheckReboot is false and we have no
                    // monitors, since those would need to be executed at this point.
                    mCompleted = true;
                    return;
                }
    
                if (!mCompleted) {
                    // we already have a check in flight, so no need
                    return;
                }
    
                mCompleted = false;
                mCurrentMonitor = null;
                mStartTime = SystemClock.uptimeMillis();
                mHandler.postAtFrontOfQueue(this);
            }
    

    HandlerChecker对象又要监控服务,又要监控某个线程:
    mMonitors.size() ==0,说明HandlerChecker没有监控的服务
    mHandler.getLooper().isIdling()==true,说明被监控的线程空闲状态,线程良好,mCompleted设为true。否则把mCompleted设为false,然后记录消息开始发送的时候到mStartTime,再给被监控的线程发送一个信息(注意HandlerChecker是实现了Runnable接口)。再它的run方法:

    public void run() {
                final int size = mMonitors.size();
                for (int i = 0 ; i < size ; i++) {
                    synchronized (Watchdog.this) {
                        mCurrentMonitor = mMonitors.get(i);
                    }
                    mCurrentMonitor.monitor();
                }
    
                synchronized (Watchdog.this) {
                    mCompleted = true;
                    mCurrentMonitor = null;
                }
            }
    

    如果mCurrentMonitor.monitor(),如果得不到服务中的锁的话,线程就挂起,下面的代码就执行不了,mCompleted 就不能设置为true了,如果线程没有挂起,mCompleted =true,就说明线程或者服务正常,否则有问题,就要通过等待的时间是否超过规定的时间来判断。
    通常被监控的服务实现Monitor接口实现的方法

     public void monitor() {
            synchronized (this) { }
        }
    

    2、 给受控制的纯种发送信息后,调用wait方法让WatchDog线程睡眠一段时间(30秒)

    static final boolean DB = false;
    
    // Set this to true to have the watchdog record kernel thread stacks when it fires
    static final boolean RECORD_KERNEL_THREADS = true;
    
    static final long DEFAULT_TIMEOUT = DB ? 10*1000 : 60*1000;
    static final long CHECK_INTERVAL = DEFAULT_TIMEOUT / 2;
    

    3、调用evaluateCheckerCompletionLocked()获取等待时间来检查线程或者服务是否有问题

    // These are temporally ordered: larger values as lateness increases
    static final int COMPLETED = 0;
    static final int WAITING = 1;
    static final int WAITED_HALF = 2;
    static final int OVERDUE = 3;
    
    private int evaluateCheckerCompletionLocked() {
            int state = COMPLETED;
            for (int i=0; i<mHandlerCheckers.size(); i++) {
                HandlerChecker hc = mHandlerCheckers.get(i);
                state = Math.max(state, hc.getCompletionStateLocked());
            }
            return state;
        }
    

    ->HandlerChecker.java->getCompletionStateLocked

    public int getCompletionStateLocked() {
                if (mCompleted) {
                    return COMPLETED;
                } else {
                    long latency = SystemClock.uptimeMillis() - mStartTime;
                    if (latency < mWaitMax/2) {
                        return WAITING;
                    } else if (latency < mWaitMax) {
                        return WAITED_HALF;
                    }
                }
                return OVERDUE;
            }
    

    mStartTime 就是步骤(1)中给被监控的线程发信息scheduleCheckLocked()的那个时间 ,如果scheduleCheckLocked方法中mCompleted为tue(线程正常,不会挂起的情况),直接return COMPLETED;如果为false就计算它等待的时候,则:
    COMPLETED:值为0,表示状态良好;
    WAITING:值为1,表示正常等待消息处理的结果;
    WAITED_HALF:值为2,表示正常等待并且等待的时间超过了规定时间的一半;
    OVERDUE:值为3,表示等待的时间超过了规定时间

    最后,只要不是OVERDUE状态都可以continue继续执行,否则就会杀死SystemServer。
    参考<深入解析android 5.0系统>

    相关文章

      网友评论

        本文标题:SystemServer的理解

        本文链接:https://www.haomeiwen.com/subject/ruxyqftx.html