美文网首页优化源码分析
Android 进程优先级和 LowMemoryKiller 机

Android 进程优先级和 LowMemoryKiller 机

作者: mao眼 | 来源:发表于2016-12-07 16:01 被阅读3528次

    接上篇

    三 Low Memory Killer

    Andorid的 Low Memory Killer 是在标准的linux lernel的 OOM 基础上修改而来的一种内存管理机制。当系统内存不足时,杀死不必要的进程释放其内存。不必要的进程的选择根据有2个:oom_adj和占用的内存的大小。oom_adj 代表进程的优先级,数值越高,优先级越低,越容易被杀死;对应每个oom_adj都可以有一个空闲进程的阀值。Android Kernel每隔一段时间会检测当前空闲内存是否低于某个阀值。假如是,则杀死oom_adj最大的不必要的进程,如果有多个,就根据 oom_score_adj 去杀死进程,,直到内存恢复低于阀值的状态。

    LowMemoryKiller 的阈值的设定,主要保存在2个文件之中,分别是:

    • /sys/module/lowmemorykiller/parameters/adj
    • /sys/module/lowmemorykiller/parameters/minfree

    adj保存着当前系统杀进程的等级,minfree则是保存着对应的内存阀值。

    Nexus6 Android7.0 系统的设置(源码编译的 OS,可能和最终设备不一样):

    shamu:/ # cat /sys/module/lowmemorykiller/parameters/adj
    0,100,200,300,900,906
    shamu:/ # cat /sys/module/lowmemorykiller/parameters/minfree
    18432,23040,27648,32256,36864,46080
    

    例如:将1,6写入节点/sys/module/lowmemorykiller/parameters/adj,将1024,8192写入节点/sys/module/lowmemorykiller/parameters/minfree

    策略:当系统可用内存低于8192个pages时,则会杀掉oom_score_adj>=6的进程;当系统可用内存低于1024个pages时,则会杀掉oom_score_adj>=1的进程。

    3.1 lmkd 守护进程

    LMK 的进程是lmkd守护进程,随着系统的启动而启动的。实现源码要在system/core/lmkd/lmkd.c

    lmkd会创建名为lmkd的socket,节点位于/dev/socket/lmkd,该socket用于跟上层framework交互。

    service lmkd /system/bin/lmkd
        class core
        critical
        socket lmkd seqpacket 0660 system system
        writepid /dev/cpuset/system-background/tasks 
    

    lmkd 会接收 Framework 的命令,进行相应的操作:

    功能 命令 对应方法
    LMK_PROCPRIO 设置进程adj PL.setOomAdj()
    LMK_TARGET 更新oom_adj PL.updateOomLevels()
    LMK_PROCREMOVE 移除进程 PL.remove()

    lmkd socket 命令处理

    static void ctrl_command_handler(void) {
        int ibuf[CTRL_PACKET_MAX / sizeof(int)];
        int len;
        int cmd = -1;
        int nargs;
        int targets;
        len = ctrl_data_read((char *)ibuf, CTRL_PACKET_MAX);
        if (len <= 0)
            return;
        nargs = len / sizeof(int) - 1;
        if (nargs < 0)
            goto wronglen;
        //将网络字节顺序转换为主机字节顺序
        cmd = ntohl(ibuf[0]);
        switch(cmd) {
        case LMK_TARGET:
            targets = nargs / 2;
            if (nargs & 0x1 || targets > (int)ARRAY_SIZE(lowmem_adj))
                goto wronglen;
            cmd_target(targets, &ibuf[1]);
            break;
        case LMK_PROCPRIO:
            if (nargs != 3)
                goto wronglen;
            //设置进程adj
            cmd_procprio(ntohl(ibuf[1]), ntohl(ibuf[2]), ntohl(ibuf[3]));
            break;
        case LMK_PROCREMOVE:
            if (nargs != 1)
                goto wronglen;
            cmd_procremove(ntohl(ibuf[1]));
            break;
        default:
            ALOGE("Received unknown command code %d", cmd);
            return;
        }
        return;
    wronglen:
        ALOGE("Wrong control socket read length cmd=%d len=%d", cmd, len);
    }
    

    设置进程 adj

    static void cmd_procprio(int pid, int uid, int oomadj) {
        struct proc *procp;
        char path[80];
        char val[20];
        ...
        snprintf(path, sizeof(path), "/proc/%d/oom_score_adj", pid);
        snprintf(val, sizeof(val), "%d", oomadj);
        // 向节点/proc/<pid>/oom_score_adj写入oomadj
        writefilestring(path, val);
    
        // 当使用kernel方式则直接返回
        if (use_inkernel_interface)
            return;
        procp = pid_lookup(pid);
        if (!procp) {
                procp = malloc(sizeof(struct proc));
                if (!procp) {
                    // Oh, the irony.  May need to rebuild our state.
                    return;
                }
                procp->pid = pid;
                procp->uid = uid;
                procp->oomadj = oomadj;
                proc_insert(procp);
        } else {
            proc_unslot(procp);
            procp->oomadj = oomadj;
            proc_slot(procp);
        }
    }
    

    向节点/proc/<pid>/oom_score_adj写入oom_adj。由于use_inkernel_interface=1,那么再接下里需要看看 kernel 的情况。

    小结:

    use_inkernel_interface该值后续应该会逐渐采用用户空间策略。不过目前仍为use_inkernel_interface=1则有:

    • LMK_TARGET:AMS.updateConfiguration()的过程中调用updateOomLevels()方法, 分别向/sys/module/lowmemorykiller/parameters目录下的minfree和adj节点写入相应信息;
    • LMK_PROCPRIO: AMS.applyOomAdjLocked()的过程中调用setOomAdj(),向/proc/<pid>/oom_score_adj写入oomadj 后直接返回;
    • LMK_PROCREMOVE:AMS.handleAppDiedLocked或者 AMS.cleanUpApplicationRecordLocked()的过程,调用remove(),目前不做任何事,直接返回;

    3.2 LowMemoryKiller Kernel driver

    lowmemorykiller driver 位于 drivers/staging/android/lowmemorykiller.c

    lowmemorykiller

    static struct shrinker lowmem_shrinker = {
        .shrink = lowmem_shrink,
        .seeks = DEFAULT_SEEKS * 16
    };
    
    static int __init lowmem_init(void)
    {
        register_shrinker(&lowmem_shrinker);
        vmpressure_notifier_register(&lmk_vmpr_nb);
        return 0;
    }
    
    static void __exit lowmem_exit(void)
    {
        unregister_shrinker(&lowmem_shrinker);
    }
    

    通过 register_shrinkerunregister_shrinker分别用于初始化和退出。

    shrinker

    LMK驱动通过注册 shrinker 来实现的,shrinker是linux kernel标准的回收内存page的机制,由内核线程kswapd负责监控。

    当内存不足时kswapd线程会遍历一张shrinker链表,并回调已注册的shrinker函数来回收内存page,kswapd还会周期性唤醒来执行内存操作。每个zone维护active_list和inactive_list链表,内核根据页面活动状态将page在这两个链表之间移动,最终通过shrink_slab和shrink_zone来回收内存页。

    lowmem_shrink

    触发 shrink 操作:

    static int lowmem_shrink(struct shrinker *s, struct shrink_control *sc)
    {
        struct task_struct *tsk;
        struct task_struct *selected = NULL;
        int rem = 0;
        int tasksize;
        int i;
        int ret = 0;
        short min_score_adj = OOM_SCORE_ADJ_MAX + 1; //1001
        int minfree = 0;
        int selected_tasksize = 0;
        int selected_oom_score_adj;
        int array_size = ARRAY_SIZE(lowmem_adj);
        int other_free;
        int other_file;
        unsigned long nr_to_scan = sc->nr_to_scan;
    
        if (nr_to_scan > 0) {
            if (mutex_lock_interruptible(&scan_mutex) < 0)
                return 0;
        }
    
       // 剩余内存
        other_free = global_page_state(NR_FREE_PAGES);
    
        if (global_page_state(NR_SHMEM) + total_swapcache_pages <
            global_page_state(NR_FILE_PAGES))
            other_file = global_page_state(NR_FILE_PAGES) -
                            global_page_state(NR_SHMEM) -
                            total_swapcache_pages;
        else
            other_file = 0;
    
        tune_lmk_param(&other_free, &other_file, sc);
    
        if (lowmem_adj_size < array_size)
            array_size = lowmem_adj_size;
        if (lowmem_minfree_size < array_size)
            array_size = lowmem_minfree_size;
        for (i = 0; i < array_size; i++) {
            minfree = lowmem_minfree[i];
            if (other_free < minfree && other_file < minfree) {
                min_score_adj = lowmem_adj[i];
                break;
            }
        }
        if (nr_to_scan > 0) {
            ret = adjust_minadj(&min_score_adj);
            lowmem_print(3, "lowmem_shrink %lu, %x, ofree %d %d, ma %hd\n",
                    nr_to_scan, sc->gfp_mask, other_free,
                    other_file, min_score_adj);
        }
    
        rem = global_page_state(NR_ACTIVE_ANON) +
            global_page_state(NR_ACTIVE_FILE) +
            global_page_state(NR_INACTIVE_ANON) +
            global_page_state(NR_INACTIVE_FILE);
        if (nr_to_scan <= 0 || min_score_adj == OOM_SCORE_ADJ_MAX + 1) {
            lowmem_print(5, "lowmem_shrink %lu, %x, return %d\n",
                     nr_to_scan, sc->gfp_mask, rem);
    
            if (nr_to_scan > 0)
                mutex_unlock(&scan_mutex);
    
            if ((min_score_adj == OOM_SCORE_ADJ_MAX + 1) &&
                (nr_to_scan > 0))
                trace_almk_shrink(0, ret, other_free, other_file, 0);
    
            return rem;
        }
        selected_oom_score_adj = min_score_adj;
    
        rcu_read_lock();
        for_each_process(tsk) {
            struct task_struct *p;
            int oom_score_adj;
    
            if (tsk->flags & PF_KTHREAD)
                continue;
    
            /* if task no longer has any memory ignore it */
            if (test_task_flag(tsk, TIF_MM_RELEASED))
                continue;
    
            if (time_before_eq(jiffies, lowmem_deathpending_timeout)) {
                if (test_task_flag(tsk, TIF_MEMDIE)) {
                    rcu_read_unlock();
                    /* give the system time to free up the memory */
                    msleep_interruptible(20);
                    mutex_unlock(&scan_mutex);
                    return 0;
                }
            }
    
            p = find_lock_task_mm(tsk);
            if (!p)
                continue;
    
            oom_score_adj = p->signal->oom_score_adj;
            // oom_adj 小于 最小值,忽略
            if (oom_score_adj < min_score_adj) {
                task_unlock(p);
                continue;
            }
            // 进程 RSS
            tasksize = get_mm_rss(p->mm);
            task_unlock(p);
            if (tasksize <= 0)
                continue;
            if (selected) {
                if (oom_score_adj < selected_oom_score_adj)
                    continue;
                if (oom_score_adj == selected_oom_score_adj &&
                    tasksize <= selected_tasksize)
                    continue;
            }
            selected = p;
            selected_tasksize = tasksize;
            selected_oom_score_adj = oom_score_adj;
            lowmem_print(3, "select '%s' (%d), adj %hd, size %d, to kill\n",
                     p->comm, p->pid, oom_score_adj, tasksize);
        }
        if (selected) {
            lowmem_print(1, "Killing '%s' (%d), adj %d,\n" \
                    "   to free %ldkB on behalf of '%s' (%d) because\n" \
                    "   cache %ldkB is below limit %ldkB for oom_score_adj %hd\n" \
                    "   Free memory is %ldkB above reserved.\n" \
                    "   Free CMA is %ldkB\n" \
                    "   Total reserve is %ldkB\n" \
                    "   Total free pages is %ldkB\n" \
                    "   Total file cache is %ldkB\n" \
                    "   Slab Reclaimable is %ldkB\n" \
                    "   Slab UnReclaimable is %ldkB\n" \
                    "   Total Slab is %ldkB\n" \
                    "   GFP mask is 0x%x\n",
                     selected->comm, selected->pid,
                     selected_oom_score_adj,
                     selected_tasksize * (long)(PAGE_SIZE / 1024),
                     current->comm, current->pid,
                     other_file * (long)(PAGE_SIZE / 1024),
                     minfree * (long)(PAGE_SIZE / 1024),
                     min_score_adj,
                     other_free * (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_FREE_CMA_PAGES) *
                    (long)(PAGE_SIZE / 1024),
                     totalreserve_pages * (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_FREE_PAGES) *
                    (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_FILE_PAGES) *
                    (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_SLAB_RECLAIMABLE) *
                    (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_SLAB_UNRECLAIMABLE) *
                    (long)(PAGE_SIZE / 1024),
                     global_page_state(NR_SLAB_RECLAIMABLE) *
                    (long)(PAGE_SIZE / 1024) +
                     global_page_state(NR_SLAB_UNRECLAIMABLE) *
                    (long)(PAGE_SIZE / 1024),
                     sc->gfp_mask);
    
            if (lowmem_debug_level >= 2 && selected_oom_score_adj == 0) {
                show_mem(SHOW_MEM_FILTER_NODES);
                dump_tasks(NULL, NULL);
                show_mem_call_notifiers();
            }
    
            lowmem_deathpending_timeout = jiffies + HZ;
            send_sig(SIGKILL, selected, 0);
            set_tsk_thread_flag(selected, TIF_MEMDIE);
            rem -= selected_tasksize;
            rcu_read_unlock();
            /* give the system time to free up the memory */
            msleep_interruptible(20);
            trace_almk_shrink(selected_tasksize, ret,
                other_free, other_file, selected_oom_score_adj);
        } else {
            trace_almk_shrink(1, ret, other_free, other_file, 0);
            rcu_read_unlock();
        }
    
        lowmem_print(4, "lowmem_shrink %lu, %x, return %d\n",
                 nr_to_scan, sc->gfp_mask, rem);
        mutex_unlock(&scan_mutex);
        return rem;
    }
    
    
    • 选择oom_score_adj最大的进程中,并且rss内存最大的进程作为选中要杀的进程。
    • 杀进程方式:send_sig(SIGKILL, selected, 0)向选中的目标进程发送signal 9来杀掉目标进程。

    lmkd参数

    • oom_adj:代表进程的优先级, 数值越大,优先级越低,越容易被杀. 取值范围[-16, 15]
    • oom_score_adj: 取值范围[-1000, 1000]
    • oom_score:lmk策略中貌似并没有看到使用的地方,这个应该是oom才会使用。

    lowmem_oom_adj_to_oom_score_adj 计算:

    static int lowmem_oom_adj_to_oom_score_adj(int oom_adj)
    {
        if (oom_adj == OOM_ADJUST_MAX)
            return OOM_SCORE_ADJ_MAX;
        else
            return (oom_adj * OOM_SCORE_ADJ_MAX) / -OOM_DISABLE;
    }
    
    • 当oom_adj = 15, 则 oom_score_adj = 1000;
    • 当oom_adj < 15, 则 oom_score_adj = oom_adj * 1000/17;

    四 总结

    以上整个过程可以简单总结如下:

    • 系统 Framework 层根据不同类型进程生命周期控制,动态分配不同的 adj 值,并且在一定的时机会对所有进程的 adj 进行更新;
    • 更新 adj 时,Framework 层会和 lmkd 守护进程进行通信,修改系统 lmk driver 配置的参数,同时设置 /proc/pid/oom_score_adj;
    • lowmemorykiller 驱动会被 linux 内核的内存 shrinker 机制调度,在 shrinker 操作中,计算进程 adj 和 rss,依据 driver 的 oom_adj 和 minfree 配置,进行 kill 进程操作。

    所以,后台应用被回收的问题,需要额外关注:

    • 进程的生命周期及5大优先级分类
    • 减小内存占用,在 trimmemory 时能及时释放内存

    参考文档:

    1. API Guide :Process and thread
    2. ActivityManagerService.java
    3. ProcessList.java
    4. lkm.c
    5. lowmemorykiller.c

    相关文章

      网友评论

        本文标题:Android 进程优先级和 LowMemoryKiller 机

        本文链接:https://www.haomeiwen.com/subject/aqadmttx.html