美文网首页
iOS 启动速度

iOS 启动速度

作者: coder_feng | 来源:发表于2021-07-13 10:31 被阅读0次

    1.背景

    众所周知,应用启动时间是一个衡量app用户体验的一个标准,一般情况下:

    1秒内完成:响应时间快
    1-3秒内完成:速度还可以
    超过3秒:速度较慢
    超过5秒:速度非常慢
    对于用户来说,肯定是越快越好,超过5秒的话,估计用户想卸载app的心都有了,而我们画啦啦app项目目前还真的超过5秒,所以优化势在必行

    2.App启动概念

    在iOS启动的范畴里面,有三种定义启动,第一种是Cold Launch,第二种一种是Warm Launch,第三种是Resume Launch

    • Cold Launch:在系统没有任何该应用信息的场景下启动应用的行为,例如重启,或者kill掉一段时间之后再重新进入;
    • Warm Launch:最近销毁的,App 有部分内存残留,没有进程存在,点击进入
    • Resume Launch: 点击home键,退出到后台,然后再点击app重新进入,或者app间的切换

    3.app启动时间的定义(针对我个人处理的项目)

    通常启动时间的定义都是分为main之前和main之后,而main之后一般也是定义为到finishLaunch执行完毕的时间,如下图


    启动优化1.png

    在我们这次优化中,我们将点击app到出现首页加载第一屏数据为一个启动时间。为了抓取指标数据前后对比,下面将统计几个时间节点,以便将来做优化处理

    4.app 启动各阶段时间指标埋点(系统,机型,版本包含在内)

    属性变量名 数据类型 属性说明
    event String 取值appStartUp----(app启动)
    processStartTime double 进程创建时间戳
    mainTime double 进入main函数的时间戳
    didFinishStartTime double 进入didFinishLaunchingWithOptions函数的时间戳
    didFinishEndTime double didFinishLaunchingWithOptions 结束的时间戳
    didBecomeActiveStartTime double 进入applicationDidBecomeActive函数的时间戳
    didBecomeActiveEndTime double 进入applicationDidBecomeActive函数开始到结束的时间戳
    viewDidLoadStartTime double 进入首屏ViewController 的viewDidLoad的时间戳
    viewDidLoadEndTime double 进入首屏ViewController 的viewDidLoad开始到viewDidLoad结束的时间戳
    viewDidAppearStartTime double 进入首屏 ViewController 的viewDidAppear的时间戳
    viewDidAppearEndTime double 进入首屏 ViewController 的viewDidAppear开始到viewDidAppear结束的时间戳
    equestStartTime double 进入首屏ViewController开始请求接口的时间戳
    requestEndTime double 进入首屏到接口数据请求回来的时间戳
    deviceId String 设备唯一标识
    appVersion String 版本号
    deviceType String 设备型号
    os String 系统版本
    pageType long PageType_Visitor :游客页 1 PageType_Home:首页 2
    requestStatus long RequestStatus_Success:1 成功请求 RequestStatus_Fail:2 失败请求RequestStatus_Error 3 无网情况

    埋点时机:在最后一个打点执行完毕之后上传一条数据,并且整个生命周期只上传一次

    endpoint:cn-shenzhen.log.aliyuncs.com
     
    project: live-logs
     
    logstore: live-business-log(正式)、live-monitor-test(测试)、live-monitor-dev(开发)
     
    AccessKeyID:xxxxxxxx
    AccessKeySecret:yyyyyyy
    

    5.测试环境

    5.1环境设置
    • 重启手机,并等待2-3分钟
    • 启用飞行模式
    • 退出icloud账户
    • release模式下测试
    • 测试热启动时间
    5.2测试设备

    尽可能使用手头上的测试设备,因为用户什么设备都会有的,这里,我采取了iphone6s Plus 和iPhone X两种设备来测试
    测试出来的数据:这里的时间会比pre-main 的时间长原因如下:(pre-main 时间从加载dyld开始到initializer结束)


    exec1.png exec2.png

    6.app启动流程

    这里会涉及到部分源码知识,但是并不会深入解析源码,这里只是了解一个过程;涉及到app的启动,不能不谈到dyld这个动态链接器,pre_main 部分,几乎都是这个dyld在做事情,这里先了解一下dyld的发展史;

    6.1 dyld的发展史

    其实dyld 也一直在为加快启动app的流程一直都在努力,从版本的升级迭代是可以感受到的,第一代:dyld1.0(1996-2004),这个时候很多动态库还是使用C++;第二代:dyld2.0(2004-2007),纠正了很多C++ initizlizer semantics、提升Security的检测,prebinding 被shared cache完全取代;第三代:dyld 3 (2017-2021)更加安全和启动速度更快

    6.2 dyld 版本对比

    因为WWDC上面只介绍了dyld2和dyld3,所以这里也就对比这两个版本

    • dyld2和dyld3的对比

      dyld2和dyld3对比.png
    • dyld2 做的事情

      dyld2.png
    • dyld3 做的事情

      dyld3.png
    • dyld3 中上一部分表示在进程外处理


      dyld3_1.png
    • dyld3 中下半部分表示在进程内处理


      dyld3_2.png
      dyld3_3.png

    从上面的图文对比,可以知道dyld2是纯粹的in-process,也就是在程序进程内执行的,也就意味着只有当应用程序被启动的时候,dyld2才能开始执行任务。而dyld3把一些工作提前就已经做好了,所以dyld3性能更高;

    dyld2 主要工作流程

    ①解析 mach-o 文件,找到其依赖的库,并且递归的找到所有依赖的库,形成一张动态库的依赖图。iOS 上的大部分 app 都依赖几百个动态链接库(大部分是系统的动态库),所以这个步骤包含了较大的工作量。
     
    ②匹配 mach-o 文件到自身的地址空间
     
    ③进行符号查找(perform symbol lookups):比如 app 中调用了 printf 方法,就需要去系统库中查找到 printf 的地址,然后将地址拷贝到 app 中的函数指针中
     
    ④rebase和binding:由于 app 需要让地址空间配置随机加载,所以所有的指针都需要加上一个基地址
     
    ⑤运行初始化程序,之后运行 main() 函数
     
    存在问题:性能、安全性和可测试性上 不够好(dyld3 解决)
    

    dyld3 主要工作流程

    ①本APP进程外的Mach-O分析器/编译器;
    在dyld 2的加载流程中,Parse mach-o headers和Find Dependencies存在安全风险(可以通过修改mach-o header及添加非法@rpath进行攻击),而Perform symbol lookups会耗费较多的CPU时间,因为一个库文件不变时,符号将始终位于库中相同的偏移位置,这两部分在dyld 3中将采用提前写入把结果数据缓存成文件的方式构成一个”lauch closure“(可以理解为缓存文件)。它处理了所有可能影响启动速度的 search path,@rpaths 和环境变量;它解析 mach-o 二进制文件,分析其依赖的动态库,并且完成了所有符号查找的工作;最后它将这些工作的结果创建成了启动闭包,写入缓存,这样,在应用启动的时候,就可以直接从缓存中读取数据,加快加载速度。
     
    这是一个普通的 daemon 进程,可以使用通常的测试架构。
    out-of-process是一个普通的后台守护程序,因为从各个APP进程抽离出来了,可以提高dyld3的可测试性。
     
    ②本进程内执行”lauch closure“的引擎;验证”lauch closures“是否正确,把dylib映射到APP进程的地址空间里,然后跳转到main函数。此时,它不再需要分析mach-o header和执行符号查找,节省了不少时间。
     
    ③”lauch closure“的缓存:
     
    iOS操作系统内置APP的”lauch closure“直接内置在shared cache共享缓存中,我们甚至不需要打开一个单独的文件;
     
    而对于第三方APP,将在APP安装或更新版本时(或者操作系统升级时?)生成lauch closure启动闭包,因为那时候的系统库已经发生更改。这样就能保证”lauch closure“总是在APP打开之前准备好。启动闭包会被写到到一个文件里,下次启动则直接读取和验证这个文件。
     
    在 iOS,tvOS,watchOS 中,一切(生成启动闭包)都是在 app 启动之前做完的。在 macOS 上,由于有 sideload app,进程内引擎会在首次启动时启动一个 daemon,之后就可以使用启动闭包了。总之大部分情景下,这些工作都在 app 启动之前完成了。
     
    大部分的启动场景都不需要调用这个进程外的 mach-o 解析器。而启动闭包又比 MachO 简单很多,因为它是一个内存映射文件,解析和验证都非常简单,并且经过了良好的性能优化。所以 dyld 3.0 的引入,能让 app 的启动速度得到明显提升。
    总体来说,dyld 3把很多耗时的操作都提前处理好了,极大提升了启动速度
    
    6.3 dyld的原理(dyld_start)

    app启动调用的时候是会调用dyld_start 这个函数的,所以我们分析次函数的调用应该就差不多了

    dyld 源码分析

    #if __arm64__
        .data
        .align 3
    __dso_static:
        .quad   ___dso_handle
     
        .text
        .align 2
        .globl __dyld_start
    __dyld_start:
        mov     x28, sp
        and     sp, x28, #~15       // force 16-byte alignment of stack
        mov x0, #0
        mov x1, #0
        stp x1, x0, [sp, #-16]! // make aligned terminating frame
        mov fp, sp          // set up fp to point to terminating frame
        sub sp, sp, #16             // make room for local variables
        ldr     x0, [x28]       // get app's mh into x0
        ldr     x1, [x28, #8]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
        add     x2, x28, #16        // get argv into x2
        adrp    x4,___dso_handle@page
        add     x4,x4,___dso_handle@pageoff // get dyld's mh in to x4
        adrp    x3,__dso_static@page
        ldr     x3,[x3,__dso_static@pageoff] // get unslid start of dyld
        sub     x3,x4,x3        // x3 now has slide of dyld
        mov x5,sp                   // x5 has &startGlue
         
        // call dyldbootstrap::start(app_mh, argc, argv, slide, dyld_mh, &startGlue)
        bl  __ZN13dyldbootstrap5startEPK12macho_headeriPPKclS2_Pm
        mov x16,x0                  // save entry point address in x16
        ldr     x1, [sp]
        cmp x1, #0
        b.ne    Lnew
     
        // LC_UNIXTHREAD way, clean up stack and jump to result
        add sp, x28, #8     // restore unaligned stack pointer without app mh
        br  x16         // jump to the program's entry point
     
        // LC_MAIN case, set up stack for call to main()
    Lnew:   mov lr, x1          // simulate return address into _start in libdyld.dylib
        ldr     x0, [x28, #8]       // main param1 = argc
        add     x1, x28, #16        // main param2 = argv
        add x2, x1, x0, lsl #3 
        add x2, x2, #8      // main param3 = &env[0]
        mov x3, x2
    Lapple: ldr x4, [x3]
        add x3, x3, #8
        cmp x4, #0
        b.ne    Lapple          // main param4 = apple
        br  x16
     
    #endif // __arm64__
     
     
    //
    //  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
    //  In dyld we have to do this manually.
    //
    uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[],
                    intptr_t slide, const struct macho_header* dyldsMachHeader,
                    uintptr_t* startGlue)
    {
        // if kernel had to slide dyld, we need to fix up load sensitive locations
        // we have to do this before using any global variables
        if ( slide != 0 ) {
            rebaseDyld(dyldsMachHeader, slide);
        }
     
        // allow dyld to use mach messaging
        mach_init();
     
        // kernel sets up env pointer to be just past end of agv array
        const char** envp = &argv[argc+1];
         
        // kernel sets up apple pointer to be just past end of envp array
        const char** apple = envp;
        while(*apple != NULL) { ++apple; }
        ++apple;
     
        // set up random value for stack canary
        __guard_setup(apple);
     
    #if DYLD_INITIALIZER_SUPPORT
        // run all C++ initializers inside dyld
        runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
    #endif
     
        // now that we are done bootstrapping dyld, call dyld's main
        uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);//获取该次运行的ASLR
        return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
    }
     
     
    static uintptr_t slideOfMainExecutable(const struct macho_header* mh)
    {
        const uint32_t cmd_count = mh->ncmds;
        const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
        const struct load_command* cmd = cmds;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
                const struct macho_segment_command* segCmd = (struct macho_segment_command*)cmd;
                if ( (segCmd->fileoff == 0) && (segCmd->filesize != 0)) {
                    return (uintptr_t)mh - segCmd->vmaddr;
                }
            }
    //没有符合条件的话,就继续遍历下一个command_PAGEZERO->_TEXT_DATA_LINKEDIT
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
        return 0;
    }
     
     
     static void rebaseDyld(const struct macho_header* mh, intptr_t slide)
    {
        // rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
        // and get interesting pointers into dyld
        const uint32_t cmd_count = mh->ncmds;
        const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
        const struct load_command* cmd = cmds;
        const struct macho_segment_command* linkEditSeg = NULL;
    #if __x86_64__
        const struct macho_segment_command* firstWritableSeg = NULL;
    #endif
        const struct dysymtab_command* dynamicSymbolTable = NULL;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            switch (cmd->cmd) {
                case LC_SEGMENT_COMMAND:
                    {
                        const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
                        if ( strcmp(seg->segname, "__LINKEDIT") == 0 )
                            linkEditSeg = seg;
                        const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
                        const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects];
                        for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
                            const uint8_t type = sect->flags & SECTION_TYPE;
                            if ( type == S_NON_LAZY_SYMBOL_POINTERS ) {
                                // rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
                                const uint32_t pointerCount = (uint32_t)(sect->size / sizeof(uintptr_t));
                                uintptr_t* const symbolPointers = (uintptr_t*)(sect->addr + slide);
                                for (uint32_t j=0; j < pointerCount; ++j) {
                                    symbolPointers[j] += slide;
                                }
                            }
                        }
    #if __x86_64__
                        if ( (firstWritableSeg == NULL) && (seg->initprot & VM_PROT_WRITE) )
                            firstWritableSeg = seg;
    #endif
                    }
                    break;
                case LC_DYSYMTAB:
                    dynamicSymbolTable = (struct dysymtab_command *)cmd;
                    break;
            }
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
         
        // use reloc's to rebase all random data pointers
    #if __x86_64__
        const uintptr_t relocBase = firstWritableSeg->vmaddr + slide;
    #else
        const uintptr_t relocBase = (uintptr_t)mh;
    #endif
        const relocation_info* const relocsStart = (struct relocation_info*)(linkEditSeg->vmaddr + slide + dynamicSymbolTable->locreloff - linkEditSeg->fileoff);
        const relocation_info* const relocsEnd = &relocsStart[dynamicSymbolTable->nlocrel];
        for (const relocation_info* reloc=relocsStart; reloc < relocsEnd; ++reloc) {
            if ( reloc->r_length != RELOC_SIZE )
                throw "relocation in dyld has wrong size";
     
            if ( reloc->r_type != POINTER_RELOC )
                throw "relocation in dyld has wrong type";
             
            // update pointer by amount dyld slid
            *((uintptr_t*)(reloc->r_address + relocBase)) += slide;
        }
    }
    

    dyldbootstrap::start 源码分析

    /
    //  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
    //  In dyld we have to do this manually.
    //
    uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[],
                    intptr_t slide, const struct macho_header* dyldsMachHeader,
                    uintptr_t* startGlue)
    {
        // if kernel had to slide dyld, we need to fix up load sensitive locations
        // we have to do this before using any global variables
        if ( slide != 0 ) {
            rebaseDyld(dyldsMachHeader, slide);
        }
     
        // allow dyld to use mach messaging
        mach_init();
     
        // kernel sets up env pointer to be just past end of agv array
        const char** envp = &argv[argc+1];
         
        // kernel sets up apple pointer to be just past end of envp array
        const char** apple = envp;
        while(*apple != NULL) { ++apple; }
        ++apple;
     
        // set up random value for stack canary
        __guard_setup(apple);
     
    #if DYLD_INITIALIZER_SUPPORT
        // run all C++ initializers inside dyld
        runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
    #endif
     
        // now that we are done bootstrapping dyld, call dyld's main
        uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);//获取该次运行的ASLR
        return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
    }
    

    slideOfMainExecutable源码分析

    static uintptr_t slideOfMainExecutable(const struct macho_header* mh)
    {
        const uint32_t cmd_count = mh->ncmds;
        const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
        const struct load_command* cmd = cmds;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
                const struct macho_segment_command* segCmd = (struct macho_segment_command*)cmd;
                if ( (segCmd->fileoff == 0) && (segCmd->filesize != 0)) {
                    return (uintptr_t)mh - segCmd->vmaddr;
                }
            }
    //没有符合条件的话,就继续遍历下一个command_PAGEZERO->_TEXT_DATA_LINKEDIT
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
        return 0;
    }
    

    rebaseDyld

    static void rebaseDyld(const struct macho_header* mh, intptr_t slide)
    {
        // rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
        // and get interesting pointers into dyld
        const uint32_t cmd_count = mh->ncmds;
        const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
        const struct load_command* cmd = cmds;
        const struct macho_segment_command* linkEditSeg = NULL;
    #if __x86_64__
        const struct macho_segment_command* firstWritableSeg = NULL;
    #endif
        const struct dysymtab_command* dynamicSymbolTable = NULL;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            switch (cmd->cmd) {
                case LC_SEGMENT_COMMAND:
                    {
                        const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
                        if ( strcmp(seg->segname, "__LINKEDIT") == 0 )
                            linkEditSeg = seg;
                        const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
                        const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects];
                        for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
                            const uint8_t type = sect->flags & SECTION_TYPE;
                            if ( type == S_NON_LAZY_SYMBOL_POINTERS ) {
                                // rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
                                const uint32_t pointerCount = (uint32_t)(sect->size / sizeof(uintptr_t));
                                uintptr_t* const symbolPointers = (uintptr_t*)(sect->addr + slide);
                                for (uint32_t j=0; j < pointerCount; ++j) {
                                    symbolPointers[j] += slide;
                                }
                            }
                        }
    #if __x86_64__
                        if ( (firstWritableSeg == NULL) && (seg->initprot & VM_PROT_WRITE) )
                            firstWritableSeg = seg;
    #endif
                    }
                    break;
                case LC_DYSYMTAB:
                    dynamicSymbolTable = (struct dysymtab_command *)cmd;
                    break;
            }
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
         
        // use reloc's to rebase all random data pointers
    #if __x86_64__
        const uintptr_t relocBase = firstWritableSeg->vmaddr + slide;
    #else
        const uintptr_t relocBase = (uintptr_t)mh;
    #endif
        const relocation_info* const relocsStart = (struct relocation_info*)(linkEditSeg->vmaddr + slide + dynamicSymbolTable->locreloff - linkEditSeg->fileoff);
        const relocation_info* const relocsEnd = &relocsStart[dynamicSymbolTable->nlocrel];
        for (const relocation_info* reloc=relocsStart; reloc < relocsEnd; ++reloc) {
            if ( reloc->r_length != RELOC_SIZE )
                throw "relocation in dyld has wrong size";
     
            if ( reloc->r_type != POINTER_RELOC )
                throw "relocation in dyld has wrong type";
             
            // update pointer by amount dyld slid
            *((uintptr_t*)(reloc->r_address + relocBase)) += slide;
        }
    }
    

    dyld::main

    uintptr_t
    _main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
            int argc, const char* argv[], const char* envp[], const char* apple[],
            uintptr_t* startGlue)
    {
        uintptr_t result = 0;
        //保存执行文件头部,后续可以根据头部访问其它信息
        sMainExecutableMachHeader = mainExecutableMH;
    #if __MAC_OS_X_VERSION_MIN_REQUIRED
        // if this is host dyld, check to see if iOS simulator is being run
        const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
        if ( rootPath != NULL ) {
            // Add dyld to the kernel image info before we jump to the sim
            notifyKernelAboutDyld();
     
            // look to see if simulator has its own dyld
            char simDyldPath[PATH_MAX];
            strlcpy(simDyldPath, rootPath, PATH_MAX);
            strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
            int fd = my_open(simDyldPath, O_RDONLY, 0);
            if ( fd != -1 ) {
                const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
                if ( errMessage != NULL )
                    halt(errMessage);
                return result;
            }
        }
    #endif
     
        CRSetCrashLogMessage("dyld: launch started");
         
        //设置上下文信息
        setContext(mainExecutableMH, argc, argv, envp, apple);
     
        // Pickup the pointer to the exec path.
        //获取可执行文件路径
        sExecPath = _simple_getenv(apple, "executable_path");
     
        // <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
        if (!sExecPath) sExecPath = apple[0];
         
        //相对路径转成绝对路径
        if ( sExecPath[0] != '/' ) {
            // have relative path, use cwd to make absolute
            char cwdbuff[MAXPATHLEN];
            if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
                // maybe use static buffer to avoid calling malloc so early...
                char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
                strcpy(s, cwdbuff);
                strcat(s, "/");
                strcat(s, sExecPath);
                sExecPath = s;
            }
        }
        // Remember short name of process for later logging
        //获取文件名字
        sExecShortName = ::strrchr(sExecPath, '/');
        if ( sExecShortName != NULL )
            ++sExecShortName;
        else
            sExecShortName = sExecPath;
         
        //配置进程是否受限
        configureProcessRestrictions(mainExecutableMH);
     
    #if __MAC_OS_X_VERSION_MIN_REQUIRED
        if ( gLinkContext.processIsRestricted ) {
            //去掉DYLD_* and LD_LIBRARY_PATH环境变量
            pruneEnvironmentVariables(envp, &apple);
            // set again because envp and apple may have changed or moved
            //重新设置上下文
            setContext(mainExecutableMH, argc, argv, envp, apple);
        }
        else
    #endif
        {
            //检查设置环境变量
            checkEnvironmentVariables(envp);
            //如果DYLD_FALLBACK为nil,设置为默认的
            defaultUninitializedFallbackPaths(envp);
        }
        //如果设置了DYLD_PRINT_OPTS环境变量打印参数
        if ( sEnv.DYLD_PRINT_OPTS )
            printOptions(argv);
        //如果设置了DYLD_PRINT_ENV环境变量打印环境变量
        if ( sEnv.DYLD_PRINT_ENV )
            printEnvironmentVariables(envp);
        //获取当前运行架构信息
        getHostInfo(mainExecutableMH, mainExecutableSlide);
        // install gdb notifier
        stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
        stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
        // make initial allocations large enough that it is unlikely to need to be re-alloced
        sImageRoots.reserve(16);
        sAddImageCallbacks.reserve(4);
        sRemoveImageCallbacks.reserve(4);
        sImageFilesNeedingTermination.reserve(16);
        sImageFilesNeedingDOFUnregistration.reserve(8);
     
    #if !TARGET_IPHONE_SIMULATOR
    #ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
        // <rdar://problem/6849505> Add gating mechanism to dyld support system order file generation process
        WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
    #endif
    #endif
     
     
        try {
            // add dyld itself to UUID list
            //增加自身到UUID列表
            addDyldImageToUUIDList();
            //通知内核
            notifyKernelAboutDyld();
     
    #if SUPPORT_ACCELERATE_TABLES
            bool mainExcutableAlreadyRebased = false;
     
    reloadAllImages:
    #endif
     
            CRSetCrashLogMessage(sLoadingCrashMessage);
            // instantiate ImageLoader for main executable
            //加载可执行文件并生成一个ImageLoader实例对象
            sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
            gLinkContext.mainExecutable = sMainExecutable;
            gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
     
    #if TARGET_IPHONE_SIMULATOR
            // check main executable is not too new for this OS
            {
                if ( ! isSimulatorBinary((uint8_t*)mainExecutableMH, sExecPath) ) {
                    throwf("program was built for a platform that is not supported by this runtime");
                }
                uint32_t mainMinOS = sMainExecutable->minOSVersion();
     
                // dyld is always built for the current OS, so we can get the current OS version
                // from the load command in dyld itself.
                uint32_t dyldMinOS = ImageLoaderMachO::minOSVersion((const mach_header*)&__dso_handle);
                if ( mainMinOS > dyldMinOS ) {
        #if TARGET_OS_WATCH
                    throwf("app was built for watchOS %d.%d which is newer than this simulator %d.%d",
                            mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
                            dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
        #elif TARGET_OS_TV
                    throwf("app was built for tvOS %d.%d which is newer than this simulator %d.%d",
                            mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
                            dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
        #else
                    throwf("app was built for iOS %d.%d which is newer than this simulator %d.%d",
                            mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
                            dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
        #endif
                }
            }
    #endif
     
     
        #if __MAC_OS_X_VERSION_MIN_REQUIRED
            // <rdar://problem/22805519> be less strict about old mach-o binaries
            uint32_t mainSDK = sMainExecutable->sdkVersion();
            gLinkContext.strictMachORequired = (mainSDK >= DYLD_MACOSX_VERSION_10_12) || gLinkContext.processUsingLibraryValidation;
        #else
            // simulators, iOS, tvOS, and watchOS are always strict
            gLinkContext.strictMachORequired = true;
        #endif
     
            // load shared cache
            //检查共享缓存是否开启,ios必须开启
            checkSharedRegionDisable();
        #if DYLD_SHARED_CACHE_SUPPORT
            if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
                //检查共享缓存是否映射到共享区域
                mapSharedCache();
            } else {
                dyld_kernel_image_info_t kernelCacheInfo;
                bzero(&kernelCacheInfo.uuid[0], sizeof(uuid_t));
                kernelCacheInfo.load_addr = 0;
                kernelCacheInfo.fsobjid.fid_objno = 0;
                kernelCacheInfo.fsobjid.fid_generation = 0;
                kernelCacheInfo.fsid.val[0] = 0;
                kernelCacheInfo.fsid.val[0] = 0;
                task_register_dyld_shared_cache_image_info(mach_task_self(), kernelCacheInfo, true, false);
            }
        #endif
     
        #if SUPPORT_ACCELERATE_TABLES
            sAllImages.reserve((sAllCacheImagesProxy != NULL) ? 16 : INITIAL_IMAGE_COUNT);
        #else
            sAllImages.reserve(INITIAL_IMAGE_COUNT);
        #endif
     
            // Now that shared cache is loaded, setup an versioned dylib overrides
        #if SUPPORT_VERSIONED_PATHS
            //检查是否有库的版本是否有更新,如果有则覆盖原有的
            checkVersionedPaths();
        #endif
     
     
            // dyld_all_image_infos image list does not contain dyld
            // add it as dyldPath field in dyld_all_image_infos
            // for simulator, dyld_sim is in image list, need host dyld added
    #if TARGET_IPHONE_SIMULATOR
            // get path of host dyld from table of syscall vectors in host dyld
            void* addressInDyld = gSyscallHelpers;
    #else
            // get path of dyld itself
            void*  addressInDyld = (void*)&__dso_handle;
    #endif
            char dyldPathBuffer[MAXPATHLEN+1];
            int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
            if ( len > 0 ) {
                dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
                if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
                    gProcessInfo->dyldPath = strdup(dyldPathBuffer);
            }
             
            //加载所有DYLD_INSERT_LIBRARIES指定的库
            // load any inserted libraries
            if  ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
                for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
                    loadInsertedDylib(*lib);
            }
            // record count of inserted libraries so that a flat search will look at
            // inserted libraries, then main, then others.
            sInsertedDylibCount = sAllImages.size()-1;
     
            // link main executable
            gLinkContext.linkingMainExecutable = true;
    #if SUPPORT_ACCELERATE_TABLES
            if ( mainExcutableAlreadyRebased ) {
                // previous link() on main executable has already adjusted its internal pointers for ASLR
                // work around that by rebasing by inverse amount
                sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
            }
    #endif
            //链接主程序
            link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
            sMainExecutable->setNeverUnloadRecursive();
            if ( sMainExecutable->forceFlat() ) {
                gLinkContext.bindFlat = true;
                gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
            }
             
            //链接插入的动态库
            // link any inserted libraries
            // do this after linking main executable so that any dylibs pulled in by inserted
            // dylibs (e.g. libSystem) will not be in front of dylibs the program uses
            if ( sInsertedDylibCount > 0 ) {
                for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                    ImageLoader* image = sAllImages[i+1];
                    link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
                    image->setNeverUnloadRecursive();
                }
                // only INSERTED libraries can interpose
                // register interposing info after all inserted libraries are bound so chaining works
                for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                    ImageLoader* image = sAllImages[i+1];
                    //注册插入
                    image->registerInterposing();
                }
            }
     
            // <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
            for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
                ImageLoader* image = sAllImages[i];
                if ( image->inSharedCache() )
                    continue;
                image->registerInterposing();
            }
        #if SUPPORT_ACCELERATE_TABLES
            if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
                // Accelerator tables cannot be used with implicit interposing, so relaunch with accelerator tables disabled
                ImageLoader::clearInterposingTuples();
                // unmap all loaded dylibs (but not main executable)
                for (long i=1; i < sAllImages.size(); ++i) {
                    ImageLoader* image = sAllImages[i];
                    if ( image == sMainExecutable )
                        continue;
                    if ( image == sAllCacheImagesProxy )
                        continue;
                    image->setCanUnload();
                    ImageLoader::deleteImage(image);
                }
                // note: we don't need to worry about inserted images because if DYLD_INSERT_LIBRARIES was set we would not be using the accelerator table
                sAllImages.clear();
                sImageRoots.clear();
                sImageFilesNeedingTermination.clear();
                sImageFilesNeedingDOFUnregistration.clear();
                sAddImageCallbacks.clear();
                sRemoveImageCallbacks.clear();
                sDisableAcceleratorTables = true;
                sAllCacheImagesProxy = NULL;
                sMappedRangesStart = NULL;
                mainExcutableAlreadyRebased = true;
                gLinkContext.linkingMainExecutable = false;
                resetAllImages();
                goto reloadAllImages;
            }
        #endif
     
            // apply interposing to initial set of images
            for(int i=0; i < sImageRoots.size(); ++i) {
                //应用插入
                sImageRoots[i]->applyInterposing(gLinkContext);
            }
            gLinkContext.linkingMainExecutable = false;
             
            // <rdar://problem/12186933> do weak binding only after all inserted images linked
            //弱符号绑定
            sMainExecutable->weakBind(gLinkContext);
     
        #if DYLD_SHARED_CACHE_SUPPORT
            // If cache has branch island dylibs, tell debugger about them
            if ( (sSharedCache != NULL) && (sSharedCache->mappingOffset >= 0x78) && (sSharedCache->branchPoolsOffset != 0) ) {
                uint32_t count = sSharedCache->branchPoolsCount;
                dyld_image_info info[count];
                const uint64_t* poolAddress = (uint64_t*)((char*)sSharedCache + sSharedCache->branchPoolsOffset);
                // <rdar://problem/20799203> empty branch pools can be in development cache
                if ( ((mach_header*)poolAddress)->magic == sMainExecutableMachHeader->magic ) {
                    for (int poolIndex=0; poolIndex < count; ++poolIndex) {
                        uint64_t poolAddr = poolAddress[poolIndex] + sSharedCacheSlide;
                        info[poolIndex].imageLoadAddress = (mach_header*)(long)poolAddr;
                        info[poolIndex].imageFilePath = "dyld_shared_cache_branch_islands";
                        info[poolIndex].imageFileModDate = 0;
                    }
                    // add to all_images list
                    addImagesToAllImages(count, info);
                    // tell gdb about new branch island images
                    gProcessInfo->notification(dyld_image_adding, count, info);
                }
            }
        #endif
     
            CRSetCrashLogMessage("dyld: launch, running initializers");
        #if SUPPORT_OLD_CRT_INITIALIZATION
            // Old way is to run initializers via a callback from crt1.o
            if ( ! gRunInitializersOldWay )
                initializeMainExecutable();
        #else
            // run all initializers
            //执行初始化方法
            initializeMainExecutable();
        #endif
     
            // notify any montoring proccesses that this process is about to enter main()
            notifyMonitoringDyldMain();
     
            // find entry point for main executable
            //LC_MAIN
            result = (uintptr_t)sMainExecutable->getThreadPC();
            if ( result != 0 ) {
                // main executable uses LC_MAIN, needs to return to glue in libdyld.dylib
                if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
                    *startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
                else
                    halt("libdyld.dylib support not present for LC_MAIN");
            }
            else {
                // main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
                //LC_UNIXTHREAD
                result = (uintptr_t)sMainExecutable->getMain();
                *startGlue = 0;
            }
        }
        catch(const char* message) {
            syncAllImages();
            halt(message);
        }
        catch(...) {
            dyld::log("dyld: launch failed\n");
        }
     
        CRSetCrashLogMessage(NULL);
        sNotifyObjCInit
        return result;
    }
    

    7.app优化的方向

    启动完整图.png
    7.1 pre_main 优化方向

    pre-main 时间的测量
    将DYLD_PRINT_STATISTICS环境变量添加到项目scheme中
    运行一下,查看控制台的输出:(main 函数之前一共用了689.74秒)


    xcode设置.png
    Total pre-main time: 689.74 milliseconds (100.0%)
             dylib loading time: 123.58 milliseconds (17.9%)
            rebase/binding time:  43.52 milliseconds (6.3%)
                ObjC setup time:  39.95 milliseconds (5.7%)
               initializer time: 482.51 milliseconds (69.9%)
               slowest intializers :
                 libSystem.B.dylib :   5.56 milliseconds (0.8%)
              libglInterpose.dylib :  99.52 milliseconds (14.4%)
             libMTLInterpose.dylib :  28.90 milliseconds (4.1%)
                  AgoraMediaPlayer :  16.61 milliseconds (2.4%)
                HLLCourseLive_test : 628.58 milliseconds (91.1%)
    

    WWDC 2016 Session 406 里面介绍了每个步骤改进的tips,下面做一个简单说明

    • dylib loading time 动态加载程序查找并读取应用程序使用的依赖动态库。每个库本身都可能有依赖项。虽然苹果系统框架的加载是高度优化的,但加载嵌入式框架可能会很耗时。为了加快动态库的加载速度,苹果建议您使用更少的动态库,或者考虑合并它们(目前我们已经将pod进来的第三方库都改成了静态库,可以再pod的工程中查看)


      查看动态库.png
    • Rebase/binding time 重定向时间,-rebase/bind dylib加载完成之后,它们处于相互独立的状态,需要绑定起来。在dylib的加载过程中,系统为了安全考虑,引入了ASLR(Address Space Layout Randomization)技术和代码签名。由于ASLR的存在,镜像(Image,包括可执行文件、dylib和bundle)会在随机的地址上加载,和之前指针指向的地址(preferred_address)会有一个偏差(slide),dyld需要修正这个偏差,来指向正确的地址。Rebase在前,Bind在后,Rebase做的是将镜像读入内存,修正镜像内部的指针,性能消耗主要在IO。Bind做的是查询符号表,设置指向镜像外部的指针,性能消耗主要在CPU计算。
    • ObjC setup time Objective-C运行时需要进行设置类、类别和选择器注册。我们对重新定位绑定时间所做的任何改进也将优化这个设置时间,
      OC的runtime需要维护一张类名与类的方法列表的全局表。
      dyld做了如下操作:
      对所有声明过的OC类,将其注册到这个全局表中(class registration)
      将category的方法插入到类的方法列表中(category registration)
      检查每个selector的唯一性(selector uniquing)
      initializer time 运行初始化程序 使用initialize替代load方法 减少使用c/c++的attribute((constructor));推荐使用dispatch_once() pthread_once() std:once()等方法 推荐使用swift 不要在初始化中调用dlopen()方法,因为加载过程是单线程,无锁,如果调用dlopen则会变成多线程,会开启锁的消耗,同时有可能死锁 ,不要在初始化中创建线程
    `- 将pod 中的动态库更改成静态库`
    
    `- 动态库合并`
    
    `- 合并功能类似的类和方法`
    
    `- 移除没有使用的类和方法、图片资源,利用工具fui(fui usage: https:``//github.com/dblock/fui) 查找没有使用的类并移除,安装链接([https://github.com/dblock/fui),](https://github.com/dblock/fui),)但是扫描出来的不是百分百正确,精确度已经挺高的,为了安全起见,将扫描出来的手动一个个去查找`
    
    `- 减少C++的静态对象和C++构造函数使用(__attribute__((constructor)))`
    
    `- static_initializer_trace 追踪有哪个initializer耗时过长`
    
    `- timer_profile或者 追踪耗时过长的方法`
    
    `- 减少load的使用,建议使用initailize方法`
    
    
    7.2 main 优化

    对于main()阶段的测量,主要是测量main()函数开始到执行didFinishLaunchingWithOptions 结束的耗时,这个需要自己添加人工代码到工程中打印时间

    main.m 文件中添加测试代码
    CFAbsoluteTime StartTime;
    int main(int argc, char * argv[]) {
          StartTime = CFAbsoluteTimeGetCurrent();
    
    AppDelegate.m文件中用extern声明全局变量StartTime
    extern CFAbsoluteTime StartTime;
    然后再didFinishLaunchingWithOptions里,再获取一下当前时间,与StartTime的差值即是main()阶段运行耗时
    double launchTime = (CFAbsoluteTimeGetCurrent() - StartTime);
    
    优化建议:减少在main函数中功能实现(这一步通常可更改的空间不多,因为main函数中比较少有自定义的其他操作)
    
    7.3 didFinishLaunch-首屏优化

    时间测试类似: didFinishLaunchingWithOptions finish:0.303182 homePage viewDidLoad:5.418390s

    • 逻辑异步
    • 逻辑延迟
    • 缓存优化(网络延长的不可控)
    7.4二进制重排

    二进制重排主要是针对如何减少Page Fault 的优化,这也就是二进制重排的核心!

    程序默认情况下是顺序执行的,你可以根据你的编译顺序,然后从linkmap中可以看到的确如此,又因为系统内存是分页管理的,所以我们可以认为方法执行如下:


    二进制重排1.png

    现在假设启动的方法分别在这两个页面中Page1的method1方法,和Page2的method3方法,那么可以看出,这里面有两个Page Fault,如果我们能够对方法进行重新排列,让method1和method3在同一Page,那么久可以减少一次Page Fault。如果方法更多的话,Page Fault从理论上来讲也能减少更多,从而提升启动速度


    二进制重排2.png

    如何含量重排效果并验证呢?

    查看Page Fault次数是否减少
    查看编译过程的中间产物LinkMap文件进行确认

    #7.4.1 System Trace

    重启设备。Command + I 打开Instruments ,选择System Trace工具,点击record按钮,出现第一个页面,点击停止按钮。过滤Main Thread相关,选择Summary:Virtual Memory


    profile1.png profile2.png profile3.png
    接下来看看热启动的情况,kill掉app,重复之前的操作(不重启)
    profile4.png profile5.png

    对于冷启动和热启动的File Backed Page In 次数,可以看到热启动情况下,出发的Page Fault就变得很小了。

    二进制优化后

    profile6.png

    page fault时间约从324.94ms 216.54ms

    7.4.2 获取启动时方法-Clang插桩

    其实就是一个代码覆盖工具,更多信息可以查看官网。在Build Settings中Other C Flags添加-fsanitize-coverage=func,trace-pc-guard,然后再代码中添加两个方法去获取我们app启动过程需要用到的方法

    8.参考

    9.遇到的坑

    问题1:从阿里云的日志里面发现viewDidAppearStartTime和viewDidAppearEndTime有时候会为0,但是项目中只有初始化的时候赋值为0,刚开始怀疑是都次初始化,但是我的类是单例,所以不成功,最后经过打点排查,发现是请求网络方法回调完成的时候,viewDidAppear方法还没有开始执行,解决这样的问题,可以有两种思路

    • viewDidAppear开始网络请求
    • viewDidAppear和网络回调分别判断对应的时间是否大于0

    问题2:最近在启动优化数据复盘的过程中,发现了一些埋点数据异常,出现某两个时间段之间时间差值过大,导致结果分析过程中出现了难点,导致异常的原因有:

    • 原因一:用户在启动过程中,退出到了后台,导致首屏页面不能按照正常启动流程执行;
    • 原因二:用户在启动过程中,点击了广告页,进入到广告页,导致首屏页面不能按照正常启动流程执行;
    解决方法:
    过滤掉这些非正式流程出现的脏数据,添加两个变量:
    
    enterBackgroundBeforeFinish:是否进入了后台(0代表没有,1代表有)
    clickAd:是否点击进入了广告页:(0代表没有,1代表有)
    

    查询sql

     and event:appStartUp  and deviceType: iPhone  or deviceType: iPad  and appVersion>= "5.0.0" |SELECT round(avg((requestEndTime-mainTime)/1000),2)   as "requestEndTime",deviceType,requestStatus,pageType ,count(*) as total,date_format(date_trunc('day', __time__), '%m-%d') as date where (requestEndTime - mainTime) > 0 group by deviceType ,requestStatus,pageType,date order by date
    

    问题3:网络接口不稳定,导致总时间不稳定

    解决建议:
    增加首页缓存,提升用户体验
    直播课表实时性的解决
    

    问题4:广告页作为rootViewController

    解决方法:
    作为window层在最上层展示,与首页代码并发执行
    

    相关文章

      网友评论

          本文标题:iOS 启动速度

          本文链接:https://www.haomeiwen.com/subject/zfpjpltx.html