1.背景
众所周知,应用启动时间是一个衡量app用户体验的一个标准,一般情况下:
1秒内完成:响应时间快
1-3秒内完成:速度还可以
超过3秒:速度较慢
超过5秒:速度非常慢
对于用户来说,肯定是越快越好,超过5秒的话,估计用户想卸载app的心都有了,而我们画啦啦app项目目前还真的超过5秒,所以优化势在必行
2.App启动概念
在iOS启动的范畴里面,有三种定义启动,第一种是Cold Launch,第二种一种是Warm Launch,第三种是Resume Launch
- Cold Launch:在系统没有任何该应用信息的场景下启动应用的行为,例如重启,或者kill掉一段时间之后再重新进入;
- Warm Launch:最近销毁的,App 有部分内存残留,没有进程存在,点击进入
- Resume Launch: 点击home键,退出到后台,然后再点击app重新进入,或者app间的切换
3.app启动时间的定义(针对我个人处理的项目)
通常启动时间的定义都是分为main之前和main之后,而main之后一般也是定义为到finishLaunch执行完毕的时间,如下图
启动优化1.png
在我们这次优化中,我们将点击app到出现首页加载第一屏数据为一个启动时间。为了抓取指标数据前后对比,下面将统计几个时间节点,以便将来做优化处理
4.app 启动各阶段时间指标埋点(系统,机型,版本包含在内)
属性变量名 | 数据类型 | 属性说明 |
---|---|---|
event | String | 取值appStartUp----(app启动) |
processStartTime | double | 进程创建时间戳 |
mainTime | double | 进入main函数的时间戳 |
didFinishStartTime | double | 进入didFinishLaunchingWithOptions函数的时间戳 |
didFinishEndTime | double | didFinishLaunchingWithOptions 结束的时间戳 |
didBecomeActiveStartTime | double | 进入applicationDidBecomeActive函数的时间戳 |
didBecomeActiveEndTime | double | 进入applicationDidBecomeActive函数开始到结束的时间戳 |
viewDidLoadStartTime | double | 进入首屏ViewController 的viewDidLoad的时间戳 |
viewDidLoadEndTime | double | 进入首屏ViewController 的viewDidLoad开始到viewDidLoad结束的时间戳 |
viewDidAppearStartTime | double | 进入首屏 ViewController 的viewDidAppear的时间戳 |
viewDidAppearEndTime | double | 进入首屏 ViewController 的viewDidAppear开始到viewDidAppear结束的时间戳 |
equestStartTime | double | 进入首屏ViewController开始请求接口的时间戳 |
requestEndTime | double | 进入首屏到接口数据请求回来的时间戳 |
deviceId | String | 设备唯一标识 |
appVersion | String | 版本号 |
deviceType | String | 设备型号 |
os | String | 系统版本 |
pageType | long | PageType_Visitor :游客页 1 PageType_Home:首页 2 |
requestStatus | long | RequestStatus_Success:1 成功请求 RequestStatus_Fail:2 失败请求RequestStatus_Error 3 无网情况 |
埋点时机:在最后一个打点执行完毕之后上传一条数据,并且整个生命周期只上传一次
endpoint:cn-shenzhen.log.aliyuncs.com
project: live-logs
logstore: live-business-log(正式)、live-monitor-test(测试)、live-monitor-dev(开发)
AccessKeyID:xxxxxxxx
AccessKeySecret:yyyyyyy
5.测试环境
5.1环境设置
- 重启手机,并等待2-3分钟
- 启用飞行模式
- 退出icloud账户
- release模式下测试
- 测试热启动时间
5.2测试设备
尽可能使用手头上的测试设备,因为用户什么设备都会有的,这里,我采取了iphone6s Plus 和iPhone X两种设备来测试
测试出来的数据:这里的时间会比pre-main 的时间长原因如下:(pre-main 时间从加载dyld开始到initializer结束)
exec1.png exec2.png
6.app启动流程
这里会涉及到部分源码知识,但是并不会深入解析源码,这里只是了解一个过程;涉及到app的启动,不能不谈到dyld这个动态链接器,pre_main 部分,几乎都是这个dyld在做事情,这里先了解一下dyld的发展史;
6.1 dyld的发展史
其实dyld 也一直在为加快启动app的流程一直都在努力,从版本的升级迭代是可以感受到的,第一代:dyld1.0(1996-2004),这个时候很多动态库还是使用C++;第二代:dyld2.0(2004-2007),纠正了很多C++ initizlizer semantics、提升Security的检测,prebinding 被shared cache完全取代;第三代:dyld 3 (2017-2021)更加安全和启动速度更快
6.2 dyld 版本对比
因为WWDC上面只介绍了dyld2和dyld3,所以这里也就对比这两个版本
-
dyld2和dyld3的对比
dyld2和dyld3对比.png
-
dyld2 做的事情
dyld2.png
-
dyld3 做的事情
dyld3.png
-
dyld3 中上一部分表示在进程外处理
dyld3_1.png -
dyld3 中下半部分表示在进程内处理
dyld3_2.png
dyld3_3.png
从上面的图文对比,可以知道dyld2是纯粹的in-process,也就是在程序进程内执行的,也就意味着只有当应用程序被启动的时候,dyld2才能开始执行任务。而dyld3把一些工作提前就已经做好了,所以dyld3性能更高;
dyld2 主要工作流程
①解析 mach-o 文件,找到其依赖的库,并且递归的找到所有依赖的库,形成一张动态库的依赖图。iOS 上的大部分 app 都依赖几百个动态链接库(大部分是系统的动态库),所以这个步骤包含了较大的工作量。
②匹配 mach-o 文件到自身的地址空间
③进行符号查找(perform symbol lookups):比如 app 中调用了 printf 方法,就需要去系统库中查找到 printf 的地址,然后将地址拷贝到 app 中的函数指针中
④rebase和binding:由于 app 需要让地址空间配置随机加载,所以所有的指针都需要加上一个基地址
⑤运行初始化程序,之后运行 main() 函数
存在问题:性能、安全性和可测试性上 不够好(dyld3 解决)
dyld3 主要工作流程
①本APP进程外的Mach-O分析器/编译器;
在dyld 2的加载流程中,Parse mach-o headers和Find Dependencies存在安全风险(可以通过修改mach-o header及添加非法@rpath进行攻击),而Perform symbol lookups会耗费较多的CPU时间,因为一个库文件不变时,符号将始终位于库中相同的偏移位置,这两部分在dyld 3中将采用提前写入把结果数据缓存成文件的方式构成一个”lauch closure“(可以理解为缓存文件)。它处理了所有可能影响启动速度的 search path,@rpaths 和环境变量;它解析 mach-o 二进制文件,分析其依赖的动态库,并且完成了所有符号查找的工作;最后它将这些工作的结果创建成了启动闭包,写入缓存,这样,在应用启动的时候,就可以直接从缓存中读取数据,加快加载速度。
这是一个普通的 daemon 进程,可以使用通常的测试架构。
out-of-process是一个普通的后台守护程序,因为从各个APP进程抽离出来了,可以提高dyld3的可测试性。
②本进程内执行”lauch closure“的引擎;验证”lauch closures“是否正确,把dylib映射到APP进程的地址空间里,然后跳转到main函数。此时,它不再需要分析mach-o header和执行符号查找,节省了不少时间。
③”lauch closure“的缓存:
iOS操作系统内置APP的”lauch closure“直接内置在shared cache共享缓存中,我们甚至不需要打开一个单独的文件;
而对于第三方APP,将在APP安装或更新版本时(或者操作系统升级时?)生成lauch closure启动闭包,因为那时候的系统库已经发生更改。这样就能保证”lauch closure“总是在APP打开之前准备好。启动闭包会被写到到一个文件里,下次启动则直接读取和验证这个文件。
在 iOS,tvOS,watchOS 中,一切(生成启动闭包)都是在 app 启动之前做完的。在 macOS 上,由于有 sideload app,进程内引擎会在首次启动时启动一个 daemon,之后就可以使用启动闭包了。总之大部分情景下,这些工作都在 app 启动之前完成了。
大部分的启动场景都不需要调用这个进程外的 mach-o 解析器。而启动闭包又比 MachO 简单很多,因为它是一个内存映射文件,解析和验证都非常简单,并且经过了良好的性能优化。所以 dyld 3.0 的引入,能让 app 的启动速度得到明显提升。
总体来说,dyld 3把很多耗时的操作都提前处理好了,极大提升了启动速度
6.3 dyld的原理(dyld_start)
app启动调用的时候是会调用dyld_start 这个函数的,所以我们分析次函数的调用应该就差不多了
dyld 源码分析
#if __arm64__
.data
.align 3
__dso_static:
.quad ___dso_handle
.text
.align 2
.globl __dyld_start
__dyld_start:
mov x28, sp
and sp, x28, #~15 // force 16-byte alignment of stack
mov x0, #0
mov x1, #0
stp x1, x0, [sp, #-16]! // make aligned terminating frame
mov fp, sp // set up fp to point to terminating frame
sub sp, sp, #16 // make room for local variables
ldr x0, [x28] // get app's mh into x0
ldr x1, [x28, #8] // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
add x2, x28, #16 // get argv into x2
adrp x4,___dso_handle@page
add x4,x4,___dso_handle@pageoff // get dyld's mh in to x4
adrp x3,__dso_static@page
ldr x3,[x3,__dso_static@pageoff] // get unslid start of dyld
sub x3,x4,x3 // x3 now has slide of dyld
mov x5,sp // x5 has &startGlue
// call dyldbootstrap::start(app_mh, argc, argv, slide, dyld_mh, &startGlue)
bl __ZN13dyldbootstrap5startEPK12macho_headeriPPKclS2_Pm
mov x16,x0 // save entry point address in x16
ldr x1, [sp]
cmp x1, #0
b.ne Lnew
// LC_UNIXTHREAD way, clean up stack and jump to result
add sp, x28, #8 // restore unaligned stack pointer without app mh
br x16 // jump to the program's entry point
// LC_MAIN case, set up stack for call to main()
Lnew: mov lr, x1 // simulate return address into _start in libdyld.dylib
ldr x0, [x28, #8] // main param1 = argc
add x1, x28, #16 // main param2 = argv
add x2, x1, x0, lsl #3
add x2, x2, #8 // main param3 = &env[0]
mov x3, x2
Lapple: ldr x4, [x3]
add x3, x3, #8
cmp x4, #0
b.ne Lapple // main param4 = apple
br x16
#endif // __arm64__
//
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[],
intptr_t slide, const struct macho_header* dyldsMachHeader,
uintptr_t* startGlue)
{
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
if ( slide != 0 ) {
rebaseDyld(dyldsMachHeader, slide);
}
// allow dyld to use mach messaging
mach_init();
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
#endif
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);//获取该次运行的ASLR
return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
static uintptr_t slideOfMainExecutable(const struct macho_header* mh)
{
const uint32_t cmd_count = mh->ncmds;
const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
const struct load_command* cmd = cmds;
for (uint32_t i = 0; i < cmd_count; ++i) {
if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
const struct macho_segment_command* segCmd = (struct macho_segment_command*)cmd;
if ( (segCmd->fileoff == 0) && (segCmd->filesize != 0)) {
return (uintptr_t)mh - segCmd->vmaddr;
}
}
//没有符合条件的话,就继续遍历下一个command_PAGEZERO->_TEXT_DATA_LINKEDIT
cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
}
return 0;
}
static void rebaseDyld(const struct macho_header* mh, intptr_t slide)
{
// rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
// and get interesting pointers into dyld
const uint32_t cmd_count = mh->ncmds;
const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
const struct load_command* cmd = cmds;
const struct macho_segment_command* linkEditSeg = NULL;
#if __x86_64__
const struct macho_segment_command* firstWritableSeg = NULL;
#endif
const struct dysymtab_command* dynamicSymbolTable = NULL;
for (uint32_t i = 0; i < cmd_count; ++i) {
switch (cmd->cmd) {
case LC_SEGMENT_COMMAND:
{
const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
if ( strcmp(seg->segname, "__LINKEDIT") == 0 )
linkEditSeg = seg;
const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects];
for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
const uint8_t type = sect->flags & SECTION_TYPE;
if ( type == S_NON_LAZY_SYMBOL_POINTERS ) {
// rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
const uint32_t pointerCount = (uint32_t)(sect->size / sizeof(uintptr_t));
uintptr_t* const symbolPointers = (uintptr_t*)(sect->addr + slide);
for (uint32_t j=0; j < pointerCount; ++j) {
symbolPointers[j] += slide;
}
}
}
#if __x86_64__
if ( (firstWritableSeg == NULL) && (seg->initprot & VM_PROT_WRITE) )
firstWritableSeg = seg;
#endif
}
break;
case LC_DYSYMTAB:
dynamicSymbolTable = (struct dysymtab_command *)cmd;
break;
}
cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
}
// use reloc's to rebase all random data pointers
#if __x86_64__
const uintptr_t relocBase = firstWritableSeg->vmaddr + slide;
#else
const uintptr_t relocBase = (uintptr_t)mh;
#endif
const relocation_info* const relocsStart = (struct relocation_info*)(linkEditSeg->vmaddr + slide + dynamicSymbolTable->locreloff - linkEditSeg->fileoff);
const relocation_info* const relocsEnd = &relocsStart[dynamicSymbolTable->nlocrel];
for (const relocation_info* reloc=relocsStart; reloc < relocsEnd; ++reloc) {
if ( reloc->r_length != RELOC_SIZE )
throw "relocation in dyld has wrong size";
if ( reloc->r_type != POINTER_RELOC )
throw "relocation in dyld has wrong type";
// update pointer by amount dyld slid
*((uintptr_t*)(reloc->r_address + relocBase)) += slide;
}
}
dyldbootstrap::start 源码分析
/
// This is code to bootstrap dyld. This work in normally done for a program by dyld and crt.
// In dyld we have to do this manually.
//
uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[],
intptr_t slide, const struct macho_header* dyldsMachHeader,
uintptr_t* startGlue)
{
// if kernel had to slide dyld, we need to fix up load sensitive locations
// we have to do this before using any global variables
if ( slide != 0 ) {
rebaseDyld(dyldsMachHeader, slide);
}
// allow dyld to use mach messaging
mach_init();
// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;
// set up random value for stack canary
__guard_setup(apple);
#if DYLD_INITIALIZER_SUPPORT
// run all C++ initializers inside dyld
runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
#endif
// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);//获取该次运行的ASLR
return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}
slideOfMainExecutable源码分析
static uintptr_t slideOfMainExecutable(const struct macho_header* mh)
{
const uint32_t cmd_count = mh->ncmds;
const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
const struct load_command* cmd = cmds;
for (uint32_t i = 0; i < cmd_count; ++i) {
if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
const struct macho_segment_command* segCmd = (struct macho_segment_command*)cmd;
if ( (segCmd->fileoff == 0) && (segCmd->filesize != 0)) {
return (uintptr_t)mh - segCmd->vmaddr;
}
}
//没有符合条件的话,就继续遍历下一个command_PAGEZERO->_TEXT_DATA_LINKEDIT
cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
}
return 0;
}
rebaseDyld
static void rebaseDyld(const struct macho_header* mh, intptr_t slide)
{
// rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
// and get interesting pointers into dyld
const uint32_t cmd_count = mh->ncmds;
const struct load_command* const cmds = (struct load_command*)(((char*)mh)+sizeof(macho_header));
const struct load_command* cmd = cmds;
const struct macho_segment_command* linkEditSeg = NULL;
#if __x86_64__
const struct macho_segment_command* firstWritableSeg = NULL;
#endif
const struct dysymtab_command* dynamicSymbolTable = NULL;
for (uint32_t i = 0; i < cmd_count; ++i) {
switch (cmd->cmd) {
case LC_SEGMENT_COMMAND:
{
const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
if ( strcmp(seg->segname, "__LINKEDIT") == 0 )
linkEditSeg = seg;
const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
const struct macho_section* const sectionsEnd = §ionsStart[seg->nsects];
for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
const uint8_t type = sect->flags & SECTION_TYPE;
if ( type == S_NON_LAZY_SYMBOL_POINTERS ) {
// rebase non-lazy pointers (which all point internal to dyld, since dyld uses no shared libraries)
const uint32_t pointerCount = (uint32_t)(sect->size / sizeof(uintptr_t));
uintptr_t* const symbolPointers = (uintptr_t*)(sect->addr + slide);
for (uint32_t j=0; j < pointerCount; ++j) {
symbolPointers[j] += slide;
}
}
}
#if __x86_64__
if ( (firstWritableSeg == NULL) && (seg->initprot & VM_PROT_WRITE) )
firstWritableSeg = seg;
#endif
}
break;
case LC_DYSYMTAB:
dynamicSymbolTable = (struct dysymtab_command *)cmd;
break;
}
cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
}
// use reloc's to rebase all random data pointers
#if __x86_64__
const uintptr_t relocBase = firstWritableSeg->vmaddr + slide;
#else
const uintptr_t relocBase = (uintptr_t)mh;
#endif
const relocation_info* const relocsStart = (struct relocation_info*)(linkEditSeg->vmaddr + slide + dynamicSymbolTable->locreloff - linkEditSeg->fileoff);
const relocation_info* const relocsEnd = &relocsStart[dynamicSymbolTable->nlocrel];
for (const relocation_info* reloc=relocsStart; reloc < relocsEnd; ++reloc) {
if ( reloc->r_length != RELOC_SIZE )
throw "relocation in dyld has wrong size";
if ( reloc->r_type != POINTER_RELOC )
throw "relocation in dyld has wrong type";
// update pointer by amount dyld slid
*((uintptr_t*)(reloc->r_address + relocBase)) += slide;
}
}
dyld::main
uintptr_t
_main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide,
int argc, const char* argv[], const char* envp[], const char* apple[],
uintptr_t* startGlue)
{
uintptr_t result = 0;
//保存执行文件头部,后续可以根据头部访问其它信息
sMainExecutableMachHeader = mainExecutableMH;
#if __MAC_OS_X_VERSION_MIN_REQUIRED
// if this is host dyld, check to see if iOS simulator is being run
const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
if ( rootPath != NULL ) {
// Add dyld to the kernel image info before we jump to the sim
notifyKernelAboutDyld();
// look to see if simulator has its own dyld
char simDyldPath[PATH_MAX];
strlcpy(simDyldPath, rootPath, PATH_MAX);
strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
int fd = my_open(simDyldPath, O_RDONLY, 0);
if ( fd != -1 ) {
const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
if ( errMessage != NULL )
halt(errMessage);
return result;
}
}
#endif
CRSetCrashLogMessage("dyld: launch started");
//设置上下文信息
setContext(mainExecutableMH, argc, argv, envp, apple);
// Pickup the pointer to the exec path.
//获取可执行文件路径
sExecPath = _simple_getenv(apple, "executable_path");
// <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
if (!sExecPath) sExecPath = apple[0];
//相对路径转成绝对路径
if ( sExecPath[0] != '/' ) {
// have relative path, use cwd to make absolute
char cwdbuff[MAXPATHLEN];
if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
// maybe use static buffer to avoid calling malloc so early...
char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
strcpy(s, cwdbuff);
strcat(s, "/");
strcat(s, sExecPath);
sExecPath = s;
}
}
// Remember short name of process for later logging
//获取文件名字
sExecShortName = ::strrchr(sExecPath, '/');
if ( sExecShortName != NULL )
++sExecShortName;
else
sExecShortName = sExecPath;
//配置进程是否受限
configureProcessRestrictions(mainExecutableMH);
#if __MAC_OS_X_VERSION_MIN_REQUIRED
if ( gLinkContext.processIsRestricted ) {
//去掉DYLD_* and LD_LIBRARY_PATH环境变量
pruneEnvironmentVariables(envp, &apple);
// set again because envp and apple may have changed or moved
//重新设置上下文
setContext(mainExecutableMH, argc, argv, envp, apple);
}
else
#endif
{
//检查设置环境变量
checkEnvironmentVariables(envp);
//如果DYLD_FALLBACK为nil,设置为默认的
defaultUninitializedFallbackPaths(envp);
}
//如果设置了DYLD_PRINT_OPTS环境变量打印参数
if ( sEnv.DYLD_PRINT_OPTS )
printOptions(argv);
//如果设置了DYLD_PRINT_ENV环境变量打印环境变量
if ( sEnv.DYLD_PRINT_ENV )
printEnvironmentVariables(envp);
//获取当前运行架构信息
getHostInfo(mainExecutableMH, mainExecutableSlide);
// install gdb notifier
stateToHandlers(dyld_image_state_dependents_mapped, sBatchHandlers)->push_back(notifyGDB);
stateToHandlers(dyld_image_state_mapped, sSingleHandlers)->push_back(updateAllImages);
// make initial allocations large enough that it is unlikely to need to be re-alloced
sImageRoots.reserve(16);
sAddImageCallbacks.reserve(4);
sRemoveImageCallbacks.reserve(4);
sImageFilesNeedingTermination.reserve(16);
sImageFilesNeedingDOFUnregistration.reserve(8);
#if !TARGET_IPHONE_SIMULATOR
#ifdef WAIT_FOR_SYSTEM_ORDER_HANDSHAKE
// <rdar://problem/6849505> Add gating mechanism to dyld support system order file generation process
WAIT_FOR_SYSTEM_ORDER_HANDSHAKE(dyld::gProcessInfo->systemOrderFlag);
#endif
#endif
try {
// add dyld itself to UUID list
//增加自身到UUID列表
addDyldImageToUUIDList();
//通知内核
notifyKernelAboutDyld();
#if SUPPORT_ACCELERATE_TABLES
bool mainExcutableAlreadyRebased = false;
reloadAllImages:
#endif
CRSetCrashLogMessage(sLoadingCrashMessage);
// instantiate ImageLoader for main executable
//加载可执行文件并生成一个ImageLoader实例对象
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
#if TARGET_IPHONE_SIMULATOR
// check main executable is not too new for this OS
{
if ( ! isSimulatorBinary((uint8_t*)mainExecutableMH, sExecPath) ) {
throwf("program was built for a platform that is not supported by this runtime");
}
uint32_t mainMinOS = sMainExecutable->minOSVersion();
// dyld is always built for the current OS, so we can get the current OS version
// from the load command in dyld itself.
uint32_t dyldMinOS = ImageLoaderMachO::minOSVersion((const mach_header*)&__dso_handle);
if ( mainMinOS > dyldMinOS ) {
#if TARGET_OS_WATCH
throwf("app was built for watchOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#elif TARGET_OS_TV
throwf("app was built for tvOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#else
throwf("app was built for iOS %d.%d which is newer than this simulator %d.%d",
mainMinOS >> 16, ((mainMinOS >> 8) & 0xFF),
dyldMinOS >> 16, ((dyldMinOS >> 8) & 0xFF));
#endif
}
}
#endif
#if __MAC_OS_X_VERSION_MIN_REQUIRED
// <rdar://problem/22805519> be less strict about old mach-o binaries
uint32_t mainSDK = sMainExecutable->sdkVersion();
gLinkContext.strictMachORequired = (mainSDK >= DYLD_MACOSX_VERSION_10_12) || gLinkContext.processUsingLibraryValidation;
#else
// simulators, iOS, tvOS, and watchOS are always strict
gLinkContext.strictMachORequired = true;
#endif
// load shared cache
//检查共享缓存是否开启,ios必须开启
checkSharedRegionDisable();
#if DYLD_SHARED_CACHE_SUPPORT
if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
//检查共享缓存是否映射到共享区域
mapSharedCache();
} else {
dyld_kernel_image_info_t kernelCacheInfo;
bzero(&kernelCacheInfo.uuid[0], sizeof(uuid_t));
kernelCacheInfo.load_addr = 0;
kernelCacheInfo.fsobjid.fid_objno = 0;
kernelCacheInfo.fsobjid.fid_generation = 0;
kernelCacheInfo.fsid.val[0] = 0;
kernelCacheInfo.fsid.val[0] = 0;
task_register_dyld_shared_cache_image_info(mach_task_self(), kernelCacheInfo, true, false);
}
#endif
#if SUPPORT_ACCELERATE_TABLES
sAllImages.reserve((sAllCacheImagesProxy != NULL) ? 16 : INITIAL_IMAGE_COUNT);
#else
sAllImages.reserve(INITIAL_IMAGE_COUNT);
#endif
// Now that shared cache is loaded, setup an versioned dylib overrides
#if SUPPORT_VERSIONED_PATHS
//检查是否有库的版本是否有更新,如果有则覆盖原有的
checkVersionedPaths();
#endif
// dyld_all_image_infos image list does not contain dyld
// add it as dyldPath field in dyld_all_image_infos
// for simulator, dyld_sim is in image list, need host dyld added
#if TARGET_IPHONE_SIMULATOR
// get path of host dyld from table of syscall vectors in host dyld
void* addressInDyld = gSyscallHelpers;
#else
// get path of dyld itself
void* addressInDyld = (void*)&__dso_handle;
#endif
char dyldPathBuffer[MAXPATHLEN+1];
int len = proc_regionfilename(getpid(), (uint64_t)(long)addressInDyld, dyldPathBuffer, MAXPATHLEN);
if ( len > 0 ) {
dyldPathBuffer[len] = '\0'; // proc_regionfilename() does not zero terminate returned string
if ( strcmp(dyldPathBuffer, gProcessInfo->dyldPath) != 0 )
gProcessInfo->dyldPath = strdup(dyldPathBuffer);
}
//加载所有DYLD_INSERT_LIBRARIES指定的库
// load any inserted libraries
if ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib)
loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at
// inserted libraries, then main, then others.
sInsertedDylibCount = sAllImages.size()-1;
// link main executable
gLinkContext.linkingMainExecutable = true;
#if SUPPORT_ACCELERATE_TABLES
if ( mainExcutableAlreadyRebased ) {
// previous link() on main executable has already adjusted its internal pointers for ASLR
// work around that by rebasing by inverse amount
sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
}
#endif
//链接主程序
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
sMainExecutable->setNeverUnloadRecursive();
if ( sMainExecutable->forceFlat() ) {
gLinkContext.bindFlat = true;
gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
}
//链接插入的动态库
// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
if ( sInsertedDylibCount > 0 ) {
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
image->setNeverUnloadRecursive();
}
// only INSERTED libraries can interpose
// register interposing info after all inserted libraries are bound so chaining works
for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
ImageLoader* image = sAllImages[i+1];
//注册插入
image->registerInterposing();
}
}
// <rdar://problem/19315404> dyld should support interposition even without DYLD_INSERT_LIBRARIES
for (long i=sInsertedDylibCount+1; i < sAllImages.size(); ++i) {
ImageLoader* image = sAllImages[i];
if ( image->inSharedCache() )
continue;
image->registerInterposing();
}
#if SUPPORT_ACCELERATE_TABLES
if ( (sAllCacheImagesProxy != NULL) && ImageLoader::haveInterposingTuples() ) {
// Accelerator tables cannot be used with implicit interposing, so relaunch with accelerator tables disabled
ImageLoader::clearInterposingTuples();
// unmap all loaded dylibs (but not main executable)
for (long i=1; i < sAllImages.size(); ++i) {
ImageLoader* image = sAllImages[i];
if ( image == sMainExecutable )
continue;
if ( image == sAllCacheImagesProxy )
continue;
image->setCanUnload();
ImageLoader::deleteImage(image);
}
// note: we don't need to worry about inserted images because if DYLD_INSERT_LIBRARIES was set we would not be using the accelerator table
sAllImages.clear();
sImageRoots.clear();
sImageFilesNeedingTermination.clear();
sImageFilesNeedingDOFUnregistration.clear();
sAddImageCallbacks.clear();
sRemoveImageCallbacks.clear();
sDisableAcceleratorTables = true;
sAllCacheImagesProxy = NULL;
sMappedRangesStart = NULL;
mainExcutableAlreadyRebased = true;
gLinkContext.linkingMainExecutable = false;
resetAllImages();
goto reloadAllImages;
}
#endif
// apply interposing to initial set of images
for(int i=0; i < sImageRoots.size(); ++i) {
//应用插入
sImageRoots[i]->applyInterposing(gLinkContext);
}
gLinkContext.linkingMainExecutable = false;
// <rdar://problem/12186933> do weak binding only after all inserted images linked
//弱符号绑定
sMainExecutable->weakBind(gLinkContext);
#if DYLD_SHARED_CACHE_SUPPORT
// If cache has branch island dylibs, tell debugger about them
if ( (sSharedCache != NULL) && (sSharedCache->mappingOffset >= 0x78) && (sSharedCache->branchPoolsOffset != 0) ) {
uint32_t count = sSharedCache->branchPoolsCount;
dyld_image_info info[count];
const uint64_t* poolAddress = (uint64_t*)((char*)sSharedCache + sSharedCache->branchPoolsOffset);
// <rdar://problem/20799203> empty branch pools can be in development cache
if ( ((mach_header*)poolAddress)->magic == sMainExecutableMachHeader->magic ) {
for (int poolIndex=0; poolIndex < count; ++poolIndex) {
uint64_t poolAddr = poolAddress[poolIndex] + sSharedCacheSlide;
info[poolIndex].imageLoadAddress = (mach_header*)(long)poolAddr;
info[poolIndex].imageFilePath = "dyld_shared_cache_branch_islands";
info[poolIndex].imageFileModDate = 0;
}
// add to all_images list
addImagesToAllImages(count, info);
// tell gdb about new branch island images
gProcessInfo->notification(dyld_image_adding, count, info);
}
}
#endif
CRSetCrashLogMessage("dyld: launch, running initializers");
#if SUPPORT_OLD_CRT_INITIALIZATION
// Old way is to run initializers via a callback from crt1.o
if ( ! gRunInitializersOldWay )
initializeMainExecutable();
#else
// run all initializers
//执行初始化方法
initializeMainExecutable();
#endif
// notify any montoring proccesses that this process is about to enter main()
notifyMonitoringDyldMain();
// find entry point for main executable
//LC_MAIN
result = (uintptr_t)sMainExecutable->getThreadPC();
if ( result != 0 ) {
// main executable uses LC_MAIN, needs to return to glue in libdyld.dylib
if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
*startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
else
halt("libdyld.dylib support not present for LC_MAIN");
}
else {
// main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
//LC_UNIXTHREAD
result = (uintptr_t)sMainExecutable->getMain();
*startGlue = 0;
}
}
catch(const char* message) {
syncAllImages();
halt(message);
}
catch(...) {
dyld::log("dyld: launch failed\n");
}
CRSetCrashLogMessage(NULL);
sNotifyObjCInit
return result;
}
7.app优化的方向
启动完整图.png7.1 pre_main 优化方向
pre-main 时间的测量
将DYLD_PRINT_STATISTICS环境变量添加到项目scheme中
运行一下,查看控制台的输出:(main 函数之前一共用了689.74秒)
xcode设置.png
Total pre-main time: 689.74 milliseconds (100.0%)
dylib loading time: 123.58 milliseconds (17.9%)
rebase/binding time: 43.52 milliseconds (6.3%)
ObjC setup time: 39.95 milliseconds (5.7%)
initializer time: 482.51 milliseconds (69.9%)
slowest intializers :
libSystem.B.dylib : 5.56 milliseconds (0.8%)
libglInterpose.dylib : 99.52 milliseconds (14.4%)
libMTLInterpose.dylib : 28.90 milliseconds (4.1%)
AgoraMediaPlayer : 16.61 milliseconds (2.4%)
HLLCourseLive_test : 628.58 milliseconds (91.1%)
WWDC 2016 Session 406 里面介绍了每个步骤改进的tips,下面做一个简单说明
-
dylib loading time 动态加载程序查找并读取应用程序使用的依赖动态库。每个库本身都可能有依赖项。虽然苹果系统框架的加载是高度优化的,但加载嵌入式框架可能会很耗时。为了加快动态库的加载速度,苹果建议您使用更少的动态库,或者考虑合并它们(目前我们已经将pod进来的第三方库都改成了静态库,可以再pod的工程中查看)
查看动态库.png - Rebase/binding time 重定向时间,-rebase/bind dylib加载完成之后,它们处于相互独立的状态,需要绑定起来。在dylib的加载过程中,系统为了安全考虑,引入了ASLR(Address Space Layout Randomization)技术和代码签名。由于ASLR的存在,镜像(Image,包括可执行文件、dylib和bundle)会在随机的地址上加载,和之前指针指向的地址(preferred_address)会有一个偏差(slide),dyld需要修正这个偏差,来指向正确的地址。Rebase在前,Bind在后,Rebase做的是将镜像读入内存,修正镜像内部的指针,性能消耗主要在IO。Bind做的是查询符号表,设置指向镜像外部的指针,性能消耗主要在CPU计算。
- ObjC setup time Objective-C运行时需要进行设置类、类别和选择器注册。我们对重新定位绑定时间所做的任何改进也将优化这个设置时间,
OC的runtime需要维护一张类名与类的方法列表的全局表。
dyld做了如下操作:
对所有声明过的OC类,将其注册到这个全局表中(class registration)
将category的方法插入到类的方法列表中(category registration)
检查每个selector的唯一性(selector uniquing)
initializer time 运行初始化程序 使用initialize替代load方法 减少使用c/c++的attribute((constructor));推荐使用dispatch_once() pthread_once() std:once()等方法 推荐使用swift 不要在初始化中调用dlopen()方法,因为加载过程是单线程,无锁,如果调用dlopen则会变成多线程,会开启锁的消耗,同时有可能死锁 ,不要在初始化中创建线程
`- 将pod 中的动态库更改成静态库`
`- 动态库合并`
`- 合并功能类似的类和方法`
`- 移除没有使用的类和方法、图片资源,利用工具fui(fui usage: https:``//github.com/dblock/fui) 查找没有使用的类并移除,安装链接([https://github.com/dblock/fui),](https://github.com/dblock/fui),)但是扫描出来的不是百分百正确,精确度已经挺高的,为了安全起见,将扫描出来的手动一个个去查找`
`- 减少C++的静态对象和C++构造函数使用(__attribute__((constructor)))`
`- static_initializer_trace 追踪有哪个initializer耗时过长`
`- timer_profile或者 追踪耗时过长的方法`
`- 减少load的使用,建议使用initailize方法`
7.2 main 优化
对于main()阶段的测量,主要是测量main()函数开始到执行didFinishLaunchingWithOptions 结束的耗时,这个需要自己添加人工代码到工程中打印时间
main.m 文件中添加测试代码
CFAbsoluteTime StartTime;
int main(int argc, char * argv[]) {
StartTime = CFAbsoluteTimeGetCurrent();
AppDelegate.m文件中用extern声明全局变量StartTime
extern CFAbsoluteTime StartTime;
然后再didFinishLaunchingWithOptions里,再获取一下当前时间,与StartTime的差值即是main()阶段运行耗时
double launchTime = (CFAbsoluteTimeGetCurrent() - StartTime);
优化建议:减少在main函数中功能实现(这一步通常可更改的空间不多,因为main函数中比较少有自定义的其他操作)
7.3 didFinishLaunch-首屏优化
时间测试类似: didFinishLaunchingWithOptions finish:0.303182 homePage viewDidLoad:5.418390s
- 逻辑异步
- 逻辑延迟
- 缓存优化(网络延长的不可控)
7.4二进制重排
二进制重排主要是针对如何减少Page Fault 的优化,这也就是二进制重排的核心!
程序默认情况下是顺序执行的,你可以根据你的编译顺序,然后从linkmap中可以看到的确如此,又因为系统内存是分页管理的,所以我们可以认为方法执行如下:
二进制重排1.png
现在假设启动的方法分别在这两个页面中Page1的method1方法,和Page2的method3方法,那么可以看出,这里面有两个Page Fault,如果我们能够对方法进行重新排列,让method1和method3在同一Page,那么久可以减少一次Page Fault。如果方法更多的话,Page Fault从理论上来讲也能减少更多,从而提升启动速度
二进制重排2.png
如何含量重排效果并验证呢?
查看Page Fault次数是否减少
查看编译过程的中间产物LinkMap文件进行确认
#7.4.1 System Trace
重启设备。Command + I 打开Instruments ,选择System Trace工具,点击record按钮,出现第一个页面,点击停止按钮。过滤Main Thread相关,选择Summary:Virtual Memory
profile1.png profile2.png profile3.png
接下来看看热启动的情况,kill掉app,重复之前的操作(不重启)
profile4.png profile5.png
对于冷启动和热启动的File Backed Page In 次数,可以看到热启动情况下,出发的Page Fault就变得很小了。
profile6.png二进制优化后
page fault时间约从324.94ms 216.54ms
7.4.2 获取启动时方法-Clang插桩
其实就是一个代码覆盖工具,更多信息可以查看官网。在Build Settings中Other C Flags添加-fsanitize-coverage=func,trace-pc-guard,然后再代码中添加两个方法去获取我们app启动过程需要用到的方法
8.参考
9.遇到的坑
问题1:从阿里云的日志里面发现viewDidAppearStartTime和viewDidAppearEndTime有时候会为0,但是项目中只有初始化的时候赋值为0,刚开始怀疑是都次初始化,但是我的类是单例,所以不成功,最后经过打点排查,发现是请求网络方法回调完成的时候,viewDidAppear方法还没有开始执行,解决这样的问题,可以有两种思路
- viewDidAppear开始网络请求
- viewDidAppear和网络回调分别判断对应的时间是否大于0
问题2:最近在启动优化数据复盘的过程中,发现了一些埋点数据异常,出现某两个时间段之间时间差值过大,导致结果分析过程中出现了难点,导致异常的原因有:
- 原因一:用户在启动过程中,退出到了后台,导致首屏页面不能按照正常启动流程执行;
- 原因二:用户在启动过程中,点击了广告页,进入到广告页,导致首屏页面不能按照正常启动流程执行;
解决方法:
过滤掉这些非正式流程出现的脏数据,添加两个变量:
enterBackgroundBeforeFinish:是否进入了后台(0代表没有,1代表有)
clickAd:是否点击进入了广告页:(0代表没有,1代表有)
查询sql
and event:appStartUp and deviceType: iPhone or deviceType: iPad and appVersion>= "5.0.0" |SELECT round(avg((requestEndTime-mainTime)/1000),2) as "requestEndTime",deviceType,requestStatus,pageType ,count(*) as total,date_format(date_trunc('day', __time__), '%m-%d') as date where (requestEndTime - mainTime) > 0 group by deviceType ,requestStatus,pageType,date order by date
问题3:网络接口不稳定,导致总时间不稳定
解决建议:
增加首页缓存,提升用户体验
直播课表实时性的解决
问题4:广告页作为rootViewController
解决方法:
作为window层在最上层展示,与首页代码并发执行
网友评论