美文网首页
iOS底层探索-程序加载preMain

iOS底层探索-程序加载preMain

作者: 可可先生_3083 | 来源:发表于2021-07-12 18:15 被阅读0次

    前言

    涉及内容较多,很多细节需要进一步探索,希望同学们多多批评指正。

    XNU加载app

    参考资料:
    iOS 系统内核 XNU:App 如何加载?
    XNU源码

    1. fork 新进程
    2. 为Mach-O分配内存
    3. 解析Mach-O
    4. 读取Mach-O 头文件
    5. 遍历load command信息,将Mach-O映射到内存,设置执行app的入口点。
    6. 启动dyld

    总体来说,XNU加载就是为Mach-O创建一个新进程,建立虚拟内存空间,解析Mach-O文件,最后映射到存空间,设置执行App的入口点。

    设置完入口点后会通过 load_dylinker() 函数来解析加载 dyld,然后将入口点地址改成 dyld 的入口地址。这一步完后,内核部分就完成了 Mach-O 文件的加载。剩下的就是用户态层 dyld 加载 App 了

    dyld

    参考资料:
    dyld: Dynamic Linking On OS X
    链接器:符号是怎么绑定到地址上的
    dyld3-wwdc
    iOS应用的启动流程和优化详解

    dyld 是英文the dynamic link editor的简写,也就是动态链接器,是苹果操作系统的一个重要组成部分。在iOS/Mac OSX系统中,仅有很少量进程只需要内核就能完成加载,基本上所有的进程都需要动态链接的。
    Mach-O镜像文件中会有很多对外部的库和符号的引用,但是这些引用并不能直接用(对于动态库的符号,是undefined),这个填补工作就是由动态链器dyld来完成。

    (undefined) external _NSLog (from Foundation) 
    (undefined) external _OBJC_CLASS_$_NSObject (from CoreFoundation)
    

    概括讲,dyld主要做了下面几件事

    1. loading

    先执行Mach-O文件,根据Mach-O中的undefined符号加载对应的dylib,系统会设置一个共享缓存来解决加载的递归依赖问题(在load函数有一个算法做相应的事情),把dylib映射到进程内存。

    2. linking

    • rebase 修复指向当前镜像内部的资源指针
    ASLR:是Address Space Layout Randomization(地址空间布局随机化)的简称。App在被启动的时候,程序和dylib会被映射到逻辑地址空间,这个逻辑地址空间有一个起始地址,ASLR技术让这个起始地址是随机的。这个地址如果是固定的,黑客很容易就用起始地址+函数偏移地址找到对应的函数地址。
    
    Code Sign:就是苹果代码加密签名机制,但是在Code Sign操作的时候,加密的哈希不是针对整个文件,而是针对每一个Page的。这个就保证了dyld在加载的时候,可以对每个page进行独立的验证
    
    • bind 将符号绑定到动态库里对应的地址上,bind指向的是镜像外部的资源指针(跨镜像)

    3. static initailizers

    • 当我们dylib被映射到进程内存,并被链接,就需要对这些资源进行必要的初始化,在iOS系统libsystem,libdispatch,libObjc,这几个库有较高的优先级,会被确保最先被初始化。
    • libObjc会向dyld注册回调函数,被加载的库包括我们的主程序 通过回调函数把oc类,符号等内容(比如我们声明的自定义class和selector)交给libObjc库管理。这也是我们下一篇内容要研究的重点。
    _dyld_objc_notify_register(&map_images, load_images, unmap_image)

    4.skip main

    至此,可执行文件和动态库都已被加载到内存,各种资源指针都被修复指向正确的内存地址,必要的初始化完成。dyld跳转到main函数并执行。

    dyld代码流程 截屏2021-07-14 下午1.25.31.png

    用模拟跑的,真机流程不一样的

    dyldbootstrap::start

    uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
                    const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
    {
    
        // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
        dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);
    
        // if kernel had to slide dyld, we need to fix up load sensitive locations
        // we have to do this before using any global variables
        /**
            1. rebase
         */
        rebaseDyld(dyldsMachHeader);
    
        // kernel sets up env pointer to be just past end of agv array
        const char** envp = &argv[argc+1];
        
        // kernel sets up apple pointer to be just past end of envp array
        const char** apple = envp;
        while(*apple != NULL) { ++apple; }
        ++apple;
        
        // set up random value for stack canary
        __guard_setup(apple);
    
    #if DYLD_INITIALIZER_SUPPORT
        // run all C++ initializers inside dyld
        runDyldInitializers(argc, argv, envp, apple);
    #endif
    
        _subsystem_init(apple);
    
        // now that we are done bootstrapping dyld, call dyld's main
        uintptr_t appsSlide = appsMachHeader->getSlide();
        return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
    }
    
    • rebaseDyld
    • runDyldIntializers
    • runDyldInitializers
    • call dyld:_main
      发现没有,一个库要被进程使用,需要做的就是那么几件事,
      load映射进内存,
      rebase/bind进行必要的资源指针修复,
      一些环境变量的设置,
      Initializer必要的初始化,
      然后这个库就可以被使用了,dyld也不例外,然后dyld可以开心去加载其他动态库了

    dyld::_main

    代码非常长,贴几个重要的片段,有兴趣可以下载源码看下

    配置环境变量
    _main(const macho_header* mainExecutableMH, uintptr_t mainExecutableSlide, 
            int argc, const char* argv[], const char* envp[], const char* apple[], 
            uintptr_t* startGlue)
    {
        if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
            launchTraceID = dyld3::kdebug_trace_dyld_duration_start(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, (uint64_t)mainExecutableMH, 0, 0);
        }
    
        //Check and see if there are any kernel flags
        dyld3::BootArgs::setFlags(hexToUInt64(_simple_getenv(apple, "dyld_flags"), nullptr));
    
    #if __has_feature(ptrauth_calls)
        // Check and see if kernel disabled JOP pointer signing (which lets us load plain arm64 binaries)
        if ( const char* disableStr = _simple_getenv(apple, "ptrauth_disabled") ) {
            if ( strcmp(disableStr, "1") == 0 )
                sKeysDisabled = true;
        }
        else {
            // needed until kernel passes ptrauth_disabled for arm64 main executables
            if ( (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_V8) || (mainExecutableMH->cpusubtype == CPU_SUBTYPE_ARM64_ALL) )
                sKeysDisabled = true;
        }
    #endif
    
        // Grab the cdHash of the main executable from the environment
        uint8_t mainExecutableCDHashBuffer[20];
        const uint8_t* mainExecutableCDHash = nullptr;
        if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
            unsigned bufferLenUsed;
            if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
                mainExecutableCDHash = mainExecutableCDHashBuffer;
        }
    
        getHostInfo(mainExecutableMH, mainExecutableSlide);
    
    #if !TARGET_OS_SIMULATOR
        // Trace dyld's load
        notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
        // Trace the main executable's load
        notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
    #endif
    
        uintptr_t result = 0;
        sMainExecutableMachHeader = mainExecutableMH;
        sMainExecutableSlide = mainExecutableSlide;
    
    
        // Set the platform ID in the all image infos so debuggers can tell the process type
        // FIXME: This can all be removed once we make the kernel handle it in rdar://43369446
        // The host may not have the platform field in its struct, but there's space for it in the padding, so always set it
        {
            __block bool platformFound = false;
            ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
                if (platformFound) {
                    halt("MH_EXECUTE binaries may only specify one platform");
                }
                gProcessInfo->platform = (uint32_t)platform;
                platformFound = true;
            });
            if (gProcessInfo->platform == (uint32_t)dyld3::Platform::unknown) {
                // There were no platforms found in the binary. This may occur on macOS for alternate toolchains and old binaries.
                // It should never occur on any of our embedded platforms.
    #if TARGET_OS_OSX
                gProcessInfo->platform = (uint32_t)dyld3::Platform::macOS;
    #else
                halt("MH_EXECUTE binaries must specify a minimum supported OS version");
    #endif
            }
        }
    
    #if TARGET_OS_OSX
        // Check to see if we need to override the platform
        const char* forcedPlatform = _simple_getenv(envp, "DYLD_FORCE_PLATFORM");
        if (forcedPlatform) {
            dyld_platform_t forcedPlatformType = 0;
            if (strncmp(forcedPlatform, "6", 1) == 0) {
                forcedPlatformType = PLATFORM_MACCATALYST;
            } else if (strncmp(forcedPlatform, "2", 1) == 0) {
                forcedPlatformType = PLATFORM_IOS;
            } else  {
                halt("DYLD_FORCE_PLATFORM is only supported for platform 2 or 6.");
            }
            const dyld3::MachOFile* mf = (dyld3::MachOFile*)sMainExecutableMachHeader;
            if (mf->allowsAlternatePlatform()) {
                gProcessInfo->platform = forcedPlatformType;
            }
        }
    
        // if this is host dyld, check to see if iOS simulator is being run
        const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
        if ( (rootPath != NULL) ) {
            // look to see if simulator has its own dyld
            char simDyldPath[PATH_MAX]; 
            strlcpy(simDyldPath, rootPath, PATH_MAX);
            strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
            int fd = dyld3::open(simDyldPath, O_RDONLY, 0);
            if ( fd != -1 ) {
                //TODO:模拟器流程分支return
                const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
                if ( errMessage != NULL )
                    halt(errMessage);
                return result;
            }
        }
        else {
            ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
                if ( dyld3::MachOFile::isSimulatorPlatform(platform) )
                    halt("attempt to run simulator program outside simulator (DYLD_ROOT_PATH not set)");
            });
        }
    #endif
    
        CRSetCrashLogMessage("dyld: launch started");
        //TODO:设置上下文
        setContext(mainExecutableMH, argc, argv, envp, apple);
    
        // Pickup the pointer to the exec path.
        //TODO: 获取可执行路径
        sExecPath = _simple_getenv(apple, "executable_path");
    
        // <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
        if (!sExecPath) sExecPath = apple[0];
    
    #if TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR
        // <rdar://54095622> kernel is not passing a real path for main executable
        if ( strncmp(sExecPath, "/var/containers/Bundle/Application/", 35) == 0 ) {
            if ( char* newPath = (char*)malloc(strlen(sExecPath)+10) ) {
                strcpy(newPath, "/private");
                strcat(newPath, sExecPath);
                sExecPath = newPath;
            }
        }
    #endif
    
        if ( sExecPath[0] != '/' ) {
            // have relative path, use cwd to make absolute
            char cwdbuff[MAXPATHLEN];
            if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
                // maybe use static buffer to avoid calling malloc so early...
                char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
                strcpy(s, cwdbuff);
                strcat(s, "/");
                strcat(s, sExecPath);
                sExecPath = s;
            }
        }
    
        // Remember short name of process for later logging
        sExecShortName = ::strrchr(sExecPath, '/');
        if ( sExecShortName != NULL )
            ++sExecShortName;
        else
            sExecShortName = sExecPath;
    
    #if TARGET_OS_OSX && __has_feature(ptrauth_calls)
        // on Apple Silicon macOS, only Apple signed ("platform binary") arm64e can be loaded
        sOnlyPlatformArm64e = true;
    
        // internal builds, or if boot-arg is set, then non-platform-binary arm64e slices can be run
        if ( const char* abiMode = _simple_getenv(apple, "arm64e_abi") ) {
            if ( strcmp(abiMode, "all") == 0 )
                sOnlyPlatformArm64e = false;
        }
    #endif
        //设置进程限制条件
        configureProcessRestrictions(mainExecutableMH, envp);
    

    很多check/比较/set/get,对一些环境变量读取,校验,设置。

    加载共享缓存

    从iOS3.1开始,为了提高性能,绝大部分的系统动态库文件都打包存放到了一个缓存文件中。共享缓存中存的都是系统级别的动态库。自己常见的动态库或者第三方动态库不会放到共享缓存中。

        // load shared cache
        checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
        if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
    #if TARGET_OS_SIMULATOR
            if ( sSharedCacheOverrideDir)
                mapSharedCache(mainExecutableSlide);
    #else
            mapSharedCache(mainExecutableSlide);
    #endif
    
            // If this process wants a different __DATA_CONST state from the shared region, then override that now
            if ( (sSharedCacheLoadInfo.loadAddress != nullptr) && (gEnableSharedCacheDataConst != sharedCacheDataConstIsEnabled) ) {
                uint32_t permissions = gEnableSharedCacheDataConst ? VM_PROT_READ : (VM_PROT_READ | VM_PROT_WRITE);
                sSharedCacheLoadInfo.loadAddress->changeDataConstPermissions(mach_task_self(), permissions,
                                                                             (gLinkContext.verboseMapping ? &dyld::log : nullptr));
            }
        }
    
    • 核心函数mapSharedCache(mainExecutableSlide)
    • mapSharedCache中调用loadDyldCache加载共享缓存
    bool loadDyldCache(const SharedCacheOptions& options, SharedCacheLoadInfo* results)
    {
        results->loadAddress        = 0;
        results->slide              = 0;
        results->errorMessage       = nullptr;
    
    #if TARGET_OS_SIMULATOR
        // simulator only supports mmap()ing cache privately into process
        return mapCachePrivate(options, results);
    #else
        if ( options.forcePrivate ) {
            // mmap cache into this process only
            return mapCachePrivate(options, results);
        }
        else {
            // fast path: when cache is already mapped into shared region
            bool hasError = false;
            if ( reuseExistingCache(options, results) ) {
                hasError = (results->errorMessage != nullptr);
            } else {
                // slow path: this is first process to load cache
                hasError = mapCacheSystemWide(options, results);
            }
            return hasError;
        }
    #endif
    }
    
    • 强制私有
    • 共享缓存已有
    • 第一次加载 这里会进入到// should be in mach/shared_region.h
    dyld3 或 dyld2
      //判断是否使用闭包模式也是dyld3的模式启动 ClosureMode::on 用dyld3 否则使用dyld2
      if ( sClosureMode == ClosureMode::Off ) {
        //dyld2
        if ( gLinkContext.verboseWarnings )
                dyld::log("dyld: not using closures\n");
      } else {
        //dyld3  DYLD_LAUNCH_MODE_USING_CLOSURE 用闭包模式
        sLaunchModeUsed = DYLD_LAUNCH_MODE_USING_CLOSURE;
        const dyld3::closure::LaunchClosure* mainClosure = nullptr;
        dyld3::closure::LoadedFileInfo mainFileInfo;
        mainFileInfo.fileContent = mainExecutableMH;
        mainFileInfo.path = sExecPath;
        ...
        // 首先到共享缓存中去找是否有dyld3的mainClosure
        if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
                mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
                ...
        }
     
       ...
        //如果共享缓存中有,然后去验证closure是否是有效的
        if ( (mainClosure != nullptr) && !closureValid(mainClosure, mainFileInfo, 
        、mainExecutableCDHash, true, envp) ) {
                mainClosure = nullptr;
                sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
        }
        
        bool allowClosureRebuilds = false;
        if ( sClosureMode == ClosureMode::On ) {
                allowClosureRebuilds = true;
        } 
        ...
        
        //如果没有在共享缓存中找到有效的closure 此时就会自动创建一个closure
        if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
            ...
            if ( mainClosure == nullptr ) { 
            // 创建一个mainClosure
            mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, 
            bootToken);
            if ( mainClosure != nullptr )
                    sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
            }
        }
       
        // try using launch closure
        // dyld3 开始启动
        if ( mainClosure != nullptr ) {
            CRSetCrashLogMessage("dyld3: launch started");
            ...
            //启动 launchWithClosure
            bool launched = launchWithClosure(mainClosure, 
            sSharedCacheLoadInfo.loadAddress,(dyld3::MachOLoaded*)mainExecutableMH,...);
             //启动失败                                                              
            if ( !launched && closureOutOfDate && allowClosureRebuilds ) {
                    // closure is out of date, build new one
                    // 如果启动失败 重新去创建mainClosure
                    mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, 
                    envp, bootToken);
                    if ( mainClosure != nullptr ) {
                        ...
                        //dyld3再次启动
                        launched = launchWithClosure(mainClosure,  sSharedCacheLoadInfo.loadAddress,
                        (dyld3::MachOLoaded*)mainExecutableMH,...);
                    }
                }
                if ( launched ) {
                        gLinkContext.startedInitializingMainExecutable = true;
                        if (sSkipMain)
                        //启动成功直接返回main函数的地址
                        result = (uintptr_t)&fake_main;
                        return result;
                }
                else {  
                //启动失败      
                }
        }
    }
    
    • dyld3优化
    1. 加载速度
      1.1. 一个deamon进程的解析器,预处理所有可能影响启动速度的search path,@path和环境变量
      1.2. 然后分析Mach-O的header和依赖,并完成所有符号查找的工作
      1.3. 然后将这些结构创建成一个启动闭包,系统app的启动闭包被构建在sharedCache中,第三方的app,在程序安装或者更新的时候构建这个启动闭包,这些都在程序启动前已经被完成
      1.4 闭包被构建在shared cache中,我们甚至不需要打开一个单独的文件,加载速度很快
    2. 安全性
      加载闭包,并验证启动闭包的安全性,在dyld3之前在程序启动时,dyld递归分析mach-oheader的依赖,可能修改并注入依赖库的问题
    3. 看个官方的对比图 851626253562_.pic.jpg
    实例化主程序

    dyld3 和 dyld2走的流程差不多,dyld3 用的是闭包模式,更快,更安全。imge是镜像文件的意思,镜像文件就是从磁盘映射到内存的mach-O文件。可以理解为只要是加载到内存的mach-o文件就叫镜像文件。

            //TODO: 实例化主程序,返回imageLoader对象,并交给dyld管理
            sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
            //主程序赋值给glinkContext
            gLinkContext.mainExecutable = sMainExecutable;
           //主程序是否代码签名
            gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);
    

    For each executable file (dynamic shared object) in use, an ImageLoader is instantiated.

    • 主程序在dyld之前已经被系统内核映射到进程缓存。从上面这段官方注释可知,任何一个可执行文件要被使用,需要实例化一个imageLoader。
    • 实例化主程序的作用是为主可执行文件实例化为一个ImageLoaderMachO对象,可以看做把主可执行文件抽象为ImageLoaderMachO的实例,交给dyld管理,并被程序使用。
    • 添加到了dyld管理的MappedRanges主列表-addImage()
    实例化动态库-加载插入的动态库
            // load any inserted libraries
            // TODO:加载插入的库
            if  ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
                for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
                    loadInsertedDylib(*lib);
            }
            // record count of inserted libraries so that a flat search will look at 
            // inserted libraries, then main, then others.
            sInsertedDylibCount = sAllImages.size()-1;
    //核心函数-load
    ImageLoader* load(const char* path, const LoadContext& context, unsigned& cacheIndex)
    //实例化一个ImageLoader
    // map in file and instantiate an ImageLoader
    static ImageLoader* loadPhase6(int fd, const struct stat& stat_buf, const char* path, const LoadContext& context)
    // create image by mapping in a mach-o file
    ImageLoader* ImageLoaderMachO::instantiateFromFile
    //添加到dyld管理的主列表
    static ImageLoader* checkandAddImage(ImageLoader* image, const LoadContext& context)
    
    • 做的事情跟实例化主程序差不多,就是把插入的动态库都是实例化一个imageLoader,并添加到dyld管理的主列表
    • load方法实现了一个算法,避免重复的库文件加载
    链接主程序
            // TODO:链接主程序
            gLinkContext.linkingMainExecutable = true;
    #if SUPPORT_ACCELERATE_TABLES
            if ( mainExcutableAlreadyRebased ) {
                // previous link() on main executable has already adjusted its internal pointers for ASLR
                // work around that by rebasing by inverse amount
                sMainExecutable->rebase(gLinkContext, -mainExecutableSlide);
            }
    #endif
            link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
            sMainExecutable->setNeverUnloadRecursive();
            if ( sMainExecutable->forceFlat() ) {
                gLinkContext.bindFlat = true;
                gLinkContext.prebindUsage = ImageLoader::kUseNoPrebinding;
            }
    
    链接动态库
            // link any inserted libraries
            // do this after linking main executable so that any dylibs pulled in by inserted 
            // dylibs (e.g. libSystem) will not be in front of dylibs the program uses
            // TODO:链接动态库 循环
            if ( sInsertedDylibCount > 0 ) {
                for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                    ImageLoader* image = sAllImages[i+1];
                    link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
                    image->setNeverUnloadRecursive();
                }
                if ( gLinkContext.allowInterposing ) {
                    // only INSERTED libraries can interpose
                    // register interposing info after all inserted libraries are bound so chaining works
                    for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                        ImageLoader* image = sAllImages[i+1];
                        image->registerInterposing(gLinkContext);
                    }
                }
            }
    
    

    链接主程序和链接动态库逻辑基本一样,核心函数是link()

    void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath)
    {
        //dyld::log("ImageLoader::link(%s) refCount=%d, neverUnload=%d\n", imagePath, fDlopenReferenceCount, fNeverUnload);
        
        // clear error strings
        (*context.setErrorStrings)(0, NULL, NULL, NULL);
    
        uint64_t t0 = mach_absolute_time();
    //递归loadLibraries
        this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);
        context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly);
    
        // we only do the loading step for preflights
        if ( preflightOnly )
            return;
    
        uint64_t t1 = mach_absolute_time();
        context.clearAllDepths();
        this->updateDepth(context.imageCount());
    
        __block uint64_t t2, t3, t4, t5;
        {
            dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0);
            t2 = mach_absolute_time();
    //递归rebase
            this->recursiveRebaseWithAccounting(context);
            context.notifyBatch(dyld_image_state_rebased, false);
    
            t3 = mach_absolute_time();
                   //初始化主程序时,赋值为true
            if ( !context.linkingMainExecutable )
                this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);
    
            t4 = mach_absolute_time();
            if ( !context.linkingMainExecutable )
                this->weakBind(context);
            t5 = mach_absolute_time();
        }
    
    • link()函数主要做的事情
    1. recursiveLoadLibraries(),保存一个依赖库的数组,方便在内存中找到自己的依赖库,后面符号绑定时候也会用到这个数组。
    2. recursiveRebaseWithAccounting(context)
    主程序和动态库的绑定
            // Bind and notify for the main executable now that interposing has been registered
            uint64_t bindMainExecutableStartTime = mach_absolute_time();
            sMainExecutable->recursiveBindWithAccounting(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true);
            uint64_t bindMainExecutableEndTime = mach_absolute_time();
            ImageLoaderMachO::fgTotalBindTime += bindMainExecutableEndTime - bindMainExecutableStartTime;
            gLinkContext.notifyBatch(dyld_image_state_bound, false);
    
            // Bind and notify for the inserted images now interposing has been registered
            if ( sInsertedDylibCount > 0 ) {
                for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
                    ImageLoader* image = sAllImages[i+1];
                    image->recursiveBind(gLinkContext, sEnv.DYLD_BIND_AT_LAUNCH, true, nullptr);
                }
            }
            
            // <rdar://problem/12186933> do weak binding only after all inserted images linked
            // TODO:主程序弱绑定(after all inserted images linked)
            sMainExecutable->weakBind(gLinkContext);
            gLinkContext.linkingMainExecutable = false;
    
            sMainExecutable->recursiveMakeDataReadOnly(gLinkContext);
    
    • recursiveBind()
    • 对主可执行文件和动态库进行符号绑定操作,用到保存的libImages数组
    • 数据fixup完成后把一些数据段设为只读
    运行初始化方法

    所有镜像文件都已加载,并且资源指针也都修复完毕,可以做一些必要的初始化了。

            // TODO:主程序初始化
            initializeMainExecutable(); 
    
    void initializeMainExecutable()
    {
        // record that we've reached this step
        gLinkContext.startedInitializingMainExecutable = true;
    
        // run initialzers for any inserted dylibs
        // 运行所有的dylibs中的initialzers方法
        ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
        initializerTimes[0].count = 0;
        const size_t rootCount = sImageRoots.size();
        //先运行动态库的初始化方法
        if ( rootCount > 1 ) {
                for(size_t i=1; i < rootCount; ++i) {
                   sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
                }
        }
    
        // run initializers for main executable and everything it brings up 
        // 运行主程序的初始化方法
        sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    
        ...
    }
    
    • 这里是我们关注的重点
    • _objc_init在这里别调用,向dyld注册回调函数,通过回调函数,各个执行文件的oc class,协议,方法,符号等内容将交给libObjc处理,包括我们主可执行文件(也就是我们自己code出来的oc代码)。篇幅有限,下一篇我们再着重讲解。
    返回main函数
        if (sSkipMain) {
             notifyMonitoringDyldMain();
            if (dyld3::kdebug_trace_dyld_enabled(DBG_DYLD_TIMING_LAUNCH_EXECUTABLE)) {
                dyld3::kdebug_trace_dyld_duration_end(launchTraceID, DBG_DYLD_TIMING_LAUNCH_EXECUTABLE, 0, 0, 2);
            }
            ARIADNEDBG_CODE(220, 1);
            result = (uintptr_t)&fake_main;
            *startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
        }
    
        return result;
    

    总结:

    分析了main函数之前,iOS程序的加载过程

    1. 首先内核fork进程,分配进程内存,把主可执行文件map到内存,并启动dyld,这一步从内核态过渡到用户态
    2. dyld 首先会rebaseSelf,并做必要的环境设置,然后分析mainExecutableMH查找可用的共享缓存,并加载。这里我们提到了
      dyld2 和 dyld3,以及dyld3相对于dyld2在启动速度和安全上做的优化。
    3. dyld的主要作用是,分析主程序mach-o文件,动态加载三方库,映进内存,并对他们进行管理。由于ASLR及代码签名的原因,需要对image进行rebase和binding操作,目的是让程序内的资源指针指向正确的内存地址。在所有资源修复完毕之后,执行主可执行文件的初始化。也就是loading,rebase/binding,initializer三件套。
    4. 在initializer中,dyld会保证最下层的动态库libsystem被最先初始化,libDispatch/libObjc也会很早被调用初始化。libObjc的初始化,会向dyld注册回调函数,用于管理所有可执行文件的OC部分。放在下一篇来分析

    思考:

    我们已经知道了main函数之前,程序的启动的大致流程,我们可以从那几个方面来提升程序的启动速度?

    以下内容,来自wwdc

    1. less dylib (减少库加载,必要时可以合并库)
    2. less classed and methods (在加载时,需要被管理,修复指针)
    3. less initializer (初始化主程序时,会调用所有的动态库的initializer和c++构造函数)
    4. more swift (no initializer,不允许特定类型的未对齐数据)
    5. less load
      总之,你写越少的代码,程序启动越快😁😁😁。这篇内容写得还漫长,能力有限,中间可能有不正确或者不准确的地方,希望大家能在评论区多多留言交流。

    相关文章

      网友评论

          本文标题:iOS底层探索-程序加载preMain

          本文链接:https://www.haomeiwen.com/subject/gnarpltx.html