iOS底层原理16：dyld源码分析

作者: 黑白森林无间道 | 来源:发表于2021-07-20 16:54 被阅读0次

iOS底层原理17：dyld与objc的关联
iOS底层原理16：dyld源码分析
iOS-底层原理 06：malloc 源码分析思路
iOS底层原理总结 -- 利用Runtime源码分析Categ
iOS底层原理探索--dyld加载流程分析
OC底层原理 16：类的加载（上）
iOS 类的加载（上）
iOS 底层原理 + 逆向文章汇总
OC底层原理 05：malloc 源码分析思路
iOS-底层原理：内存对齐

本文主要介绍dyld源码执行流程，应用启动加载过程、类、分类加载，都不可避免的触及dyld，所以了解dyld源码可以让我们更好的理解iOS应用的工作原理

什么是dyld
dyld（the dynamic link editor）是苹果的动态链接器，是苹果操作系统的一个重要组成部分，在系统内核做好程序准备工作之后，交由dyld负责余下的工作。而且它是开源的，任何人可以通过苹果官网下载它的源码来阅读理解它的运作方式，了解系统加载动态库的细节。
dyld下载地址：https://opensource.apple.com/tarballs/dyld，本文使用的是dyld-832.7.3版本

dyld加载流程分析

新建一个工程DyldDemo，在ViewController.m中实现+ (void)load {}方法，并添加断点，运行程序，通过bt查看函数调用栈

image

从堆栈信息可以看出dyld是从_dyld_start开始的

【1】__dyld_start

在dyld源码的dyldStartup.s中找到了 __dyld_start函数，此函数由汇编实现，兼容各个平台架构，arm64架构汇编代码如下

#include <TargetConditionals.h>
    .globl __dyld_start
    
    #if __arm64__ && !TARGET_OS_SIMULATOR
    .text
    .align 2
    .globl __dyld_start
__dyld_start:
    mov     x28, sp
    and     sp, x28, #~15       // force 16-byte alignment of stack
    mov x0, #0
    mov x1, #0
    stp x1, x0, [sp, #-16]! // make aligned terminating frame
    mov fp, sp          // set up fp to point to terminating frame
    sub sp, sp, #16             // make room for local variables
#if __LP64__
    ldr     x0, [x28]               // get app's mh into x0
    ldr     x1, [x28, #8]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
    add     x2, x28, #16            // get argv into x2
#else
    ldr     w0, [x28]               // get app's mh into x0
    ldr     w1, [x28, #4]           // get argc into x1 (kernel passes 32-bit int argc as 64-bits on stack to keep alignment)
    add     w2, w28, #8             // get argv into x2
#endif
    adrp    x3,___dso_handle@page
    add     x3,x3,___dso_handle@pageoff // get dyld's mh in to x4
    mov x4,sp                   // x5 has &startGlue

    // call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
    bl  __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm
    mov x16,x0                  // save entry point address in x16
#if __LP64__
    ldr     x1, [sp]
#else
    ldr     w1, [sp]
#endif
    cmp x1, #0
    b.ne    Lnew

    // LC_UNIXTHREAD way, clean up stack and jump to result
#if __LP64__
    add sp, x28, #8             // restore unaligned stack pointer without app mh
#else
    add sp, x28, #4             // restore unaligned stack pointer without app mh
#endif
#if __arm64e__
    braaz   x16                     // jump to the program's entry point
#else
    br      x16                     // jump to the program's entry point
#endif

    // LC_MAIN case, set up stack for call to main()
Lnew:   mov lr, x1          // simulate return address into _start in libdyld.dylib
#if __LP64__
    ldr x0, [x28, #8]       // main param1 = argc
    add x1, x28, #16        // main param2 = argv
    add x2, x1, x0, lsl #3
    add x2, x2, #8          // main param3 = &env[0]
    mov x3, x2
Lapple: ldr x4, [x3]
    add x3, x3, #8
#else
    ldr w0, [x28, #4]       // main param1 = argc
    add x1, x28, #8         // main param2 = argv
    add x2, x1, x0, lsl #2
    add x2, x2, #4          // main param3 = &env[0]
    mov x3, x2
Lapple: ldr w4, [x3]
    add x3, x3, #4
#endif
    cmp x4, #0
    b.ne    Lapple          // main param4 = apple
#if __arm64e__
    braaz   x16
#else
    br      x16
#endif

#endif // __arm64__ && !TARGET_OS_SIMULATOR

源码中可以看到一条bl命令，根据注释可以知道是跳转到dyldbootstrap::start()函数：

// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
    bl  __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm

【2】dyldbootstrap::start()函数

dyldbootstrap::start()函数中做了很多dyld初始化相关的工作

dyld重定位：rebaseDyld(dyldsMachHeader);
mach消息初始化：mach_init();
栈溢出保护：__guard_setup(apple);

//
//  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
//  In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
                const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

    // if kernel had to slide dyld, we need to fix up load sensitive locations
    // we have to do this before using any global variables
    // dyld重定位
    rebaseDyld(dyldsMachHeader);

    // kernel sets up env pointer to be just past end of agv array
    const char** envp = &argv[argc+1];
    
    // kernel sets up apple pointer to be just past end of envp array
    const char** apple = envp;
    while(*apple != NULL) { ++apple; }
    ++apple;

    // set up random value for stack canary
    // 栈溢出保护
    __guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
    // run all C++ initializers inside dyld
    runDyldInitializers(argc, argv, envp, apple);
#endif

    _subsystem_init(apple);

    // now that we are done bootstrapping dyld, call dyld's main
    uintptr_t appsSlide = appsMachHeader->getSlide();
    // 进入dyld::_main()函数
    return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

【3】dyld::_main()

dyld::_main()是整个App启动的关键函数，此函数里面做了很多事情

image

【3.1】配置环境变量

这一步主要是设置运行参数、环境变量等。代码在开始的时候，将入参mainExecutableMH赋值给了sMainExecutableMachHeader，这是一个macho_header结构体，表示的是当前主程序的Mach-O头部信息，加载器依据Mach-O头部信息就可以解析整个Mach-O文件信息

设置上下文信息：setContext(mainExecutableMH, argc, argv, envp, apple);
配置进程受限模式：configureProcessRestrictions(mainExecutableMH, envp);
checkEnvironmentVariables(envp);检测环境变量
配置环境变量DYLD_PRINT_OPTS和DYLD_PRINT_ENV打印参数

    // 第一步：配置环境变量
    // Grab the cdHash of the main executable from the environment
    // 从环境中获取主可执行文件的cdHash
    uint8_t mainExecutableCDHashBuffer[20];
    const uint8_t* mainExecutableCDHash = nullptr;
    if ( const char* mainExeCdHashStr = _simple_getenv(apple, "executable_cdhash") ) {
        unsigned bufferLenUsed;
        if ( hexStringToBytes(mainExeCdHashStr, mainExecutableCDHashBuffer, sizeof(mainExecutableCDHashBuffer), bufferLenUsed) )
            mainExecutableCDHash = mainExecutableCDHashBuffer;
    }

    getHostInfo(mainExecutableMH, mainExecutableSlide);

#if !TARGET_OS_SIMULATOR
    // Trace dyld's load
    notifyKernelAboutImage((macho_header*)&__dso_handle, _simple_getenv(apple, "dyld_file"));
    // Trace the main executable's load
    notifyKernelAboutImage(mainExecutableMH, _simple_getenv(apple, "executable_file"));
#endif

    uintptr_t result = 0;
    // 主程序的MachO_header
    sMainExecutableMachHeader = mainExecutableMH;
    // 主程序的ASLR值
    sMainExecutableSlide = mainExecutableSlide;


    // Set the platform ID in the all image infos so debuggers can tell the process type
    // FIXME: This can all be removed once we make the kernel handle it in rdar://43369446
    // The host may not have the platform field in its struct, but there's space for it in the padding, so always set it
    {
        __block bool platformFound = false;
        ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
            if (platformFound) {
                halt("MH_EXECUTE binaries may only specify one platform");
            }
            gProcessInfo->platform = (uint32_t)platform;
            platformFound = true;
        });
        if (gProcessInfo->platform == (uint32_t)dyld3::Platform::unknown) {
            // There were no platforms found in the binary. This may occur on macOS for alternate toolchains and old binaries.
            // It should never occur on any of our embedded platforms.
#if TARGET_OS_OSX
            gProcessInfo->platform = (uint32_t)dyld3::Platform::macOS;
#else
            halt("MH_EXECUTE binaries must specify a minimum supported OS version");
#endif
        }
    }

#if TARGET_OS_OSX
    // Check to see if we need to override the platform
    const char* forcedPlatform = _simple_getenv(envp, "DYLD_FORCE_PLATFORM");
    if (forcedPlatform) {
        dyld_platform_t forcedPlatformType = 0;
        if (strncmp(forcedPlatform, "6", 1) == 0) {
            forcedPlatformType = PLATFORM_MACCATALYST;
        } else if (strncmp(forcedPlatform, "2", 1) == 0) {
            forcedPlatformType = PLATFORM_IOS;
        } else  {
            halt("DYLD_FORCE_PLATFORM is only supported for platform 2 or 6.");
        }
        const dyld3::MachOFile* mf = (dyld3::MachOFile*)sMainExecutableMachHeader;
        if (mf->allowsAlternatePlatform()) {
            gProcessInfo->platform = forcedPlatformType;
        }
    }

    // if this is host dyld, check to see if iOS simulator is being run
    const char* rootPath = _simple_getenv(envp, "DYLD_ROOT_PATH");
    if ( (rootPath != NULL) ) {
        // look to see if simulator has its own dyld
        char simDyldPath[PATH_MAX]; 
        strlcpy(simDyldPath, rootPath, PATH_MAX);
        strlcat(simDyldPath, "/usr/lib/dyld_sim", PATH_MAX);
        int fd = dyld3::open(simDyldPath, O_RDONLY, 0);
        if ( fd != -1 ) {
            const char* errMessage = useSimulatorDyld(fd, mainExecutableMH, simDyldPath, argc, argv, envp, apple, startGlue, &result);
            if ( errMessage != NULL )
                halt(errMessage);
            return result;
        }
    }
    else {
        ((dyld3::MachOFile*)mainExecutableMH)->forEachSupportedPlatform(^(dyld3::Platform platform, uint32_t minOS, uint32_t sdk) {
            if ( dyld3::MachOFile::isSimulatorPlatform(platform) )
                halt("attempt to run simulator program outside simulator (DYLD_ROOT_PATH not set)");
        });
    }
#endif

    CRSetCrashLogMessage("dyld: launch started");
    // 设置上下文信息
    setContext(mainExecutableMH, argc, argv, envp, apple);

    // Pickup the pointer to the exec path.
    sExecPath = _simple_getenv(apple, "executable_path");

    // <rdar://problem/13868260> Remove interim apple[0] transition code from dyld
    if (!sExecPath) sExecPath = apple[0];

#if TARGET_OS_IPHONE && !TARGET_OS_SIMULATOR
    // <rdar://54095622> kernel is not passing a real path for main executable
    if ( strncmp(sExecPath, "/var/containers/Bundle/Application/", 35) == 0 ) {
        if ( char* newPath = (char*)malloc(strlen(sExecPath)+10) ) {
            strcpy(newPath, "/private");
            strcat(newPath, sExecPath);
            sExecPath = newPath;
        }
    }
#endif

    if ( sExecPath[0] != '/' ) {
        // have relative path, use cwd to make absolute
        char cwdbuff[MAXPATHLEN];
        if ( getcwd(cwdbuff, MAXPATHLEN) != NULL ) {
            // maybe use static buffer to avoid calling malloc so early...
            char* s = new char[strlen(cwdbuff) + strlen(sExecPath) + 2];
            strcpy(s, cwdbuff);
            strcat(s, "/");
            strcat(s, sExecPath);
            sExecPath = s;
        }
    }

    // Remember short name of process for later logging
    sExecShortName = ::strrchr(sExecPath, '/');
    if ( sExecShortName != NULL )
        ++sExecShortName;
    else
        sExecShortName = sExecPath;

#if TARGET_OS_OSX && __has_feature(ptrauth_calls)
    // on Apple Silicon macOS, only Apple signed ("platform binary") arm64e can be loaded
    sOnlyPlatformArm64e = true;

    // internal builds, or if boot-arg is set, then non-platform-binary arm64e slices can be run
    if ( const char* abiMode = _simple_getenv(apple, "arm64e_abi") ) {
        if ( strcmp(abiMode, "all") == 0 )
            sOnlyPlatformArm64e = false;
    }
#endif
    // 配置进程受限模式
    configureProcessRestrictions(mainExecutableMH, envp);
    // AMFI相关（Apple Mobile File Integrity苹果移动文件保护)
    // Check if we should force dyld3.  Note we have to do this outside of the regular env parsing due to AMFI
    if ( dyld3::internalInstall() ) {
        if (const char* useClosures = _simple_getenv(envp, "DYLD_USE_CLOSURES")) {
            if ( strcmp(useClosures, "0") == 0 ) {
                sClosureMode = ClosureMode::Off;
            } else if ( strcmp(useClosures, "1") == 0 ) {
    #if !__i386__ // don't support dyld3 for 32-bit macOS
                sClosureMode = ClosureMode::On;
                sClosureKind = ClosureKind::full;
    #endif
            } else if ( strcmp(useClosures, "2") == 0 ) {
                sClosureMode = ClosureMode::On;
                sClosureKind = ClosureKind::minimal;
            } else {
                dyld::warn("unknown option to DYLD_USE_CLOSURES.  Valid options are: 0 and 1\n");
            }

        }
    }

#if TARGET_OS_OSX
    if ( !gLinkContext.allowEnvVarsPrint && !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsSharedCache ) {
        pruneEnvironmentVariables(envp, &apple);
        // set again because envp and apple may have changed or moved
        setContext(mainExecutableMH, argc, argv, envp, apple);
    }
    else
#endif
    {
        // 检测环境变量
        checkEnvironmentVariables(envp);
        defaultUninitializedFallbackPaths(envp);
    }
#if TARGET_OS_OSX
    switch (gProcessInfo->platform) {
#if (TARGET_OS_OSX && TARGET_CPU_ARM64)
        case PLATFORM_IOS:
            sClosureMode = ClosureMode::On; // <rdar://problem/56792308> Run iOS apps on macOS in dyld3 mode
            [[clang::fallthrough]];
#endif
        case PLATFORM_MACCATALYST:
            gLinkContext.rootPaths = parseColonList("/System/iOSSupport", NULL);
            gLinkContext.iOSonMac = true;
            if ( sEnv.DYLD_FALLBACK_LIBRARY_PATH == sLibraryFallbackPaths )
                sEnv.DYLD_FALLBACK_LIBRARY_PATH = sRestrictedLibraryFallbackPaths;
            if ( sEnv.DYLD_FALLBACK_FRAMEWORK_PATH == sFrameworkFallbackPaths )
                sEnv.DYLD_FALLBACK_FRAMEWORK_PATH = sRestrictedFrameworkFallbackPaths;
            break;
        case PLATFORM_DRIVERKIT:
            gLinkContext.driverKit = true;
            gLinkContext.sharedRegionMode = ImageLoader::kDontUseSharedRegion;
            break;
    }
#endif
    // 如果设置了DYLD_PRINT_OPTS则调用printOptions()打印参数
    if ( sEnv.DYLD_PRINT_OPTS )
        printOptions(argv);
    // 如果设置了DYLD_PRINT_ENV则调用printEnvironmentVariables()打印环境变量
    if ( sEnv.DYLD_PRINT_ENV ) 
        printEnvironmentVariables(envp);

【3.2】加载共享缓存

首先检查共享缓存是否禁用：checkSharedRegionDisable()，iOS必须开启共享缓存才能正常工作（iOS cannot run without shared region）
接下来调用mapSharedCache()加载共享缓存，在mapSharedCache()里实际是调用了loadDyldCache()函数，从代码可以看出，共享缓存加载分为三种情况 mapSharedCache(mainExecutableSlide); ——> loadDyldCache(opts, &sSharedCacheLoadInfo);
- 仅加载到当前进程，调用mapCachePrivate()，iOS系统不回进入这个分支
- 共享缓存已加载，不做任何处理。
- 当前进程首次加载共享缓存，调用mapCacheSystemWide()

image

if ( sJustBuildClosure )
        sClosureMode = ClosureMode::On;

    // TODO: 第二步：加载共享缓存
    // load shared cache
    // 检查共享缓存是否禁用，iOS必须开启共享缓存
    checkSharedRegionDisable((dyld3::MachOLoaded*)mainExecutableMH, mainExecutableSlide);
    if ( gLinkContext.sharedRegionMode != ImageLoader::kDontUseSharedRegion ) {
#if TARGET_OS_SIMULATOR
        if ( sSharedCacheOverrideDir)
            mapSharedCache(mainExecutableSlide);
#else
        mapSharedCache(mainExecutableSlide);
#endif
    }

动态库共享缓存区

由于iOS系统中 UIKit、Foundation 等系统动态库每个应用都会通过 dyld 加载到内存中，因此，为了节约空间，苹果将这些系统库放在了一个地方：动态库共享缓存区（dyld shared cache）

PIC技术（Position Independent Code：位置代码独立）

以NSLog举例说明

【编译时】：工程中所有引用了共享缓存区中的系统库方法，编译时都会放到间接符号表中，运行时绑定到真实函数地址
【运行时】：当dyld将应用进程加载到内存中时，根据load commands加载需要的系统动态库文件，然后去做相应的符号绑定（如NSLog，dyld就会去找到Foundation中NSLog的真实地址，映射到_DATA段符号表的NSLog上）

这个过程被称为PIC技术（Position Independent Code）位置代码独立。

这里提一下fishhook的工作原理：将编译后系统库函数所指向的符号，在运行时重绑定到用户指定的函数地址，然后将原系统函数的真实地址赋值到用户指定的指针上

【3.3】dyld2 和 dyld3（闭包模式）

image

在iOS 13系统中，iOS将全面采用新的dyld3以替代之前版本的dyld2。dyld3带来了可观的性能提升，减少了APP的启动时间。
dyld2主要过程如下：

解析 Mach-O 的 Header 和 Load Commands，找到其依赖的库，并递归找到所有依赖的库
加载 Mach-O 文件
进行符号查找
绑定和变基
运行初始化程序

上面的所有过程都发生在 App 启动时，包含了大量的计算和I/O
所以苹果开发团队为了加快启动速度，在 WWDC2017 - 413 -
App Startup Time: Past, Present, and Future[4]
正式提出了 dyld3。

👇下面我们将分析dyld3的流程

【3.3.1】获取mainClosure

首先从共享缓存中获取 mainClosure
验证mainClosure是否有效（dyld闭包版本需要与 dyld缓存版本一致）
如果没有找到一个有效的缓存闭包，则生成新的

// dyld3 启用闭包模式，加载速度更快
sLaunchModeUsed = DYLD_LAUNCH_MODE_USING_CLOSURE;
const dyld3::closure::LaunchClosure* mainClosure = nullptr;
dyld3::closure::LoadedFileInfo mainFileInfo;
mainFileInfo.fileContent = mainExecutableMH;
mainFileInfo.path = sExecPath;
// FIXME: If we are saving this closure, this slice offset/length is probably wrong in the case of FAT files.
mainFileInfo.sliceOffset = 0;
mainFileInfo.sliceLen = -1;
struct stat mainExeStatBuf;
if ( dyld3::stat(sExecPath, &mainExeStatBuf) == 0 ) {
    mainFileInfo.inode = mainExeStatBuf.st_ino;
    mainFileInfo.mtime = mainExeStatBuf.st_mtime;
}
// check for closure in cache first
if ( sSharedCacheLoadInfo.loadAddress != nullptr ) {
    // 从共享缓存中获取 mainClosure
    mainClosure = sSharedCacheLoadInfo.loadAddress->findClosure(sExecPath);
    if ( gLinkContext.verboseWarnings && (mainClosure != nullptr) )
        dyld::log("dyld: found closure %p (size=%lu) in dyld shared cache\n", mainClosure, mainClosure->size());
    if ( mainClosure != nullptr )
        sLaunchModeUsed |= DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
}

// We only want to try build a closure at runtime if its an iOS third party binary, or a macOS binary from the shared cache
bool allowClosureRebuilds = false;
if ( sClosureMode == ClosureMode::On ) {
    allowClosureRebuilds = true;
} else if ( (sClosureMode == ClosureMode::PreBuiltOnly) && (mainClosure != nullptr) ) {
    allowClosureRebuilds = true;
}

if ( (mainClosure != nullptr) && ! closureValid(mainClosure, mainFileInfo, mainExecutableCDHash, true, envp) ) {
    // 验证mainClosure是否失效
    mainClosure = nullptr;
    sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_CLOSURE_FROM_OS;
}

// <rdar://60333505> bootToken is a concat of boot-hash kernel passes down for app and dyld's uuid
uint8_t bootTokenBufer[128];
unsigned bootTokenBufferLen = 0;
if ( const char* bootHashStr = _simple_getenv(apple, "executable_boothash") ) {
    if ( hexStringToBytes(bootHashStr, bootTokenBufer, sizeof(bootTokenBufer), bootTokenBufferLen) ) {
        if ( ((dyld3::MachOFile*)&__dso_handle)->getUuid(&bootTokenBufer[bootTokenBufferLen]) )
            bootTokenBufferLen += sizeof(uuid_t);
    }
}
dyld3::Array<uint8_t> bootToken(bootTokenBufer, bootTokenBufferLen, bootTokenBufferLen);

// If we didn't find a valid cache closure then try build a new one
if ( (mainClosure == nullptr) && allowClosureRebuilds ) {
    // if forcing closures, and no closure in cache, or it is invalid, check for cached closure
    if ( !sForceInvalidSharedCacheClosureFormat )
        mainClosure = findCachedLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
    if ( mainClosure == nullptr ) {
        // if  no cached closure found, build new one
        mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
        if ( mainClosure != nullptr )
            sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
    }
}

// exit dyld after closure is built, without running program
if ( sJustBuildClosure )
    _exit(EXIT_SUCCESS);

【3.3.2】通过mainClosure启动程序

// TODO: 【第三步3.2】通过闭包模式启动程序
    // try using launch closure
    if ( mainClosure != nullptr ) {
        // dyld3开始启动
        CRSetCrashLogMessage("dyld3: launch started");
        if ( mainClosure->topImage()->fixupsNotEncoded() )
            sLaunchModeUsed |= DYLD_LAUNCH_MODE_MINIMAL_CLOSURE;
        Diagnostics diag;
        bool closureOutOfDate;
        bool recoverable;
        // 通过闭包启动程序
        bool launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
                                          mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
        if ( !launched && closureOutOfDate && allowClosureRebuilds ) {
            // closure is out of date, build new one
            // 没有启动成功，并且closure过期，重新创建mainClosure
            mainClosure = buildLaunchClosure(mainExecutableCDHash, mainFileInfo, envp, bootToken);
            if ( mainClosure != nullptr ) {
                diag.clearError();
                sLaunchModeUsed |= DYLD_LAUNCH_MODE_BUILT_CLOSURE_AT_LAUNCH;
                if ( mainClosure->topImage()->fixupsNotEncoded() )
                    sLaunchModeUsed |= DYLD_LAUNCH_MODE_MINIMAL_CLOSURE;
                else
                    sLaunchModeUsed &= ~DYLD_LAUNCH_MODE_MINIMAL_CLOSURE;
                launched = launchWithClosure(mainClosure, sSharedCacheLoadInfo.loadAddress, (dyld3::MachOLoaded*)mainExecutableMH,
                                             mainExecutableSlide, argc, argv, envp, apple, diag, &result, startGlue, &closureOutOfDate, &recoverable);
            }
        }
        if ( launched ) {
            gLinkContext.startedInitializingMainExecutable = true;
            if (sSkipMain)
                // 拿到主程序main函数，并返回
                result = (uintptr_t)&fake_main;
            return result;
        }
        else {
            if ( gLinkContext.verboseWarnings ) {
                dyld::log("dyld: unable to use closure %p\n", mainClosure);
            }
            if ( !recoverable )
                halt(diag.errorMessage());
        }
    }
}

👇下面是dyld2的流程

【3.4】实例化主程序

// instantiate ImageLoader for main executable
// TODO: 实例化主程序
sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);
gLinkContext.mainExecutable = sMainExecutable;
gLinkContext.mainExecutableCodeSigned = hasCodeSignatureLoadCommand(mainExecutableMH);

实例化主程序就是把需要的主程序的部分信息加载到内存中，通过instantiateMainExecutable方法返回ImageLoader类型的实例对象，然后对主程序进行签名

image

将实例化的image添加到镜像文件数组中，在这注意一点主程序的image是第一个添加到数组中的

image

查看函数调用，真正实例化主程序流程：instantiateFromLoadedImage -> instantiateMainExecutable -> sniffLoadCommands

sniffLoadCommands，有几个参数(请结合mach-O文件看)

compressed：根据 LC_DYLD_INFO 和 LC_DYLD_INFO_ONYL 等来决定
segCount：segment段命令数量，最大不能超过 255 个
libCount：依赖库数量， LC_LOAD_DYLIB (Foundation、UIKit等) , 最大不能超过 4095 个
codeSignCmd：应用签名
encryptCmd：应用加密信息

【3.5】加载插入动态库

image

越狱开发中，根据 DYLD_INSERT_LIBRARIES 环境变量，可以决定是否加载插入的动态库。
越狱的插件就是基于这个原理来实现的，只需要下载插件，就可以影响到应用，有部分防护手段就用到了这个环境变量。

【3.6】链接主程序

这里会多次调用link函数，循环加载动态库，对主程序的依赖库进行rebase、符号绑定（非懒加载、弱符号）等等

【3.6.1】链接主程序

// link main executable
// TODO: 开始链接主程序
gLinkContext.linkingMainExecutable = true;      
link(sMainExecutable, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);

【3.6.2】链接主程序完成后，判断`sInsertedDylibCount`插入的动态库数量是否大于`0`，然后循环链接插入的动态库

// link any inserted libraries
// do this after linking main executable so that any dylibs pulled in by inserted 
// dylibs (e.g. libSystem) will not be in front of dylibs the program uses
if ( sInsertedDylibCount > 0 ) {
    for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
        ImageLoader* image = sAllImages[i+1];
        // 链接插入动态库
        link(image, sEnv.DYLD_BIND_AT_LAUNCH, true, ImageLoader::RPathChain(NULL, NULL), -1);
        image->setNeverUnloadRecursive();
    }
    if ( gLinkContext.allowInterposing ) {
        // only INSERTED libraries can interpose
        // register interposing info after all inserted libraries are bound so chaining works
        for(unsigned int i=0; i < sInsertedDylibCount; ++i) {
            ImageLoader* image = sAllImages[i+1];
            // 注册插入的动态库，以保证链接有效
            image->registerInterposing(gLinkContext);
        }
    }
}

【3.6.3】查看link函数

void ImageLoader::link(const LinkContext& context, bool forceLazysBound, bool preflightOnly, bool neverUnload, const RPathChain& loaderRPaths, const char* imagePath)
{
    //dyld::log("ImageLoader::link(%s) refCount=%d, neverUnload=%d\n", imagePath, fDlopenReferenceCount, fNeverUnload);
    
    // clear error strings
    (*context.setErrorStrings)(0, NULL, NULL, NULL);
    // 起始时间。用于记录时间间隔
    uint64_t t0 = mach_absolute_time();
    //递归加载主程序依赖的库.完成之后发通知。
    this->recursiveLoadLibraries(context, preflightOnly, loaderRPaths, imagePath);
    context.notifyBatch(dyld_image_state_dependents_mapped, preflightOnly);

    // we only do the loading step for preflights
    if ( preflightOnly )
        return;

    uint64_t t1 = mach_absolute_time();
    context.clearAllDepths();
    this->updateDepth(context.imageCount());

    __block uint64_t t2, t3, t4, t5;
    {
        dyld3::ScopedTimer(DBG_DYLD_TIMING_APPLY_FIXUPS, 0, 0, 0);
        t2 = mach_absolute_time();
        //Rebase修正ASLR!
        this->recursiveRebaseWithAccounting(context);
        context.notifyBatch(dyld_image_state_rebased, false);

        t3 = mach_absolute_time();
        if ( !context.linkingMainExecutable )
            //绑定NoLazy符号
            this->recursiveBindWithAccounting(context, forceLazysBound, neverUnload);

        t4 = mach_absolute_time();
        if ( !context.linkingMainExecutable )
            //绑定弱符号!
            this->weakBind(context);
        t5 = mach_absolute_time();
    }

    // interpose any dynamically loaded images
    if ( !context.linkingMainExecutable && (fgInterposingTuples.size() != 0) ) {
        dyld3::ScopedTimer timer(DBG_DYLD_TIMING_APPLY_INTERPOSING, 0, 0, 0);
        //递归应用插入的动态库
        this->recursiveApplyInterposing(context);
    }

    // now that all fixups are done, make __DATA_CONST segments read-only
    if ( !context.linkingMainExecutable )
        this->recursiveMakeDataReadOnly(context);

    if ( !context.linkingMainExecutable )
        context.notifyBatch(dyld_image_state_bound, false);
    uint64_t t6 = mach_absolute_time();

    if ( context.registerDOFs != NULL ) {
        std::vector<DOFInfo> dofs;
        this->recursiveGetDOFSections(context, dofs);
        //注册
        context.registerDOFs(dofs);
    }
    //计算结束时间.
    uint64_t t7 = mach_absolute_time();

    // clear error strings
    (*context.setErrorStrings)(0, NULL, NULL, NULL);

    fgTotalLoadLibrariesTime += t1 - t0;
    fgTotalRebaseTime += t3 - t2;
    fgTotalBindTime += t4 - t3;
    fgTotalWeakBindTime += t5 - t4;
    fgTotalDOF += t7 - t6;
    
    // done with initial dylib loads
    fgNextPIEDylibAddress = 0;
}

【3.6.4】插入动态库链接结束后，再进行主程序弱绑定

image

【3.7】初始化主程序initializeMainExecutable

接下来就是我们最关键的部分，初始化主程序initializeMainExecutable

// run all initializers
// 初始化主程序
initializeMainExecutable();

查看函数调用流程：initializeMainExecutable() -> runInitializers() -> processInitializers() -> recursiveInitialization()
到这里就没办法跟了，cmd + shift + o搜索recursiveInitialization

// let objc know we are about to initialize this image
uint64_t t1 = mach_absolute_time();
fState = dyld_image_state_dependents_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
    
// initialize this image
bool hasInitializers = this->doInitialization(context);

// let anyone know we finished initializing this image
fState = dyld_image_state_initialized;
oldState = fState;
context.notifySingle(dyld_image_state_initialized, this, NULL);
    
if ( hasInitializers ) {
    uint64_t t2 = mach_absolute_time();
    timingInfo.addTime(this->getShortName(), t2-t1);
}

从notifySingle函数跟进去，最后来到(*sNotifyObjCInit)(image->getRealPath(), image->machHeader());，这个回调函数，最终会调起_objc_init里面的回调函数load_images，这里需要下载objc4源码进行分析

image

👇下面我们来分析objc4-818.2源码

【3.8】load_images

在objc源码中搜索_objc_init，发现在回调中会调用load_images

image

进入load_images方法

image
继续进入call_load_methods方法，调用类的+load方法

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. Repeatedly call class +loads until there aren't any more
        while (loadable_classes_used > 0) {
            // 调用类的+load方法
            call_class_loads();
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

【3.9】doInitialization

继续接着【3.7】ImageLoader::recursiveInitialization分析，load_images回调结束后，会继续执行doInitialization函数

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());

    // mach-o has -init and static initializers
    doImageInit(context);
    doModInitFunctions(context); // 加载c++构造函数
    
    CRSetCrashLogMessage2(NULL);
    
    return (fHasDashInit || fHasInitializers);
}

c++构造方法，在mach-O的Data段中对应__mod_init_func

image

【3.10】找到主程序入口

// find entry point for main executable
// 找到主程序的入口，调起main函数
result = (uintptr_t)sMainExecutable->getEntryFromLC_MAIN();
if ( result != 0 ) {
    // main executable uses LC_MAIN, we need to use helper in libdyld to call into main()
    if ( (gLibSystemHelpers != NULL) && (gLibSystemHelpers->version >= 9) )
        *startGlue = (uintptr_t)gLibSystemHelpers->startGlueToCallExit;
    else
        halt("libdyld.dylib support not present for LC_MAIN");
}
else {
    // main executable uses LC_UNIXTHREAD, dyld needs to let "start" in program set up for main()
    result = (uintptr_t)sMainExecutable->getEntryFromLC_UNIXTHREAD();
    *startGlue = 0;
}

总结

image

iOS底层原理17：dyld与objc的关联
本文主要的目的是理解 dyld与objc是如何关联的在上一篇文章iOS底层原理16：dyld源码分析[https...
iOS底层原理16：dyld源码分析
本文主要介绍dyld源码执行流程，应用启动加载过程、类、分类加载，都不可避免的触及dyld，所以了解dyld源码可...
iOS-底层原理 06：malloc 源码分析思路
iOS-底层原理 06：malloc 源码分析在iOS-底层原理 02：alloc & init & new 源...
iOS底层原理总结 -- 利用Runtime源码分析Categ
iOS底层原理总结 -- 利用Runtime源码分析Category的底层实现窥探iOS底层实现--OC对象的...
iOS底层原理探索--dyld加载流程分析
iOS底层原理探索--dyld加载流程分析参考文章：https://juejin.im/post/5e12ce8...
OC底层原理 16：类的加载（上）
在上一篇iOS-底层原理 16：dyld与objc的关联[https://www.jianshu.com/p/25...
iOS 类的加载（上）
在上一篇iOS-底层原理 16：dyld与objc的关联[https://www.jianshu.com/p/25...
iOS 底层原理 + 逆向文章汇总
iOS 底层原理 objc4 源码相关 0、底层源码探索方式iOS-底层原理 01：源码探索的三种方式[https...
OC底层原理 05：malloc 源码分析思路
在iOS-底层原理 01：alloc & init & new 源码分析[https://www.jianshu....
iOS-底层原理：内存对齐
上篇文章中iOS-底层原理：alloc & init & new 源码分析通过对alloc源码的分析，可以得知al...