iOS 应用加载`dyld`篇

作者: just东东 | 来源:发表于2020-09-24 09:18 被阅读0次

iOS 应用加载`dyld`篇
Dyld的加载流程分析
iOS底层-dyld加载分析
iOS应用加载流程
iOS底层原理 12 : 应用程序的加载
iOS-启动
iOS dyld加载流程
iOS底层原理16：dyld源码分析
iOS底层之类的加载
iOS-底层原理14：dyld与objc的关联

iOS 应用加载`dyld`篇

前言

我们探索了iOS底层对象以及类的原理，对其有了大概的了解。那么iOS应用倒地是如何启动的呢？从我们点击桌面的图标开始，究竟发生了什么呢？APP的生命周期倒地是怎么样的呢？下面我们来研究一下。

1.APP加载初探

首先我们新建一个Single View App的项目，让在main.m中打一个断点：

main

然后我们查看堆栈信息，发现在main方法执行前有一步start，点击这个start我们可以看到这一流程是发生在libdyld.dylib这个库。

start

这说明我们的APP在进入到main函数前还通过dyld做了很多事情，那么dyld到底都做了什么事情呢，我们不妨从Apple OpenSource上搞一份dyld的源码来看看。我们选择的是dyld-733.6版本：

2.Dyld探索

打开dyld工程很长很长的目录结构，如果一行一行的去分析估计得猴年马月了，我都不知道+load方法是我们开发中被调用很早的方法，我们不妨在+load方法处打个断点来看看调用堆栈。

load

此时我们发现了一个叫做_dyld_start的调用，并且我们还发现断点先到了+load方法处，而不是main，说明+load方法的调用要早与main函数的调用。

2.1 _dyld_start

打开dyld 733.6的源码，全局搜索_dyld_start，我们来到dyldStartup.s这个文件，并在聚焦于arm64架构下的汇编代码：

_dyld_start

对于这里的汇编代码，我们肯定也没必要逐行分析，我们直接定位到bl跳转语句，

// call dyldbootstrap::start(app_mh, argc, argv, dyld_mh, &startGlue)
    bl  __ZN13dyldbootstrap5startEPKN5dyld311MachOLoadedEiPPKcS3_Pm

这里注释的意思是调用位于dyldbootstrap命名空间下的start方法，我们继续探索一下这个start方法。

2.2 dyldbootstrap::start

_start源码：

//
//  This is code to bootstrap dyld.  This work in normally done for a program by dyld and crt.
//  In dyld we have to do this manually.
//
uintptr_t start(const dyld3::MachOLoaded* appsMachHeader, int argc, const char* argv[],
                const dyld3::MachOLoaded* dyldsMachHeader, uintptr_t* startGlue)
{

    // Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

    // if kernel had to slide dyld, we need to fix up load sensitive locations
    // we have to do this before using any global variables
    rebaseDyld(dyldsMachHeader);

    // kernel sets up env pointer to be just past end of agv array
    const char** envp = &argv[argc+1];
    
    // kernel sets up apple pointer to be just past end of envp array
    const char** apple = envp;
    while(*apple != NULL) { ++apple; }
    ++apple;

    // set up random value for stack canary
    __guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
    // run all C++ initializers inside dyld
    runDyldInitializers(argc, argv, envp, apple);
#endif

    // now that we are done bootstrapping dyld, call dyld's main
    uintptr_t appsSlide = appsMachHeader->getSlide();
    return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

首先开始kdebug跟踪，以指示dyld引导程序的启动

// Emit kdebug tracepoint to indicate dyld bootstrap has started <rdar://46878536>
    dyld3::kdebug_trace_dyld_marker(DBG_DYLD_TIMING_BOOTSTRAP_START, 0, 0, 0, 0);

调用rebaseDyld函数，该函数内实现了对Mach-O内部DATA段指针的重设基址和绑定（Fixup 修复）操作，并且初始化了mach和系统调用，最后将修复好的DATA段的数据标记为只读。

rebaseDyld函数源码：

//
// On disk, all pointers in dyld's DATA segment are chained together.
// They need to be fixed up to be real pointers to run.
//
static void rebaseDyld(const dyld3::MachOLoaded* dyldMH)
{
    // walk all fixups chains and rebase dyld
    const dyld3::MachOAnalyzer* ma = (dyld3::MachOAnalyzer*)dyldMH;
    assert(ma->hasChainedFixups());
    uintptr_t slide = (long)ma; // all fixup chain based images have a base address of zero, so slide == load address
    __block Diagnostics diag;
    ma->withChainStarts(diag, 0, ^(const dyld_chained_starts_in_image* starts) {
        ma->fixupAllChainedFixups(diag, starts, slide, dyld3::Array<const void*>(), nullptr);
    });
    diag.assertNoError();

    // now that rebasing done, initialize mach/syscall layer
    mach_init();

    // <rdar://47805386> mark __DATA_CONST segment in dyld as read-only (once fixups are done)
    ma->forEachSegment(^(const dyld3::MachOFile::SegmentInfo& info, bool& stop) {
        if ( info.readOnlyData ) {
            ::mprotect(((uint8_t*)(dyldMH))+info.vmAddr, (size_t)info.vmSize, VM_PROT_READ);
        }
    });
}

接下来内核将env指针设置为刚好超出agv数组的末尾；内核将apple指针设置为刚好超出envp数组的末尾。并进行堆栈溢出的保护。

// kernel sets up env pointer to be just past end of agv array
const char** envp = &argv[argc+1];
    
// kernel sets up apple pointer to be just past end of envp array
const char** apple = envp;
while(*apple != NULL) { ++apple; }
++apple;

// set up random value for stack canary
__guard_setup(apple);

然后根据当前dyld是否有初始值，在dyld中运行所有c++初始化器

#if DYLD_INITIALIZER_SUPPORT
    // run all C++ initializers inside dyld
    runDyldInitializers(argc, argv, envp, apple);
#endif

最后在dyld完成引导后，读取Mach-O header 中的偏移量aooSlide，然后调用dyld命名空间的_main函数。

// now that we are done bootstrapping dyld, call dyld's main
uintptr_t appsSlide = appsMachHeader->getSlide();
return dyld::_main((macho_header*)appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);

2.3 dyld::_main

通过搜索namespace dyld {我们来到dyld2.cpp文件内，找_main函数，一看600多行代码，如果逐行阅读肯定很不明智，我们挑选有注释的地方看，因为注释的地方基本都是最主要的。

首先看_main函数注释：

dyld的入口函数，内核加载了dyld，然后跳转到__dyly_start来设置一些寄存器的值，然后调用到了这里（__main），此函数返回__dyld_start所跳转到的目标程序的main函数地址

内核标志检查

然后检查是否有内核标志

cdHash值.jpg

从环境变量中获取主可执行文件的cdHash值，这个哈希值mainExecutableCDHash在后面用来校验dyld3的启动闭包。

模拟器加载.jpg

追踪dyld的加载，然后判断当前是否是模拟器环境，如果不是模拟器，则追踪dyld和主二进制可执行文件的加载。

platform ID.jpg

在所有镜像文件中设置platform ID，以便调试器可以告诉进程类型，一旦我们在rdar://43369446中让内核处理它，这些都是可以删除的。

15985844879874.jpg

判断是否为macOS执行环境，如果是则看看是否需要覆盖平台，然后判断DYLD_ROOT_PATH环境变量是否存在，如果存在，然后判断是模拟器是否有自己指定的dyld，如果有就使用，如果没有就返回错误信息。

15985958960585.jpg

打印日志：dyld 启动开始
根据传入的参数设置上下文
获取指向exec路径的指针
如果不是模拟器环境，并且是iPhone OS的最小支持版本，如果内核没有返回主可执行文件的真实路径就进行路径的赋值
判断exec路径是否为绝对路径，如果为相对路径，则使用cwd转化为绝对路径

15985960490868.jpg

为了后续的日志打印从exec路径中取出进程的名称（strrchr函数是获取第二个参数出现的最后一个位置，然后返回从这个位置开始到结束的内容）
根据APP主可执行文件Mach-O Header的内容配置进程的一些限制条件
检查我们是否应该强制dyld3。注意，由于AMFI的原因，我们必须在常规的env解析之外进行此操作

15985961852736.jpg

判断执行环境是否是macOS，如果是在判断上下文的一些配置属性是否以及配置好了，如果没配置好，就再次就行一次上下文的配置操作
检查环境变量
初始化默认的回调路径
判断当前APP的Mach-O的平台类型是IOSMAC，重置上下文的根路径，标记iOSonMac为true，在做一些回调路径的判断和设置
如果不是IOSMAC而是driverKit在做一些配置

15985969100276.jpg

根据环境变量打印一些信息
对环境变量的一些解析容错处理等
如果不是TMPDIR格式就退出

15985974845146.jpg

获取Mach-O info
判断共享缓存是否开始，如果开启了就将共享换粗映射到当前进程的逻辑内存空间内

15985977914313.jpg

如果我们还没有闭包模式，那么检查环境和缓存类型

Xnip2020-08-28_15-02-54.jpg

在非模拟器环境判断闭包的状态，如果是Off就获取上下文的错误信息并打印日志
如果不是则进行闭包的处理，由于dyld3会创建一个启动闭包，我们需要来读取它，这里会先在缓存里面查找是否有启动闭包的存在，系统级的APP的启动闭包存在于共享缓存中，我们自己开发的APP的启动闭包是在APP安装或者升级的时候构建的，所以这里检查dyld中的缓存是有意义的。
我们会尝试在运行时期间构建一个的闭包，如果它是一个iOS第三方二进制文件，或一个来自共享缓存的macOS二进制文件
如果我们没有找到一个有效的缓存闭包，那么尝试建立一个新的
在建立闭包后退出dyld，不运行程序
如果找到了有效的闭包则尝试使用启动闭包
- 打印日志： dyld3 开始启动
- 如果启动失败则创建一个新的启动闭包尝试再次启动
- 如果启动成功，由于start()函数指针的方式调用_main方法的返回指针，所以需要进行签名。
如果还没有使用闭包则进行旧的启动方式

15985995917633.jpg

开始接受gdb通知，
使初始化的空间足够大，这样就不太可能需要重新分配（re-alloced）

15985997596035.jpg

添加门控机制，支持系统订单文件生成过程
最主要的是addDyldImageToUUIDList，添加dyld的景象文件到UUID列表中，主要目的是启用堆栈的符号化。

reloadAllImages：

这里就是重头戏了，加载所有的镜像文件，我们挑重点看，因为这里涉及很多逻辑，包括对包版本Mach-O的不严格的要求，以及对simulators, iOS, tvOS, watchOS的一些不同处理。

下面我们看到如下代码：

// instantiate ImageLoader for main executable
        sMainExecutable = instantiateFromLoadedImage(mainExecutableMH, mainExecutableSlide, sExecPath);

这里根据函数的意思就是实例化我们的主程序，我们来到这个方法内部:

// The kernel maps in main executable before dyld gets control.  We need to 
// make an ImageLoader* for the already mapped in main executable.
static ImageLoaderMachO* instantiateFromLoadedImage(const macho_header* mh, uintptr_t slide, const char* path)
{
    // try mach-o loader
    if ( isCompatibleMachO((const uint8_t*)mh, path) ) {
        ImageLoader* image = ImageLoaderMachO::instantiateMainExecutable(mh, slide, path, gLinkContext);
        addImage(image);
        return (ImageLoaderMachO*)image;
    }
    
    throw "main executable not a known format";
}

该方法的作用就是内核在dyld获得控制之前在主可执行文件中进行映射。我们需要为主可执行文件中已经映射的文件创建一个ImageLoader。从上面的代码不难看出实际创建是通过ImageLoaderMachO::instantiateMainExecutable函数真正创建的。我们点击跳转到该方法。

// create image for main executable
ImageLoader* ImageLoaderMachO::instantiateMainExecutable(const macho_header* mh, uintptr_t slide, const char* path, const LinkContext& context)
{
    //dyld::log("ImageLoader=%ld, ImageLoaderMachO=%ld, ImageLoaderMachOClassic=%ld, ImageLoaderMachOCompressed=%ld\n",
    //  sizeof(ImageLoader), sizeof(ImageLoaderMachO), sizeof(ImageLoaderMachOClassic), sizeof(ImageLoaderMachOCompressed));
    bool compressed;
    unsigned int segCount;
    unsigned int libCount;
    const linkedit_data_command* codeSigCmd;
    const encryption_info_command* encryptCmd;
    sniffLoadCommands(mh, path, false, &compressed, &segCount, &libCount, context, &codeSigCmd, &encryptCmd);
    // instantiate concrete class based on content of load commands
    if ( compressed ) 
        return ImageLoaderMachOCompressed::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
    else
#if SUPPORT_CLASSIC_MACHO
        return ImageLoaderMachOClassic::instantiateMainExecutable(mh, slide, path, segCount, libCount, context);
#else
        throw "missing LC_DYLD_INFO load command";
#endif
}

我们继续探索，通过上面的方法还需要进入到sniffLoadCommands函数看看。又是一个长长的方法，PS倒地、难受、生无可恋，看看注释就知道sniffLoadCommands的主要功能是判断当前加载的Mach-O文件是不是原始的二进制文件还是压缩的LINKEDIT，以及获取该文件有多少段。好的，既然这个函数的左右已经明确了，并且我们只是为了拿到ImageLoader，那么直接返回。sniffLoadCommands执行完毕后根据LINKEDIT是压缩的格式还是传统格式分别调用ImageLoaderMachOCompressed::instantiateMainExecutable和ImageLoaderMachOClassic::instantiateMainExecutable进行实例化ImageLoader。

15986006360353.jpg

拿到ImageLoaer后开始加载动态库，并记录插入库的数量，先试插入的库，然后是主库，最后是其他的库

15986014239415.jpg

加载完毕就是链接库：

15986021020368.jpg

首先是链接主二进制文件
其次链接插入的动态库

我们再来看看link函数吧

void link(ImageLoader* image, bool forceLazysBound, bool neverUnload, const ImageLoader::RPathChain& loaderRPaths, unsigned cacheIndex)
{
    // add to list of known images.  This did not happen at creation time for bundles
    if ( image->isBundle() && !image->isLinked() )
        addImage(image);

    // we detect root images as those not linked in yet 
    if ( !image->isLinked() )
        addRootImage(image);
    
    // process images
    try {
        const char* path = image->getPath();
#if SUPPORT_ACCELERATE_TABLES
        if ( image == sAllCacheImagesProxy )
            path = sAllCacheImagesProxy->getIndexedPath(cacheIndex);
#endif
        image->link(gLinkContext, forceLazysBound, false, neverUnload, loaderRPaths, path);
    }
    catch (const char* msg) {
        garbageCollectImages();
        throw;
    }
}

在link函数内部，会递归调用，来实现库里面引用库的链接操作，对于Bundle类型的库被连接过就不会再次链接了，其他的库被链接过的有不会被再次链接

15986027554117.jpg

即使没有DYLD_INSERT_LIBRARIES环境变量, dyld也应该支持插入

15986028300073.jpg

如果支持快速表，则处理快速表

15986029446096.jpg

应用插入到初始图像集
现在已经注册了插入操作，为主可执行文件绑定和通知
现在已注册插入的图像的绑定和通知
弱绑定后，只有所有插入的图像链接
设置只读上下文

15986031463860.jpg

打印dyld 开始运行 initializers
调用initializeMainExecutable初始化

initializeMainExecutable 代码：

void initializeMainExecutable()
{
    // record that we've reached this step
    gLinkContext.startedInitializingMainExecutable = true;

    // run initialzers for any inserted dylibs
    ImageLoader::InitializerTimingList initializerTimes[allImagesCount()];
    initializerTimes[0].count = 0;
    const size_t rootCount = sImageRoots.size();
    if ( rootCount > 1 ) {
        for(size_t i=1; i < rootCount; ++i) {
            sImageRoots[i]->runInitializers(gLinkContext, initializerTimes[0]);
        }
    }
    
    // run initializers for main executable and everything it brings up 
    sMainExecutable->runInitializers(gLinkContext, initializerTimes[0]);
    
    // register cxa_atexit() handler to run static terminators in all loaded images when this process exits
    if ( gLibSystemHelpers != NULL ) 
        (*gLibSystemHelpers->cxa_atexit)(&runAllStaticTerminators, NULL, NULL);

    // dump info if requested
    if ( sEnv.DYLD_PRINT_STATISTICS )
        ImageLoader::printStatistics((unsigned int)allImagesCount(), initializerTimes[0]);
    if ( sEnv.DYLD_PRINT_STATISTICS_DETAILS )
        ImageLoaderMachO::printStatisticsDetails((unsigned int)allImagesCount(), initializerTimes[0]);
}

这里先为所有插入并链接完成的动态库执行初始化操作
然后再为主程序可执行文件执行初始化操作
注册cxa_atexit()处理程序，在进程退出时在所有加载的图像中运行静态终止符
如果需要就打印一些垃圾信息

runInitializers 源码：

void ImageLoader::runInitializers(const LinkContext& context, InitializerTimingList& timingInfo)
{
    uint64_t t1 = mach_absolute_time();
    mach_port_t thisThread = mach_thread_self();
    ImageLoader::UninitedUpwards up;
    up.count = 1;
    up.imagesAndPaths[0] = { this, this->getPath() };
    processInitializers(context, thisThread, timingInfo, up);
    context.notifyBatch(dyld_image_state_initialized, false);
    mach_port_deallocate(mach_task_self(), thisThread);
    uint64_t t2 = mach_absolute_time();
    fgTotalInitTime += (t2 - t1);
}

在 runInitializers 内部我们继续探索到processInitializers:
processInitializers源码：

// <rdar://problem/14412057> upward dylib initializers can be run too soon
// To handle dangling dylibs which are upward linked but not downward, all upward linked dylibs
// have their initialization postponed until after the recursion through downward dylibs
// has completed.
void ImageLoader::processInitializers(const LinkContext& context, mach_port_t thisThread,
                                     InitializerTimingList& timingInfo, ImageLoader::UninitedUpwards& images)
{
    uint32_t maxImageCount = context.imageCount()+2;
    ImageLoader::UninitedUpwards upsBuffer[maxImageCount];
    ImageLoader::UninitedUpwards& ups = upsBuffer[0];
    ups.count = 0;
    // Calling recursive init on all images in images list, building a new list of
    // uninitialized upward dependencies.
    for (uintptr_t i=0; i < images.count; ++i) {
        images.imagesAndPaths[i].first->recursiveInitialization(context, thisThread, images.imagesAndPaths[i].second, timingInfo, ups);
    }
    // If any upward dependencies remain, init them.
    if ( ups.count > 0 )
        processInitializers(context, thisThread, timingInfo, ups);
}

然后我们根据源码继续探索recursiveInitialization

recursiveInitialization 源码：


void ImageLoader::recursiveInitialization(const LinkContext& context, mach_port_t this_thread, const char* pathToInitialize,
                                          InitializerTimingList& timingInfo, UninitedUpwards& uninitUps)
{
    recursive_lock lock_info(this_thread);
    recursiveSpinLock(lock_info);

    if ( fState < dyld_image_state_dependents_initialized-1 ) {
        uint8_t oldState = fState;
        // break cycles
        fState = dyld_image_state_dependents_initialized-1;
        try {
            // initialize lower level libraries first
            for(unsigned int i=0; i < libraryCount(); ++i) {
                ImageLoader* dependentImage = libImage(i);
                if ( dependentImage != NULL ) {
                    // don't try to initialize stuff "above" me yet
                    if ( libIsUpward(i) ) {
                        uninitUps.imagesAndPaths[uninitUps.count] = { dependentImage, libPath(i) };
                        uninitUps.count++;
                    }
                    else if ( dependentImage->fDepth >= fDepth ) {
                        dependentImage->recursiveInitialization(context, this_thread, libPath(i), timingInfo, uninitUps);
                    }
                }
            }
            
            // record termination order
            if ( this->needsTermination() )
                context.terminationRecorder(this);

            // let objc know we are about to initialize this image
            uint64_t t1 = mach_absolute_time();
            fState = dyld_image_state_dependents_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_dependents_initialized, this, &timingInfo);
            
            // initialize this image
            bool hasInitializers = this->doInitialization(context);

            // let anyone know we finished initializing this image
            fState = dyld_image_state_initialized;
            oldState = fState;
            context.notifySingle(dyld_image_state_initialized, this, NULL);
            
            if ( hasInitializers ) {
                uint64_t t2 = mach_absolute_time();
                timingInfo.addTime(this->getShortName(), t2-t1);
            }
        }
        catch (const char* msg) {
            // this image is not initialized
            fState = oldState;
            recursiveSpinUnLock();
            throw;
        }
    }
    
    recursiveSpinUnLock();
}

继续探索notifySingle

notifySingle

在第938行是获取镜像文件真实地址的代码。

那么sNotifyObjCInit到底是在什么地方初始化的呢？我们通过全局搜索，最后找的在registerObjCNotifiers给其赋值。

registerObjCNotifiers 源码：

void registerObjCNotifiers(_dyld_objc_notify_mapped mapped, _dyld_objc_notify_init init, _dyld_objc_notify_unmapped unmapped)
{
    // record functions to call
    sNotifyObjCMapped   = mapped;
    sNotifyObjCInit     = init;
    sNotifyObjCUnmapped = unmapped;

    // call 'mapped' function with all images mapped so far
    try {
        notifyBatchPartial(dyld_image_state_bound, true, NULL, false, true);
    }
    catch (const char* msg) {
        // ignore request to abort during registration
    }

    // <rdar://problem/32209809> call 'init' function on all images already init'ed (below libSystem)
    for (std::vector<ImageLoader*>::iterator it=sAllImages.begin(); it != sAllImages.end(); it++) {
        ImageLoader* image = *it;
        if ( (image->getState() == dyld_image_state_initialized) && image->notifyObjC() ) {
            dyld3::ScopedTimer timer(DBG_DYLD_TIMING_OBJC_INIT, (uint64_t)image->machHeader(), 0, 0);
            (*sNotifyObjCInit)(image->getRealPath(), image->machHeader());
        }
    }
}

那么是什么地方调用了registerObjCNotifiers呢，我们继续搜索registerObjCNotifiers来到了_dyld_objc_notify_register函数。

_dyld_objc_notify_register 源码：

void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped)
{
    dyld::registerObjCNotifiers(mapped, init, unmapped);
}

那么我们还是会有疑问，_dyld_objc_notify_register的调用是从哪里开始的呢？我们在次全局搜索，却没有什么合适的地方是其被调用的地方。但是我们发现了如下所示的注释：仅供objc运行时使用，这个时候我们就知道了，此处的调用不在dyld源码里面了，所以我们需要转入libobjc源码继续探索。

//
// Note: only for use by objc runtime
// Register handlers to be called when objc images are mapped, unmapped, and initialized.
// Dyld will call back the "mapped" function with an array of images that contain an objc-image-info section.
// Those images that are dylibs will have the ref-counts automatically bumped, so objc will no longer need to
// call dlopen() on them to keep them from being unloaded.  During the call to _dyld_objc_notify_register(),
// dyld will call the "mapped" function with already loaded objc images.  During any later dlopen() call,
// dyld will also call the "mapped" function.  Dyld will call the "init" function when dyld would be called
// initializers in that image.  This is when objc calls any +load methods in that image.
//
void _dyld_objc_notify_register(_dyld_objc_notify_mapped    mapped,
                                _dyld_objc_notify_init      init,
                                _dyld_objc_notify_unmapped  unmapped);

15986056339510.jpg

** 果不其然，我们在objc源码中的_objc_init函数中找到了_dyld_objc_notify_register的调用。此处使用的是objc4-779.1

经过上面一连串的跳转（xiabicaozuo）宝宝早就晕了。但是我们还是要淡定，这里是dyld注册回调，这样我们的Runtime才能知道镜像何时加载完毕。在ImageLoader::recursiveInitialization函数中有这样一行代码：

// initialize this image
bool hasInitializers = this->doInitialization(context);

这里是真正做初始化操作的地方

doInitialization 源码：

bool ImageLoaderMachO::doInitialization(const LinkContext& context)
{
    CRSetCrashLogMessage2(this->getPath());

    // mach-o has -init and static initializers
    doImageInit(context);
    doModInitFunctions(context);
    
    CRSetCrashLogMessage2(NULL);
    
    return (fHasDashInit || fHasInitializers);
}

在ImageLoaderMachO::doInitialization源码中可以看到主要有两个操作，一个是：doImageInit，一个是doModInitFunctions。

doImageInit 源码：

void ImageLoaderMachO::doImageInit(const LinkContext& context)
{
    if ( fHasDashInit ) {
        const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
        const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
        const struct load_command* cmd = cmds;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            switch (cmd->cmd) {
                case LC_ROUTINES_COMMAND:
                    Initializer func = (Initializer)(((struct macho_routines_command*)cmd)->init_address + fSlide);
#if __has_feature(ptrauth_calls)
                    func = (Initializer)__builtin_ptrauth_sign_unauthenticated((void*)func, ptrauth_key_asia, 0);
#endif
                    // <rdar://problem/8543820&9228031> verify initializers are in image
                    if ( ! this->containsAddress(stripPointer((void*)func)) ) {
                        dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
                    }
                    if ( ! dyld::gProcessInfo->libSystemInitialized ) {
                        // <rdar://problem/17973316> libSystem initializer must run first
                        dyld::throwf("-init function in image (%s) that does not link with libSystem.dylib\n", this->getPath());
                    }
                    if ( context.verboseInit )
                        dyld::log("dyld: calling -init function %p in %s\n", func, this->getPath());
                    {
                        dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                        func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                    }
                    break;
            }
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
    }
}

doImageInit 内部会通过初始地址 + 偏移量拿到初始化器 func，然后进行签名的验证。验证通过后还要判断初始化器是否在镜像文件中以及 libSystem 库是否已经初始化，最后才执行初始化器。

doModInitFunctions 源码：

void ImageLoaderMachO::doModInitFunctions(const LinkContext& context)
{
    if ( fHasInitializers ) {
        const uint32_t cmd_count = ((macho_header*)fMachOData)->ncmds;
        const struct load_command* const cmds = (struct load_command*)&fMachOData[sizeof(macho_header)];
        const struct load_command* cmd = cmds;
        for (uint32_t i = 0; i < cmd_count; ++i) {
            if ( cmd->cmd == LC_SEGMENT_COMMAND ) {
                const struct macho_segment_command* seg = (struct macho_segment_command*)cmd;
                const struct macho_section* const sectionsStart = (struct macho_section*)((char*)seg + sizeof(struct macho_segment_command));
                const struct macho_section* const sectionsEnd = &sectionsStart[seg->nsects];
                for (const struct macho_section* sect=sectionsStart; sect < sectionsEnd; ++sect) {
                    const uint8_t type = sect->flags & SECTION_TYPE;
                    if ( type == S_MOD_INIT_FUNC_POINTERS ) {
                        Initializer* inits = (Initializer*)(sect->addr + fSlide);
                        const size_t count = sect->size / sizeof(uintptr_t);
                        // <rdar://problem/23929217> Ensure __mod_init_func section is within segment
                        if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
                            dyld::throwf("__mod_init_funcs section has malformed address range for %s\n", this->getPath());
                        for (size_t j=0; j < count; ++j) {
                            Initializer func = inits[j];
                            // <rdar://problem/8543820&9228031> verify initializers are in image
                            if ( ! this->containsAddress(stripPointer((void*)func)) ) {
                                dyld::throwf("initializer function %p not in mapped image for %s\n", func, this->getPath());
                            }
                            if ( ! dyld::gProcessInfo->libSystemInitialized ) {
                                // <rdar://problem/17973316> libSystem initializer must run first
                                const char* installPath = getInstallPath();
                                if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) != 0) )
                                    dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath());
                            }
                            if ( context.verboseInit )
                                dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
                            bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
                            {
                                dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
                            bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
                            if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
                                // now safe to use malloc() and other calls in libSystem.dylib
                                dyld::gProcessInfo->libSystemInitialized = true;
                            }
                        }
                    }
                    else if ( type == S_INIT_FUNC_OFFSETS ) {
                        const uint32_t* inits = (uint32_t*)(sect->addr + fSlide);
                        const size_t count = sect->size / sizeof(uint32_t);
                        // Ensure section is within segment
                        if ( (sect->addr < seg->vmaddr) || (sect->addr+sect->size > seg->vmaddr+seg->vmsize) || (sect->addr+sect->size < sect->addr) )
                            dyld::throwf("__init_offsets section has malformed address range for %s\n", this->getPath());
                        if ( seg->initprot & VM_PROT_WRITE )
                            dyld::throwf("__init_offsets section is not in read-only segment %s\n", this->getPath());
                        for (size_t j=0; j < count; ++j) {
                            uint32_t funcOffset = inits[j];
                            // verify initializers are in TEXT segment
                            if ( funcOffset > seg->filesize ) {
                                dyld::throwf("initializer function offset 0x%08X not in mapped image for %s\n", funcOffset, this->getPath());
                            }
                            if ( ! dyld::gProcessInfo->libSystemInitialized ) {
                                // <rdar://problem/17973316> libSystem initializer must run first
                                const char* installPath = getInstallPath();
                                if ( (installPath == NULL) || (strcmp(installPath, libSystemPath(context)) != 0) )
                                    dyld::throwf("initializer in image (%s) that does not link with libSystem.dylib\n", this->getPath());
                            }
                            Initializer func = (Initializer)((uint8_t*)this->machHeader() + funcOffset);
                            if ( context.verboseInit )
                                dyld::log("dyld: calling initializer function %p in %s\n", func, this->getPath());
                            bool haveLibSystemHelpersBefore = (dyld::gLibSystemHelpers != NULL);
                            {
                                dyld3::ScopedTimer(DBG_DYLD_TIMING_STATIC_INITIALIZER, (uint64_t)fMachOData, (uint64_t)func, 0);
                                func(context.argc, context.argv, context.envp, context.apple, &context.programVars);
                            }
                            bool haveLibSystemHelpersAfter = (dyld::gLibSystemHelpers != NULL);
                            if ( !haveLibSystemHelpersBefore && haveLibSystemHelpersAfter ) {
                                // now safe to use malloc() and other calls in libSystem.dylib
                                dyld::gProcessInfo->libSystemInitialized = true;
                            }
                        }
                    }
                }
            }
            cmd = (const struct load_command*)(((char*)cmd)+cmd->cmdsize);
        }
    }
}

doModInitFunctions函数通过判断fHasInitializers，从load_command里循环加载所有方法

现在我们回到start函数

通知监听所有监听着该进程即将进入main()

// notify any montoring proccesses that this process is about to enter main()
notifyMonitoringDyldMain();

Xnip2020-08-28_17-56-53.jpg

最后就是根据各个不同的版本对result进行赋值，还有部分容错处理，最后返回result

小结：

至此我们的dyld::start就粗略的分析完毕了，过程有点复杂，有些也是一知半解的，对于dyld这个强大工具，由于本人才疏学浅，肯定难以分析的很透彻，如有错误欢迎指正。

3. _objc_init 初探

注：此处使用的是objc4-779.1，_objc_init函数在objc-os.mm文件中。

根据上面的分析，我们来到objc4-779.1的源码中分析，根据上面的各种调用关系，我们来到_objc_init这个函数中打个断点，看看函数的调用栈，来验证一下我们的分析。

函数调用栈.jpg

通过函数调用栈我们可以清楚的看到此处的调用顺序，在dyld中最后调用的是doModInitFunctions函数。由此我们可以确定一个顺序：

dyld->libSystem:libSystem_initializer->libdispatch:libdispatch_init->libdispatch:_os_object_init->libobjc:_objc_init

我们打开libSystem的源码通过全局搜索libSystem_initializer，找到该函数，并在该函数内找到了libdispatch_init的调用。

libSystem_initializer.jpg

同样我们来到libdispatch的源码通过全局搜索libdispatch_init，找到该函数，并在函数内找到了_os_object_init的调用。

libdispatch_init.jpg

我们继续在libdispatch的源码查找_os_object_init，同样也验证了_objc_init函数的调用。

_os_object_init.jpg

至此我们找到了_objc_init的调用处，我们的初探就到这里结束了，那么_objc_init内部具体做了什么，我们会在后面的文章中着重分析。

4.总结

本文通过从APP的启动时的start函数引入到了对dyld的探索
从dyld汇编中的_dyld_start找到dyldbootstrap::start
dyldbootstrap::start调用了rebaseDyld函数对Mach-O内部的DATA段指针进行了重设基址和绑定操作，并且初始化了mach的系统调用。接下来start函数还做了很多初始化操作
下面从dyldbootstrap::start中调用了dyld::_main
在_main中初始化一些上下文，对不同环境的加载做相应的区分，开启各种日志的打印，接收通知，对dyld2和dyld3的闭包做相应处理，最后来到重头戏reloadAllImages加载所有的镜像文件
加载镜像文件首先实例化一个ImageLoader，通过ImageLoader加载动态库，记录并插入库的数量，先试引用的库后主库，最后是其他库，加载完毕就是链接库
链接库首先是链接主二进制文件，其次是引入的动态库，链接时是递归操作的，通过递归来实现库里面引用库的连接操作，链接完毕后循环插入初始镜像集
接下来就是为主可执行文件绑定通知，进行弱引用处理，设置只读上下文；初始化所有动态库，先初始化所有插入并连接完成的动态库，然后为主程序执行初始化操作
然后我们通过获取镜像文件的真实地址对其进行初始化，初始化的时候通过各种调用的分析，我们找到了它在objc:_objc_init函数的具体初始化操作。
最后通过notifyMonitoringDyldMain函数，通知监听所有监听着该进程即将进入main()函数

iOS 应用加载`dyld`篇
iOS 应用加载dyld篇前言我们探索了iOS底层对象以及类的原理，对其有了大概的了解。那么iOS应用倒地是如...
Dyld的加载流程分析
引言：众所周知，我们的iOS应用是通过Dyld进行加载的，那么Dyld是如何加载我们的应用的，它的流程是怎样的，...
iOS底层-dyld加载分析
引言：众所周知，我们的iOS应用是通过Dyld进行加载的，那么Dyld是如何加载我们的应用的，它的流程是怎样的，...
iOS应用加载流程
在iOS领域我们谈应用加载流程，就不得不谈一下dyld。概述：DYLD（the dynamic link edit...
iOS底层原理 12 : 应用程序的加载
一、应用程序的加载 APP加载过程：程序启动依次加载dyld、libSystem、libdispathc.dyld...
iOS-启动
iOS应用启动共分为三个阶段1.main()函数之前,使用dyld加载各种依赖库、加载category和load方...
iOS dyld加载流程
dyld加载的详细流程可以参考文章 iOS dyld加载流程[https://www.jianshu.com/p...
iOS底层原理16：dyld源码分析
本文主要介绍dyld源码执行流程，应用启动加载过程、类、分类加载，都不可避免的触及dyld，所以了解dyld源码可...
iOS底层之类的加载
在iOS底层中，关于类的加载，在应用程序开始加载时，首先通过dyld链接到动态库objc，从objc中的init方...
iOS-底层原理14：dyld与objc的关联
在上一篇文章iOS-底层原理13：dyld加载流程[https://www.jianshu.com/p/030cf...

iOS 应用加载`dyld`篇

iOS 应用加载`dyld`篇

前言

1.APP加载初探

2.Dyld探索

2.1 _dyld_start

2.2 dyldbootstrap::start

2.3 dyld::_main

3. _objc_init 初探

4.总结

相关文章