美文网首页
Mach-O 文件结构

Mach-O 文件结构

作者: Rimson | 来源:发表于2019-05-23 00:35 被阅读0次

    本文源码从苹果开源官网获得

    什么是Mach-O

    Mach-OMach Object文件格式的缩写,是用于 iOS 和 macOS 的可执行文件,目标代码,动态库,内核转储的文件格式。

    Mach-O 文件格式

    苹果官方给的一张文件结构图:


    Mach-O文件结构

    我们编写一个HelloWorld程序,将其编译,然后通过MachOView来打开.out文件:

    可以知道Mach-O由三部分组成:

    • Header:指明了CPU架构、文件类型、Load Commands 个数等一些基本信息。
    • Load Commands:描述了怎样加载每个 Segment 的信息。在 Mach-O 文件中可以有多个 Segment,每个 Segment 可能包含零个、一个或多个 Section。
    • Data:Segment 的具体数据,包含了代码和数据等。

    Header

    /*
     * The 32-bit mach header appears at the very beginning of the object file for
     * 32-bit architectures.
     */
    struct mach_header {
        uint32_t    magic;      /* mach magic number identifier */
        cpu_type_t  cputype;    /* cpu specifier */
        cpu_subtype_t   cpusubtype; /* machine specifier */
        uint32_t    filetype;   /* type of file */
        uint32_t    ncmds;      /* number of load commands */
        uint32_t    sizeofcmds; /* the size of all the load commands */
        uint32_t    flags;      /* flags */
    };
    
    /*
     * The 64-bit mach header appears at the very beginning of object files for
     * 64-bit architectures.
     */
    struct mach_header_64 {
        uint32_t    magic;      /* mach magic number identifier */
        cpu_type_t  cputype;    /* cpu specifier */
        cpu_subtype_t   cpusubtype; /* machine specifier */
        uint32_t    filetype;   /* type of file */
        uint32_t    ncmds;      /* number of load commands */
        uint32_t    sizeofcmds; /* the size of all the load commands */
        uint32_t    flags;      /* flags */
        uint32_t    reserved;   /* reserved */
    };
    
    • magic:魔数,0xfeedface是32位,0xcefaedfe是64位
    /* Constant for the magic field of the mach_header (32-bit architectures) */
    #define MH_MAGIC    0xfeedface  /* the mach magic number */
    #define MH_CIGAM    0xcefaedfe  /* NXSwapInt(MH_MAGIC) */
    
    • cputype:CPU类型
    • cpusubtype:CPU具体类型
    • filetype:文件类型,例如可执行文件、库文件等
      文件类型filetype的宏定义有:
    #define MH_OBJECT   0x1     /* relocatable object file */
    #define MH_EXECUTE  0x2     /* demand paged executable file */
    #define MH_FVMLIB   0x3     /* fixed VM shared library file */
    #define MH_CORE     0x4     /* core file */
    #define MH_PRELOAD  0x5     /* preloaded executable file */
    #define MH_DYLIB    0x6     /* dynamically bound shared library */
    #define MH_DYLINKER 0x7     /* dynamic link editor */
    #define MH_BUNDLE   0x8     /* dynamically bound bundle file */
    #define MH_DYLIB_STUB   0x9     /* shared library stub for static */
                        /*  linking only, no section contents */
    #define MH_DSYM     0xa     /* companion file with only debug */
                        /*  sections */
    #define MH_KEXT_BUNDLE  0xb     /* x86_64 kexts */
    
    • ncmds:Load Commands的数量
    • sizeofcmds:Load Commands的总大小
    • flags:标志位,用于描述该文件的详细信息。
    • reserved:64位才有的保留字段,暂时没用

    标志位flags的宏定义有:

    #define MH_NOUNDEFS 0x1     /* the object file has no undefined
                           references */
    #define MH_INCRLINK 0x2     /* the object file is the output of an
                           incremental link against a base file
                           and can't be link edited again */
    #define MH_DYLDLINK 0x4     /* the object file is input for the
                           dynamic linker and can't be staticly
                           link edited again */
    #define MH_BINDATLOAD   0x8     /* the object file's undefined
                           references are bound by the dynamic
                           linker when loaded. */
    #define MH_PREBOUND 0x10        /* the file has its dynamic undefined
                           references prebound. */
    #define MH_SPLIT_SEGS   0x20        /* the file has its read-only and
                           read-write segments split */
    #define MH_LAZY_INIT    0x40        /* the shared library init routine is
                           to be run lazily via catching memory
                           faults to its writeable segments
                           (obsolete) */
    #define MH_TWOLEVEL 0x80        /* the image is using two-level name
                           space bindings */
    #define MH_FORCE_FLAT   0x100       /* the executable is forcing all images
                           to use flat name space bindings */
    #define MH_NOMULTIDEFS  0x200       /* this umbrella guarantees no multiple
                           defintions of symbols in its
                           sub-images so the two-level namespace
                           hints can always be used. */
    #define MH_NOFIXPREBINDING 0x400    /* do not have dyld notify the
                           prebinding agent about this
                           executable */
    #define MH_PREBINDABLE  0x800           /* the binary is not prebound but can
                           have its prebinding redone. only used
                                               when MH_PREBOUND is not set. */
    #define MH_ALLMODSBOUND 0x1000      /* indicates that this binary binds to
                                               all two-level namespace modules of
                           its dependent libraries. only used
                           when MH_PREBINDABLE and MH_TWOLEVEL
                           are both set. */ 
    #define MH_SUBSECTIONS_VIA_SYMBOLS 0x2000/* safe to divide up the sections into
                            sub-sections via symbols for dead
                            code stripping */
    #define MH_CANONICAL    0x4000      /* the binary has been canonicalized
                           via the unprebind operation */
    #define MH_WEAK_DEFINES 0x8000      /* the final linked image contains
                           external weak symbols */
    #define MH_BINDS_TO_WEAK 0x10000    /* the final linked image uses
                           weak symbols */
    
    #define MH_ALLOW_STACK_EXECUTION 0x20000/* When this bit is set, all stacks 
                           in the task will be given stack
                           execution privilege.  Only used in
                           MH_EXECUTE filetypes. */
    #define MH_DEAD_STRIPPABLE_DYLIB 0x400000 /* Only for use on dylibs.  When
                             linking against a dylib that
                             has this bit set, the static linker
                             will automatically not create a
                             LC_LOAD_DYLIB load command to the
                             dylib if no symbols are being
                             referenced from the dylib. */
    #define MH_ROOT_SAFE 0x40000           /* When this bit is set, the binary 
                          declares it is safe for use in
                          processes with uid zero */
                                             
    #define MH_SETUID_SAFE 0x80000         /* When this bit is set, the binary 
                          declares it is safe for use in
                          processes when issetugid() is true */
    
    #define MH_NO_REEXPORTED_DYLIBS 0x100000 /* When this bit is set on a dylib, 
                          the static linker does not need to
                          examine dependent dylibs to see
                          if any are re-exported */
    #define MH_PIE 0x200000         /* When this bit is set, the OS will
                           load the main executable at a
                           random address.  Only used in
                           MH_EXECUTE filetypes. */
    

    对于上面的HelloWorld程序来说,它的Header信息如下:

    Load Commands

    struct load_command {
        uint32_t cmd;       /* type of load command */
        uint32_t cmdsize;   /* total size of command in bytes */
    };
    
    • cmd类型:指定command类型
    • cmdsize:表示command大小,用于计算到下一个command的偏移量

    cmd类型:

    cmd 作用
    LC_SEGMENT/LC_SEGMENT_64 将段内数据加载映射到内存中去
    LC_SYMTAB 符号表信息
    LC_DYSYMTAB 动态符号表信息
    LC_DYLD_INFO_ONLY 动态库信息
    LC_LOAD_DYLINKER 启动dyld
    LC_UUID 唯一标识符
    LC_SOURCE_VERSION 源代码版本
    LC_MAIN 程序入口
    LC_LOAD_DYLIB 加载动态库
    LC_FUNCTION_STARTS 函数符号表
    LC_DATA_IN_CODE Data注入代码地址
    LC_CODE_SIGNATURE 代码签名信息

    segment

    首先看看segment的定义:

    struct segment_command { /* for 32-bit architectures */
        uint32_t    cmd;        /* LC_SEGMENT */
        uint32_t    cmdsize;    /* includes sizeof section structs */
        char        segname[16];    /* segment name */
        uint32_t    vmaddr;     /* memory address of this segment */
        uint32_t    vmsize;     /* memory size of this segment */
        uint32_t    fileoff;    /* file offset of this segment */
        uint32_t    filesize;   /* amount to map from the file */
        vm_prot_t   maxprot;    /* maximum VM protection */
        vm_prot_t   initprot;   /* initial VM protection */
        uint32_t    nsects;     /* number of sections in segment */
        uint32_t    flags;      /* flags */
    };
    
    • cmd:上面提到的Load Command类型
    • cmdsize:Load Command大小
    • segname[16]:段名称
    segname 含义
    __PAGEZERO 可执行文件捕获空指针的段
    __TEXT 代码段和只读数据
    __DATA 全局变量和静态变量
    __LINKEDIT 包含动态链接器所需的符号、字符串表等数据
    • vmaddr:段虚拟地址(未偏移),真实虚拟地址要加上ASLR的偏移量
    • vmsize:段的虚拟地址大小
    • fileoff:段在文件内的地址偏移
    • filesize:段在文件内的大小
      加载segment的过程,就是从文件偏移fileoff处,将大小为filesize的段,加载到虚拟机vmaddr处。
    • nsects:段内section数量
    • flags:标志位,用于描述详细信息
      标志位宏定义:
    #define SG_HIGHVM   0x1 /* the file contents for this segment is for
                       the high part of the VM space, the low part
                       is zero filled (for stacks in core files) */
    #define SG_FVMLIB   0x2 /* this segment is the VM that is allocated by
                       a fixed VM library, for overlap checking in
                       the link editor */
    #define SG_NORELOC  0x4 /* this segment has nothing that was relocated
                       in it and nothing relocated to it, that is
                       it maybe safely replaced without relocation*/
    #define SG_PROTECTED_VERSION_1  0x8 /* This segment is protected.  If the
                           segment starts at file offset 0, the
                           first page of the segment is not
                           protected.  All other pages of the
                           segment are protected. */
    

    section

    section的定义:

    struct section { /* for 32-bit architectures */
        char        sectname[16];   /* name of this section */
        char        segname[16];    /* segment this section goes in */
        uint32_t    addr;       /* memory address of this section */
        uint32_t    size;       /* size in bytes of this section */
        uint32_t    offset;     /* file offset of this section */
        uint32_t    align;      /* section alignment (power of 2) */
        uint32_t    reloff;     /* file offset of relocation entries */
        uint32_t    nreloc;     /* number of relocation entries */
        uint32_t    flags;      /* flags (section type and attributes)*/
        uint32_t    reserved1;  /* reserved (for offset or index) */
        uint32_t    reserved2;  /* reserved (for count or sizeof) */
    };
    
    • sectname:section名称
    • segname:所属的segment名称
      (大写的__TEXT代表segment,小写的__text代表section
    sectname 含义
    __text 主程序代码
    __subs 桩代码
    __stub_helper 用于动态链接,启动dyld
    __cstring 硬编码的C字符串
    __la_symbol_ptr 延迟加载
    __data 初始化的可变的变量
    • addr:section在内存中的地址
    • size:section大小
    • offset:section在文件中的偏移
    • align:内存对齐边界
    • reloff:重定位入口在文件中的偏移
    • nreloc:重定位入口数量

    相关文章

      网友评论

          本文标题:Mach-O 文件结构

          本文链接:https://www.haomeiwen.com/subject/euwtzqtx.html