Category底层实现分析1 - runtime合并分类信息

作者: Jacob_LJ | 来源:发表于2018-08-25 11:52 被阅读25次

Category底层实现分析1 - runtime合并分类信息
Category&load&initialize详解
iOS底层原理总结 -- 利用Runtime源码分析Categ
Category相关
Category实现的原理二:分类信息如何添加到本类中
Category原理解析(1)-load方法和initializ
category底层
Category 底层分析
分类和类扩展
iOS面试题：分类和类扩展区别，为啥分类不能添加成员变量，如何给

2018年08月29日

在3.8.2里补上objc_object结构的解析，说明它与objc_class在成员变量结构上是相似的

2018年08月26日

添加参考使用源码版本说明

注：分析参考 MJ底层原理班内容，本着自己学习原则记录

本文使用的源码为objc4-723

1 Category 是什么

1.1 语言特性

category是Objective-C 2.0之后添加的语言特性

1.2 就是对象

从文件类型上观察，与普通类一样，具有.h 和.m文件
从文件内容上看，与普通类一样，具有声明和实现编码格式

Person

/// Person
@interface Person : NSObject
@end

/// Person.m
@implementation Person
@end

Person+Test1

/// Person+Test1.h 
@interface Person (Test1)
@end

/// Person+Test1.m
@implementation Person (Test1)
@end

将Person+Test1.m转为C++代码

执行命令xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc Person+Test1.m，然后获得Person+Test1.cpp文件
通过struct _category_t类型的结构体来表示 Person+Test1

Person+Test1

struct _category_t

struct _category_t包含的成员变量有
1)、类的名字（name）
2)、类（cls）
3)、category中所有给类添加的实例方法的列表（instance_methods）
4)、category中所有添加的类方法的列表（class_methods）
5)、category实现的所有协议的列表（protocols）
6)、category中添加的所有属性（properties）

结合之前对NSObject的本质探究的描述，它们都是通过结构体形式来表现，所以可以说明Category本质上也是一个对象

1.3 应用场景

category的主要作用是为已经存在的类添加方法。
除此之外，apple还推荐了category的另外两个使用场景

可以把类的实现分开在几个不同的文件里面。这样做有几个显而易见的好处，
a)可以减少单个文件的体积
b)可以把不同的功能组织到不同的category里
c)可以由多个开发者共同完成一个类
d)可以按需加载想要的category 等等。
声明私有方法

不过除了apple推荐的使用场景，广大开发者脑洞大开，还衍生出了category的其他几个使用场景：

模拟多继承
把framework的私有方法公开

REF

深入理解Objective-C：Category

2 Category 在 runtime 源码中的 struct _category_t 实现

基本与通过命令行获取的类型描述差不多

struct _category_t

3 程序运行时，runtime 对 Category 的处理逻辑探究

runtime 会将Category 的instanceMethods、calssMethods和protocols等信息合并到class-object和meta-class中

在 runtime 源码中查找初始化类时对 Category 的处理逻辑

3.1 通过runtime源码，找到文件 objc-os.mm 文件

3.2 在里面找到`void _objc_init(void)`方法，它是OC 的 runtime 启动初始化方法

void _objc_init(void)

3.3 搜索`map_images`方法，可获得`map_images`、`load_images`、`unmap_image`等方法实现

`map_images`、`load_images`、`unmap_image`等方法实现

3.4 对比发现

load_images方法与+load 的实现相关
unmap_image方法则与map_images相反
所以先继续探究map_images函数的实现，其中调用了函数map_images_nolock

3.5 `map_images_nolock`内调用了`_read_images`函数，且其中一个参数是 `totalClasses`也就是处理所有类对象相关动作函数(_{省略了其他源码})

void 
map_images_nolock(unsigned mhCount, const char * const mhPaths[],
                  const struct mach_header * const mhdrs[])
{
// ... 其他行为代码

    if (hCount > 0) {
        _read_images(hList, hCount, totalClasses, unoptimizedTotalClasses);
    }

// ... 其他行为代码

3.6 `_read_images`中对 Category 处理的逻辑部分(_{省略了其他源码})

/***********************************************************************
* _read_images
* Perform initial processing of the headers in the linked 
* list beginning with headerList. 
*
* Called by: map_images_nolock
*
* Locking: runtimeLock acquired by map_images
**********************************************************************/
void _read_images(header_info **hList, uint32_t hCount, int totalClasses, int unoptimizedTotalClasses)
{
    /// ... 其他行为代码

    size_t count;

    // Discover categories. 
    for (EACH_HEADER) {
        category_t **catlist = 
            _getObjc2CategoryList(hi, &count);
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();

        for (i = 0; i < count; i++) { // 遍历该类的所有分类
            category_t *cat = catlist[i]; // 取出分类
            Class cls = remapClass(cat->cls); // 获取该分类对应的类

            if (!cls) {
                // Category's target class is missing (probably weak-linked).
                // Disavow any knowledge of this category.
                catlist[i] = nil;
                if (PrintConnecting) {
                    _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                 "missing weak-linked target class", 
                                 cat->name, cat);
                }
                continue;
            }

            // Process this category. 
            // First, register the category with its target class. 
            // Then, rebuild the class's method lists (etc) if 
            // the class is realized. 
            bool classExists = NO;
            /// 处理类对象
            if (cat->instanceMethods ||  cat->protocols  
                ||  cat->instanceProperties) 
            { // 该分类的结构体中实例方法或协议或实例属性中，其中一个成员变量有值即进入对应的 rebuild 环节
                addUnattachedCategoryForClass(cat, cls, hi);
                if (cls->isRealized()) {
                    remethodizeClass(cls); /// 重点，重建类对象的方法列表
                    classExists = YES;
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category -%s(%s) %s", 
                                 cls->nameForLogging(), cat->name, 
                                 classExists ? "on existing class" : "");
                }
            }

            /// 处理元类对象
            if (cat->classMethods  ||  cat->protocols  
                ||  (hasClassProperties && cat->_classProperties)) 
            {
                addUnattachedCategoryForClass(cat, cls->ISA(), hi);
                if (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA()); /// 重点，重建元类对象的方法列表
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category +%s(%s)", 
                                 cls->nameForLogging(), cat->name);
                }
            }
        }
    }

    ts.log("IMAGE TIMES: discover categories");

    // Category discovery MUST BE LAST to avoid potential races 
    // when other threads call the new category code before 
    // this thread finishes its fixups.

    /// ... 其他行为代码

3.7 `static void remethodizeClass(Class cls)`函数实现，其中调用函数`attachCategories`

static void remethodizeClass(Class cls)

3.8 `attachCategories`函数实现

注意其中的 while 循环，这是一个倒序组装逻辑，通过 i-- 倒序遍历过程，在组合 mlists 时，会将最后编译的分类方法列表信息添加至 mlists 数组的最前面

3.8.1 最后会将从分类中收集的对象方法列表(mlists) 、属性信息列表(proplists)、协议信息列表(protolists)拼接到类的 rw 结构中的对应列表内。

3.8.2 `auto rw = cls->data();`的理解：

实例对象、类对象、元类对象的基本内存结构如下，其中类对象和元类对象的本质结构都是struct objc_class类型而 instance 对象的本质结构是 struct objc_object

引子 MJ 底层课课件

struct objc_class类型内存结构如下：(_{省略了其他源码})

struct objc_class : objc_object {
    // Class ISA;
    Class superclass;
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() { 
        return bits.data();
    }
    /// ... 其他行为代码
}

objc_object类型内存结构如下：(_{省略了其他源码})

struct objc_object {
private:
    isa_t isa;

public:

    // ISA() assumes this is NOT a tagged pointer object
    Class ISA();

    // getIsa() allows this to be a tagged pointer object
    Class getIsa();

    /// ... 其他行为代码
}

而在objc_object结构中的 isa_t isa;, 这个 isa_t 结构如下：它内部包含一个 Class cls 成员变量，而 Class 定义是 typedef struct objc_class *Class;从此也就是说明objc_object具有上述objc_class的内存结构

union isa_t 
{
    isa_t() { }
    isa_t(uintptr_t value) : bits(value) { }

    Class cls;
    uintptr_t bits;

    /// ... 其他行为代码
}

所以 auto rw = cls->data(); 获取得到的 rw 类型即使struct class_rw_t 类型，其结构如下(_{省略了其他源码})，也说明了 rw->methods、rw->properties、rw->protocols等操作原理。

struct class_rw_t {
    // Be warned that Symbolication knows the layout of this structure.
    uint32_t flags;
    uint32_t version;

    const class_ro_t *ro;

    method_array_t methods;
    property_array_t properties;
    protocol_array_t protocols;

    Class firstSubclass;
    Class nextSiblingClass;

    char *demangledName;
    /// ... 其他行为代码
}

基本关系如下图所示：

图片引自 MJ 底层课课件

attachLists()(_{拼接从分类中获取的实例方法、属性、协议等列表})的实现方式如下：

void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            /// 按照新添加的列表个数给类方法列表数组扩容
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;

            /// 将类的原方法信息移动到扩容数组的最后面
            memmove(array()->lists + addedCount, 
                    array()->lists, 
                    oldCount * sizeof(array()->lists[0]));

            /// 将从多个分类中组装好的列表数组元素复制到扩容数组前面
            memcpy(array()->lists, 
                   addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }

3.9 memmove 和 memcpy 的使用区别

* 先使用 memmove 对原类的方法列表元素移动到扩容数组最后面，而不用 memcpy 的目的是保证准确性，如果使用 memcpy 的话，一旦dst 和 src 出现重叠情况则无法保证准确性
* 后面使用 memcpy 则无需考虑重叠问题，因为该操作是两个列表对象的复制操作，没有重叠风险。

4 runtime 处理 Category 后的结论

分类中的方法、属性、协议等信息会通过 runtime 机制，被附加到对应类的类对象和元类对象结构中。
因为倒序组装所有分类列表和向前附加的动作(_{将分类的方法等插入到类对象列表前面})，这就会导致：
- 分类与类，优先调用分类的方法
- 分类与分类，优先调用后编译的分类方法

5 相关题

5.1 Category的使用场合是什么？

参考第1点

5.2 Category的实现原理

Category 编译之后的底层结构是 struct category_t，里面存储着分类的instanceMethods、classMethods、instanceProperties、protocols信息等
在程序运行的时候，runtime 会将 Category 的数据，合并到类（_{类对象、元类对象}）信息中

4.3 Category和Class Extension的区别是什么？

Category 具有@implementation 而 Class Extension 没有
Class Extension 基本目的是私有化(_{一般写在.m文件中的})属性、方法、成员变量、协议等信息
必须有一个类的源码才能为一个类添加Extension，所以一般情况下你无法为系统的类比如NSString添加extension(_{因为Extension只能写在类的.h或.m中})，但是可以通过写在 Category 中间接导入
Class Extension在编译的时候，它的数据就已经包含在类信息中
Category 是在运行时，才会将数据合并到类信息中

文／Jacob_LJ（简书作者）
PS:如非特别说明，所有文章均为原创作品，著作权归作者所有，转载需联系作者获得授权，并注明出处，所有打赏均归本人所有！

Category底层实现分析1 - runtime合并分类信息
2018年08月29日在3.8.2里补上objc_object结构的解析，说明它与objc_class在成员变量...
Category&load&initialize详解
Category的底层架构定义在objc-runtime-new.h中分类相关信息合并到类或者元类源码图里面...
iOS底层原理总结 -- 利用Runtime源码分析Categ
iOS底层原理总结 -- 利用Runtime源码分析Category的底层实现窥探iOS底层实现--OC对象的...
Category相关
1.Category的实现原理？ ->所有的分类结构体都是一样的 -通过runtime动态将分类的方法合并到类对象...
Category实现的原理二:分类信息如何添加到本类中
Category实现的原理一:底层结构及源码分析中我们知道了category的底层数据结构,以及从runtime源...
Category原理解析(1)-load方法和initializ
1. Category的底层结构通过runtime动态的将分类的方法合并到类对象或元类对象中,程序编译的时:ca...
category底层
1.category底层结构每一个分类的结构都是这样的 2.category加载过程通过Runtime_obj...
Category 底层分析
一、Category 浅层分析二、Category 底层结构三、Category 源码分析(分类方法优先调用)...
分类和类扩展
1、分类实现原理 Category编译之后的底层结构是struct category_t，里面存储着分类的对象方法...
iOS面试题：分类和类扩展区别，为啥分类不能添加成员变量，如何给
一、分类和类扩展区别 1. 分类实现原理 Category编译之后的底层结构是struct category_t...