基于runtime理解Category

作者: 平常心_kale | 来源:发表于2019-05-21 19:31 被阅读0次

基于runtime理解Category
Objective-C 之 Category
[iOS] 组件化方案学习 - CTMediator
透彻理解 KVO 观察者模式（附基于runtime实现代码）
iOS 的Category实现原理
Objective-C：Category
OC分类
category探究
iOS底层:Category
iOS Category（分类）

Category一般又叫分类，它的主要作用是在不改变原有类的前提下，动态地给这个类添加一些方法。当我们需要为一个类额外的增加方法属性时，分类便是我们的首选。我们都知道对象方法是存放在类中，类方法是存放在元类对象中，那么如果在category增加了方法属性等，它们又存放在哪呢？

1.Runtime中Category的底层结构：

打开<objc/runtime.h>

typedef struct objc_category *Category;   Category 是表示一个指向分类的结构体的指针:

//runtime1880行显示如下
struct objc_category {
    char * _Nonnull category_name                            OBJC2_UNAVAILABLE;  //分类名
    char * _Nonnull class_name                               OBJC2_UNAVAILABLE;  //分类所属的类名
    struct objc_method_list * _Nullable instance_methods     OBJC2_UNAVAILABLE;  //实例方法列表
    struct objc_method_list * _Nullable class_methods        OBJC2_UNAVAILABLE;  //类方法列表
    struct objc_protocol_list * _Nullable protocols          OBJC2_UNAVAILABLE;  //分类所实现的协议列表
}

从runtime代码中基本可以看出平时使用categroy的方式，对象方法，类方法，协议，都可以找到对应的存储方式。并且我们发现分类结构体中是不存在成员变量的，因此分类中是不允许添加成员变量的。分类中添加的属性并不会帮助我们自动生成成员变量，只会生成get set方法的声明，需要我们自己去实现。

现在在分类中申明一个成员变量看看会发生什么？

{
    NSInteger _sex;  
    报错 Instance variables may not be placed in categories  //实例变量不能放在类别中
}

可见在分类中申明成员变量直接报错。但是添加属性,并且只会生成get set方法的声明，需要我们自己去实现。

通过关联技术 实现set和get方法
- (void)setWeight:(NSInteger)weight {
    objc_setAssociatedObject(self, @"weightKey", @(weight), OBJC_ASSOCIATION_ASSIGN);
}
- (NSInteger)weight {
    return [objc_getAssociatedObject(self, @"weightKey") integerValue];
}

那么现在给一个分类添加一个方法,这个方法是怎么添加进去的呢？带着这个问题先看看runtime的源码。

首先新建一个Student继承与NSObject，再新建一个Student+Category 的类别，然后在终端中输入 clang -rewrite-objc Student+Category .m 编译成C++代码,看看其结构。

//编译成  C++代码
struct _category_t {
    const char *name;
    struct _class_t *cls;
    
    //结构体前面的name和class，我们可以猜测这个是类名和分类的名称相关的内容
    const struct _method_list_t *instance_methods; //为分类增加的实例方法列表
    const struct _method_list_t *class_methods;    //为分类增加的类方法列表
    const struct _protocol_list_t *protocols;      //分类遵守的协议列表
    const struct _prop_list_t *properties;         //增加的属性列表
};

/*
 我们搜索“_category_t”可以看到，一个分类在编译后转成C++文件，就是以上图这种结构体的方式存在。
 我们每增加一个分类，编译后就会多一个这样的结构体，然后在运行时阶段，把所有分类的结构体全部动态合并到我们类里面去
 */

问题:Category为什么能添加方法不能添加成员变量

每一个Category都会编译然后存储在一个_category_t类型的变量中，
通过这个_category_t的结构结构我们也可以看出，属性存储在_prop_list_t，这里并没有类中的objc_ivar_list结构体，所以Category的_category_t结构体中根本没有储存ivar的地方，所以不能添加成员变量。

1.通过以上分析我们发现。分类源码中确实是将我们定义的对象方法，类方法，属性等都存放在catagory_t结构体中。接下来我们在回到runtime源码查看catagory_t存储的方法，属性，协议等是如何存储在类对象中的。
2.结构体主要包含了分类定义的实例方法与类方法，其中instance_methods 列表是 objc_class 中方法列表的一个子集，而class_methods列表是元类方法列表的一个子集。

2.源码分析

首先来到runtime初始化函数

/***********************************************************************
* _objc_init
* Bootstrap initialization. Registers our image notifier with dyld.
* Called by libSystem BEFORE library initialization time
**********************************************************************/

// runtime加载入口
void _objc_init(void)
{
    static bool initialized = false;
    if (initialized) return;
    initialized = true;
    
    // fixme defer initialization until an objc-using image is found?
    environ_init();
    tls_init();
    static_init();
    lock_init();
    exception_init();

    _dyld_objc_notify_register(&map_images, load_images, unmap_image);
}

先从map_images读取模块,再到map_images_nolock函数中找到_read_images函数，在_read_images函数中我们找到分类相关代码。

 // 发现和处理所有Category
    for (EACH_HEADER) {
        // 外部循环遍历找到当前类，查找类对应的Category数组
        category_t **catlist = 
            _getObjc2CategoryList(hi, &count);
        bool hasClassProperties = hi->info()->hasCategoryClassProperties();

        // 内部循环遍历当前类的所有Category
        for (i = 0; i < count; i++) {
            category_t *cat = catlist[i];
            Class cls = remapClass(cat->cls);
            
            if (!cls) {
                catlist[i] = nil;
                if (PrintConnecting) {
                    _objc_inform("CLASS: IGNORING category \?\?\?(%s) %p with "
                                 "missing weak-linked target class", 
                                 cat->name, cat);
                }
                continue;
            }
 
            // 首先，通过其所属的类注册Category。如果这个类已经被实现，则重新构造类的方法列表。
            bool classExists = NO;
            if (cat->instanceMethods ||  cat->protocols  
                ||  cat->instanceProperties) 
            {
                // 将Category添加到对应Class的value中，value是Class对应的所有category数组
                addUnattachedCategoryForClass(cat, cls, hi);
                // 将Category的method、protocol、property添加到Class
                if (cls->isRealized()) {
                    remethodizeClass(cls);
                    classExists = YES;
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category -%s(%s) %s", 
                                 cls->nameForLogging(), cat->name, 
                                 classExists ? "on existing class" : "");
                }
            }

            // 这块和上面逻辑一样，区别在于这块是对Meta Class做操作，而上面则是对Class做操作
            // 根据下面的逻辑，从代码的角度来说，是可以对原类添加Category的
            if (cat->classMethods  ||  cat->protocols  
                ||  (hasClassProperties && cat->_classProperties)) 
            {
                addUnattachedCategoryForClass(cat, cls->ISA(), hi);
                if (cls->ISA()->isRealized()) {
                    remethodizeClass(cls->ISA());
                }
                if (PrintConnecting) {
                    _objc_inform("CLASS: found category +%s(%s)", 
                                 cls->nameForLogging(), cat->name);
                }
            }
        }
    }

    ts.log("IMAGE TIMES: discover categories");

1.这个方法是用来查找有没有分类的通过_getObjc2CategoryList 函数获取到分类列表之后，进行遍历，获取其中的方法，协议，属性等。
2.可以看到最终都调用了remethodizeClass(cls);函数。

继续看看 remethodizeClass函数

/***********************************************************************
* remethodizeClass
* Attach outstanding categories to an existing class.
* Fixes up cls's method list, protocol list, and property list.
* Updates method caches for cls and its subclasses.
* Locking: runtimeLock must be held by the caller
**********************************************************************/

// 将Category的信息添加到Class，包含method、property、protocol
static void remethodizeClass(Class cls)
{
    category_list *cats;
    bool isMeta;

    runtimeLock.assertWriting();

    isMeta = cls->isMetaClass();

    // 从Category哈希表中查找category_t对象，并将已找到的对象从哈希表中删除
    if ((cats = unattachedCategoriesForClass(cls, false/*not realizing*/))) {
        if (PrintConnecting) {
            _objc_inform("CLASS: attaching categories to class '%s' %s", 
                         cls->nameForLogging(), isMeta ? "(meta)" : "");
        }
        
        attachCategories(cls, cats, true /*flush caches*/);        
        free(cats);
    }
}

通过上述代码我们发现attachCategories函数接收了类对象cls和分类数组cats，前面我们说到分类信息是存储在category_t结构体中，那多个分类则存在category_list中。

继续看看 attachCategories函数。

// Attach method lists and properties and protocols from categories to a class.
// Assumes the categories in cats are all loaded and sorted by load order, 
// oldest categories first.

// 获取到Category的Protocol list、Property list、Method list，然后通过attachLists函数添加到所属的类中
static void 
attachCategories(Class cls, category_list *cats, bool flush_caches)
{
    if (!cats) return;
    if (PrintReplacedMethods) printReplacements(cls, cats);
  //是否为元类对象
    bool isMeta = cls->isMetaClass();

    // 按照Category个数，分配对应的内存空间
    // 方法列表二维数组
    method_list_t **mlists = (method_list_t **)
        malloc(cats->count * sizeof(*mlists));
    property_list_t **proplists = (property_list_t **)
        malloc(cats->count * sizeof(*proplists));
    protocol_list_t **protolists = (protocol_list_t **)
        malloc(cats->count * sizeof(*protolists));

    // Count backwards through cats to get newest categories first
    int mcount = 0;
    int propcount = 0;
    int protocount = 0;
    int i = cats->count;
    bool fromBundle = NO;
    
    // 循环查找出Protocol list、Property list、Method list
    while (i--) {
        auto& entry = cats->list[i];
        //根据传入是否元类对象来取出对应的类方法或者实例方法
        method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
        if (mlist) {
          //1.将所有分类中的所有方法存入到 mlists中
            mlists[mcount++] = mlist;
            fromBundle |= entry.hi->isBundle();
        }

        property_list_t *proplist = 
            entry.cat->propertiesForMeta(isMeta, entry.hi);
        if (proplist) {
          //2.将所有分类中的所有属性存到proplists中
            proplists[propcount++] = proplist;
        }

        protocol_list_t *protolist = entry.cat->protocols;
        if (protolist) {
            //3.将所有分类中所有协议存入protolists中
            protolists[protocount++] = protolist;
        }
    }

  //结构体：用于存放对象方法和属性和协议
    auto rw = cls->data();

    // 执行添加操作
    //存入到相关的数组中,然后释放对应的数组
    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    rw->methods.attachLists(mlists, mcount);
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);

    rw->properties.attachLists(proplists, propcount);
    free(proplists);

    rw->protocols.attachLists(protolists, protocount);
    free(protolists);
}

1.首先根据方法列表，属性列表，协议列表，malloc分配内存，根据多少个分类以及每一块方法需要多少内存来分配相应的内存地址。
2.从分类数组里面往三个数组里面存放分类数组里面存放的分类方法，属性以及协议放入对应mlist、proplists、protolosts数组中，这三个数组放着所有分类的方法，属性和协议。
3.通过类对象的data()方法，拿到类对象的class_rw_t结构体rw，class_rw_t中存放着类对象的方法，属性和协议等数据，rw结构体通过类对象的data方法获取，所以rw里面存放这类对象里面的数据。
4.分别通过rw调用方法列表、属性列表、协议列表的attachList函数，将所有的分类的方法、属性、协议列表数组传进去
5.在attachList方法内部将分类和本类相应的对象方法，属性，和协议进行了合并。

继续看下attachLists函数内部。

 void attachLists(List* const * addedLists, uint32_t addedCount) {
        if (addedCount == 0) return;

        if (hasArray()) {
            // many lists -> many lists

            //array() ->list :原来的列表数组
            //addedList:分类的列表数组
            uint32_t oldCount = array()->count;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
            array()->count = newCount;
            
            //array() - >：原方法列表,把原方法列表往后移动,前面增加新方法列表需要的空间
            memmove(array()->lists + addedCount, array()->lists,
                    oldCount * sizeof(array()->lists[0]));
            
             //把新的方法列表拷贝到最前面
            memcpy(array()->lists, addedLists, 
            addedCount * sizeof(array()->lists[0]));
        }
        else if (!list  &&  addedCount == 1) {
            // 0 lists -> 1 list
            list = addedLists[0];
        } 
        else {
            // 1 list -> many lists
            List* oldList = list;
            uint32_t oldCount = oldList ? 1 : 0;
            uint32_t newCount = oldCount + addedCount;
            setArray((array_t *)malloc(array_t::byteSize(newCount)));
            array()->count = newCount;
            if (oldList) array()->lists[addedCount] = oldList;
            memcpy(array()->lists, addedLists, 
                   addedCount * sizeof(array()->lists[0]));
        }
    }

    void tryFree() {
        if (hasArray()) {
            for (uint32_t i = 0; i < array()->count; i++) {
                try_free(array()->lists[i]);
            }
            try_free(array());
        }
        else if (list) {
            try_free(list);
        }
    }

1.上面方法中有两个重要的数组 a.array()->lists：类对象原来的方法列表，属性列表，协议列表; b. addedLists：传入所有分类的方法列表，属性列表，协议列表。
2.在这个方法里重新计算了列表长度，并且调用 realloc 重新分配了新的内存空间，
在realloc的时候传进一个newCount，这是因为要增加分类中的方法，所以需要对之前的数组扩容
3.调用 memmove 方法，把原来的方法列表向后移动，前面留出了新列表的长度，
4.调用 memcpy 方法，把新方法列表插入到整个列表最前面。其实也就是首先将原来数组中的每个元素先往后移动（我们要添加几个元素，就移动几位），因为移动后的位置，其实也是数组自己的内存空间，所以存在重叠问题，直接移动会导致元素丢失的问题，所以用memmove（会检测是否有内存重叠）。
5.在分类中添加和原类中同名的方法时，那调用该方法时会优先调用分类的方法，原因就是分类的方法在方法列表中重新插入在最前面，所以会优先调用，这也解释了分类方法会覆盖原方法的原因。

总结

1.分类的实现原理是将category中的方法，属性，协议数据放在category_t结构体中，然后将结构体内的方法列表拷贝到类对象的方法列表中。
2.Category可以添加属性，但是并不会自动生成成员变量及set/get方法。
3.因为category_t结构体中并不存在成员变量。通过之前对对象的分析我们知道成员变量是存放在实例对象中的，并且编译的那一刻就已经决定好了。
4.而分类是在运行时才去加载的。那么我们就无法在程序运行时将分类的成员变量中添加到实例对象的结构体中。因此分类中不可以添加成员变量。

3.load 和 initialize相关

对于这个区别，我们从两个角度出发分析，调用方式和调用时刻。

load方法会在程序启动就会调用，当装载类信息的时候就会调用。
现在看看runtime源码我们发现是优先调用类的load方法，之后调用分类的load方法。

void call_load_methods(void)
{
    static bool loading = NO;
    bool more_categories;

    loadMethodLock.assertLocked();

    // Re-entrant calls do nothing; the outermost call will finish the job.
    if (loading) return;
    loading = YES;

    void *pool = objc_autoreleasePoolPush();

    do {
        // 1. 反复调用class +load，直到没有更多
        while (loadable_classes_used > 0) {
            call_class_loads();   //1.先调用类的load 方法
        }

        // 2. Call category +loads ONCE
        more_categories = call_category_loads();  //2.之后再调用分类的load方法

        // 3. Run more +loads if there are classes OR more untried categories
    } while (loadable_classes_used > 0  ||  more_categories);

    objc_autoreleasePoolPop(pool);

    loading = NO;
}

1.通过源码我们发现是优先调用类的load方法，之后调用分类的load方法。
2.我们通过代码验证一下：我们添加 <Animal> 继承 <AnimalFather>类，并添加 <Animal+fun> 分类，分别重写+load方法并且里面用 NSLog(@"%@, %s", [self class], FUNCTION);打印，

通过打印发现：

2019-05-21 16:51:55.008427+0800 BaseProject[689:181934] AnimalFather, +[AnimalFather load]
2019-05-21 16:51:55.014857+0800 BaseProject[689:181934] Animal, +[Animal load]
2019-05-21 16:51:55.017435+0800 BaseProject[689:181934] Animal, +[Animal(fun) load]

3.确实是优先调用类的load方法之后调用分类的load方法，不过调用类的load方法之前会保证其父类已经调用过load方法。

问题

为什么写了分类,子类中的load方法也调用了呢？难道分类不会覆盖子类中的方法吗？

通过看一下runtime中load方法的调用源码:

/***********************************************************************
* call_class_loads
* Call all pending class +load methods.
* If new classes become loadable, +load is NOT called for them.
*
* Called only by call_load_methods().
**********************************************************************/
static void call_class_loads(void)
{
    int i;
    
    // Detach current loadable list.
    struct loadable_class *classes = loadable_classes;
    int used = loadable_classes_used;
    loadable_classes = nil;
    loadable_classes_allocated = 0;
    loadable_classes_used = 0;
    
    // Call all +loads for the detached list.
    for (i = 0; i < used; i++) {
        Class cls = classes[i].cls;
        load_method_t load_method = (load_method_t)classes[i].method;
        if (!cls) continue; 

        if (PrintLoading) {
            _objc_inform("LOAD: +[%s load]\n", cls->nameForLogging());
        }
        (*load_method)(cls, SEL_load);
    }
    
    // Destroy the detached list.
    if (classes) free(classes);
}


/***********************************************************************
* call_category_loads
* Call some pending category +load methods.
* The parent class of the +load-implementing categories has all of 
*   its categories attached, in case some are lazily waiting for +initalize.
* Don't call +load unless the parent class is connected.
* If new categories become loadable, +load is NOT called, and they 
*   are added to the end of the loadable list, and we return TRUE.
* Return FALSE if no new categories became loadable.
*
* Called only by call_load_methods().
**********************************************************************/
static bool call_category_loads(void)
{
    int i, shift;
    bool new_categories_added = NO;
    
    // Detach current loadable list.
    struct loadable_category *cats = loadable_categories;
    int used = loadable_categories_used;
    int allocated = loadable_categories_allocated;
    loadable_categories = nil;
    loadable_categories_allocated = 0;
    loadable_categories_used = 0;

    // Call all +loads for the detached list.
    
    /*
     我们看到load方法中直接拿到load方法的内存地址直接调用方法，不在是通过消息发送机制调用。
     
     我们可以看到分类中也是通过直接拿到load方法的地址进行调用。因此正如我们之前试验的一样，
     分类中重写load方法，并不会优先调用分类的load方法，而不调用本类中的load方法了。
     */
    
    for (i = 0; i < used; i++) {
        Category cat = cats[i].cat;
        load_method_t load_method = (load_method_t)cats[i].method;
        Class cls;
        if (!cat) continue;

        cls = _category_getClass(cat);
        if (cls  &&  cls->isLoadable()) {
            if (PrintLoading) {
                _objc_inform("LOAD: +[%s(%s) load]\n", 
                             cls->nameForLogging(), 
                             _category_getName(cat));
            }
            (*load_method)(cls, SEL_load);
            cats[i].cat = nil;
        }
    }

3.看到load方法中直接拿到load方法的内存地址直接调用方法，不再是通过消息发送机制调用。我们可以看到分类中也是通过直接拿到load方法的地址进行调用。因此正如我们之前试验的一样，分类中重写load方法，并不会优先调用分类的load方法，而调用本类中的load方法了。

注意

1.Runtime 调用+(void)load时没有autorelease pool:
原因是runtime调用+(void)load的时候，程序还没有建立其autorelease pool，所以那些会需要使用到autorelease pool的代码，都会出现异常。这一点是非常需要注意的，也就是说放在+(void)load中的对象都应该是alloc出来并且不能使用autorelease来释放。

接下来为 <AnimalFather>、 <Animal> 、<Animal+fun> 添加initialize方法。

通过打印发现如下：

2019-05-20 16:47:34.611819+0800 BaseProject[489:58023] AnimalFather, +[AnimalFather initialize]
2019-05-20 16:47:34.611933+0800 BaseProject[489:58023] Animal, +[Animal initialize]

由此可见分类中的方法覆盖了子类中的方法
首先还是先看下runtime中initialize的源码吧

void callInitialize(Class cls){
    //通过消息发送机制
    ((void(*)(Class, SEL))objc_msgSend)(cls, SEL_initialize);
    asm("");
}

1.initialize是通过消息发送机制调用的，消息发送机制通过isa指针找到对应的方法与实现，因此先找到分类方法中的实现，会优先调用分类方法中的实现。
2.当类第一次接收到消息时，就会调用initialize，也就是说使用类的时候就会调用initialize方法。调用子类的initialize之前，会先保证调用父类的initialize方法。如果之前已经调用过initialize，就不会再调用initialize方法了。当分类重写initialize方法时会先调用分类的方法,会覆盖子类的方法。

注意

1.+(void)initialize 消息是在该类接收到其第一个消息之前调用。
关于这里的第一个消息需要特别说明一下，对于 NSObject 的 runtime 机制而言，其在调用 NSObject 的 + (void)load 消息不被视为第一个消息，
所以当在 +(void)方法中打印 NSLog(@"%@, %s", [self class], FUNCTION);
打印结果如下：

2019-05-20 16:51:18.487864+0800 BaseProject[503:59074] AnimalFather, +[AnimalFather initialize]
2019-05-20 16:51:18.488139+0800 BaseProject[503:59074] AnimalFather, +[AnimalFather load]
2019-05-20 16:51:18.495708+0800 BaseProject[503:59074] Animal,d +[Animal(fun) initialize]
2019-05-20 16:51:18.495755+0800 BaseProject[503:59074] Animal, +[Animal load]
2019-05-20 16:51:18.498858+0800 BaseProject[503:59074] Animal, +[Animal(fun) load]

2.在应用程序的生命周期中，runtime 只会向每个类发送一次 + (void)initialize 消息. 若再次调用初始化实例，initialize 不再调用。
举例说明:

    Animal *a = [[Animal alloc]init];
    
    [a loadName2];
    
    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, (int64_t)(3 * NSEC_PER_SEC)), dispatch_get_main_queue(), ^{
        Animal *a1 = [[Animal alloc]init];
        NSLog(@"再次初始化Animal 若再次调用初始化实例，initialize 不再调用 ");
    });

总结：

1.当先初始化一个Animal类后,延时3秒后再初始化一个a1对象时,不再调用initialize 方法。
1. 如果该类是子类，且该子类中没有实现 + (void)initialize 消息，或者子类显示调用父类实现 [super initialize],那么则会调用其父类的实现。也就是说，父类的 + (void)initialize 可能会被调用多次。
3.如果类包含分类，且分类重写了initialize方法，那么则会调用分类的 initialize 实现，而原类的该方法实现不会被调用。
4.父类的 initialize 方法先于子类的 initialize 方法调用。

最后

如果有不对的地方请指正，十分感谢。

基于runtime理解Category
Category一般又叫分类，它的主要作用是在不改变原有类的前提下，动态地给这个类添加一些方法。当我们需要为一个类...
Objective-C 之 Category
Category 是基于 Objective-C runtime 的一种体现。 Category 原理首先，这里先...
[iOS] 组件化方案学习 - CTMediator
1. Target-Action 这种方案是基于 OC 的runtime、category 特性动态获取模块，例如...
透彻理解 KVO 观察者模式（附基于runtime实现代码）
透彻理解 KVO 观察者模式（附基于runtime实现代码）透彻理解 KVO 观察者模式（附基于runtime实...
iOS 的Category实现原理
Category 加载过程原理是通过runtime加载类的所有Category数据把Category的方法、属...
Objective-C：Category
Category的底层结构 Category的加载处理过程通过Runtime加载某个类的所有Category数据...
OC分类
Category底层结构 Category加载过程 1.通过Runtime加载某个类的所有Category数据 2...
category探究
category本质 category attach2Class objc-os.mm objc-runtime-...
iOS底层:Category
Category的底层结构 Category的加载处理过程1、通过Runtime加载某个类的所有Category数...
iOS Category（分类）
Category的加载处理过程通过runtime加载某个类的所有category数据。把所有的category...