美文网首页
iOS Class实现原理-结构解析

iOS Class实现原理-结构解析

作者: 野码道人 | 来源:发表于2021-01-24 02:42 被阅读0次

    本文会阐述下面几个问题

    1、Class是什么
    2、Class的内存布局
    3、class_rw_t与class_ro_t的设计哲学
    4、分类与class_rw_t的关系

    查看源码(源码版本objc4-781.2)

    源码地址
    打开objc-private.h查看源码,发现Class是一个结构体指针

    typedef struct objc_class *Class;
    

    我们继续在源码中搜索“struct objc_class”,如图,发现有5个头文件都有定义,最终确认objc-runtime-new.h中是OC2.0中生效的,其他几个文件都有相关宏定义做了限定

    searchobjc_class.jpg

    objc_class结构体简略定义如下

    struct objc_class : objc_object {
        // Class ISA;
        Class superclass;
        cache_t cache;             // formerly cache pointer and vtable
        class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    
        class_rw_t *data() const {
            return bits.data();
        }
        ...
    };
    

    发现objc_class继承自objc_object(c++对c结构体做了扩展,允许定义函数,允许继承并且默认访问权限为public这与c++中的class是不同的),我们再看下objc_object的定义

    struct objc_object {
    private:
        isa_t isa;
    
    public:
    
        // ISA() assumes this is NOT a tagged pointer object
        Class ISA();
    
        // rawISA() assumes this is NOT a tagged pointer object or a non pointer ISA
        Class rawISA();
    
        // getIsa() allows this to be a tagged pointer object
        Class getIsa();
        ...
    };
    

    所以现在可以理解为这个结构体大概长这样

    struct objc_class {
        // Class ISA;
        Class superclass;
        cache_t cache;             // formerly cache pointer and vtable
        class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    
        class_rw_t *data() const {
            return bits.data();
        }
        ...
        Class ISA();
    
        // rawISA() assumes this is NOT a tagged pointer object or a non pointer ISA
        Class rawISA();
    
        // getIsa() allows this to be a tagged pointer object
        Class getIsa();
        ...
    };
    

    下面这个东西是私有成员,所以类内部操作isa的地方使用的是objc_object里面封装的一系列函数,嗯~这很符合开闭原则

    private:
        isa_t isa;
    

    我们从上到下梳理一下:

    定义了一个Class类型的superclass指针,定义了一个cache_t类型的对象,class_data_bits_t类型的对象,注意这里的用词,在OC里面对象即指针,struct则不同,结构体指针在64位系统占8个字节,结构体对象占用的内存大小是内部所有成员变量的字节数总和,当然还要考虑内存对齐原则,iOS系统会按照8字节对齐,16字节为一个开辟单元,嗯~为了访问效率

    cache_t 结构解析

    cache_t的简略定义如下,保留了所有的成员变量,省略了函数

    struct cache_t {
    #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
        explicit_atomic<struct bucket_t *> _buckets;
        explicit_atomic<mask_t> _mask;
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
        
        // How much the mask is shifted by.
        static constexpr uintptr_t maskShift = 48;
        
        // Additional bits after the mask which must be zero. msgSend
        // takes advantage of these additional bits to construct the value
        // `mask << 4` from `_maskAndBuckets` in a single instruction.
        static constexpr uintptr_t maskZeroBits = 4;
        
        // The largest mask value we can store.
        static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1;
        
        // The mask applied to `_maskAndBuckets` to retrieve the buckets pointer.
        static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1;
        
        // Ensure we have enough bits for the buckets pointer.
        static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers.");
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
        // _maskAndBuckets stores the mask shift in the low 4 bits, and
        // the buckets pointer in the remainder of the value. The mask
        // shift is the value where (0xffff >> shift) produces the correct
        // mask. This is equal to 16 - log2(cache_size).
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
    
        static constexpr uintptr_t maskBits = 4;
        static constexpr uintptr_t maskMask = (1 << maskBits) - 1;
        static constexpr uintptr_t bucketsMask = ~maskMask;
    #else
    #error Unknown cache mask storage type.
    #endif
        
    #if __LP64__
        uint16_t _flags;
    #endif
        uint16_t _occupied;
    
    public:
        ...
    };
    

    嗯~还是有点长,我们来解读一下,里面有一些条件编译指令,还有一些static变量,我们知道如下条件编译只会走一个分支,静态变量存储在静态区,结构体不会为其分配内存空间,所以cache_t对象到底占多大内存呢?我们再次精简下结构

    struct cache_t {
    #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
        explicit_atomic<struct bucket_t *> _buckets;
        explicit_atomic<mask_t> _mask;
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
    #else
    #error Unknown cache mask storage type.
    #endif
        
    #if __LP64__
        uint16_t _flags;
    #endif
        uint16_t _occupied;
    
    public:
        ...
    };
    

    explicit_atomic是个结构体模板,大小是传入参数的大小

    template <typename T>
    struct explicit_atomic : public std::atomic<T> {
        explicit explicit_atomic(T initial) noexcept : std::atomic<T>(std::move(initial)) {}
        operator T() const = delete;
        
        T load(std::memory_order order) const noexcept {
            return std::atomic<T>::load(order);
        }
        void store(T desired, std::memory_order order) noexcept {
            std::atomic<T>::store(desired, order);
        }
        static explicit_atomic<T> *from_pointer(T *ptr) {
            static_assert(sizeof(explicit_atomic<T> *) == sizeof(T *),
                          "Size of atomic must match size of original");
            explicit_atomic<T> *atomic = (explicit_atomic<T> *)ptr;
            ASSERT(atomic->is_lock_free());
            return atomic;
        }
    };
    

    mask_t又是什么呢?嗯~32位无符号整形,占4个字节

    typedef uint32_t mask_t;
    

    uintptr_t又是如何定义的呢?嗯~64位系统占8个字节

    typedef unsigned long int uintptr_t;
    

    所以我们来计算一下cache_t的大小,8+4+2+2,嗯~16个字节

    class_data_bits_t 结构解析

    内部只有一个成员变量,嗯~8个字节

    struct class_data_bits_t {
        uintptr_t bits;
    };
    

    接下来看这个常函数,返回值是class_rw_t指针

    class_rw_t *data() const {
        return bits.data();
    }
    

    class_rw_t 结构解析

    class_rw_t简略定义如下,嗯~终于看到核心的东西了,如下四个函数依次返回了class_ro_t类型的结构体指针、method_array_t、property_array_t、protocol_array_t类型的对象

    struct class_rw_t {
        const class_ro_t *ro() const {...}
        const method_array_t methods() const {...}
        const property_array_t properties() const {...}
        const protocol_array_t protocols() const {...}
    };
    

    先抛开class_ro_t不说,我们继续阅读源码,发现如下事实,他们同时继承于模板类list_array_tt,内部实现了添加、存储、释放等管理函数

    class method_array_t : public list_array_tt<method_t, method_list_t> 
    {
        ...
    };
    
    class property_array_t : public list_array_tt<property_t, property_list_t> 
    {
        ...
    };
    
    class protocol_array_t : public list_array_tt<protocol_ref_t, protocol_list_t> 
    {
        ...
    };
    

    我们要重点阅读下这个:

    模板类的attachLists函数,这是OC支持动态性的核心函数,if有多个元素,则通过memmove函数把old数据移动到array()->lists,再通过memcpy函数将addedLists数据拷贝过来,else if 本来list为空则直接赋值为addedLists,else做了一对多合并,所以从数据结构来讲method、property、protocol都支持了动态更新

    template <typename Element, typename List>
    class list_array_tt {
        void attachLists(List* const * addedLists, uint32_t addedCount) {
            if (addedCount == 0) return;
    
            if (hasArray()) {
                // many lists -> many lists
                uint32_t oldCount = array()->count;
                uint32_t newCount = oldCount + addedCount;
                setArray((array_t *)realloc(array(), array_t::byteSize(newCount)));
                array()->count = newCount;
                memmove(array()->lists + addedCount, array()->lists, 
                        oldCount * sizeof(array()->lists[0]));
                memcpy(array()->lists, addedLists, 
                       addedCount * sizeof(array()->lists[0]));
            }
            else if (!list  &&  addedCount == 1) {
                // 0 lists -> 1 list
                list = addedLists[0];
            } 
            else {
                // 1 list -> many lists
                List* oldList = list;
                uint32_t oldCount = oldList ? 1 : 0;
                uint32_t newCount = oldCount + addedCount;
                setArray((array_t *)malloc(array_t::byteSize(newCount)));
                array()->count = newCount;
                if (oldList) array()->lists[addedCount] = oldList;
                memcpy(array()->lists, addedLists, 
                       addedCount * sizeof(array()->lists[0]));
            }
        }
    };
    

    class_ro_t 结构解析

    看着是不是很眼熟,嗯~没错,就是上面提到的oldList,同样有方法、属性、协议还有成员变量

    struct class_ro_t {
        uint32_t flags;
        uint32_t instanceStart;
        uint32_t instanceSize;
    #ifdef __LP64__
        uint32_t reserved;
    #endif
    
        const uint8_t * ivarLayout;
        
        const char * name;
        method_list_t * baseMethodList;
        protocol_list_t * baseProtocols;
        const ivar_list_t * ivars;
    
        const uint8_t * weakIvarLayout;
        property_list_t *baseProperties;
    }
    

    何以见得class_ro_t中的属性、方法等成员变量就是oldLists呢,再看一段源码

    /***********************************************************************
    * realizeClassWithoutSwift
    * Performs first-time initialization on class cls, 
    * including allocating its read-write data.
    * Does not perform any Swift-side initialization.
    * Returns the real class structure for the class. 
    * Locking: runtimeLock must be write-locked by the caller
    **********************************************************************/
    static Class realizeClassWithoutSwift(Class cls, Class previously)
    {
        runtimeLock.assertLocked();
    
        class_rw_t *rw;
        Class supercls;
        Class metacls;
    
        if (!cls) return nil;
        if (cls->isRealized()) return cls;
        ASSERT(cls == remapClass(cls));
    
        // fixme verify class is not in an un-dlopened part of the shared cache?
    
        auto ro = (const class_ro_t *)cls->data();
        auto isMeta = ro->flags & RO_META;
        if (ro->flags & RO_FUTURE) {
            // This was a future class. rw data is already allocated.
            rw = cls->data();
            ro = cls->data()->ro();
            ASSERT(!isMeta);
            cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
        } else {
            // Normal class. Allocate writeable class data.
            rw = objc::zalloc<class_rw_t>();
            rw->set_ro(ro);
            rw->flags = RW_REALIZED|RW_REALIZING|isMeta;
            cls->setData(rw);
        }
        ...
    }
    

    嗯~Apple给我们做的注释很清楚了Performs first-time initialization on class cls,类第一次初始化的时候,都会执行如上函数,类的初始信息存储在class_ro_t中,经过一顿操作,将初始信息ro赋值给rw中的ro,bits.data()返回的就是rw指针,bits是什么呢。是不是还是很眼熟,回顾一下,嗯~就是class_data_bits_t

    struct objc_class {
        // Class ISA;
        Class superclass;
        cache_t cache;             // formerly cache pointer and vtable
        class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    
        class_rw_t *data() const {
            return bits.data();
        }
        ...
    };
    

    下面再看一段源码,嗯~没有删减,看到了吧,类对象初始化的时候会执行到extAlloc函数,从ro中取出method_list_t、property_list_t、protocol_list_t然后执行attachLists方法合并到rw

    class_rw_ext_t *class_rw_t::extAlloc(const class_ro_t *ro, bool deepCopy)
    {
        runtimeLock.assertLocked();
    
        auto rwe = objc::zalloc<class_rw_ext_t>();
    
        rwe->version = (ro->flags & RO_META) ? 7 : 0;
    
        method_list_t *list = ro->baseMethods();
        if (list) {
            if (deepCopy) list = list->duplicate();
            rwe->methods.attachLists(&list, 1);
        }
    
        // See comments in objc_duplicateClass
        // property lists and protocol lists historically
        // have not been deep-copied
        //
        // This is probably wrong and ought to be fixed some day
        property_list_t *proplist = ro->baseProperties;
        if (proplist) {
            rwe->properties.attachLists(&proplist, 1);
        }
    
        protocol_list_t *protolist = ro->baseProtocols;
        if (protolist) {
            rwe->protocols.attachLists(&protolist, 1);
        }
    
        set_ro_or_rwe(rwe, ro);
        return rwe;
    }
    

    class_rw_t与class_ro_t的设计哲学

    apple为什么会定义两个结构差不多的结构体来实现Class呢?ro:read only,rw:read write,原因是class_ro_t是编译期的产物,类源文件中的属性、方法、协议、成员变量在编译期就存在class_ro_t中,而class_rw_t则是运行时的产物,class_rw_t的设计就是为了支撑Class的动态性,运行时将class_ro_t中的属性、协议、方法动态合并到对应的数据结构

    分类真的可以添加属性

    那么category呢,源码很长,但是还是忍不住全贴出来了,看见了吧,在调用attachCategories之前一定会调用一句auto rwe = cls->data()->extAllocIfNeeded();,而extAllocIfNeeded()则会调用到extAlloc()函数,extAlloc()内部会执行拷贝ro到rw,所以我们总说category里面的与原类中同名的方法会被优先调用到,原因就在此,以此类推,一个类的多个分类后被加载的分类同名方法总是优先被查询到

    static void
    attachCategories(Class cls, const locstamped_category_t *cats_list, uint32_t cats_count,
                     int flags)
    {
        if (slowpath(PrintReplacedMethods)) {
            printReplacements(cls, cats_list, cats_count);
        }
        if (slowpath(PrintConnecting)) {
            _objc_inform("CLASS: attaching %d categories to%s class '%s'%s",
                         cats_count, (flags & ATTACH_EXISTING) ? " existing" : "",
                         cls->nameForLogging(), (flags & ATTACH_METACLASS) ? " (meta)" : "");
        }
    
        /*
         * Only a few classes have more than 64 categories during launch.
         * This uses a little stack, and avoids malloc.
         *
         * Categories must be added in the proper order, which is back
         * to front. To do that with the chunking, we iterate cats_list
         * from front to back, build up the local buffers backwards,
         * and call attachLists on the chunks. attachLists prepends the
         * lists, so the final result is in the expected order.
         */
        constexpr uint32_t ATTACH_BUFSIZ = 64;
        method_list_t   *mlists[ATTACH_BUFSIZ];
        property_list_t *proplists[ATTACH_BUFSIZ];
        protocol_list_t *protolists[ATTACH_BUFSIZ];
    
        uint32_t mcount = 0;
        uint32_t propcount = 0;
        uint32_t protocount = 0;
        bool fromBundle = NO;
        bool isMeta = (flags & ATTACH_METACLASS);
        auto rwe = cls->data()->extAllocIfNeeded();
    
        for (uint32_t i = 0; i < cats_count; i++) {
            auto& entry = cats_list[i];
    
            method_list_t *mlist = entry.cat->methodsForMeta(isMeta);
            if (mlist) {
                if (mcount == ATTACH_BUFSIZ) {
                    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
                    rwe->methods.attachLists(mlists, mcount);
                    mcount = 0;
                }
                mlists[ATTACH_BUFSIZ - ++mcount] = mlist;
                fromBundle |= entry.hi->isBundle();
            }
    
            property_list_t *proplist =
                entry.cat->propertiesForMeta(isMeta, entry.hi);
            if (proplist) {
                if (propcount == ATTACH_BUFSIZ) {
                    rwe->properties.attachLists(proplists, propcount);
                    propcount = 0;
                }
                proplists[ATTACH_BUFSIZ - ++propcount] = proplist;
            }
    
            protocol_list_t *protolist = entry.cat->protocolsForMeta(isMeta);
            if (protolist) {
                if (protocount == ATTACH_BUFSIZ) {
                    rwe->protocols.attachLists(protolists, protocount);
                    protocount = 0;
                }
                protolists[ATTACH_BUFSIZ - ++protocount] = protolist;
            }
        }
    
        if (mcount > 0) {
            prepareMethodLists(cls, mlists + ATTACH_BUFSIZ - mcount, mcount, NO, fromBundle);
            rwe->methods.attachLists(mlists + ATTACH_BUFSIZ - mcount, mcount);
            if (flags & ATTACH_EXISTING) flushCaches(cls);
        }
    
        rwe->properties.attachLists(proplists + ATTACH_BUFSIZ - propcount, propcount);
    
        rwe->protocols.attachLists(protolists + ATTACH_BUFSIZ - protocount, protocount);
    }
    

    如上,我们看分类里面的属性也会被添加到类的属性列表里,那为什么我们说,分类不能添加属性呢?明明添加进去了啊:

    嗯~这是因为我们访问属性需要通过点语法,最终是通过get方法访问成员变量,而分类添加的属性不会生成get/set方法,并且成员变成是存在于class_ro_t中,分类并不会动态添加成员变量,更无法通过下划线访问,因为成员变量不存在所以都不能通过编译,那么如何让分类里面添加的属性生效呢,就是需要手动实现getter和setter方法,并且模拟添加成员变量

    总结

    Class的实现细节较多,本文只讨论了内存结构,下篇打算讨论下isa~

    1、Class是什么

    继承objc_object的结构体,objc_class类型的指针

    2、Class的内存布局

    isa

    指向类对象,32位下是一个cls指针,64位下会存储类的很多相关信息,如:是否有自定义c++析构函数,是否有关联对象,是否有弱引用,是否用sidetable存储优化引用计数等
    superclass
    父类的指针
    cache
    缓存调用过的本类方法列表

    class_rw_t

    存储动态数据类型的结构体,通过attachLists函数,支持method、property、protocol的动态更新

    class_ro_t

    静态数据类型,类的初始信息存储在class_ro_t中,运行时,从ro中取出method_list_t、property_list_t、protocol_list_t然后执行attachLists方法合并到rw

    3、class_rw_t与class_ro_t的设计哲学

    class_rw_t的设计就是为了支撑Class的动态性,运行时将class_ro_t中的属性、协议、方法动态合并到对应的数据结构,注意:不包括成员变量(动态添加删除成员变量会造成内存地址混乱)

    4、分类与class_rw_t的关系

    attachCategories函数,负责将分类中的方法合并到class_rw_t,再此之前已经将ro合并到rw,因此category里面的与原类中同名的方法会被优先调用到,以此类推,一个类的多个分类后被加载的分类同名方法总是优先被查询到

    相关文章

      网友评论

          本文标题:iOS Class实现原理-结构解析

          本文链接:https://www.haomeiwen.com/subject/nxykzktx.html