美文网首页
iOS底层探索 -- cache_t的结构 和 insert流程

iOS底层探索 -- cache_t的结构 和 insert流程

作者: iOS小木偶 | 来源:发表于2020-09-20 16:46 被阅读0次

    在我们探索class的底层时,我们追踪到objc_class的源码,其中重要结构为

    
    struct objc_class : objc_object {
        // Class ISA;
        Class superclass;
        cache_t cache;             // formerly cache pointer and vtable
        class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
    }
    

    可以看出四个最重要的模块

    1. isa (注释掉并不是说没有,只是提醒这里继承了objc_objectisa属性)
    2. superclass (父类)
    3. cache (缓存)
    4. bits (方法变量等数据)

    当研究节点到今天时,我们已经研究了isabits 的结构 而superclass 依旧是一个class的属性 so我们还剩下一个cache_t 类型的cache还没有分析。

    所以,今天的任务,就是分析cache的结构

    cache_t lldb 分析

    在我们之前的研究过程中,lldb都是我们的三板斧之一,简单,暴力,直观。所以今天我们继续用lldb分析
    (项目基于objc的公开源码 781版本 同时项目直接在mac上运行)

    @interface FQPerson : NSObject
    @property (nonatomic, strong) NSString * name;
    @property (nonatomic, strong) NSString * nikeName;
    -(void)sayHelloWorld;
    -(void)eat1;
    -(void)eat2;
    -(void)eat3;
    -(void)eat4;
    -(void)eat5;
    -(void)eat6;
    +(void)cry;
    @end
    

    测试的类 在.m文件中实现这三个方法。

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            // insert code here...
            FQPerson *person = [FQPerson alloc];
            Class personClass = [FQPerson class];
            [person eat1];
            [person sayHelloWorld];
            NSLog(@"%@",personClass);
        }
        return 0;
    }
    

    测试的入口
    我们在[person eat];前下一个断点
    开始我们的lldb尝试
    通过我们之前的内存地址平移的方式,我们可以获取到cache的指针地址,并打印其中内容

    cache_t打印.jpg

    从打印结果,我们可以看出cache_t的主要结构为

    1. _buckets,

    2. _mask,

    3. _flags,

    4. _occupied

      cache_t结构图.jpg

    bucket_t的内容中,我们看到了selimp

    而我们知道SelImp和方法有关。

    所以我们猜测cache缓存了方法相关的数据

    于是,我们让运行[person eat1];

    随后,我们继续打印cache_t

    (lldb) p * $1
    (cache_t) $3 = {
      _buckets = {
        std::__1::atomic<bucket_t *> = 0x0000000101906470 {
          _sel = {
            std::__1::atomic<objc_selector *> = ""
          }
          _imp = {
            std::__1::atomic<unsigned long> = 8560
          }
        }
      }
      _mask = {
        std::__1::atomic<unsigned int> = 3
      }
      _flags = 32804
      _occupied = 1
    }
    

    此时_selNull变为了""

    _mask变为了3

    _occupied增加了1

    可见确实在执行方法的过程中,在cache中存储了数据

    现在,我们尝试打印其中可能储存的方法信息

    方法打印.jpg

    可见cache_t中确实储存了调用过的方法信息

    同时,我们使用machOView也可以验证我们存储的方法

    machOView打印.jpg

    cache_t代码分析

    我们在lldb的分析中得到了一些成果

    1. cache_t中确实储存了方法信息
    2. 方法信息以SelImp对的方式存在_buckets中。

    但也存在很多问题,

    1. 缓存的存储伴随增删改查,这些是如何实现的?
    2. _mask,_occupied,_flags这些参数有什么作用?

    现在,源码在手的优势就来了,让我们分析一下源码

    struct cache_t {
    #if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
        explicit_atomic<struct bucket_t *> _buckets;
        explicit_atomic<mask_t> _mask;
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
        
        // How much the mask is shifted by.
        static constexpr uintptr_t maskShift = 48;
        
        // Additional bits after the mask which must be zero. msgSend
        // takes advantage of these additional bits to construct the value
        // `mask << 4` from `_maskAndBuckets` in a single instruction.
        static constexpr uintptr_t maskZeroBits = 4;
        
        // The largest mask value we can store.
        static constexpr uintptr_t maxMask = ((uintptr_t)1 << (64 - maskShift)) - 1;
        
        // The mask applied to `_maskAndBuckets` to retrieve the buckets pointer.
        static constexpr uintptr_t bucketsMask = ((uintptr_t)1 << (maskShift - maskZeroBits)) - 1;
        
        // Ensure we have enough bits for the buckets pointer.
        static_assert(bucketsMask >= MACH_VM_MAX_ADDRESS, "Bucket field doesn't have enough bits for arbitrary pointers.");
    #elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
        // _maskAndBuckets stores the mask shift in the low 4 bits, and
        // the buckets pointer in the remainder of the value. The mask
        // shift is the value where (0xffff >> shift) produces the correct
        // mask. This is equal to 16 - log2(cache_size).
        explicit_atomic<uintptr_t> _maskAndBuckets;
        mask_t _mask_unused;
    
        static constexpr uintptr_t maskBits = 4;
        static constexpr uintptr_t maskMask = (1 << maskBits) - 1;
        static constexpr uintptr_t bucketsMask = ~maskMask;
    #else
    #error Unknown cache mask storage type.
    #endif
        
    #if __LP64__
        uint16_t _flags;
    #endif
        uint16_t _occupied;
    
    public:
        static bucket_t *emptyBuckets();
        
        struct bucket_t *buckets();
        mask_t mask();
        mask_t occupied();
        void incrementOccupied();
        void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
        void initializeToEmpty();
    
        unsigned capacity();
        bool isConstantEmptyCache();
        bool canBeFreed();
    
    #if __LP64__
        bool getBit(uint16_t flags) const {
            return _flags & flags;
        }
        void setBit(uint16_t set) {
            __c11_atomic_fetch_or((_Atomic(uint16_t) *)&_flags, set, __ATOMIC_RELAXED);
        }
        void clearBit(uint16_t clear) {
            __c11_atomic_fetch_and((_Atomic(uint16_t) *)&_flags, ~clear, __ATOMIC_RELAXED);
        }
    #endif
    
    #if FAST_CACHE_ALLOC_MASK
        bool hasFastInstanceSize(size_t extra) const
        {
            if (__builtin_constant_p(extra) && extra == 0) {
                return _flags & FAST_CACHE_ALLOC_MASK16;
            }
            return _flags & FAST_CACHE_ALLOC_MASK;
        }
    
        size_t fastInstanceSize(size_t extra) const
        {
            ASSERT(hasFastInstanceSize(extra));
    
            if (__builtin_constant_p(extra) && extra == 0) {
                return _flags & FAST_CACHE_ALLOC_MASK16;
            } else {
                size_t size = _flags & FAST_CACHE_ALLOC_MASK;
                // remove the FAST_CACHE_ALLOC_DELTA16 that was added
                // by setFastInstanceSize
                return align16(size + extra - FAST_CACHE_ALLOC_DELTA16);
            }
        }
    
        void setFastInstanceSize(size_t newSize)
        {
            // Set during realization or construction only. No locking needed.
            uint16_t newBits = _flags & ~FAST_CACHE_ALLOC_MASK;
            uint16_t sizeBits;
    
            // Adding FAST_CACHE_ALLOC_DELTA16 allows for FAST_CACHE_ALLOC_MASK16
            // to yield the proper 16byte aligned allocation size with a single mask
            sizeBits = word_align(newSize) + FAST_CACHE_ALLOC_DELTA16;
            sizeBits &= FAST_CACHE_ALLOC_MASK;
            if (newSize <= sizeBits) {
                newBits |= sizeBits;
            }
            _flags = newBits;
        }
    #else
        bool hasFastInstanceSize(size_t extra) const {
            return false;
        }
        size_t fastInstanceSize(size_t extra) const {
            abort();
        }
        void setFastInstanceSize(size_t extra) {
            // nothing
        }
    #endif
    
        static size_t bytesForCapacity(uint32_t cap);
        static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);
    
        void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
        void insert(Class cls, SEL sel, IMP imp, id receiver);
    
        static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn, cold));
    };
    
    

    其中相关宏定义


    20200918014301.jpg

    于是,我们可以首先得到不同框架环境下cacha_t的中的属性并不相同,最大区别为真机中_maskAndBuckets maskbuckets 存在同一个地方而非真机中是分开存储的

    真机和非真机结构简图.jpg

    同时,我们也看到了一些值得我们研究的方法

    void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
    void insert(Class cls, SEL sel, IMP imp, id receiver);
    
    

    由此,我们来分别研究一下。

    void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);

    显然,这个是向系统申请开辟内存空间的过程

    void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
    {
        bucket_t *oldBuckets = buckets();
        bucket_t *newBuckets = allocateBuckets(newCapacity);
    
        // Cache's old contents are not propagated. 
        // This is thought to save cache memory at the cost of extra cache fills.
        // fixme re-measure this
    
        ASSERT(newCapacity > 0);
        ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
    
        setBucketsAndMask(newBuckets, newCapacity - 1);
        
        if (freeOld) {
            cache_collect_free(oldBuckets, oldCapacity);
        }
    }
    
    

    简化成流程图为


    reallocate流程.png

    中间_occupied 会被赋值为0 ,这就是为什么扩容后,_occupied 的值不会等于调用的方法数。

    解释一下这里为什么不保留原先的数据
    举个例子,你买了一个小户型的房子,你住了一段时间家里人增加了,想换个大点的房子,这时候你并不是把墙敲了直接再盖两间就行了,因为你隔壁可能已经被分配给别人了,只能在别的空地上再给你建一栋足够大的房子,那这样,你之前的房子其实跟现在的房子并没有关系,如果数据全部迁移也会麻烦很多。因为这里的数据是缓存数据,并不是不能丢失的,所以直接丢弃,只开辟新空间。

    void insert(Class cls, SEL sel, IMP imp, id receiver);

    这个是向cache中存储的方法,也是我们最需要研究的方法

    void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
    {
    #if CONFIG_USE_CACHE_LOCK
        cacheUpdateLock.assertLocked();
    #else
        runtimeLock.assertLocked();
    #endif
    
        ASSERT(sel != 0 && cls->isInitialized());
    
        // Use the cache as-is if it is less than 3/4 full
        mask_t newOccupied = occupied() + 1;
        unsigned oldCapacity = capacity(), capacity = oldCapacity;
        if (slowpath(isConstantEmptyCache())) {
            // Cache is read-only. Replace it.
            if (!capacity) capacity = INIT_CACHE_SIZE;
            reallocate(oldCapacity, capacity, /* freeOld */false);
        }
        else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) {
            // Cache is less than 3/4 full. Use it as-is.
        }
        else {
            capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
            if (capacity > MAX_CACHE_SIZE) {
                capacity = MAX_CACHE_SIZE;
            }
            reallocate(oldCapacity, capacity, true);
        }
    
        bucket_t *b = buckets();
        mask_t m = capacity - 1;
        mask_t begin = cache_hash(sel, m);
        mask_t i = begin;
    
        // Scan for the first unused slot and insert there.
        // There is guaranteed to be an empty slot because the
        // minimum size is 4 and we resized at 3/4 full.
        do {
            if (fastpath(b[i].sel() == 0)) {
                incrementOccupied();
                b[i].set<Atomic, Encoded>(sel, imp, cls);
                return;
            }
            if (b[i].sel() == sel) {
                // The entry was added to the cache by some other thread
                // before we grabbed the cacheUpdateLock.
                return;
            }
        } while (fastpath((i = cache_next(i, m)) != begin));
    
        cache_t::bad_cache(receiver, (SEL)sel, cls);
    }
    

    我们再次简化成流程图


    cache_t insert分析.jpg
    1. 先判断是否有空间,如果没有直接默认申请4个空间

    2. 如果本身已有空间,判断newOccupied + CACHE_END_MARKER <= capacity / 4 * 3

    3. 如果满足,直接对bucket赋值

    4. 如果不满足,则2倍扩容。 然后清理空间

    5. 然后存储bucket。

    bucket存储 流程

    buckets赋值流程.png

    至此,我们大概分析了cache_t的结构和 数据存储的流程总图为


    cache_t流程-2.png

    以及总结我们之前的问题 _occupied 为当前缓存中的计数 _mask 为当前申请的空间数-1.

    相关文章

      网友评论

          本文标题:iOS底层探索 -- cache_t的结构 和 insert流程

          本文链接:https://www.haomeiwen.com/subject/yzhpyktx.html