美文网首页
第六篇:Cache深入浅出

第六篇:Cache深入浅出

作者: 坚持才会看到希望 | 来源:发表于2022-04-29 22:42 被阅读0次

    在objc的源码里,如下面所示,我们这节来讲解下cache_t cache。

    struct objc_class : objc_object {
      objc_class(const objc_class&) = delete;
      objc_class(objc_class&&) = delete;
      void operator=(const objc_class&) = delete;
      void operator=(objc_class&&) = delete;
        // Class ISA;
        Class superclass;
        cache_t cache;             // formerly cache pointer and vtable
        class_data_bits_t bits;  
    

    我们点击cache_t 进入cache的结构体里去查看:其中_maybeMask是表示的buckets的长度带下。

    struct cache_t {
    private:
        explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
        union {
            struct {
                explicit_atomic<mask_t>    _maybeMask;
    #if __LP64__
                uint16_t                   _flags;
    #endif
                uint16_t                   _occupied;
            };
            explicit_atomic<preopt_cache_t *> _originalPreoptCache;
        };
    

    在上面源码中,我们依然看不到cache是干什么的,我们继续查找下,找到了 void insert(SEL sel, IMP imp, id receiver)这个方法,我们继续点击进入insert这个方法里,发现其实在操作buckets,如下:

      bucket_t *b = buckets();
        mask_t m = capacity - 1;
        mask_t begin = cache_hash(sel, m);
        mask_t i = begin;
    
        // Scan for the first unused slot and insert there.
        // There is guaranteed to be an empty slot.
        do {
            if (fastpath(b[i].sel() == 0)) {
                incrementOccupied();
                b[i].set<Atomic, Encoded>(b, sel, imp, cls());
                return;
            }
            if (b[i].sel() == sel) {
                // The entry was added to the cache by some other thread
                // before we grabbed the cacheUpdateLock.
                return;
            }
        } while (fastpath((i = cache_next(i, m)) != begin));
    
        bad_cache(receiver, (SEL)sel);
    

    在下面我们看到有个template模板,这个是进行bukets的存储和拿取。

    template<Atomicity atomicity, IMPEncoding impEncoding>
    void bucket_t::set(bucket_t *base, SEL newSel, IMP newImp, Class cls)
    {
        ASSERT(_sel.load(memory_order_relaxed) == 0 ||
               _sel.load(memory_order_relaxed) == newSel);
    
        // objc_msgSend uses sel and imp with no locks.
        // It is safe for objc_msgSend to see new imp but NULL sel
        // (It will get a cache miss but not dispatch to the wrong place.)
        // It is unsafe for objc_msgSend to see old imp and new sel.
        // Therefore we write new imp, wait a lot, then write new sel.
        
        uintptr_t newIMP = (impEncoding == Encoded
                            ? encodeImp(base, newImp, newSel, cls)
                            : (uintptr_t)newImp);
    
        if (atomicity == Atomic) {
            _imp.store(newIMP, memory_order_relaxed);
            
            if (_sel.load(memory_order_relaxed) != newSel) {
    #ifdef __arm__
                mega_barrier();
                _sel.store(newSel, memory_order_relaxed);
    #elif __x86_64__ || __i386__
                _sel.store(newSel, memory_order_release);
    #else
    #error Don't know how to do bucket_t::set on this architecture.
    #endif
            }
        } else {
            _imp.store(newIMP, memory_order_relaxed);
            _sel.store(newSel, memory_order_relaxed);
        }
    }
    

    通过上面的源码我们知道cache是用来缓存方法的,也就是sel和imp.那我们如何去证明我们从源码里猜测呢?接着我们继续去探索下。我们再 HPWPerson *p = [HPWPerson alloc这个下面打个断点,然后通过lldb调试工具进行打印,打印如下所示,我们发现我们把cache_t里的成员变量_bucketsAndMaybeMask _maybeMask,_flags,_occupied,_originalPreoptCache都打印出来,也没发现sel和imp的痕迹。

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            NSLog(@"Hello World!");
            
            HPWPerson *p = [HPWPerson alloc];
            [p method1];
           
        }
        return 0;
    }
    
    lldb) x/6gx p.class
    0x1000080f0: 0x00000001000080c8 0x0000000100821140
    0x100008100: 0x0001000100d25f90 0x0002801000000000
    0x100008110: 0x0000000100ab3df4 0x0000000000000000
    (lldb) p/x (cache_t *)0x100008100
    (cache_t *) $1 = 0x0000000100008100
    (lldb) p *$1
    (cache_t) $2 = {
      _bucketsAndMaybeMask = {
        std::__1::atomic<unsigned long> = {
          Value = 281479285464976
        }
      }
       = {
         = {
          _maybeMask = {
            std::__1::atomic<unsigned int> = {
              Value = 0
            }
          }
          _flags = 32784
          _occupied = 2
        }
        _originalPreoptCache = {
          std::__1::atomic<preopt_cache_t *> = {
            Value = 0x0002801000000000
          }
        }
      }
    }
    (lldb) p $2._flags
    (uint16_t) $3 = 32784
    (lldb) p $2._occupied
    (uint16_t) $4 = 2
    (lldb) p $2._maybeMask
    (explicit_atomic<unsigned int>) $5 = {
      std::__1::atomic<unsigned int> = {
        Value = 0
      }
    }
    (lldb) p $2._bucketsAndMaybeMask
    (explicit_atomic<unsigned long>) $6 = {
      std::__1::atomic<unsigned long> = {
        Value = 281479285464976
      }
    }
    (lldb) p $2._originalPreoptCache
    (explicit_atomic<preopt_cache_t *>) $7 = {
      std::__1::atomic<preopt_cache_t *> = {
        Value = 0x0002801000000000
      }
    }
    

    那到底存在哪里呢,我们继续去探索下我们之前说的buckets里面,通过lldb进行打印操作。奇迹发生了,通过下面的打印我们看到了_sel和_imp都在bucket_t里。

    (lldb) p $2.buckets()
    (bucket_t *) $8 = 0x0000000100d25f90
    (lldb) p *$8
    (bucket_t) $9 = {
      _imp = {
        std::__1::atomic<unsigned long> = {
          Value = 8438676
        }
      }
      _sel = {
        std::__1::atomic<objc_selector *> = "" {
          Value = ""
        }
      }
    

    我们再继续看下源码,点进入sel和imp中,然后通过lldb进行调用和打印imp和sel。发现其里面的确是缓存方法的。

    imp源码:

     inline IMP rawImp(MAYBE_UNUSED_ISA objc_class *cls) const {
            uintptr_t imp = _imp.load(memory_order_relaxed);
            if (!imp) return nil;
    
    (lldb) p *$9.imp(nil,p.class)
    (void (*)()) $10 = 0x0000000100804364 (libobjc.A.dylib`-[NSObject class] at NSObject.mm:2254)
    

    sel源码:

    public:
        static inline size_t offsetOfSel() { return offsetof(bucket_t, _sel); }
        inline SEL sel() const { return _sel.load(memory_order_relaxed); }
    
    (lldb) p *$9.sel()
    (SEL) $12 = "class"
    

    Cache的扩容

    cache_t的大小为16个字节,因为其中有个union体,这个之前我们篇章介绍过union大小的计算,这里是最大成员变量的整数倍也就是8字节,_bucketsAndMaybeMask也是8字节,所以cache_t为8+8为16字节。

    大家想想cache的代销为16字节,那如果有很多方法呢,那cache的大小还是16字节吗?其实其还是16字节的,因为在cache的结构体里有个_bucketsAndMaybeMask这个变量是存放方法的首地址,通过这个首地址就可以找到cache里面缓存的方法。

    struct cache_t { 8+8=16
    private:
        explicit_atomic<uintptr_t> _bucketsAndMaybeMask; 8字节
        union {
            struct {
                explicit_atomic<mask_t>    _maybeMask; 4字节
    #if __LP64__
                uint16_t                   _flags;  2字节
    #endif
                uint16_t                   _occupied;  2字节
            };
            explicit_atomic<preopt_cache_t *> _originalPreoptCache; 8字节
        };
    

    通过对buckets源码分析,其在arm64架构下为2,在x86_64架构下为4.

    ALWAYS_INLINE
    void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
    {
        bucket_t *oldBuckets = buckets();
        bucket_t *newBuckets = allocateBuckets(newCapacity);
    
        // Cache's old contents are not propagated. 
        // This is thought to save cache memory at the cost of extra cache fills.
        // fixme re-measure this
    
        ASSERT(newCapacity > 0);
        ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);
    
        setBucketsAndMask(newBuckets, newCapacity - 1);
        
        if (freeOld) {
            collect_free(oldBuckets, oldCapacity);
        }
    }
    
    #if __arm__  ||  __x86_64__  ||  __i386__
    
    // objc_msgSend has few registers available.
    // Cache scan increments and wraps at special end-marking bucket.
    #define CACHE_END_MARKER 1
    
    // Historical fill ratio of 75% (since the new objc runtime was introduced).
    static inline mask_t cache_fill_ratio(mask_t capacity) {
        return capacity * 3 / 4;
    }
    
    #elif __arm64__ && __LP64__
    
    // objc_msgSend has lots of registers available.
    // Cache scan decrements. No end marker needed.
    #define CACHE_END_MARKER 0
    
    // Allow 87.5% fill ratio in the fast path for all cache sizes.
    // Increasing the cache fill ratio reduces the fragmentation and wasted space
    // in imp-caches at the cost of potentially increasing the average lookup of
    // a selector in imp-caches by increasing collision chains. Another potential
    // change is that cache table resizes / resets happen at different moments.
    static inline mask_t cache_fill_ratio(mask_t capacity) {
        return capacity * 7 / 8;
    }
    
    
       else if (capacity <= FULL_UTILIZATION_CACHE_SIZE && newOccupied + CACHE_END_MARKER <= capacity) {
            // Allow 100% cache utilization for small buckets. Use it as-is.
        }
    

    通过上面知道:
    1)在arm64架构如果缓存的大小小于等于(buckets)筒子的长度的8分之7,在x86_64架构如果缓存的大小小于等于(buckets)筒子的长度的4分之3则什么都不干

    Cache扩容的总结:

    1.在x86_64架构下---->当缓存的大小等于筒子长度的3/4的时候,进行两倍扩容

    2.在arm64架构下----->当缓存的大小大于筒子长度的7/8的时候,进行两倍扩容,当筒子的长度小于等于8的时候,不会扩容

    3.同时在这里有个小细节,CACHE_END_MARKER在x86_64结构下为1,在arm64结构下为0。所以在x86_64中当进来的长度为3,筒子为4,金辉进行扩容,这里有CACHE_END_MARKER为1进行了移位扩容操作。当在arm64中当进来的长度是14,筒子为16,不会进行扩容,因为CACHE_END_MARKER为0,当进来长度为15的时候才会扩容处理。

    4.在x86_64架构下,筒子的初始大小为4,在arm64架构下,筒子的初始大小2

    通过有了对于扩容的理解,我们就可以看代码了,在下面代码中,我们打印时候发现找不到method1这个方法,这个是因为在arm64结构下,其筒子也就是buckets长度为2,其中带method1方法有3个,这样就扩容了,所以在打印的时候找不到method1这个方法。其中还有respondsToSelector, Class这两个方法,因为我们找不到method1这个方法,这个是因为扩容了,扩容的时候回释放之前的buckets,所以说method1这个方法肯定在respondsToSelector, Class这两个方法之前运行的。respondsToSelector, Class这两个方法是系统里的。

    int main(int argc, const char * argv[]) {
        @autoreleasepool {
            NSLog(@"Hello World!");
            
            HPWPerson *p = [HPWPerson alloc];
            [p method1];
           
        }
        return 0;
    }
    

    相关文章

      网友评论

          本文标题:第六篇:Cache深入浅出

          本文链接:https://www.haomeiwen.com/subject/ocwsyrtx.html