cache_t结构探一探

作者: 灰溜溜的小王子 | 来源:发表于2020-09-21 17:48 被阅读0次

一.`cache_t`结构

1.cache_t结构

struct objc_class : objc_object {
    // Class ISA;
    Class superclass; //0X10
    cache_t cache;             // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    class_rw_t *data() const {
        return bits.data();
    }
....
  }

cache是cache_t类型，那么cache_t又是什么样子继续(省略了部分代码)：

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
    explicit_atomic<struct bucket_t *> _buckets;//8字节 imp sel
    explicit_atomic<mask_t> _mask;//4字节
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
    
    // How much the mask is shifted by.
    static constexpr uintptr_t maskShift = 48;
   
#if __LP64__
    uint16_t _flags;//2 标识
#endif
    uint16_t _occupied;//2 占位

public:
    static bucket_t *emptyBuckets();
    struct bucket_t *buckets();
    mask_t mask();//存储调用次数
    mask_t occupied();//获取方法调用 插入  
    void incrementOccupied();
    void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
    void initializeToEmpty();

    unsigned capacity();
    bool isConstantEmptyCache();
    bool canBeFreed();
    static size_t bytesForCapacity(uint32_t cap);
    static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);

    void reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld);
    void insert(Class cls, SEL sel, IMP imp, id receiver);

    static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn, cold));
};

找到其中记录信息的参数：
1.bucket_t结构体类型的_buckets
2.mask_t类型的_mask
3.uint16_t类型的标识位_flags和_occupied
4.bucket_t结构体类型的buckets()函数
5.mask_t类型的mask()函数
6.mask_t类型的occupied()函数

explicit_atomic<struct bucket_t *> _buckets;//8字节 imp sel
    explicit_atomic<mask_t> _mask;//4字节
#if __LP64__
    uint16_t _flags;//2 标识
#endif
    uint16_t _occupied;//2 占位
struct bucket_t *buckets();
mask_t mask();//存储调用次数
mask_t occupied();//获取方法调用 插入

从字面意思猜测下，buckets（桶）是最有可能存储东西的


struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    explicit_atomic<uintptr_t> _imp;
    explicit_atomic<SEL> _sel;
#else
    explicit_atomic<SEL> _sel;
    explicit_atomic<uintptr_t> _imp;
#endif
....
};

从方法编号_sel, 指针地址_imp可以看出它存储的就是方法，目的为了方便查找，提高查找效率

cache_t中的_buckets()桶、_mask(面具)、_occupied(占据)LLDB打印一些信息来看看

上代码

#import <Foundation/Foundation.h>
NS_ASSUME_NONNULL_BEGIN
@interface PHPerson : NSObject
- (void)doFirst;
- (void)doSecond;
- (void)doThird;
@end

NS_ASSUME_NONNULL_END
#import "PHPerson.h"
@implementation PHPerson
- (void)doFirst {}
- (void)doSecond {}
- (void)doThird {}
@end

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        PHPerson * person = [PHPerson alloc];
        [person doFirst];
        [person doSecond];
        [person doThird];
        Class personClass = object_getClass(person);
        NSLog(@"%@",personClass);
    }
    return 0;
}

探索cache_t就需要找到cache在内存中所占的位置，我们知道类中isa指针占8字节，superclass指针占8字节，只要拿到类的首地址+16字节就能得到cache_t的地址

2.LLDB调试

此时_mask为3，_occupied为2，我们继续打印_buckets

从源码的分析中，我们知道sel-imp是在cache_t的_buckets属性中（目前处于macOS环境），而在cache_t结构体中提供了获取_buckets属性的方法buckets();
获取了_buckets属性，就可以获取sel-imp了，这两个的获取在bucket_t结构体中同样提供了相应的获取方法sel()以及 imp(pClass).

二.cache_t原理

类似SDWebImage，我们探究方法的缓存的时候，我们不仅要探索什么时候存，还要探索怎么存，存在哪，占多大内存，存取方式等。所以我们接下来就一步一步的去剖析。
1.怎么存储?

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED//macOS、模拟器 -- 主要是架构区分
  // explicit_atomic 显示原子性，目的是为了能够 保证 增删改查时 线程的安全性
    //等价于 struct bucket_t * _buckets;
    //_buckets 中放的是 sel imp
    //_buckets的读取 有提供相应名称的方法 buckets()
    explicit_atomic<struct bucket_t *> _buckets; //最小的buckets大小是 4（为了支持扩容算法需要）
    explicit_atomic<mask_t> _mask;  //散列表长度 - 1
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16 //64位真机
    explicit_atomic<uintptr_t> _maskAndBuckets;//写在一起的目的是为了优化
    mask_t _mask_unused;
public: //对外公开可以调用的方法
    static bucket_t *emptyBuckets(); // 清空buckets
    
    struct bucket_t *buckets(); //这个方法的实现很简单就是_buckets对外的一个获取函数
    mask_t mask();  //获取缓存容量_mask
    mask_t occupied(); //获取已经占用的缓存个数_occupied
    void incrementOccupied(); //增加缓存，_occupied自++
    void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask); //这个函数是设置一个新的Buckets
    void initializeToEmpty();
    unsigned capacity();
    bool isConstantEmptyCache();
    bool canBeFreed();
   ......
}

这个incrementOccupied()成功引起了我的注意继续查找：

2.cache_fill执行流程

3.cache_t::insert执行流程

4.在执行第一步时出现函数reallocate(oldCapacity, capacity, /* freeOld */false)源码如下:

void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
    }
}

image.png

如果有旧的buckets，需要清理之前的缓存，即调用cache_collect_free方法

为什么要创建新的新的buckets来替换原有的buckets并抹掉原有的buckets的方案，而不是在在原有buckets的基础上进行扩容？

减少对方法快速查找流程的影响：调用objc_msgSend时会触发方法快速查找，如果进行扩容需要做一些读写操作，对快速查找影响比较大。

对性能要求比较高：开辟新的buckets空间并抹掉原有buckets的消耗比在原有buckets上进行扩展更加高效

5.接着分析图中第三步

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the
    // minimum size is 4 and we resized at 3/4 full.
    do {
        if (fastpath(b[i].sel() == 0)) {
            incrementOccupied();
            b[i].set<Atomic, Encoded>(sel, imp, cls);
            return;
        }
        if (b[i].sel() == sel) {
            // The entry was added to the cache by some other thread
            // before we grabbed the cacheUpdateLock.
            return;
        }
    } while (fastpath((i = cache_next(i, m)) != begin));

    cache_t::bad_cache(receiver, (SEL)sel, cls);

image.png

其中mask_t begin = cache_hash(sel, m);是计算开始查找对下标，hash算法求值

static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    return (mask_t)(uintptr_t)sel & mask;
}

key就是SEL
映射关系其实就是 sel & mask = index
mask = 散列表长度 - 1
所以 index 一定是 <= mask

注释提供信息:

 * Cache writers (hold cacheUpdateLock while reading or writing; not PC-checked)
 * cache_fill         (acquires lock)
 * cache_expand       (only called from cache_fill)
 * cache_create       (only called from cache_expand)
 * bcopy               (only called from instrumented cache_expand)
 * flush_caches        (acquires lock)
 * cache_flush        (only called from cache_fill and flush_caches)
 * cache_collect_free (only called from cache_expand and cache_flush)
 *
 * UNPROTECTED cache readers (NOT thread-safe; used for debug info only)
 * cache_print
 * _class_printMethodCaches
 * _class_printDuplicateCacheEntries
 * _class_printMethodCacheStatistics

三.总结

image.png

cache_t结构探一探

一.`cache_t`结构

二.cache_t原理

三.总结

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

cache_t结构探一探

一.cache_t结构

二.cache_t原理

三.总结

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

一.`cache_t`结构