美文网首页
iOS底层原理:cache_t分析

iOS底层原理:cache_t分析

作者: 打碟的DJ | 来源:发表于2020-09-17 22:52 被阅读0次

再开始之前,先四个二带王炸,来波整体分析图:


Cache_t原理分析图.png

从上一篇博客中,我们可以知道cache_t cache是存储在objc_class的结构体中;

struct objc_class : objc_object {
    // Class ISA;  // 指针占 8 字节
    Class superclass; // 指针占 8 字节
    cache_t cache;   // 占 16 字节,具体看cache_t的结构体成员变量          // formerly cache pointer and vtable
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags

    // 省略代码
    ...
};

chach_t 源码

struct cache_t {
#if CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_OUTLINED
    // 模拟器走这里
    explicit_atomic<struct bucket_t *> _buckets;
    explicit_atomic<mask_t> _mask;
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_HIGH_16
    // 真机走这里
    explicit_atomic<uintptr_t> _maskAndBuckets;
    mask_t _mask_unused;
    // 省略代码
    ...
#elif CACHE_MASK_STORAGE == CACHE_MASK_STORAGE_LOW_4
    // 其它非64位会走这里
    // 省略代码
    ...
    explicit_atomic<uintptr_t> _maskAndBuckets;
    mask_t _mask_unused;
#else
#error Unknown cache mask storage type.
#endif
    
#if __LP64__
    uint16_t _flags;
#endif
    uint16_t _occupied;

    // 省略代码
    ...
};

从源码中就可以看出模拟器和真机是会有差别的,所以我们最后一定要跑真机哟~~~

explicit_atomic

其实不用过多的分析,是c++类的一个泛型;看名字也可以清晰的知道是显式原子性。因为 cache是用于缓存,需要进行增删改查,从而为了安全性而引入。

bucket_t

struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    explicit_atomic<uintptr_t> _imp;
    explicit_atomic<SEL> _sel;
#else
    explicit_atomic<SEL> _sel;
    explicit_atomic<uintptr_t> _imp;
#endif

    // 省略代码
    ...
};

bucket_t中可以看出,其中存储的主要是_sel_imp

从中我们可以看出cache_t的整体结构如下所示:


cache_t结构

lldb调试分析cache结构

lldb - cache_t

通过lldb调试,通过偏移isa(8字节)和superclass(8字节)共16字节来获取到cache_t的结构体首地址。

cache_t结构体的源码中提供了获取_buckets的函数:

struct bucket_t *buckets();

在获取到bucket_t的结构体后,可以通过bucket_t结构体中提供的方法sel()imp(Class cls) ,分别获取到_sel_imp:

inline SEL sel() const { return _sel.load(memory_order::memory_order_relaxed); }

    inline IMP imp(Class cls) const {
        uintptr_t imp = _imp.load(memory_order::memory_order_relaxed);
        if (!imp) return nil;
#if CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_PTRAUTH
        SEL sel = _sel.load(memory_order::memory_order_relaxed);
        return (IMP)
            ptrauth_auth_and_resign((const void *)imp,
                                    ptrauth_key_process_dependent_code,
                                    modifierForSEL(sel, cls),
                                    ptrauth_key_function_pointer, 0);
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_ISA_XOR
        return (IMP)(imp ^ (uintptr_t)cls);
#elif CACHE_IMP_ENCODING == CACHE_IMP_ENCODING_NONE
        return (IMP)imp;
#else
#error Unknown method cache IMP encoding.
#endif
    }

当我们执行[person say2]的时候,如何获取到对应的say2的信息呢?

第二个方法

通过buckets我们就可以看出他是多个bucket_t的集合,所以通过p *($9 + 1)来获取,当然也可以通过p $8.buckets()[0]p $8.buckets()[1]这种方式来获取。

那每次都得通过lldb调试来查找相关的信息是不是有点麻烦,那我们有更加直观的方式来获取cache_t的方式吗?

既然问出来,那肯定是有的。我们可以直接把源码的代码copy出来,进行模拟就可以打印相关的信息了。

模拟源码环境

源码如下:

typedef uint32_t mask_t;  // x86_64 & arm64 asm are less efficient with 16-bits

struct isc_bucket_t {
    SEL _sel;
    IMP _imp;
};

struct isc_cache_t {
    struct isc_bucket_t * _buckets;
    mask_t _mask;
    uint16_t _flags;
    uint16_t _occupied;
};

struct isc_class_data_bits_t {
    uintptr_t bits;
};

struct isc_objc_class {
    Class ISA;
    Class superclass;
    struct isc_cache_t cache;             // formerly cache pointer and vtable
    struct isc_class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags
};

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        
        Person *person = [Person alloc];
        Class pClass = [Person class];
       
        [person say1];
        [person say2];
        
        struct isc_objc_class *isc_pClass = (__bridge struct isc_objc_class *)(pClass);
        NSLog(@"%hu - %u",isc_pClass->cache._occupied,isc_pClass->cache._mask);
        for (mask_t i = 0; i<isc_pClass->cache._mask; i++) {
            // 打印获取的 bucket
            struct isc_bucket_t bucket = isc_pClass->cache._buckets[i];
            NSLog(@"%@ - %p",NSStringFromSelector(bucket._sel),bucket._imp);
        }
        
        
        NSLog(@"Hello, World!");
    }
    return 0;
}
执行不同个数方法的打印结果

通过上图我们可以看到

_occupied_mask是什么呢?为什么会有变化?而bucket为什么会存在丢失的情况呢?

分析cachet_t结构

通过上述的问题我们可以看出 _occupied_mask 是在变化的,那我们可以先从这两个成员变量分析。

_occupied

通过查看源码,我们可以看到有个方法是处理_occupied的:

// cache_t 结构体中的方法
void incrementOccupied();

void cache_t::incrementOccupied() 
{
    _occupied++;
}

通过全局搜索一下 incrementOccupied(),我们可以发现在cache_t::insert方法中调用了该方法

cache_t::insert 源码分析

ALWAYS_INLINE
void cache_t::insert(Class cls, SEL sel, IMP imp, id receiver)
{
#if CONFIG_USE_CACHE_LOCK
    cacheUpdateLock.assertLocked();
#else
    runtimeLock.assertLocked();
#endif

    ASSERT(sel != 0 && cls->isInitialized());

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = occupied() + 1; // 获取新的`occupied`
    unsigned oldCapacity = capacity(), capacity = oldCapacity;
    if (slowpath(isConstantEmptyCache())) {
        // Cache is read-only. Replace it.
        // INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2),  1 左移三位即 100,即十进制的 4
        if (!capacity) capacity = INIT_CACHE_SIZE; 
        reallocate(oldCapacity, capacity, /* freeOld */false);
    }
    else if (fastpath(newOccupied + CACHE_END_MARKER <= capacity / 4 * 3)) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        capacity = capacity ? capacity * 2 : INIT_CACHE_SIZE;
        if (capacity > MAX_CACHE_SIZE) {
            capacity = MAX_CACHE_SIZE;
        }
        reallocate(oldCapacity, capacity, true);
    }

    bucket_t *b = buckets();
    mask_t m = capacity - 1;
    mask_t begin = cache_hash(sel, m);
    mask_t i = begin;

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the
    // minimum size is 4 and we resized at 3/4 full.

    // hash 算法存储bucket
    do {
        if (fastpath(b[i].sel() == 0)) {
            incrementOccupied();
            b[i].set<Atomic, Encoded>(sel, imp, cls);
            return;
        }
        if (b[i].sel() == sel) {
            // The entry was added to the cache by some other thread
            // before we grabbed the cacheUpdateLock.
            return;
        }
    } while (fastpath((i = cache_next(i, m)) != begin));

    cache_t::bad_cache(receiver, (SEL)sel, cls);
}

1、reallocate 开辟空间,初始化数据

reallocate(oldCapacity, capacity, /* freeOld */false);

ALWAYS_INLINE
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity, bool freeOld)
{
    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity); // 主要是开辟空间,并强转为 bucket_t 类型

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    ASSERT(newCapacity > 0);
    ASSERT((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1); // 设置 bucket、mask、 _occupied = 0,也就是初始化数据
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
    }
}

1.1、cache_collect_free 释放cache

static void cache_collect_free(bucket_t *data, mask_t capacity)
{
#if CONFIG_USE_CACHE_LOCK
    cacheUpdateLock.assertLocked();
#else
    runtimeLock.assertLocked();
#endif

    if (PrintCaches) recordDeadCache(capacity);

    _garbage_make_room (); // 垃圾回收
    garbage_byte_size += cache_t::bytesForCapacity(capacity);
    garbage_refs[garbage_count++] = data; // 暂存于后置位
    cache_collect(false);
}

1.2、_garbage_make_room 垃圾回收,扩容

static void _garbage_make_room(void)
{
    static int first = 1;

    // Create the collection table the first time it is needed
    // 当是第一个时储存起来
    if (first)
    {
        first = 0;
        garbage_refs = (bucket_t**)
            malloc(INIT_GARBAGE_COUNT * sizeof(void *));
        garbage_max = INIT_GARBAGE_COUNT;
    }

    // Double the table if it is full
    // 如果内存段满了,扩容两倍
    else if (garbage_count == garbage_max)
    {
        garbage_refs = (bucket_t**)
            realloc(garbage_refs, garbage_max * 2 * sizeof(void *));
        garbage_max *= 2;
    }
}

hash 算法存储 bucket

// cache_t::insert 中的 hash 算法存储 bucket

bucket_t *b = buckets();
mask_t m = capacity - 1;
mask_t begin = cache_hash(sel, m);
mask_t i = begin;

do {
    // 当缓存中(buckets)不存在方法时,进行设置
    if (fastpath(b[i].sel() == 0)) {
        incrementOccupied(); // occupied 加 1
        b[i].set<Atomic, Encoded>(sel, imp, cls); // 缓存方法
        return;
    }
    // 存在方法则不做处理
    if (b[i].sel() == sel) {
        // The entry was added to the cache by some other thread
        // before we grabbed the cacheUpdateLock.
        return;
    }
} while (fastpath((i = cache_next(i, m)) != begin));

// cache_next 方法
#define CACHE_END_MARKER 1
static inline mask_t cache_next(mask_t i, mask_t mask) {
    return (i+1) & mask;
}
hash运算
  • 当我们进行hash运算存储的时候,如果集合中没有内容,则进行存储
  • 当集合中存在了内容,即hash冲突,所以需要我们再次进行hash运算,直到没有冲突的情况下,进行存储或者等于begin的时候,放弃存储

总结

  • 1、因为会进行扩容,所以mask会是4 * n - 1occupied则会进行加1
  • 2、bucket 因为在进行扩容的时候,会重新申请内存,会导致有所丢失
  • 3、因为hash算法存储导致顺序紊乱

相关文章

网友评论

      本文标题:iOS底层原理:cache_t分析

      本文链接:https://www.haomeiwen.com/subject/oudryktx.html