ObjC Runtime 中 Weak 属性的实现 (上)

作者: iOSugarCom | 来源:发表于2017-05-30 08:24 被阅读100次

ObjC Runtime 中 Weak 属性的实现 (上)
ObjC Runtime 中 Weak 属性的实现 (中)
iOS中的weak指针
Runtime
一种objc的runtime的弱引用实现方式
iOS 中weak的实现原理和销毁
Weak实现原理
weak实现原理
runtime 如何实现 weak 属性？
2019-02-14

前言

OC 中的 weak 属性是怎么实现的，为什么在对象释放后会自动变成 nil？本文对这个问题进行了一点探讨。

环境

mac OS Sierra 10.12.4
objc709

参考答案

runtime 对注册的类，会进行布局，对于 weak 对象会放入一个 hash 表中。用 weak 指向的对象内存地址作为 key，当此对象的引用计数为 0 的时候会 dealloc，假如 weak 指向的对象内存地址是 a ，那么就会以 a 为键，在这个 weak 表中搜索，找到所有以 a 为键的 weak 对象，从而设置为 nil 。

测试

代码

#import <Foundation/Foundation.h>

@interface WeakProperty : NSObject

@property (nonatomic,weak) NSObject *obj;


@end

@implementation WeakProperty

- (void)dealloc {
    NSLog(@"%s",__func__);
}

@end


int main(int argc, const char * argv[]) {
    @autoreleasepool {
        WeakProperty *property = [[WeakProperty alloc] init];
        NSObject *obj = [[NSObject alloc] init];
        property.obj = obj;     
        NSLog(@"%@",property.obj);   
        
        // 会触发函数 ``id objc_initWeak(id *location, id newObj)``       
        // NSObject *obj = [[NSObject alloc] init];
        // __weak NSObject *obj2 = obj;
        // 会触发函数 ``void objc_copyWeak(id *dst, id *src)``
        // __weak NSObject *obj3 = obj2;
    }
    return 0;
}

结果

对象的 weak 属性调用 setter 时

调用 id objc_storeWeak(id *location, id newObj)
调用 static id storeWeak(id *location, objc_object *newObj)
...

使用 NSLog 输出 property.obj 属性时

调用 id objc_loadWeakRetained(id *location)

当 dealloc 释放对象时

调用 void objc_destroyWeak(id *location)

小结

storeWeak 函数用于为 weak 属性赋值 (包括销毁)
objc_loadWeakRetained 函数用于获取 weak 属性

观察 & 分析

对于函数 storeWeak 主要分析两种情况下的调用

赋值，即 id objc_storeWeak(id *location, id newObj)
销毁，即 void objc_destroyWeak(id *location)

而对于 weak 属性的获取主要分析

函数 id objc_loadWeakRetained(id *location)

观察: `id objc_storeWeak(id *location, id newObj)`

/** 
 * This function stores a new value into a __weak variable. It would
 * be used anywhere a __weak variable is the target of an assignment.
 * 
 * @param location The address of the weak pointer itself
 * @param newObj The new object this weak ptr should now point to
 * 
 * @return \e newObj
 */
id
objc_storeWeak(id *location, id newObj)
{
    return storeWeak<DoHaveOld, DoHaveNew, DoCrashIfDeallocating>
        (location, (objc_object *)newObj);
}

该函数单纯的调用了 storeWeak 函数

观察: `void objc_destroyWeak(id *location)`

/** 
 * Destroys the relationship between a weak pointer
 * and the object it is referencing in the internal weak
 * table. If the weak pointer is not referencing anything, 
 * there is no need to edit the weak table. 
 *
 * This function IS NOT thread-safe with respect to concurrent 
 * modifications to the weak variable. (Concurrent weak clear is safe.)
 * 
 * @param location The weak pointer address. 
 */
void
objc_destroyWeak(id *location)
{
    (void)storeWeak<DoHaveOld, DontHaveNew, DontCrashIfDeallocating>
        (location, nil);
}

该函数也只是单纯的调用了 storeWeak 函数

函数 `storeWeak` 源码

template <HaveOld haveOld, HaveNew haveNew,
          CrashIfDeallocating crashIfDeallocating>
static id 
storeWeak(id *location, objc_object *newObj)
{
    assert(haveOld  ||  haveNew);
    if (!haveNew) assert(newObj == nil);

    Class previouslyInitializedClass = nil;
    id oldObj;
    SideTable *oldTable;
    SideTable *newTable;

    // Acquire locks for old and new values.
    // Order by lock address to prevent lock ordering problems. 
    // Retry if the old value changes underneath us.
 retry:
    if (haveOld) {
        oldObj = *location;
        oldTable = &SideTables()[oldObj];
    } else {
        oldTable = nil;
    }
    if (haveNew) {
        newTable = &SideTables()[newObj];
    } else {
        newTable = nil;
    }

    SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

    if (haveOld  &&  *location != oldObj) {
        SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
        goto retry;
    }

    // Prevent a deadlock between the weak reference machinery
    // and the +initialize machinery by ensuring that no 
    // weakly-referenced object has an un-+initialized isa.
    if (haveNew  &&  newObj) {
        Class cls = newObj->getIsa();
        if (cls != previouslyInitializedClass  &&  
            !((objc_class *)cls)->isInitialized()) 
        {
            SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);
            _class_initialize(_class_getNonMetaClass(cls, (id)newObj));

            // If this class is finished with +initialize then we're good.
            // If this class is still running +initialize on this thread 
            // (i.e. +initialize called storeWeak on an instance of itself)
            // then we may proceed but it will appear initializing and 
            // not yet initialized to the check above.
            // Instead set previouslyInitializedClass to recognize it on retry.
            previouslyInitializedClass = cls;

            goto retry;
        }
    }

    // Clean up old value, if any.
    if (haveOld) {
        weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
    }

    // Assign new value, if any.
    if (haveNew) {
        newObj = (objc_object *)
            weak_register_no_lock(&newTable->weak_table, (id)newObj, location, 
                                  crashIfDeallocating);
        // weak_register_no_lock returns nil if weak store should be rejected

        // Set is-weakly-referenced bit in refcount table.
        if (newObj  &&  !newObj->isTaggedPointer()) {
            newObj->setWeaklyReferenced_nolock();
        }

        // Do not set *location anywhere else. That would introduce a race.
        *location = (id)newObj;
    }
    else {
        // No new value. The storage is not changed.
    }
    
    SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

    return (id)newObj;
}

可以结合 lldb 边调试边对其进行分析，

分析: `id objc_storeWeak(id *location, id newObj)`

// Template parameters.
enum HaveOld { DontHaveOld = false, DoHaveOld = true };
enum HaveNew { DontHaveNew = false, DoHaveNew = true };

对于模板参数，传递的是 DoHaveOld(true) & DoHaveNew(true)

在64位汇编中，当参数少于7个时，参数从左到右放入寄存器: rdi, rsi, rdx, rcx, r8, r9。此处 location 和 newObj 分别来自 rdi 和 rsi。

根据注释加地址比较，可知 location 为 指向弱引用的地址，newObj 为要求 弱引用指向的地址，在当前场景下为赋值给 WeakProperty 的 obj 属性的 obj 变量。

在当前场景下即为执行 storeWeak 后，内存地址 0x0000000101301638 上保存的值为 0x0000000101301490

铺垫: `SideTable`

关于结构体 SideTable，在本文中当做黑盒来处理

struct SideTable {
    spinlock_t slock;
    RefcountMap refcnts;
    weak_table_t weak_table;

    SideTable() {
        memset(&weak_table, 0, sizeof(weak_table));
    }

    ~SideTable() {
        _objc_fatal("Do not delete SideTable.");
    }

    void lock() { slock.lock(); }
    void unlock() { slock.unlock(); }
    void forceReset() { slock.forceReset(); }

    // Address-ordered lock discipline for a pair of side tables.

    template<HaveOld, HaveNew>
    static void lockTwo(SideTable *lock1, SideTable *lock2);
    template<HaveOld, HaveNew>
    static void unlockTwo(SideTable *lock1, SideTable *lock2);
};

关于 spinlock_t，Wiki 上关于 Spinlock 词条的解释如下

In software engineering, a spinlock is a lock which causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking if the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (that which holds the lock) blocks, or "goes to sleep.

例子

; Intel syntax

locked:                      ; The lock variable. 1 = locked, 0 = unlocked.
     dd      0               ; 定义 lock 变量 默认为 0 

spin_lock:
     mov     eax, 1          ; Set the EAX register to 1. 
                                        ; 设置 EAX 寄存器的值为 1 

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.
                             ; This will always store 1 to the lock, leaving
                             ;  the previous value in the EAX register.
                             ; 交换 eax 与 lock 变量的值，根据上一步可知，lock 肯定会被赋值为1

     test    eax, eax        ; Test EAX with itself. Among other things, this will
                             ;  set the processor's Zero Flag if EAX is 0.
                             ; If EAX is 0, then the lock was unlocked and
                             ; we just locked it.
                             ; Otherwise, EAX is 1 and we didn't acquire the lock.
                                        ; 将 EAX 与 自身比较，如果 EAX 是 0 则设置 Zeor Flag ，表明当前未加锁，只要加锁操作即可，反之证明已被加锁，不设置 Zero Flag。
     jnz     spin_lock       ; Jump back to the MOV instruction if the Zero Flag is
                             ;  not set; the lock was previously locked, and so
                             ; we need to spin until it becomes unlocked.
                                        ; 如果 Zero Flag 未被设置，则跳转继续 spin_lock
     ret                     ; The lock has been acquired, return to the calling
                             ;  function.
                             ; 获得锁后，继续执行

; 当获得所的操作执行完成后，则 locked 变成 0，另一个线程再次进行 spin_lock 操作 locked 为 0，导致 EAX 为0 ，重新获得了锁，同时 locked 变成 1...

spin_unlock:
     mov     eax, 0          ; Set the EAX register to 0.

     xchg    eax, [locked]   ; Atomically swap the EAX register with
                             ;  the lock variable.

     ret                     ; The lock has been released.

配合 google 的翻译可知，自旋锁会循环等待直到锁可用。

从 weak_table_t 结构体的注释说明了，它会保存 ids 和 keys 的形式保存对象

/**
 * The global weak references table. Stores object ids as keys,
 * and weak_entry_t structs as their values.
 */
struct weak_table_t {
    weak_entry_t *weak_entries;
    size_t    num_entries;
    uintptr_t mask;
    uintptr_t max_hash_displacement;
};

结构体 SideTable 可看做是一个带加锁功能的集合，其中的元素以键值对的形式存放。

在 ObjC 的入口函数 _objc_init 会调用函数 arr_init 来初始化 SideTableBuf 静态变量

正文: `id objc_storeWeak(id *location, id newObj)`

进入 if (haveOld) 条件

创建新元素，因此 location 地址的原值为 nil

进入 SideTables() 函数

static StripedMap<SideTable>& SideTables() {
    return *reinterpret_cast<StripedMap<SideTable>*>(SideTableBuf);
}

关于 reinterpret_cast 的讨论

reinterpret_cast is the most dangerous cast, and should be used very sparingly. It turns one type directly into another - such as casting the value from one pointer to another, or storing a pointer in an int, or all sorts of other nasty things. Largely, the only guarantee you get with reinterpret_cast is that normally if you cast the result back to the original type, you will get the exact same value (but not if the intermediate type is smaller than the original type). There are a number of conversions that reinterpret_cast cannot do, too. It's used primarily for particularly weird conversions and bit manipulations, like turning a raw data stream into actual data, or storing data in the low bits of an aligned pointer.

它是一种类型强转的方式

SideTableBuf 是大小为 4096 的 SideTable 缓存数组， oldTable 的赋值相当于在取数组元素，nil 可看成 0 ，即取第一个元素。

同理，haveNew 为 true ，newTable 是以 newObj 为索引在 SideTabBuf 中查找元素。

调用 SideTable::lockTwo 方法

SideTable::lockTwo<haveOld, haveNew>(oldTable, newTable);

进入 SideTable::lockTwo 方法

template<>
void SideTable::lockTwo<DoHaveOld, DoHaveNew>
    (SideTable *lock1, SideTable *lock2)
{
    spinlock_t::lockTwo(&lock1->slock, &lock2->slock);
}

进入 lockTwo 方法

// Address-ordered lock discipline for a pair of locks.

static void lockTwo(mutex_tt *lock1, mutex_tt *lock2) {
   if (lock1 < lock2) {
       lock1->lock();
       lock2->lock();
   } else {
       lock2->lock();
       if (lock2 != lock1) lock1->lock(); 
   }
}

*判断 if (haveOld && location != oldObj) 条件

haveOld && *location != oldObj ，oldObj 被赋值为 *location 正常情况下，两者相等，不等说明出了问题，算是容错。

判断 if (haveNew && newObj) 条件

haveNew && newObj 根据注释可知也是一个容错的处理

清除旧值

if (haveOld) {
   weak_unregister_no_lock(&oldTable->weak_table, oldObj, location);
}

赋予新值

// Assign new value, if any.
if (haveNew) {
   newObj = (objc_object *)
       weak_register_no_lock(&newTable->weak_table, (id)newObj, location, 
                             crashIfDeallocating);
   // weak_register_no_lock returns nil if weak store should be rejected

   // Set is-weakly-referenced bit in refcount table.
   if (newObj  &&  !newObj->isTaggedPointer()) {
       newObj->setWeaklyReferenced_nolock();
   }

   // Do not set *location anywhere else. That would introduce a race.
   *location = (id)newObj;
}
else {
   // No new value. The storage is not changed.
}

以 location 为 key,以 newObj 为值保存到对应的 weak_table_t 的结构体中

调用 SideTable::unlockTwo 方法

SideTable::unlockTwo<haveOld, haveNew>(oldTable, newTable);

分析: `void objc_destroyWeak(id *location)`

因为传递的模板参数为 DontHaveNew ，当释放掉旧值后，不会再进入 if (haveNew) 条件中获得新值。

分析: `id objc_loadWeakRetained(id *location)`

retry:
    // fixme std::atomic this load
    obj = *location;
    ...
    result = obj;
    ... 
    return result

通过 * 取值符号操作 location ，获得弱引用指向的地址。

总结

本文通过对 ObjC 运行时粗略分析，来了解 weak 属性是如何进行存储，使用与释放的。ObjC 的类结构中一个静态的键值对表变量，它保存着对象的弱引用属性，其中的键为指向弱引用的内存地址，值为弱引用，当对象销毁时通过键查表，然后将对应的弱引用从表中移除。

参考

ObjC Runtime 中 Weak 属性的实现 (上)
前言 OC 中的 weak 属性是怎么实现的，为什么在对象释放后会自动变成 nil？本文对这个问题进行了一点探讨。...
ObjC Runtime 中 Weak 属性的实现 (中)
导语在上一篇中简单分析了 Weak 属性是如何被存储，获取和销毁的，其中的 SideTable 结构体当做黑盒进...
iOS中的weak指针
ObjC runtime是如何实现weak指针的用strong指针创建weak指针,系统会调用objc_init...
Runtime
概述 runtime 是什么 isa指针 runtime 怎么添加属性，方法等 runtime 如何实现weak属...
一种objc的runtime的弱引用实现方式
问题:如何动态的给oc对象添加weak属性的变量? 现状: objc的runtime仅支持assign,stron...
iOS 中weak的实现原理和销毁
A.weak的实现原理？初始化时：runtime 会调用objc_initWeak函数，初始化一个新的weak指...
Weak实现原理
runtime：版本objc4-756.2 weak初始化objc_initWeak 使用weak修饰对象，通过调...
weak实现原理
weak实现原理 1.初始化时，runtime调用objc_initweak函数，初始化一个新的weak指针，指向...
runtime 如何实现 weak 属性？
runtime 如何实现 weak 属性？ weak 此特质表明该属性定义了一种「非拥有关系」(nonowning...
2019-02-14
1._objc_msgForward函数是做什么的，直接调用它将会发生什么？ runtime如何实现weak变量的...

网友评论

72c06b01e75f:你好，想请教下调试的问题
1. 调试时并没有出现objc的函数栈，比如文中所述
“对象的 weak 属性调用 setter 时，
. 调用 id objc_storeWeak(id *location, id newObj)
. 调用 static id storeWeak(id *location, objc_object *newObj)”
是怎么调试的？？
2. 怎么进入objc源码的断点的？？
iOSugarCom:你好,objc运行时库是开源的,所以代码可在 https://opensource.apple.com/ 找到,我用的是mac版的,可以在 github上找到可运行的版本

函数 id objc_storeWeak(id *location, id newObj)

官方注释
This function stores a new value into a __weak variable. It would
be used anywhere a __weak variable is the target of an assignment.

当然也可以从函数名揣测下...

你也可以用手机版本验证一下,比如

...
@property (nonatomic,copy) NSString *test;
...

- (void)viewDidLoad {
[super viewDidLoad];
self.test = @"123";
}

用Xcode在lldb调试时

// 设置函数断点
(lldb) br set --name objc_storeWeak

// 输出如下
Breakpoint 5: where = libobjc.A.dylib`objc_storeWeak, address = 0x0000000183454260

然后走一下,会发现断在这个方法,当然都是汇编代码

至于调试,可以按住 ctrl 下面控制台的调试按钮有点变化,可以更细致的按指令与线程调试