类对象
objc_class
定义在objc-runtime-new.h,继承自objc_object,所以后面都叫它类对象好了.
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
Class getSuperclass() const {...}
void setSuperclass(Class newSuperclass) {...}
class_rw_t *data() const {
return bits.data();
}
void setData(class_rw_t *newData) {
bits.setData(newData);
}
先看这些内容,首先第一个成员是isa,类对象的isa指向元类
第二个成员是Class指针,指向superclass
第三个成员cache是缓存
第四个成员bits存放的是类的具体内容.
然后是getSuperclass() 和setSuperclass(),设置和获取父类.
如果是arm64e环境,并且ISA_SIGNING_SIGN_MODE不是NONE,就会在set的时候签名,在get的时候解签,
否则就直接给superclass赋值,以及直接返回superclass.
最下面两个函数,设置和获取一个class_rw_t *,可以看到调用的是bits的方法
所以class_rw_t和class_data_bits_t是关联的,分别看看这两个结构体,
class_data_bits_t
#if __LP64__
#define FAST_DATA_MASK 0x00007ffffffffff8UL
#else
#define FAST_DATA_MASK 0xfffffffcUL
struct class_data_bits_t {
friend objc_class;
uintptr_t bits;
public:
class_rw_t* data() const {
return (class_rw_t *)(bits & FAST_DATA_MASK);
}
void setData(class_rw_t *newData)
{
ASSERT(!data() || (newData->flags & (RW_REALIZING | RW_FUTURE)));
// Set during realization or construction only. No locking needed.
// Use a store-release fence because there may be concurrent
// readers of data and data's contents.
uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData;
atomic_thread_fence(memory_order_release);
bits = newBits;
}
// Get the class's ro data, even in the presence of concurrent realization.
// fixme this isn't really safe without a compiler barrier at least
// and probably a memory barrier when realizeClass changes the data field
const class_ro_t *safe_ro() const {
class_rw_t *maybe_rw = data();
if (maybe_rw->flags & RW_REALIZED) {
// maybe_rw is rw
return maybe_rw->ro();
} else {
// maybe_rw is actually ro
return (class_ro_t *)maybe_rw;
}
}
只有一个成员变量bits,类型是uintptr_t,这个类型与当前环境的指针大小相同,而且它确实也可以当做指针,是8个字节.
setData()和getData,实质是设置和获取class_rw_t,注释中说明只在运行时调用这个方法,并且需要注意线程安全.
set的时候,拿bits和FAST_DATA_MASK的取反与运算,然后或上参数newData,得到的值就是新的bits,
以32位为例,假设bits是xxxx xxxP, ~FAST_DATA_MASK是0000 0003, newData是xxxx xxxQ,所以本质是(P&3)|Q,newData的前面都是原模原样;同样的在get的时候,相当于P&c.
0x00007ffffffffff8是0000 0000 0000 0000 011111111111111111111111111111111111111111111000,根据isa的经验,class_rw_t应该就是存储与bits的第4到47位了.
其次是get class_ro_t,注释说明可以并发获取,并且不能修改内容.
这个函数可以看到两种情况,一种情况返回maybe_rw->ro(),也就是说从class_rw_t中获得class_ro_t,
另一种情况是直接变换指针类型,把class_rw_t当做class_ro_t返回,
class_rw_t
接下来就看看class_rw_t的结构
struct class_rw_t {
// Be warned that Symbolication knows the layout of this structure.
uint32_t flags;
uint16_t witness;
explicit_atomic<uintptr_t> ro_or_rw_ext;
Class firstSubclass;
Class nextSiblingClass;
private:
using ro_or_rw_ext_t = objc::PointerUnion<const class_ro_t, class_rw_ext_t, PTRAUTH_STR("class_ro_t"), PTRAUTH_STR("class_rw_ext_t")>;
const ro_or_rw_ext_t get_ro_or_rwe()
void set_ro_or_rwe(const class_ro_t *ro)
void set_ro_or_rwe(class_rw_ext_t *rwe, const class_ro_t *ro)
class_rw_ext_t *extAlloc(const class_ro_t *ro, bool deep = false);
public:
void setFlags(uint32_t set)
void clearFlags(uint32_t clear)
void changeFlags(uint32_t set, uint32_t clear)
class_rw_ext_t *ext()
class_rw_ext_t *extAllocIfNeeded()
class_rw_ext_t *deepCopy(const class_ro_t *ro)
const class_ro_t *ro() const
void set_ro(const class_ro_t *ro)
const method_array_t methods()
const property_array_t properties()
const protocol_array_t protocols()
};
首先explicit_atomic,它是继承自C++的atomic,用于包装一个值,实现多个线程安全访问,不会引起数据竞争.
using在这里的作用类似typedef,声明了一个ro_or_rw_ext_t,现在ro_or_rw_ext_t就是一个类名.
PointerUnion是一个类,它的目的和c的union类似,可以定义多种类型的成员,但是同时只能表达一个.
template <class T1, class T2, typename Auth1, typename Auth2>
class PointerUnion {
uintptr_t _value;
定义的时候需要4个泛型模板,两个类型,两个成员名称,在这里传的是class_ro_t和class_rw_ext_t;
也就是说,ro_or_rw_ext_t要么是class_rw_ext_t 要么是class_ro_t.
PointerUnion提供了一个is()用于判断是哪一种,调用的时候需要声明类型,比如v.is<class_rw_ext_t *>(),如果此时表达class_rw_ext_t,就返回true.
PointerUnion还提供了一个get()用于获取数据,方法和is()相同,返回的就是指定类型的指针.
const ro_or_rw_ext_t get_ro_or_rwe() const {
return ro_or_rw_ext_t{ro_or_rw_ext};
}
这个方法是用ro_or_rw_ext初始化ro_or_rw_ext_t,PointerUnion只有一个_value属性,在这里就是用ro_or_rw_ext赋值.
后面对于数据的创建存取操作都是由PointerUnion,也就是get_ro_or_rwe()来完成.
并且有时候还有匿名的ro_or_rw_ext_t{ro_or_rw_ext},目的也是用PointerUnion来处理,用同样的ro_or_rw_ext初始化的PointerUnion本质是一个对象.
const method_array_t methods() const {
auto v = get_ro_or_rwe();
if (v.is<class_rw_ext_t *>()) {
return v.get<class_rw_ext_t *>(&ro_or_rw_ext)->methods;
} else {
return method_array_t{v.get<const class_ro_t *>(&ro_or_rw_ext)->baseMethods()};
}
}
在后面的成员方法中,首先就是获取get_ro_or_rwe(),
比如上面这个获取方法列表的函数methods(),先获取ro_or_rw_ext_t,如果是class_rw_ext_t,就返回它的methods(),如果是class_ro_t,就返回它的baseMethods().
最后算一下class_rw_t的大小,4+2+8+8+8 = 30,但是由于内存对齐,需要32字节.
class_rw_ext_t
PointerUnion用于表达class_rw_ext_t指针或者class_ro_t,那么就来看看这两个结构体
struct class_rw_ext_t {
DECLARE_AUTHED_PTR_TEMPLATE(class_ro_t)
class_ro_t_authed_ptr<const class_ro_t> ro;
method_array_t methods;
property_array_t properties;
protocol_array_t protocols;
char *demangledName;
uint32_t version;
};
DECLARE_AUTHED_PTR_TEMPLATE是声明一个结构体指针,声明出来的就是class_ro_t_authed_ptr,
这个宏是这么定义的,
#define DECLARE_AUTHED_PTR_TEMPLATE(name) \
template <typename T> using name ## _authed_ptr \
= WrappedPtr<T, PTRAUTH_STR(name)>;
#else
#define PTRAUTH_STR(name) PtrauthRaw
#define DECLARE_AUTHED_PTR_TEMPLATE(name) \
template <typename T> using name ## _authed_ptr = RawPtr<T>;
#endif
它的目的是签名,用于安全性,有两种定义,一个是arm64e的签名,一个是不签名,
除了签名还有包装,包装使用下面这个结构体,它有一个指针ptr,这就是最终指向class_ro_t的指针.
template<typename T, typename Auth>
struct WrappedPtr {
private:
T *ptr;
除此之外,class_rw_t里还有方法列表,属性列表,协议列表.
class_ro_t
里面大概是这些内容
struct class_ro_t {
uint32_t flags;
uint32_t instanceStart;
uint32_t instanceSize;
#ifdef __LP64__
uint32_t reserved;
#endif
union {
const uint8_t * ivarLayout;
Class nonMetaclass;
};
explicit_atomic<const char *> name;
void *baseMethodList;
protocol_list_t * baseProtocols;
const ivar_list_t * ivars;
const uint8_t * weakIvarLayout;
property_list_t *baseProperties;
method_list_t *baseMethods()
Class getNonMetaclass()
const uint8_t *getIvarLayout()
ro和rw都有Methods,Protocols,Properties,但是他们的类型并不一样.
class_ro_t是类在初始化的时候也初始化,没有提供修改的函数,所以ro也就是read only,此时是没有rw的,ro就通过 using ro_or_rw_ext_t代替rw.
相对应的class_rw_t是可读可写的,当类完成初始化,class_rw_t中class_rw_ext_t的class_ro_t_authed_ptr<const class_ro_t> ro会指向class_ro_t.
cache_t
struct cache_t {
private:
explicit_atomic<uintptr_t> _bucketsAndMaybeMask;
union {
struct {
explicit_atomic<mask_t> _maybeMask;
#if __LP64__
uint16_t _flags;
#endif
uint16_t _occupied;
};
explicit_atomic<preopt_cache_t *> _originalPreoptCache;
};
//...
explicit_atomic是C++ atomic的封装,封装的类型是uintptr_t,而uintptr_t与当前环境的指针大小相同,也就是8个字节.
接下来是一个共用体
#if __LP64__
typedef uint32_t mask_t; // x86_64 & arm64 asm are less efficient with 16-bits
#else
typedef uint16_t mask_t;
#endif
mask_t是个别名,64位环境占4个字节,struct是4+2+2,preopt_cache_t是指针,也是8字节,所以共用体是8字节,
因此cache_t一共是8+8个字节,从类的地址开始,isa_t(8byte) + Class(8byte) + cache_t(16byte) + class_data_bits_t(8byte)
ivar_t
class_ro_t中的Ivars是ivar_list_t类型,它是基础自entsize_list_tt的
ro和rw的很多list都是entsize_list_tt和以entsize_list_tt为基础再次封装的list_array_tt,具体看(下一篇)[https://www.jianshu.com/p/52080de84f38]
这篇可以先不用详细了解,只需要知道entsize_list_tt类似一个数组,有个get()方法获取元素,比如get(0).
struct ivar_t {
int32_t *offset;
const char *name;
const char *type;
// alignment is sometimes -1; use alignment() instead
uint32_t alignment_raw;
uint32_t size;
uint32_t alignment() const {
if (alignment_raw == ~(uint32_t)0) return 1U << WORD_SHIFT;
return 1 << alignment_raw;
}
};
8+8+8+4+4 = 32字节,具体是如何使用的后面在看.
property_t
struct property_t {
const char *name;
const char *attributes;
};
property_t只有两个字符串指针,因为它只对属性进行描述.
成员变量的内存布局
ivar_list_t只在class_ro_t中有,并且rw里没有ivar相关的东西.
但是class_ro_t的初始化中,成员变量并非一个个new出来,而是从mach-o中读取的,在objc4中只能找到class_addIvar这个函数用于动态添加成员,静态加载类的时候不会调用这个函数.
静态加载的过程是一套复杂的流程,对于成员变量,可以先通过runtime来观察.
在这之前,可以先看一下在内存中,ivar_t的样子
@interface MyClass : NSObject
{
NSInteger _num;
}
@end
@implementation
- (instancetype)init{
if(self = [super init]){
_num = 5;
}
return self;
}
@end
int main(int argc, const char * argv[]) {
@autoreleasepool {
MyClass *my = [[MyClass alloc]init];
NSLog(@"Hello, World!");
return 0
}
}
声明一个类.在NSLog断点.
(lldb) p my.class
(Class) $0 = 0x0000000100008118
(lldb) p (class_data_bits_t *)$0 + 0x20
(class_data_bits_t *) $1 = 0x0000000100008218
从类对象地址开始,偏移8(isa)+8(superclass)+16(cache)就是bits的位置了,换成16进制是0x20,得到class_data_bits_t *
(lldb) p (objc_class *)$0
(objc_class *) $2 = 0x0000000100008118
(lldb) p $2->data()
(class_rw_t *) $3 = 0x0000000109412280
(lldb) p $2->safe_ro()
(const class_ro_t *) $4 = 0x0000000100008098
把Class转换成objc_class *,然后分别获取class_rw_t和class_ro_t.
p $3->ro_or_rw_ext
(explicit_atomic<unsigned long>) $5= {
std::__1::atomic<unsigned long> = {
Value = 4295000216
}
}
(lldb) p/x 4295000216
(long) $6 = 0x0000000100008098
(lldb) p $3->ro()
(const class_ro_t *) $7 = 0x0000000100008098
输出rw里的ro_or_rw_ext,此时它就是ro的地址.或者调用ro()函数也可以
p *$3
(class_rw_t) $8 = {
flags = 2148007936
witness = 1
ro_or_rw_ext = {
std::__1::atomic<unsigned long> = {
Value = 4295000216
}
}
firstSubclass = nil
nextSiblingClass = 0x00007ff85e83b9c8
}
(lldb) p sizeof($3->flags)
(unsigned long) $9 = 4
(lldb) p sizeof($3->witness)
(unsigned long) $10 = 2
(lldb) p sizeof($3->ro_or_rw_ext)
(unsigned long) $11 = 8
(lldb) p sizeof($3->firstSubclass)
(unsigned long) $12 = 8
(lldb) p sizeof($3->nextSiblingClass)
(unsigned long) $13 = 8
(lldb) p sizeof(*$3)
(unsigned long) $14 = 32
把整个rw输出,另外可以看到内存对齐的情况.
p *$4
(const class_ro_t) $15 = {
flags = 128
instanceStart = 8
instanceSize = 16
reserved = 0
= {
ivarLayout = 0x0000000000000000
nonMetaclass = nil
}
name = {
std::__1::atomic<const char *> = "MyClass" {
Value = 0x0000000100003fa8 "MyClass"
}
}
baseMethods = {
ptr = nil
}
baseProtocols = nil
ivars = 0x0000000100008070
weakIvarLayout = 0x0000000000000000
baseProperties = nil
_swiftMetadataInitializer_NEVER_USE = {}
}
然后查看class_ro_t的内容,name是这个ro所属类的名字.
ivars是有值的.
(lldb) p *$4->ivars
(const ivar_list_t) $16 = {
entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 1)
}
(lldb) p $16->get(0)
(ivar_t) $17 = {
offset = 0x00000001000080e8
name = 0x0000000100003fb0 "_num"
type = 0x0000000100003fb5 "q"
alignment_raw = 3
size = 8
}
输出ivars并取出第0个元素.
那么_num真正的值存在哪呢.需要根据offset找,offset是成员现对于实例的偏移,而offset是指针,它指向的地址存着真正的偏移量
(lldb) p/x my
(MyClass *) $18 = 0x0000000108e4c3b0
(lldb) x/wx 0x00000001000080e8
0x100008120: 0x00000008
也就是_num存在my后面8个字节,my身就8个字节(isa的大小),所以对象后面紧跟着就是_num.
(lldb) x/4gx $18
0x108e4c3b0: 0x011d800100008129 0x0000000000000005
0x108e4c3c0: 0x0000000108e4c490 0x0000000108e4c6d0
读取my指针地址的内存,读取8x4字节,第一段是isa,第二段存的就是_num的值.
假如成员变量是指针,那这8个字节存的就是这个指针.
属性的内存布局
@interface MyClass : NSObject
@property(nonatomic, strong) NSNumber *number;
@property(nonatomic, assign) NSInteger integer;
@property(atomic, assign) NSInteger atomic;
@property(nonatomic, copy) NSString *Str;
@property(nonatomic, weak) NSObject *weak;
@property(nonatomic, strong, readonly) NSObject *readonly;
@end
@implementation MyClass
- (instancetype)init{
if(self = [super init]){
_readonly = NSObject.new;
}
return self;
}
@end
定义五个property,分别是不同的修饰.
(lldb) p my.class
(Class) $0 = 0x0000000100008408
(lldb) p (objc_class *)$0
(objc_class *) $1 = 0x0000000100008408
(lldb) p $1->data()
(class_rw_t *) $2 = 0x0000000108e27090
(lldb) p $2->ro()
(const class_ro_t *) $3 = 0x0000000100008328
(lldb) p *$3
(const class_ro_t) $4 = {
flags = 388
instanceStart = 8
instanceSize = 56
reserved = 0
= {
ivarLayout = 0x0000000100003f46 "\U00000001!\U00000011"
nonMetaclass = 0x0000000100003f46
}
name = {
std::__1::atomic<const char *> = "MyClass" {
Value = 0x0000000100003f3e "MyClass"
}
}
baseMethods = {
ptr = 0x00000001000080b8
}
baseProtocols = nil
ivars = 0x00000001000081f8
weakIvarLayout = 0x0000000100003f4a "A"
baseProperties = 0x00000001000082c0
_swiftMetadataInitializer_NEVER_USE = {}
}
(lldb) p $4.baseProperties
(property_list_t *const) $5 = 0x00000001000082c0
(lldb) p *$5
(property_list_t) $6 = {
entsize_list_tt<property_t, property_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 16, count = 6)
}
(lldb) p $6.get(0)
(property_t) $7 = (name = "number", attributes = "T@\"NSNumber\",&,N,V_number")
(lldb) p $6.get(1)
(property_t) $8 = (name = "integer", attributes = "Tq,N,V_integer")
(lldb) p $6.get(2)
(property_t) $9 = (name = "atomic", attributes = "Tq,V_atomic")
(lldb) p $6.get(3)
(property_t) $10 = (name = "Str", attributes = "T@\"NSString\",C,N,V_Str")
(lldb) p $6.get(4)
(property_t) $11 = (name = "weak", attributes = "T@\"NSObject\",W,N,V_weak")
(lldb) p $6.get(5)
(property_t) $12 = (name = "readonly", attributes = "T@\"NSObject\",R,N,V_readonly")
可以看到property_t存的name和attributes,类似"T@"NSObject",R,N,V_readonly",规则是:
以T开头,后跟@encode类型和逗号,比如NSInteger是q,NSNumber是@"NSNumber.
然后是修饰,以逗号隔开,
最后以V加上下划线加上属性名称结尾,其实下划线加上属性名称就是成员变量,后面细说.
其中attributes的修饰大概有这些:
image.png
然后文档还举了一些例子:
比如Tc,Td,Ti,Tf是char, double,enum/int, float
还有一些需要注意的,比如@property(getter=intGetFoo, setter=intSetFoo:) int intSetterGetter;编码后是Ti,GintGetFoo,SintSetFoo:,VintSetterGetter
还有C++指针会加一个,比如int*是Ti; void*是T^v;
还有id类型是T@,也就是后面的类名是空的.
等等
那么property的真实结构和数据存在哪呢
继续上面的lldb
(lldb) p $4.ivars
(const ivar_list_t *const) $13 = 0x00000001000081f8
(lldb) p *$13
(const ivar_list_t) $14 = {
entsize_list_tt<ivar_t, ivar_list_t, 0, PointerModifierNop> = (entsizeAndFlags = 32, count = 6)
}
(lldb) p $14.get(0)
(ivar_t) $15 = {
offset = 0x00000001000083d8
name = 0x0000000100003e90 "_number"
type = 0x0000000100003f7a "@\"NSNumber\""
alignment_raw = 3
size = 8
}
(lldb) p $14.get(1)
(ivar_t) $16 = {
offset = 0x00000001000083e0
name = 0x0000000100003e98 "_integer"
type = 0x0000000100003f86 "q"
alignment_raw = 3
size = 8
}
所以还是property同时还生成了ivars.
不过不仅仅是这样,我们知道@property还会生成setter和getter,这些在后面方法和消息以及类的加载再分析.
网友评论