tagged Pointer

作者: 码农农农SL | 来源:发表于2019-06-30 21:56 被阅读0次

问题来源

在最近的一次面试过程中,被问到Apple对tagged pointer的优化的问题,当时一脸懵逼,除了说上一句“提高访问效率和节约内存成本”,就没有然后了,被问及其实现原理一概不知。一时陷入尴尬,不知道对话该怎么继续。词穷的原因,大概是因为我的无知吧!所以过后,赶紧补上这方面的知识!

探索历程

1、百度

既然打算掌握tagged pointer的相关知识,肯定要知道它是什么,干什么用的,怎么实现的,优势&劣势等等。一大堆的问题,将我湮灭,还好,我有度娘
so...

百度.png
一看这么多,求知欲爆棚的我瞬间高潮了,内心简直。。。。。无法用言语形容。尝试着阅读了几篇,发现解释的都很片面和肤浅,和我想要的相差甚远。有点小失落。但是,我可是“好好学习天天向上”的好学生,祖国的花朵,共产主义的接班人啊,怎能轻言放弃呢。于是继续在一堆优(垃)秀(圾)中追求真理。。。

皇天不负有心人。

image.png
看到唐巧大佬也有过这方面的研究。嗯嗯,看来我和大佬之间,就差一个tagged pointer的掌握!!!(窃喜&YY自己正走在成为大佬的路上.png)
点开大佬博客,似乎好像,大佬对tagged pointer也没有太多的解析,让读者知其大概。顿时陷入了迷茫和思考:我想要的仅仅是这些表面的知其然而不知其所以然的所谓的知识吗?脑海中闪过无数鸡汤,瞬间顿悟:去吧皮卡丘,追寻你想要的吧!

2、Developer Documention

干劲是有了,那从哪里入手呢,怎么去学习tagged pointer呢。这时,我想到了官方文档,于是打开xcode,Help-->Developer Documention,搜索Tagged Pointer:

image.png
WTF....
内心一万头草泥马奔腾而过~~~
后来想想,感觉自己好傻逼,Developer Documention是开发者接口文档,tagged pointer又不是接口,是Apple对NSString、NSNumber、NSDate等类的优化。优化。优化。额,,好像哪里不对,优化不是底层实现的吗,探索底层,不应该去看runtime吗。诶,终究还是太年轻,为什么现在才想起来要看runtime,看来共产主义迟迟不让你接班,还是有原因的!!!

runtime源码

既然知道了路在何方,就不在迷茫~
废话不多说,先去OpenSource下载一份runtime源码。
这里下载一份最新的objc-750
下载完后,我隐隐感觉到,我离掌握Tagged Pointer已经不远了。
于是,抑制住激动的心,控制住颤抖的手,点开objc-750源码,好吧,我又迷失在了无尽的头文件中。

image.png
这尼玛,Tagged Pointer相关的知识,在哪个头文件?总不能一个个去找吧?你可能觉得我已经在头文件中迷失自我了,我只能说:呵,天真,我可是共产主义的接班人,这就想难道我?
于是我点开了文件搜索😎~
image.png
这TM也有好几个啊啊啊啊啊啊。没办法,只能再次求助度娘
一番查找,终于找到了objc-internal.h。(窃喜.jpg)
总算进入正题了(进入全神贯注模式.jpg)。
开始阅读源码:
image.png
这里可以看注释,文件中关于Tagged Pointer的,大概在209行的位置:
image.png
所以直接看这里。
开始的位置,是对64位系统的判定:
#if __LP64__
#define OBJC_HAVE_TAGGED_POINTERS 1  //如果是64位系统,则定义宏OBJC_HAVE_TAGGED_POINTERS
#endif

然后下面就对OBJC_HAVE_TAGGED_POINTERS宏进行判定:


#if OBJC_HAVE_TAGGED_POINTERS    //说明tagged pointer只存在于64位系统中,即Apple是在64位系统上用tagged pointer优化NSString等类的。
...

继续往下:

#if __has_feature(objc_fixed_enum)  ||  __cplusplus >= 201103L
enum objc_tag_index_t : uint16_t
#else
typedef uint16_t objc_tag_index_t;
enum
#endif
{
    // 60-bit payloads
    OBJC_TAG_NSAtom            = 0, 
    OBJC_TAG_1                 = 1, 
    OBJC_TAG_NSString          = 2, 
    OBJC_TAG_NSNumber          = 3, 
    OBJC_TAG_NSIndexPath       = 4, 
    OBJC_TAG_NSManagedObjectID = 5, 
    OBJC_TAG_NSDate            = 6,

    // 60-bit reserved
    OBJC_TAG_RESERVED_7        = 7, 

    // 52-bit payloads
    OBJC_TAG_Photos_1          = 8,
    OBJC_TAG_Photos_2          = 9,
    OBJC_TAG_Photos_3          = 10,
    OBJC_TAG_Photos_4          = 11,
    OBJC_TAG_XPC_1             = 12,
    OBJC_TAG_XPC_2             = 13,
    OBJC_TAG_XPC_3             = 14,
    OBJC_TAG_XPC_4             = 15,

    OBJC_TAG_First60BitPayload = 0, 
    OBJC_TAG_Last60BitPayload  = 6, 
    OBJC_TAG_First52BitPayload = 8, 
    OBJC_TAG_Last52BitPayload  = 263, 

    OBJC_TAG_RESERVED_264      = 264
};

一看到#if#else#endif,就知道在做判断。
看一下具体在判断什么:

#if __has_feature(objc_fixed_enum)  ||  __cplusplus >= 201103L
/*
  __has_feature(objc_fixed_enum):
__has_feature evaluates to 1 if the feature is both supported by Clang and standardized in the current language standard or 0 if not.
翻译一下就是:__has_feature取值为1如果特性被Clang支持并且被标准化在当前的语言标准中,取值为0如果不是的话。(原谅我耿(垃)直(圾)的翻译!!!)
Use __has_feature(objc_fixed_enum) to determine whether support for fixed underlying types is available in Objective-C.
翻译:用__has_feature(objc_fixed_enum)来决定是否支持固定的基础类型在Objective-C中。(我也不知道在说啥~)
__cplusplus >= 201103L:
C++11或者更新的标准。
*/
enum objc_tag_index_t : uint16_t
/*
定义枚举objc_tag_index_t,枚举类型为:uint16_t
这是c++11之后的特性
*/
#else
typedef uint16_t objc_tag_index_t;
enum
#endif

下面是一些枚举值:

{
    // 60-bit payloads
    OBJC_TAG_NSAtom            = 0, 
    OBJC_TAG_1                 = 1, 
    OBJC_TAG_NSString          = 2,   //NSString
    OBJC_TAG_NSNumber          = 3,  //NSNumber
    OBJC_TAG_NSIndexPath       = 4, //NSIndexPath
    OBJC_TAG_NSManagedObjectID = 5, 
    OBJC_TAG_NSDate            = 6,//NSDate  

    // 60-bit reserved
    OBJC_TAG_RESERVED_7        = 7, 

    // 52-bit payloads
    OBJC_TAG_Photos_1          = 8,
    OBJC_TAG_Photos_2          = 9,
    OBJC_TAG_Photos_3          = 10,
    OBJC_TAG_Photos_4          = 11,
    OBJC_TAG_XPC_1             = 12,
    OBJC_TAG_XPC_2             = 13,
    OBJC_TAG_XPC_3             = 14,
    OBJC_TAG_XPC_4             = 15,

    OBJC_TAG_First60BitPayload = 0, 
    OBJC_TAG_Last60BitPayload  = 6, 
    OBJC_TAG_First52BitPayload = 8, 
    OBJC_TAG_Last52BitPayload  = 263, 

    OBJC_TAG_RESERVED_264      = 264
};

接下来,就是一些关于tagged pointer的函数了:

// Returns true if tagged pointers are enabled.
// The other functions below must not be called if tagged pointers are disabled.
static inline bool 
_objc_taggedPointersEnabled(void);

// Register a class for a tagged pointer tag.
// Aborts if the tag is invalid or already in use.
OBJC_EXPORT void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class _Nonnull cls)
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

// Returns the registered class for the given tag.
// Returns nil if the tag is valid but has no registered class.
// Aborts if the tag is invalid.
OBJC_EXPORT Class _Nullable
_objc_getClassForTag(objc_tag_index_t tag)
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

// Create a tagged pointer object with the given tag and payload.
// Assumes the tag is valid.
// Assumes tagged pointers are enabled.
// The payload will be silently truncated to fit.
static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t payload);

// Return true if ptr is a tagged pointer object.
// Does not check the validity of ptr's class.
static inline bool 
_objc_isTaggedPointer(const void * _Nullable ptr);

// Extract the tag value from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// Does not check the validity of ptr's tag.
static inline objc_tag_index_t 
_objc_getTaggedPointerTag(const void * _Nullable ptr);

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is zero-extended.
static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr);

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is sign-extended.
static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr);

中间也有各个函数的注释,还算注释的比较清楚。那我们就根据注释和源码,一个个看其实现吧(下面将声明和实现的代码粘贴到一起):
_objc_taggedPointersEnabled:

// Returns true if tagged pointers are enabled.
// The other functions below must not be called if tagged pointers are disabled.
/*
翻译:返回true如果tagged pointers 是被允许的。
     下面的其它函数不能被调用如果tagged pointers被禁用。
*/
static inline bool 
_objc_taggedPointersEnabled(void);

static inline bool 
_objc_taggedPointersEnabled(void)
{
    extern uintptr_t objc_debug_taggedpointer_mask;
    //引入一个外部变量 objc_debug_taggedpointer_mask
    return (objc_debug_taggedpointer_mask != 0);
  //如果objc_debug_taggedpointer_mask为为0,则返回flase,否则返回true
}

这个函数的实现其实很简单,关键是objc_debug_taggedpointer_mask,这是个什么东西?
它既然被extern引入,说明他在其他文件中肯定被定义并初始化过。so,找到它。
objc-gdb.h中找到了:

OBJC_EXPORT uintptr_t objc_debug_taggedpointer_mask
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

//这里用OBJC_EXPORT导出objc_debug_taggedpointer_mask。
//OBJC_EXPORT的定义为:

#if !defined(OBJC_EXPORT)
#   define OBJC_EXPORT  OBJC_EXTERN OBJC_VISIBLE

//而OBJC_EXTERN的定义为:

#if !defined(OBJC_EXTERN)
#   if defined(__cplusplus)
#       define OBJC_EXTERN extern "C" 
#   else
#       define OBJC_EXTERN extern
#   endif
#endif

//OBJC_VISIBLE的定义为:
#if !defined(OBJC_VISIBLE)

#       define OBJC_VISIBLE  __attribute__((visibility("default")))

#endif

//所以这句相当于:
extern "C" __attribute__((visibility("default"))) uintptr_t objc_debug_taggedpointer_mask
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

//同样的,OBJC_AVAILABLE也可以找到其定义:
/* OBJC_AVAILABLE: shorthand for all-OS availability */

#if !defined(OBJC_AVAILABLE)
#   define OBJC_AVAILABLE(x, i, t, w, b)                            \
        __OSX_AVAILABLE(x)  __IOS_AVAILABLE(i)  __TVOS_AVAILABLE(t) \
        __WATCHOS_AVAILABLE(w)  __BRIDGEOS_AVAILABLE(b)
#endif
/*可见,OBJC_AVAILABLE将扩展为4个宏,拿__OSX_AVAILABLE来看看吧:*/

/* for use marking APIs available info for Mac OSX */

#if defined(__has_attribute)
  #if __has_attribute(availability)
    #define __OSX_UNAVAILABLE                    __OS_AVAILABILITY(macosx,unavailable)
    #define __OSX_AVAILABLE(_vers)               __OS_AVAILABILITY(macosx,introduced=_vers)
    #define __OSX_DEPRECATED(_start, _dep, _msg) __OSX_AVAILABLE(_start) __OS_AVAILABILITY_MSG(macosx,deprecated=_dep,_msg)
  #endif
#endif

#define __OS_AVAILABILITY(_target, _availability)            __attribute__((availability(_target,_availability)))

//所以:
#define __OSX_AVAILABLE(x)  __attribute__((availability(macosx,x)))
//其他的也是类似的,这里就不展开多说了。其实主要目的是为了限制版本的。
//关于__attribute__((availability(macosx,x))) :
/*

平台,可为macosx, ios, tvos, watchos;
何时引入,introduced=版本;
何时弃用,deprecated=版本;
何时废弃,obsoleted=版本;
不可用,unavailable,标识此平台不可用;
额外信息,message=字符串,可用来额外说明,如提示新的可用的替代方法。
*/
/*所以,OBJC_EXPORT uintptr_t objc_debug_taggedpointer_mask
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);
扩展开来其实是这样的:
*/
extern "C" __attribute__((visibility("default"))) uintptr_t objc_debug_taggedpointer_mask __attribute__((availability(macosx,introduced=10.9))) __attribute__((availability(ios, introduced=7.0,))) __attribute__((availability(tvos,introduced=9.0))) __attribute__((availability(watchos,introduced=1.0,))) __attribute__((availability(bridgeos,introduced=2.0,))) 
    
/*
搞了这么多,各种define套define,搞的花里胡哨的,其实句话主要就是说:
objc_debug_taggedpointer_mask这个符号被导入,可见类型为`default `,且这个符号,分别在macosx10.9、ios7.0、tvos9.0、watchos1.0、bridgeos2.0时被引入的。
*/

然而,搞了半天,还是不知道objc_debug_taggedpointer_mask的值啊。
所以继续探索。
一番寻找之后,终于找到了其赋值的位置:

uintptr_t objc_debug_taggedpointer_mask = _OBJC_TAG_MASK;

#if OBJC_MSB_TAGGED_POINTERS
#   define _OBJC_TAG_MASK (1UL<<63)
#else
#   define _OBJC_TAG_MASK 1UL

#if (TARGET_OS_OSX || TARGET_OS_IOSMAC) && __x86_64__
    // 64-bit Mac - tag bit is LSB    
//这里注释的很清楚了,64位的mac,tagged标识位位LSB(最低有效位)
#   define OBJC_MSB_TAGGED_POINTERS 0
#else
    // Everything else - tag bit is MSB
//其他的tagged标识位都为MSB(最高有效位)
#   define OBJC_MSB_TAGGED_POINTERS 1
#endif

绕这么一大圈,居然给我说是这个?so interesting!!!
回到最最初的那个_objc_taggedPointersEnabled函数:

static inline bool 
_objc_taggedPointersEnabled(void);

static inline bool 
_objc_taggedPointersEnabled(void)
{
    extern uintptr_t objc_debug_taggedpointer_mask;
    return (objc_debug_taggedpointer_mask != 0);
/*
这里等价于:
if(64bit system)
{
  return true;
}
return flase;
*/
}

这个函数算是弄清楚了,接下来看另一个函数。
_objc_registerTaggedPointerClass

OBJC_EXPORT void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class _Nonnull cls)
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);
/***********************************************************************
* _objc_registerTaggedPointerClass
* Set the class to use for the given tagged pointer index.
* Aborts if the tag is out of range, or if the tag is already 
* used by some other class.
**********************************************************************/
void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class cls)
{
    if (objc_debug_taggedpointer_mask == 0) {//首先判断系统是否支持tagged pointer 。objc_debug_taggedpointer_mask=0为tagged pointer disabled;
        _objc_fatal("tagged pointers are disabled");
/*
调用_objc_fatal函数。
这是_objc_fatal的实现:
void _objc_fatal(const char *fmt, ...)
{
    va_list args;
    va_start(args, fmt);
    _vcprintf(fmt, args);
    va_end(args);
//通过va_list获取格式化参数,并输出。
    _cprintf("\n");

    abort();
//退出
}
*/
//所以这里的意思就是输出"tagged pointers are disabled"并退出。
    }

    Class *slot = classSlotForTagIndex(tag);
    if (!slot) {
        _objc_fatal("tag index %u is invalid", (unsigned int)tag);
    }

    Class oldCls = *slot;
    
    if (cls  &&  oldCls  &&  cls != oldCls) {
        _objc_fatal("tag index %u used for two different classes "
                    "(was %p %s, now %p %s)", tag, 
                    oldCls, oldCls->nameForLogging(), 
                    cls, cls->nameForLogging());
    }

    *slot = cls;

    // Store a placeholder class in the basic tag slot that is 
    // reserved for the extended tag space, if it isn't set already.
    // Do this lazily when the first extended tag is registered so 
    // that old debuggers characterize bogus pointers correctly more often.
    if (tag < OBJC_TAG_First60BitPayload || tag > OBJC_TAG_Last60BitPayload) {
        Class *extSlot = classSlotForBasicTagIndex(OBJC_TAG_RESERVED_7);
        if (*extSlot == nil) {
            extern objc_class OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
            *extSlot = (Class)&OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
        }
    }

继续往下看的话,又遇到一个之前没有看过的函数:

Class * classSlotForTagIndex(tag);

在runtime中找到它的实现部分:

// Returns a pointer to the class's storage in the tagged class arrays, 
// or nil if the tag is out of range.
/*
注释:返回一个指针给类的存储在tagged class 数组中。
*/
static Class *  
classSlotForTagIndex(objc_tag_index_t tag)
{
    if (tag >= OBJC_TAG_First60BitPayload && tag <= OBJC_TAG_Last60BitPayload) {
//这里判断tagged pointer的payloads位,是否为60位payloads。
/*
// 60-bit payloads
    OBJC_TAG_NSAtom            = 0, 
    OBJC_TAG_1                 = 1, 
    OBJC_TAG_NSString          = 2, 
    OBJC_TAG_NSNumber          = 3, 
    OBJC_TAG_NSIndexPath       = 4, 
    OBJC_TAG_NSManagedObjectID = 5, 
    OBJC_TAG_NSDate            = 6,
*/
//以上7种类型都为60bit payloads(用60bit来存储数据)
        return classSlotForBasicTagIndex(tag);//调用另一个函数,这里找到它的实现(在下面对其分析)
    }

    if (tag >= OBJC_TAG_First52BitPayload && tag <= OBJC_TAG_Last52BitPayload) {
        int index = tag - OBJC_TAG_First52BitPayload;
        uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
                                    >> _OBJC_TAG_EXT_INDEX_SHIFT)
                                   & _OBJC_TAG_EXT_INDEX_MASK);
        return &objc_tag_ext_classes[index ^ tagObfuscator];
    }

    return nil;
}

分析classSlotForBasicTagIndex函数。

// Returns a pointer to the class's storage in the tagged class arrays.
// Assumes the tag is a valid basic tag.假设标志是一个有效的标志(在上层调用中已经判断了tag)
static Class *
classSlotForBasicTagIndex(objc_tag_index_t tag)
{
//这里又出现一个奇怪的变量objc_debug_taggedpointer_obfuscator😞。。。看看他到底是什么东西?
    uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
                                >> _OBJC_TAG_INDEX_SHIFT)
                               & _OBJC_TAG_INDEX_MASK);
    uintptr_t obfuscatedTag = tag ^ tagObfuscator;
    // Array index in objc_tag_classes includes the tagged bit itself
#if SUPPORT_MSB_TAGGED_POINTERS
    return &objc_tag_classes[0x8 | obfuscatedTag];
#else
    return &objc_tag_classes[(obfuscatedTag << 1) | 1];
#endif
}

分析objc_debug_taggedpointer_obfuscator

/***********************************************************************
* initializeTaggedPointerObfuscator(初始化initializeTaggedPointerObfuscator)
* Initialize objc_debug_taggedpointer_obfuscator with randomness.(初始化objc_debug_taggedpointer_obfuscator用随机数据)
*
* The tagged pointer obfuscator is intended to make it more difficult
* for an attacker to construct a particular object as a tagged pointer,
* in the presence of a buffer overflow or other write control over some
* memory. The obfuscator is XORed with the tagged pointers when setting
* or retrieving payload values. They are filled with randomness on first
* use.
**********************************************************************/
static void
initializeTaggedPointerObfuscator(void)
{
    if (sdkIsOlderThan(10_14, 12_0, 12_0, 5_0, 3_0) ||
        // Set the obfuscator to zero for apps linked against older SDKs,
        // in case they're relying on the tagged pointer representation.
        DisableTaggedPointerObfuscation) {
/*
// OPTION(var, env, help)
OPTION( DisableTaggedPointerObfuscation, OBJC_DISABLE_TAG_OBFUSCATION,    "disable obfuscation of tagged pointers")
*/
/*
#define sdkIsOlderThan(x, i, t, w) (sdkVersion() < DYLD_OS_VERSION(x, i, t, w))
#if TARGET_OS_OSX
#   define DYLD_OS_VERSION(x, i, t, w) DYLD_MACOSX_VERSION_##x
#   define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_IOS
#   define DYLD_OS_VERSION(x, i, t, w) DYLD_IOS_VERSION_##i
#   define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_TV
    // dyld does not currently have distinct constants for tvOS
#   define DYLD_OS_VERSION(x, i, t, w) DYLD_IOS_VERSION_##t
#   define sdkVersion() dyld_get_program_sdk_version()

#elif TARGET_OS_WATCH
#   define DYLD_OS_VERSION(x, i, t, w) DYLD_WATCHOS_VERSION_##w
    // watchOS has its own API for compatibility reasons
#   define sdkVersion() dyld_get_program_sdk_watch_os_version()

#else
#   error unknown OS
#endif
*/
        objc_debug_taggedpointer_obfuscator = 0;
    } else {

        // Pull random data into the variable, then shift away all non-payload bits.(将随机数据放入变量中,然后移走所有非有效负载位。)
        arc4random_buf(&objc_debug_taggedpointer_obfuscator,
                       sizeof(objc_debug_taggedpointer_obfuscator));
    //arc4random_buf() fills the region buf of length nbytes with random data.

        objc_debug_taggedpointer_obfuscator &= ~_OBJC_TAG_MASK;
//获取一个随机数,并将该随机数的tagged pointer 标识位(高1bit或者低1bit)置为0。即:
//objc_debug_taggedpointer_obfuscator = objc_debug_taggedpointer_obfuscator & (0x7FFFFFFFFFFFFFFF或者0xFFFFFFFFFFFFFFFE)
    }
}

objc_debug_taggedpointer_obfuscator也算是囫囵吞枣掌握了,回到刚刚那个classSlotForBasicTagIndex函数:

static Class *
classSlotForBasicTagIndex(objc_tag_index_t tag)
{
    uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
                                >> _OBJC_TAG_INDEX_SHIFT)
                               & _OBJC_TAG_INDEX_MASK);
//这里就很好理解了,将随机得到的数据(tagged标识位已置0)右移_OBJC_TAG_INDEX_SHIFT位。

/*
#if OBJC_MSB_TAGGED_POINTERS
#   define _OBJC_TAG_INDEX_SHIFT 60
#else
#   define _OBJC_TAG_INDEX_SHIFT 1
#endif
总是有两种情况,这里就选一种研究(“_OBJC_TAG_INDEX_SHIFT 60”)
右移60位,相当于取随机数objc_debug_taggedpointer_obfuscator的高四位。

#define _OBJC_TAG_INDEX_MASK 0x7
然后再和0x7进行位与操作。
其实质就是取60bit~62bit。(高4bit中的低3bit)
*/
    uintptr_t obfuscatedTag = tag ^ tagObfuscator;
//tag 和 tagObfuscator 进行异或操作。
    // Array index in objc_tag_classes includes the tagged bit itself
#if SUPPORT_MSB_TAGGED_POINTERS
    return &objc_tag_classes[0x8 | obfuscatedTag];
#else
    return &objc_tag_classes[(obfuscatedTag << 1) | 1];
#endif
//将标志位置1,然后在objc_tag_classes数组中取出,再取地址返回。
}

回到开始的_objc_registerTaggedPointerClass函数:

void
_objc_registerTaggedPointerClass(objc_tag_index_t tag, Class cls)
{
    if (objc_debug_taggedpointer_mask == 0) {
        _objc_fatal("tagged pointers are disabled");
    }

    Class *slot = classSlotForTagIndex(tag);
/*
这里已经知道了,slot是通过tag值,在objc_tag_classes数组中寻找对于的Class,并获取其地址,赋值给slot。
*/
    if (!slot) {
        _objc_fatal("tag index %u is invalid", (unsigned int)tag);
    }
//如果没有获取到,则abort并告知tag index is invalid。

    Class oldCls = *slot;
    
    if (cls  &&  oldCls  &&  cls != oldCls) {
        _objc_fatal("tag index %u used for two different classes "
                    "(was %p %s, now %p %s)", tag, 
                    oldCls, oldCls->nameForLogging(), 
                    cls, cls->nameForLogging());
    }
  //这里是判断该tag index是否被其他class使用。

    *slot = cls;

//下面的这段,有待探究~~~
    // Store a placeholder class in the basic tag slot that is 
    // reserved for the extended tag space, if it isn't set already.
    // Do this lazily when the first extended tag is registered so 
    // that old debuggers characterize bogus pointers correctly more often.
    if (tag < OBJC_TAG_First60BitPayload || tag > OBJC_TAG_Last60BitPayload) {
        Class *extSlot = classSlotForBasicTagIndex(OBJC_TAG_RESERVED_7);
        if (*extSlot == nil) {
            extern objc_class OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
            *extSlot = (Class)&OBJC_CLASS_$___NSUnrecognizedTaggedPointer;
        }
    }
}

到这里,这个函数算是解析完了。其实它做的就是:
1、判断tagged pointer disabled与否。
2、用tag值在objc_tag_classes数组中查找相对应的Class(查找过程上面已经分析过)。
3、判断是否查找到。如果查找到,判断是否被占用。

下一个函数_objc_getClassForTag

// Returns the registered class for the given tag.
// Returns nil if the tag is valid but has no registered class.
// Aborts if the tag is invalid.
OBJC_EXPORT Class _Nullable
_objc_getClassForTag(objc_tag_index_t tag)
    OBJC_AVAILABLE(10.9, 7.0, 9.0, 1.0, 2.0);

/***********************************************************************
* _objc_getClassForTag
* Returns the class that is using the given tagged pointer tag.
* Returns nil if no class is using that tag or the tag is out of range.
**********************************************************************/
Class
_objc_getClassForTag(objc_tag_index_t tag)
{
    Class *slot = classSlotForTagIndex(tag);
    if (slot) return *slot;
    else return nil;
}

嘿,巧了,classSlotForTagIndex函数,在分析上个函数的时候,分析过了。详细过程见classSlotForBasicTagIndex函数的分析过程。

下一个函数_objc_makeTaggedPointer

// Create a tagged pointer object with the given tag and payload.
// Assumes the tag is valid.
// Assumes tagged pointers are enabled.
// The payload will be silently truncated to fit.
static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t payload);

static inline void * _Nonnull
_objc_makeTaggedPointer(objc_tag_index_t tag, uintptr_t value)
{
    // PAYLOAD_LSHIFT and PAYLOAD_RSHIFT are the payload extraction shifts.
    // They are reversed here for payload insertion.

    // assert(_objc_taggedPointersEnabled());
    if (tag <= OBJC_TAG_Last60BitPayload) {
//这里只分析这种情况
        // assert(((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT) == value);
        uintptr_t result =
            (_OBJC_TAG_MASK | 
             ((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | 
             ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT));
        return _objc_encodeTaggedPointer(result);
    } else {
        // assert(tag >= OBJC_TAG_First52BitPayload);
        // assert(tag <= OBJC_TAG_Last52BitPayload);
        // assert(((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT) == value);
        uintptr_t result =
            (_OBJC_TAG_EXT_MASK |
             ((uintptr_t)(tag - OBJC_TAG_First52BitPayload) << _OBJC_TAG_EXT_INDEX_SHIFT) |
             ((value << _OBJC_TAG_EXT_PAYLOAD_RSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_LSHIFT));
        return _objc_encodeTaggedPointer(result);
    }
}

先分析result的值:

uintptr_t result = (_OBJC_TAG_MASK |  ((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT));

/*
同样的,这里只分析 _OBJC_TAG_INDEX_SHIFT = 60 的情况
((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) :将tag的低4bit变为高4bit
((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT):将value的高4bit置0
((uintptr_t)tag << _OBJC_TAG_INDEX_SHIFT) | ((value << _OBJC_TAG_PAYLOAD_RSHIFT) >> _OBJC_TAG_PAYLOAD_LSHIFT):将tag的低四位给value的高四位。
上面的操作就相当于:
tag:0x89abcdef
value:0x76543210
tagvalue:0xf6543210

所以result为上述操作之后的tag和value值的tagged pointer标识位(63bit)置1。
举个例子:
tag=0x76543210
value=0xf9abcdef
那么result=(1<<63) | ((tag<<60) | ((value<<4)>>4)) = 0x89abcdef
*/

再来看看_objc_encodeTaggedPointer:

static inline void * _Nonnull
_objc_encodeTaggedPointer(uintptr_t ptr)
{
    return (void *)(objc_debug_taggedpointer_obfuscator ^ ptr);
/*
objc_debug_taggedpointer_obfuscator在上面也是分析过的,他有一个初始化函数initializeTaggedPointerObfuscator。
其值为一个由arc4random_buf生成的64bit随机数,然后再将_OBJC_TAG_MASK位(最高位)置0。
所以的得到的为:result^objc_debug_taggedpointer_obfuscator
*/
}

看下一个函数_objc_isTaggedPointer

// Return true if ptr is a tagged pointer object.
// Does not check the validity of ptr's class.
static inline bool 
_objc_isTaggedPointer(const void * _Nullable ptr);

static inline bool 
_objc_isTaggedPointer(const void * _Nullable ptr)
{
    return ((uintptr_t)ptr & _OBJC_TAG_MASK) == _OBJC_TAG_MASK;
/*
这个其实很简单,传入一个指针,判断指针的_OBJC_TAG_MASK位是否为1。
*/
}

_objc_getTaggedPointerTag

// Extract the tag value from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// Does not check the validity of ptr's tag.
static inline objc_tag_index_t 
_objc_getTaggedPointerTag(const void * _Nullable ptr);

static inline objc_tag_index_t 
_objc_getTaggedPointerTag(const void * _Nullable ptr) 
{
    // assert(_objc_isTaggedPointer(ptr));
    uintptr_t value = _objc_decodeTaggedPointer(ptr);
/*
用随机数objc_debug_taggedpointer_obfuscator对指针进行decode。
*/
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
//    uintptr_t basicTag = (value >> 60) & 0x7;//将decode之后的value的高四位变为低四位,并将其3bit置为0。

//仅是研究NSNumber等类的话,下面的就不用去深究。
    uintptr_t extTag =   (value >> _OBJC_TAG_EXT_INDEX_SHIFT) & _OBJC_TAG_EXT_INDEX_MASK;

    if (basicTag == _OBJC_TAG_INDEX_MASK) {
        return (objc_tag_index_t)(extTag + OBJC_TAG_First52BitPayload);
    } else {
        return (objc_tag_index_t)basicTag;
    }
}

_objc_getTaggedPointerValue

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is zero-extended.
static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr);

static inline uintptr_t
_objc_getTaggedPointerValue(const void * _Nullable ptr) 
{
    // assert(_objc_isTaggedPointer(ptr));
    uintptr_t value = _objc_decodeTaggedPointer(ptr);
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    if (basicTag == _OBJC_TAG_INDEX_MASK) {
        return (value << _OBJC_TAG_EXT_PAYLOAD_LSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_RSHIFT;
    } else {
        return (value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
//主要看这里,value的值是_objc_decodeTaggedPointer的返回值,然后再将这个返回值高四位置0,即得。
    }
}

_objc_getTaggedPointerSignedValue

// Extract the payload from the given tagged pointer object.
// Assumes ptr is a valid tagged pointer object.
// The payload value is sign-extended.
static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr);

static inline intptr_t
_objc_getTaggedPointerSignedValue(const void * _Nullable ptr) 
{
    // assert(_objc_isTaggedPointer(ptr));
    uintptr_t value = _objc_decodeTaggedPointer(ptr);
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    if (basicTag == _OBJC_TAG_INDEX_MASK) {
        return ((intptr_t)value << _OBJC_TAG_EXT_PAYLOAD_LSHIFT) >> _OBJC_TAG_EXT_PAYLOAD_RSHIFT;
    } else {
        return ((intptr_t)value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
    }
}
/*
这个函数和上面的_objc_getTaggedPointerValue是一样的。
*/

理论到这算是走完了。
这里,总结一下:
tagged pointer是Apple在64bit系统上对NSNumber类的一些优化,主要目的是为了节省内存。其实现原理为:将指针的一部分(4bit)拿出来充当tag值,标记对象指针是否为tagged pointer。并在这4bit中,还存储了该指针的所属类在objc_tag_classes数组中的index。通过一系列函数可以对其进行操作,例如判断指针是否为tagged pointer(_objc_isTaggedPointer)、获取tagged pointer的数据(_objc_getTaggedPointerValue)、获取tagged pointer的类型(_objc_getTaggedPointerTag)等等。

实践(环境:MAC,版本:10.14.5)

理论搞懂了,那就来指导一下实践吧:

  • 1 由已知tagged pointer 获取其相关信息。
    (1)查看tagged pointer enable or disable。
    根据上述分析,判断pagged pointer是否启用,应该是调用函数_objc_taggedPointersEnabled
    但是_objc_taggedPointersEnabled函数是被static修饰的,所以不能extern并使用。只能直接重写其实现。
#import <Foundation/Foundation.h>

bool sl_objc_taggedPointersEnabled(void)
{
    extern uintptr_t objc_debug_taggedpointer_mask;
    return (objc_debug_taggedpointer_mask != 0);
}

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        
        bool b = sl_objc_taggedPointersEnabled();
        NSLog(@"Enable:%d",b);
    }
    return 0;
}

查看打印:

2019-07-01 17:37:14.584230+0800 AAA[10865:711050] Enable:1

可知,在mac10.14.5上,是enabled的。
(2)判断一个指针是不是Tagged Pointer。

#   define _OBJC_TAG_MASK 1UL//本来需要判断的,这里测试的是已知的mac 10.14.5.所以直接定义
bool sl_objc_isTaggedPointer(const void * _Nullable ptr)
{
    return ((uintptr_t)ptr & _OBJC_TAG_MASK) == _OBJC_TAG_MASK;
}
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSNumber *n = @123;
        NSString *s = @"123";
        NSString *s_m_c = [[s mutableCopy] copy];
        NSObject *o = [NSObject new];
        bool nb = sl_objc_isTaggedPointer((__bridge void *)n);
        bool sb = sl_objc_isTaggedPointer((__bridge void *)s);
        bool smcb = sl_objc_isTaggedPointer((__bridge void *)s_m_c);
        bool ob = sl_objc_isTaggedPointer((__bridge void *)o);
        NSLog(@"nb(0x%lx) is TaggedPointer %d",(uintptr_t)n,nb);
        NSLog(@"sb(0x%lx) is TaggedPointer %d",(uintptr_t)s,sb);
        NSLog(@"smcb(0x%lx) is TaggedPointer %d",(uintptr_t)s_m_c,smcb);
        NSLog(@"ob(0x%lx) is TaggedPointer %d",(uintptr_t)o,ob);
    }
    return 0;
}

打印:

2019-07-01 17:58:36.116391+0800 AAA[10973:719191] nb(0x56b19f9c2784f641) is TaggedPointer 1
2019-07-01 17:58:36.116530+0800 AAA[10973:719191] sb(0x100001058) is TaggedPointer 0
2019-07-01 17:58:36.116538+0800 AAA[10973:719191] smcb(0x56b19f9c14b6bc53) is TaggedPointer 1
2019-07-01 17:58:36.116544+0800 AAA[10973:719191] ob(0x10051e520) is TaggedPointer 0

(3)获取tagged pointer的类型(tag枚举值)

#   define _OBJC_TAG_INDEX_SHIFT 1
#define _OBJC_TAG_INDEX_MASK 0x7
extern uintptr_t objc_debug_taggedpointer_obfuscator;
uintptr_t sl_objc_getTaggedPointerTag(const void * _Nullable ptr)
{
    uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    return basicTag;
}
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSNumber *n = @123;
        NSString *s = @"123";
        NSString *s_m_c = [[s mutableCopy] copy];
        uintptr_t n_t = sl_objc_getTaggedPointerTag((__bridge void *)n);
        uintptr_t s_m_c_t = sl_objc_getTaggedPointerTag((__bridge void *)s_m_c);
        NSLog(@"n's type is %ld",n_t);
        NSLog(@"s_m_c's type is %ld",s_m_c_t);        
    }
    return 0;
}

log:

2019-07-04 09:41:34.017122+0800 AAA[11655:849484] n's type is 3
2019-07-04 09:41:34.017318+0800 AAA[11655:849484] s_m_c's type is 2

对比log和objc_tag_index_t:

// 60-bit payloads
    OBJC_TAG_NSAtom            = 0, 
    OBJC_TAG_1                 = 1, 
    OBJC_TAG_NSString          = 2, 
    OBJC_TAG_NSNumber          = 3, 
    OBJC_TAG_NSIndexPath       = 4, 
    OBJC_TAG_NSManagedObjectID = 5, 
    OBJC_TAG_NSDate            = 6,

(4)获取taggedpointer的所属类:

#   define _OBJC_TAG_INDEX_SHIFT 1
#define _OBJC_TAG_INDEX_MASK 0x7
extern uintptr_t objc_debug_taggedpointer_obfuscator;
uintptr_t sl_objc_getTaggedPointerTag(const void * _Nullable ptr)
{
    uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    return basicTag;
}
extern Class objc_debug_taggedpointer_classes[];
Class *sl_classSlotForBasicTagIndex(uintptr_t tag)
{
    uintptr_t tagObfuscator = ((objc_debug_taggedpointer_obfuscator
                                >> _OBJC_TAG_INDEX_SHIFT)
                               & _OBJC_TAG_INDEX_MASK);
    uintptr_t obfuscatedTag = tag ^ tagObfuscator;
    // Array index in objc_tag_classes includes the tagged bit itself
    return &objc_debug_taggedpointer_classes[(obfuscatedTag << 1) | 1];
}
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSNumber *n = @123;
        NSString *s = @"123";
        NSString *s_m_c = [[s mutableCopy] copy];
        uintptr_t n_t = sl_objc_getTaggedPointerTag((__bridge void *)n);
        uintptr_t s_m_c_t = sl_objc_getTaggedPointerTag((__bridge void *)s_m_c);
        Class *n_t_c = sl_classSlotForBasicTagIndex(n_t);
        Class *s_m_c_t_c = sl_classSlotForBasicTagIndex(s_m_c_t);
        NSLog(@"n's Class is %@",*n_t_c);
        NSLog(@"s_m_c's Clss is %@",*s_m_c_t_c);
    }
    return 0;
}

log:

2019-07-04 10:58:05.075581+0800 AAA[11913:876264] n's Class is __NSCFNumber
2019-07-04 10:58:05.075754+0800 AAA[11913:876264] s_m_c's Clss is NSTaggedPointerString

看到上面的例子,或许我该露出张狂但又不失谦虚的笑容,因为一切都那么顺利,理论得到了印证。然而,真的是这么一回事吗?下面看一个不尽人意的例子。
(5)通过taggedpointer获取value:

#   define _OBJC_TAG_PAYLOAD_LSHIFT 0
#   define _OBJC_TAG_PAYLOAD_RSHIFT 4
uintptr_t sl_objc_getTaggedPointerValue(const void * _Nullable ptr)
{
    // assert(_objc_isTaggedPointer(ptr));
    uintptr_t value = objc_debug_taggedpointer_obfuscator ^ (uintptr_t)ptr;
//    uintptr_t basicTag = (value >> _OBJC_TAG_INDEX_SHIFT) & _OBJC_TAG_INDEX_MASK;
    return (value << _OBJC_TAG_PAYLOAD_LSHIFT) >> _OBJC_TAG_PAYLOAD_RSHIFT;
}
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSNumber *n = @123;
        uintptr_t n_v = sl_objc_getTaggedPointerValue((__bridge void *)n);
        NSLog(@"hex(123) = 0x%x",123);
        NSLog(@"n's value is 0x%lx",n_v);
    }
    return 0;
}

log:

2019-07-04 11:14:59.639165+0800 AAA[11959:881845] hex(123) = 0x7b
2019-07-04 11:14:59.639316+0800 AAA[11959:881845] n's value is 0x7b2

是不是感觉有点不对劲?
换一个数据:

NSNumber *n = [NSNumber numberWithLongLong:121111];
        uintptr_t n_v = sl_objc_getTaggedPointerValue((__bridge void *)n);
        NSLog(@"hex(121111) = 0x%x",121111);
        NSLog(@"n's value is 0x%lx",n_v);

log:

2019-07-04 11:18:03.279558+0800 AAA[11967:882991] hex(121111) = 0x1d917
2019-07-04 11:18:03.279719+0800 AAA[11967:882991] n's value is 0x1d9173

每次都发现算出来的value后面总是多一位。。。。
查了蛮多资料,只知道:
value的第1-4位是NSNumber的类型:比如,char是0、short是1、int是2、float是4。
至于为什么是这样,Apple对这一特性的实现代码,都没有找到。(原谅我的无能😔)。
当然,最大的坑,还不是这个,是NSString。考虑到篇幅问题,下篇再作细研究。

相关文章

网友评论

    本文标题:tagged Pointer

    本文链接:https://www.haomeiwen.com/subject/ritmcctx.html