iOS Class本质之cache_t

背景:

我们知道oc调用方法是objc_send_message,具体流程就是,去isa指针所指的类或元类对象的对应的方法缓存cache_t中查找,如果在缓存中找到,则直接返回对应IMP;如果没找到,则去类或元类的方法类表中查找,找到了,返回IMP,并保存到对应的cache_t中,便于下次查找使用;如果在类或元类的方法列表也没找到,则进入消息解析转发流程,直到unrecognized selector sent to instance。
方法的查找是需要速度的,为了速度,就有查找流程中的缓存,所以我们有必要了解cache_t。

类结构

我们知道object-c的类Class的底层实现是一个object_class的结构体如下:

struct objc_class : objc_object {//继承objc_object,objc_object对应oc就是实例对象,从这点讲,类也是对象,万物皆对象
    // Class ISA;//isa指针
    Class superclass;//指向父类的指针
    cache_t cache;             //  方法缓存列表
    class_data_bits_t bits;    // class_rw_t * plus custom rr/alloc flags 类的其他信息数据,例如,属性、方法、协议等
    ...
}

cache_t的结构

struct cache_t {
    struct bucket_t *_buckets;///存储方法的类表,hash表
    mask_t _mask;//hash算法使用到的盐
    mask_t _occupied;//列表中也被占用的位置数
    ...
}

bucket_t的结构

struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__//iOS
    uintptr_t _imp;//函数指针IMP
    SEL _sel;//selector
#else
    SEL _sel;
    uintptr_t _imp;
#endif
    ...
}

既然是缓存列表,就有更新(写入),查(读取)等逻辑,有一段官方文字如下:

Method cache locking (GrP 2001-1-14)
 *
 * For speed, objc_msgSend does not acquire any locks when it reads 
 * method caches. Instead, all cache changes are performed so that any 
 * objc_msgSend running concurrently with the cache mutator will not 
 * crash or hang or get an incorrect result from the cache. 
 *
 * When cache memory becomes unused (e.g. the old cache after cache 
 * expansion), it is not immediately freed, because a concurrent 
 * objc_msgSend could still be using it. Instead, the memory is 
 * disconnected from the data structures and placed on a garbage list. 
 * The memory is now only accessible to instances of objc_msgSend that 
 * were running when the memory was disconnected; any further calls to 
 * objc_msgSend will not see the garbage memory because the other data 
 * structures don't point to it anymore. The collecting_in_critical
 * function checks the PC of all threads and returns FALSE when all threads 
 * are found to be outside objc_msgSend. This means any call to objc_msgSend 
 * that could have had access to the garbage has finished or moved past the 
 * cache lookup stage, so it is safe to free the memory.
 *
 * All functions that modify cache data or structures must acquire the 
 * cacheUpdateLock to prevent interference from concurrent modifications.
 * The function that frees cache garbage must acquire the cacheUpdateLock 
 * and use collecting_in_critical() to flush out cache readers.
 * The cacheUpdateLock is also used to protect the custom allocator used 
 * for large method cache blocks.
 *
 * Cache readers (PC-checked by collecting_in_critical())//读取
 * objc_msgSend*
 * cache_getImp
 *
 * Cache writers (hold cacheUpdateLock while reading or writing; not PC-checked)//写入
 * cache_fill         (acquires lock)
 * cache_expand       (only called from cache_fill)
 * cache_create       (only called from cache_expand)
 * bcopy               (only called from instrumented cache_expand)
 * flush_caches        (acquires lock)
 * cache_flush        (only called from cache_fill and flush_caches)
 * cache_collect_free (only called from cache_expand and cache_flush)
 *
 * UNPROTECTED cache readers (NOT thread-safe; used for debug info only)
 * cache_print
 * _class_printMethodCaches
 * _class_printDuplicateCacheEntries
 * _class_printMethodCacheStatistics

从上面的文档可以看出,写入的入口函数是cache_fill,读取缓存的入口是cache_getImp,下面一一解读:

写入操作cache_fill:
void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if !DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);//加锁 
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
调用另外一个函数cache_fill_nolock:
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
    cacheUpdateLock.assertLocked();//加锁

    // Never cache before +initialize is done
    if (!cls->isInitialized()) return;//类对象没有初始化,直接return

    // Make sure the entry wasn't added to the cache by some other thread 
    // before we grabbed the cacheUpdateLock.
    if (cache_getImp(cls, sel)) return;
/**先试着取一下,可能其他的线程
调用过该方法,取到直接return,
后面详细讲解读取缓存,这里的
cache_getImp函数实现是通过
汇编语言写的,就是为了速度*/

    cache_t *cache = getCache(cls);//拿到类对象对应的cache_t

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = cache->occupied() + 1;//暂用位置试着加1
    mask_t capacity = cache->capacity();//获取hash表的容量
    if (cache->isConstantEmptyCache()) {
        // Cache is read-only. Replace it.
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);//为空,创建bucket
    }
    else if (newOccupied <= capacity / 4 * 3) {
        // Cache is less than 3/4 full. Use it as-is.没超过3/4,直接使用
    }
    else {
        // Cache is too full. Expand it.
        cache->expand();//超过3/4,hash扩容
    }

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(sel, receiver);//通过hash算法找到对应位置的bucket
    if (bucket->sel() == 0) cache->incrementOccupied();//更新Occupied的值
    bucket->set<Atomic>(sel, imp);存储sel及对应的imp
}

函数几点说明:

  1. cache_getImp()的具体实现是汇编语言,一切是为了速度跟性能;
  2. 如果其他线程在执行写操作,直接return;
  3. buckets初始的容量是4, INIT_CACHE_SIZE;
enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2)
};
  1. 当buckets的存储的bucket超过了容量的3/4时,会触发扩容。
创建bucket
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    bool freeOld = canBeFreed();//旧的缓存是否能清空

    bucket_t *oldBuckets = buckets();//先拿到旧的缓存
    bucket_t *newBuckets = allocateBuckets(newCapacity);//根据新的容量,new一份新的缓存

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    setBucketsAndMask(newBuckets, newCapacity - 1);//设置对应的成员变量
    
    if (freeOld) {//是否需要释放旧的缓存,扩容后旧的缓存处理
        cache_collect_free(oldBuckets, oldCapacity);//暂存在其他地方
        cache_collect(false);//在合适的时机释放,没有其他的线程在read cache的时候
    }
}

在上面的函数中,我们注意到了旧的缓存的处理逻辑的影子(可以猜测到,扩容应该也会调用这个函数,后面会讲解到),在这里先说明一下几点:

  1. 扩容重新hash,旧的缓存没有分配到新的buckets中;
  2. 没有在新的buckets中分配位置,但并不是立马就释放掉了,
    而是存放在通过cache_collect_free 申请的garbage_refs里面,等到该类没有对象发送消息或者读取缓存时,然后释放掉;
  3. 上面两点的出发点,都是为了性能跟速度。
将扩容导致的旧缓存存放在内存的其他区域garbage_refs
// table of refs to free,bucket_t的二维表
static bucket_t **garbage_refs = 0;
enum {
    INIT_GARBAGE_COUNT = 128
};
 //一开始是一个容量为128的二维表,随着增多,会不断的2倍扩容
static void _garbage_make_room(void)
{
    static int first = 1;

    // Create the collection table the first time it is needed
    if (first)
    {
        first = 0;
        garbage_refs = (bucket_t**)
            malloc(INIT_GARBAGE_COUNT * sizeof(void *));
        garbage_max = INIT_GARBAGE_COUNT;
    }

    // Double the table if it is full
    else if (garbage_count == garbage_max)
    {//满了二倍扩容
        garbage_refs = (bucket_t**)
            realloc(garbage_refs, garbage_max * 2 * sizeof(void *));
        garbage_max *= 2;
    }
}
cache_collect_free,存放旧缓存
static void cache_collect_free(bucket_t *data, mask_t capacity)
{
    cacheUpdateLock.assertLocked();

    if (PrintCaches) recordDeadCache(capacity);

    _garbage_make_room ();
    garbage_byte_size += cache_t::bytesForCapacity(capacity);
    garbage_refs[garbage_count++] = data;
}

在合适的时机释放旧缓存

void cache_collect(bool collectALot)
{
    cacheUpdateLock.assertLocked();

    // Done if the garbage is not full,可用内存未使用完不需要释放
    if (garbage_byte_size < garbage_threshold  &&  !collectALot) {
        return;
    }

    // Synchronize collection with objc_msgSend and other cache readers
    if (!collectALot) {
        if (_collecting_in_critical ()) {//检验其他的线程是否正在发送消息或者读取,缓存不释放
            // objc_msgSend (or other cache reader) is currently looking in
            // the cache and might still be using some garbage.
            if (PrintCaches) {
                _objc_inform ("CACHES: not collecting; "
                              "objc_msgSend in progress");
            }
            return;
        }
    } 
    else {
        // No excuses.
        while (_collecting_in_critical()) //一直检验是否有发送该sel,或者读取缓存
            ;
    }

    // No cache readers in progress - garbage is now deletable

    // Log our progress
    if (PrintCaches) {
        cache_collections++;
        _objc_inform ("CACHES: COLLECTING %zu bytes (%zu allocations, %zu collections)", garbage_byte_size, cache_allocations, cache_collections);
    }
    //能走到这,说明该类相关的对象包括类对象没有发送消息,或者读取该缓存,可以释放了这部分缓存
    // Dispose all refs now in the garbage
    // Erase each entry so debugging tools don't see stale pointers.
    while (garbage_count--) {
        auto dead = garbage_refs[garbage_count];
        garbage_refs[garbage_count] = nil;
        free(dead);//释放
    }
    
    // Clear the garbage count and total size indicator
    garbage_count = 0;
    garbage_byte_size = 0;

    if (PrintCaches) {
        size_t i;
        size_t total_count = 0;
        size_t total_size = 0;

        for (i = 0; i < countof(cache_counts); i++) {
            int count = cache_counts[i];
            int slots = 1 << i;
            size_t size = count * slots * sizeof(bucket_t);

            if (!count) continue;

            _objc_inform("CACHES: %4d slots: %4d caches, %6zu bytes", 
                         slots, count, size);

            total_count += count;
            total_size += size;
        }

        _objc_inform("CACHES:      total: %4zu caches, %6zu bytes", 
                     total_count, total_size);
    }
}
扩容函数expand()
void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;//扩容的在之前的基础上翻倍。

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can't grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);//调用上面讲到过的reallocate
}

扩容的在之前的基础上翻倍。

同过sel的hash查到到对应的bucket
bucket_t * cache_t::find(SEL s, id receiver)
{
    assert(s != 0);

    bucket_t *b = buckets();
    mask_t m = mask();
    mask_t begin = cache_hash(s, m);//hash算法
    mask_t i = begin;
    do {
        if (b[i].sel() == 0  ||  b[i].sel() == s) {
            return &b[i];
        }
    } while ((i = cache_next(i, m)) != begin);//hash冲突解决方案,向上个位置查找

    // hack,没查找到,坏的缓存
    Class cls = (Class)((uintptr_t)this - offsetof(objc_class, cache));
    cache_t::bad_cache(receiver, (SEL)s, cls);
}
hash算法
static inline mask_t cache_hash(SEL sel, mask_t mask) 
{
    return (mask_t)(uintptr_t)sel & mask;//相当于%mask,确保所有结果都在buckets的容量之内,mask = capacity -1
}
hash冲突算法
static inline mask_t cache_next(mask_t i, mask_t mask) {
    return i ? i-1 : mask;//返回前面一个桶子bucket,直到0的时候,返回mask(从最后一个位置开始继续往前找)
}
读取操作

cache_getImp,这部分代码苹果是用汇编代码写的,一切都是为了性能和速度,ARM64汇编的相关可以点击这里了解

STATIC_ENTRY _cache_getImp

    GetClassFromIsa_p16 p0//拿到isa,并放到p16寄存器
    CacheLookup GETIMP//查找imp

LGetImpMiss:
    mov p0, #0
    ret

    END_ENTRY _cache_getImp
/////////////////////////////////////////////////////////////////////
.macro CacheLookup
    // p1 = SEL, p16 = isa
    ldp p10, p11, [x16, #CACHE] // p10 = buckets, p11 = occupied|mask,通过isa的cache获取到对应的buckets放到p10寄存器,occupied|mask放到p11寄存器
#if !__LP64__
    and w11, w11, 0xffff    // p11 = mask
#endif
    and w12, w1, w11        
// x12 = _cmd & mask,64位走这个分支,w11寄存器(p11寄存器的32位)存放的mask,
//跟p1存放的sel(_cmd就是sel,我们知道obj_msgSend前面有两个
//默认参数,第一个是self,第二个是_cmd,_cmd就是方法sel,这里也
//一样,当然这里的sel是通过_cache_getImp函数放进去到p1寄存器的,
//汇编前面8个寄存器默认放函数的参数的)进行&运算
//(cache_fill_nolock函数的hash算法函数cache_hash上面有提到过)
//得到buckets的下标index,放到p12寄存器
    add p12, p10, p12, LSL #(1+PTRSHIFT)                 
// p12 = buckets + ((_cmd & mask) << (1+PTRSHIFT)),上面提到bucket
//是一个结构体,他的成员是sel跟imp,而且都是 uintptr_t类型,而uintptr_t就是
//unsigned long类型,在64位下就是都占8个字节,随意一个bucket结构体所占的
//内存是16个字节,所以从buckets数组找到对应的元素就可以通过数组的
//首地址+每个元素所占内存的大小,即sel对应的bucket = buckets + index*16
//也可以表示为bucket = buckets + index<<4,并且放到p12寄存器中。

    ldp p17, p9, [x12]      // {imp, sel} = *bucket,p12寄存器中的imp,sel取出来,分别放到p17和p9寄存器中
1:  cmp p9, p1          // if (bucket->sel != _cmd),比较p1,p9寄存器中的sel
    b.ne    2f          //     scan more,不相等,跳转到2:
    CacheHit $0         // call or return imp ,相等,返回imp
    
2:  // not hit: p12 = not-hit bucket,不相等
    CheckMiss $0            
// miss if bucket->sel == 0,如果对应的sel为空,说明,没有缓存,直接return
    cmp p12, p10        
// wrap if bucket == buckets,比较bucket是否为第一个元素
    b.eq    3f //为第一个元素,跳转到3
    ldp p17, p9, [x12, #-BUCKET_SIZE]!  
// {imp, sel} = *--bucket,不是第一个元素,说明hash冲突,向上一个位置的
//bucket查找,即cache_next函数,得到的bucket再次将imp、sel存放在p17和p9寄存器中
    b   1b          // loop,跳转到1,循环

3:  // wrap: p12 = first bucket, w11 = mask
    add p12, p12, w11, UXTW #(1+PTRSHIFT)                          
 // p12 = buckets + (mask << 1+PTRSHIFT),移动到最后一个元素,并赋值给p12寄存器

    // Clone scanning loop to miss instead of hang when cache is corrupt.
    // The slow path may detect any corruption and halt later.

    ldp p17, p9, [x12]      
// {imp, sel} = *bucket,将imp、sel存放在p17和p9寄存器中,继续下面的操作,
//跟上面的1,2相同,区别在3,当跳转到3时,说明从最后一个到第一个都比较
//都没有找到对应的缓存,就是没有查找到,结束查找,返回
1:  cmp p9, p1          // if (bucket->sel != _cmd)
    b.ne    2f          //     scan more
    CacheHit $0         // call or return imp
    
2:  // not hit: p12 = not-hit bucket
    CheckMiss $0            // miss if bucket->sel == 0
    cmp p12, p10        // wrap if bucket == buckets
    b.eq    3f
    ldp p17, p9, [x12, #-BUCKET_SIZE]!  // {imp, sel} = *--bucket
    b   1b          // loop

3:  // double wrap
    JumpMiss $0
    
.endmacro

以上,请指正!!!

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 206,311评论 6 481
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 88,339评论 2 382
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 152,671评论 0 342
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 55,252评论 1 279
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 64,253评论 5 371
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,031评论 1 285
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,340评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,973评论 0 259
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,466评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,937评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,039评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,701评论 4 323
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,254评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,259评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,485评论 1 262
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,497评论 2 354
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,786评论 2 345