美文网首页
Go语言——Map分析

Go语言——Map分析

作者: 陈先生_9e91 | 来源:发表于2018-10-22 13:08 被阅读0次

    Go语言——Map分析

    go\src\runtime\hashmap.go

    简介

    // A map is just a hash table. The data is arranged
    // into an array of buckets. Each bucket contains up to
    // 8 key/value pairs. The low-order bits of the hash are
    // used to select a bucket. Each bucket contains a few
    // high-order bits of each hash to distinguish the entries
    // within a single bucket.
    //
    // If more than 8 keys hash to a bucket, we chain on
    // extra buckets.
    //
    // When the hashtable grows, we allocate a new array
    // of buckets twice as big. Buckets are incrementally
    // copied from the old bucket array to the new bucket array.
    //
    // Map iterators walk through the array of buckets and
    // return the keys in walk order (bucket #, then overflow
    // chain order, then bucket index).  To maintain iteration
    // semantics, we never move keys within their bucket (if
    // we did, keys might be returned 0 or 2 times).  When
    // growing the table, iterators remain iterating through the
    // old table and must check the new table if the bucket
    // they are iterating through has been moved ("evacuated")
    // to the new table.
    

    map就是一个hash表。数据被分配到bucket桶数组中。每个桶有8个kv键值对。hash值的低八位用来选择桶。高八位用来在桶内部区分kv。这个很有意思,因为一个指针就是8位。

    如果桶中kv超过8个,就新建一个桶,与之相连。

    当hash表需要扩容时,我们每次增长两倍的桶数组。桶会从老的桶数组赋值到新的桶数组。

    为了维护迭代语义,我们不会移动桶中的key(如果这么做了,key有可能返回0次或者2次)。扩容时,迭代器迭代老表,并且必须检查新表,是否正在迭代的桶已经撤离到了新表。

    struct

    type hmap struct {
       count     int // # live cells == size of map.  Must be first (used by len() builtin)
       flags     uint8
       B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
       noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
       hash0     uint32 // hash seed
    
       buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
       oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
       nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)
       overflow *[2]*[]*bmap
    }
    

    重要属性:

    • buckets,桶数组,长度2^B
    • oldbuckets,老桶数组,扩容时不为nil
    • nevacuate,num evacuate,撤离进度,小于它的都已经撤离到新桶数组
    • overflow *[2]*[]*bmap,这个挺有意思的,长度为2的数组,元素是桶数组。overflow 保存溢出的桶,即桶超过8个kv。overflow[0]对应buckets溢出,overflow[1]对应oldbuckets溢出。
    // A bucket for a Go map.
    type bmap struct {
       // tophash generally contains the top byte of the hash value
       // for each key in this bucket. If tophash[0] < minTopHash,
       // tophash[0] is a bucket evacuation state instead.
       tophash [bucketCnt]uint8
    }
    

    桶中只有一个8字节数组,长度为8。tophash保存hash的高8位,根据高8位找到条目。value保存这里就跟大多数map实现不一样,揣测8位就是value的指针?

    code

    带着上面的理解,进行源代码学习

    创建

    func makemap(t *maptype, hint int64, h *hmap, bucket unsafe.Pointer) *hmap {
       // find size parameter which will hold the requested # of elements
       B := uint8(0)
       for ; overLoadFactor(hint, B); B++ {
       }
    
       buckets := bucket
       if B != 0 {
          buckets = newarray(t.bucket, 1<<B)
       }
    
       // initialize Hmap
       if h == nil {
          h = (*hmap)(newobject(t.hmap))
       }
       h.count = 0
       h.B = B
       h.flags = 0
       h.hash0 = fastrand()
       h.buckets = buckets
       h.oldbuckets = nil
       h.nevacuate = 0
       h.noverflow = 0
    
       return h
    }
    

    根据B,创建长度2^B的桶数组。这里B会受到初始值8和load factor平衡因子的影响。

    get

    func mapaccess2(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, bool) {
       alg := t.key.alg
       hash := alg.hash(key, uintptr(h.hash0))
       m := bucketMask(h.B)
       b := (*bmap)(unsafe.Pointer(uintptr(h.buckets) + (hash&m)*uintptr(t.bucketsize)))
       if c := h.oldbuckets; c != nil {
          oldb := (*bmap)(unsafe.Pointer(uintptr(c) + (hash&m)*uintptr(t.bucketsize)))
          if !evacuated(oldb) {
             b = oldb
          }
       }
       top := tophash(hash)
       for ; b != nil; b = b.overflow(t) {
          for i := uintptr(0); i < bucketCnt; i++ {
             if b.tophash[i] != top {
                continue
             }
             k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize))
             if t.indirectkey {
                k = *((*unsafe.Pointer)(k))
             }
             if alg.equal(key, k) {
                v := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))
                if t.indirectvalue {
                   v = *((*unsafe.Pointer)(v))
                }
                return v, true
             }
          }
       }
       return unsafe.Pointer(&zeroVal[0]), false
    }
    
    1. 利用hash算法得到key的hash值
    2. 这个位运算没看懂,根据注释应该是利用hash的低八位找到bucket
    3. 如果oldbuckets不为空,即正在执行扩容,所以优先从oldbuckets读取
    4. 根据hash的高8位遍历bucket,得到v。(这一大坨位运算没看懂。)

    put

    //go:linkname reflect_mapassign reflect.mapassign
    func reflect_mapassign(t *maptype, h *hmap, key unsafe.Pointer, val unsafe.Pointer) {
       p := mapassign(t, h, key)
       typedmemmove(t.elem, p, val)
    }
    
    // Like mapaccess, but allocates a slot for the key if it is not present in the map.
    func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {}
    

    put过程比较特别,之前说过bucket里面保存的是uint8数组,也就是指针。所以这里需要先给key找到对应的slot,然后将value拷贝到对应的地址。

    相关文章

      网友评论

          本文标题:Go语言——Map分析

          本文链接:https://www.haomeiwen.com/subject/fwxgzftx.html