美文网首页
HashMap源码学习与理解(基于JDK1.8)

HashMap源码学习与理解(基于JDK1.8)

作者: ZT5250Z | 来源:发表于2020-04-22 17:18 被阅读0次

    设计说明

    HashMap源码中对源码的设计已经做了详细的说明,避免造成误导,还是直接引用源码中的英文说明,感兴趣的可以自行阅读理解一下

    /*
     * Implementation notes.
     *
     * This map usually acts as a binned (bucketed) hash table, but
     * when bins get too large, they are transformed into bins of
     * TreeNodes, each structured similarly to those in
     * java.util.TreeMap. Most methods try to use normal bins, but
     * relay to TreeNode methods when applicable (simply by checking
     * instanceof a node).  Bins of TreeNodes may be traversed and
     * used like any others, but additionally support faster lookup
     * when overpopulated. However, since the vast majority of bins in
     * normal use are not overpopulated, checking for existence of
     * tree bins may be delayed in the course of table methods.
     *
     * Tree bins (i.e., bins whose elements are all TreeNodes) are
     * ordered primarily by hashCode, but in the case of ties, if two
     * elements are of the same "class C implements Comparable<C>",
     * type then their compareTo method is used for ordering. (We
     * conservatively check generic types via reflection to validate
     * this -- see method comparableClassFor).  The added complexity
     * of tree bins is worthwhile in providing worst-case O(log n)
     * operations when keys either have distinct hashes or are
     * orderable, Thus, performance degrades gracefully under
     * accidental or malicious usages in which hashCode() methods
     * return values that are poorly distributed, as well as those in
     * which many keys share a hashCode, so long as they are also
     * Comparable. (If neither of these apply, we may waste about a
     * factor of two in time and space compared to taking no
     * precautions. But the only known cases stem from poor user
     * programming practices that are already so slow that this makes
     * little difference.)
     *
     * Because TreeNodes are about twice the size of regular nodes, we
     * use them only when bins contain enough nodes to warrant use
     * (see TREEIFY_THRESHOLD). And when they become too small (due to
     * removal or resizing) they are converted back to plain bins.  In
     * usages with well-distributed user hashCodes, tree bins are
     * rarely used.  Ideally, under random hashCodes, the frequency of
     * nodes in bins follows a Poisson distribution
     * (http://en.wikipedia.org/wiki/Poisson_distribution) with a
     * parameter of about 0.5 on average for the default resizing
     * threshold of 0.75, although with a large variance because of
     * resizing granularity. Ignoring variance, the expected
     * occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
     * factorial(k)). The first values are:
     *
     * 0:    0.60653066
     * 1:    0.30326533
     * 2:    0.07581633
     * 3:    0.01263606
     * 4:    0.00157952
     * 5:    0.00015795
     * 6:    0.00001316
     * 7:    0.00000094
     * 8:    0.00000006
     * more: less than 1 in ten million
     *
     * The root of a tree bin is normally its first node.  However,
     * sometimes (currently only upon Iterator.remove), the root might
     * be elsewhere, but can be recovered following parent links
     * (method TreeNode.root()).
     *
     * All applicable internal methods accept a hash code as an
     * argument (as normally supplied from a public method), allowing
     * them to call each other without recomputing user hashCodes.
     * Most internal methods also accept a "tab" argument, that is
     * normally the current table, but may be a new or old one when
     * resizing or converting.
     *
     * When bin lists are treeified, split, or untreeified, we keep
     * them in the same relative access/traversal order (i.e., field
     * Node.next) to better preserve locality, and to slightly
     * simplify handling of splits and traversals that invoke
     * iterator.remove. When using comparators on insertion, to keep a
     * total ordering (or as close as is required here) across
     * rebalancings, we compare classes and identityHashCodes as
     * tie-breakers.
     *
     * The use and transitions among plain vs tree modes is
     * complicated by the existence of subclass LinkedHashMap. See
     * below for hook methods defined to be invoked upon insertion,
     * removal and access that allow LinkedHashMap internals to
     * otherwise remain independent of these mechanics. (This also
     * requires that a map instance be passed to some utility methods
     * that may create new nodes.)
     *
     * The concurrent-programming-like SSA-based coding style helps
     * avoid aliasing errors amid all of the twisty pointer operations.
     */
    

    相关属性说明

    HashMap的默认初始化容量,必须是2的冥次方,为什么必须是2的冥次方?,在后续的tableSizeFor会做个说明

    /** 默认初始化的容量
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
    
    /** 最大的容量
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;
    
    /** 默认的加载因子
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;
    

    链表转换成树的阈值,当链表长度达到TREEIFY_THRESHOLD时,则会触发树化的操作
    将链表转换成树,至于为什么是8,其实可以在上面官方提供的实现说明中找到答案,是根据时间和空间的综合考虑的结果

    Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution
    理论上,在随机的hashCode条件下,链表元素是遵循泊松分布的,当元素达到8的时候,概率为0.00000006,几乎是不可能事件
    0: 0.60653066
    1: 0.30326533
    2: 0.07581633
    3: 0.01263606
    4: 0.00157952
    5: 0.00015795
    6: 0.00001316
    7: 0.00000094
    8: 0.00000006
    more: less than 1 in ten million
    但是JDK没办法控制用户自定义的hashCode,如果用户自己定义了一个随机性很差的hashCode,那就没办法遵循上面的原则,所以综合了原本的随机性和用户自定义的行为,这里就选择了8作为转换的阈值
    当长度降为6时,会触发链化操作,即将树再转换成链表,这里为什么选择6呢,主要是因为如果元素在8附近徘徊的话,就会频繁的触发转换操作,为了避免这个问题,所以选择了6作为降级的阈值

    /** 链表树化的阈值
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;
    
    /** 树转换成链表的阈值
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;
    
    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
    static final int MIN_TREEIFY_CAPACITY = 64;
    

    构造方法

    HashMap(int initialCapacity, float loadFactor)
    HashMap(int initialCapacity)
    HashMap()
    HashMap(Map<? extends K, ? extends V> m)

    HashMap数据结构

    HashMap的底层数据结构如下,由一个数组和多个链表/树组成,每个数组元素都是一个链表/树

    当链表达到一定长度时就会转换成树

    数组------链表
    
    |---|
    | 0 |->| a |->| b |->| c |->| d |->null
    |---|
    | 1 |->| a |->| b |->| c |->| d |->| e |->null
    |---|
    | 2 |->| a |->null
    |---|
    | 3 |->| a |->| b |->| c |->| d |->null
    |---|
    | 4 |->| a |->| b |->| c |->null
    |---|
    | 5 |->| a |->null
    |---|
    | 6 |->null
    |---|
    | 7 |->| a |->| b |->null
    |---|
    

    Map<String, String> m = new HashMap<>(cap);

    我们知道HashMap规定初始容量必须是2的冥次方,所以如何保证初始化的容量是2的冥次方就是一个需要关注的点

    int cap = 11;
    Map<String, String> m = new HashMap<>(cap);
    
    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }
    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        this.threshold = tableSizeFor(initialCapacity);
    }
    

    为什么要是2的冥次方

    为什么要保证容量为2的冥次方,这里简要说明一下,因为HashMap中计算某个元素的索引值时是(n - 1) & hash, 这里n为容量,当n为2的冥次方时,n - 1 就变成了所有比特位都为1的数,这样在进行后续的&运算时,可以保证hash中的所有比特位都能起到作用,因为&运算是相同为1,不同为0,所以当比特位都为1时,就能保留尽可能多的1值,并且1值由hash本身决定,不会受到n的影响,如果不是2的冥次方,那其二进制就是0和1相间的随机分布,这样就会导致hash本身的值有部分要被丢弃掉

    例如 n是2的冥次方,其n-1的二进制都是

    0000000000000000111111111111111
    000000000000000000000111111111
    ...
    

    等等类似的,在进行&运算时,结果由hash本身决定
    如果n不是2的冥次方,那对应的n-1就可能是

    00000001000101010010101010011
    00000001010101010000000010011
    ...
    

    这样n-1中所有的比特位为0的&运算结果都为0,就导致了计算结果不稳定,并且不是有hash主导的

    如何保证容量为2的冥次方 - tableSizeFor(initialCapacity)

    指定初始化数组容量,会调用tableSizeFor(initialCapacity)来保证初始化的数组容量为2的冥次方

           static final int tableSizeFor(int cap) {
               int n = cap - 1;
    // 右移1位之后与自己进行或运算,可以把前两位变为1
               n |= n >>> 1;
    // 这里由于前一步已经变成了前2位是1了,所以需要右移2位再和上一步的计算结果进行或运算
    // 把前四位变成1
               n |= n >>> 2;
    // 这里前一步已经将结果变成前4位为1的数了,所以需要右移4位后继续运算
               n |= n >>> 4;
    // 这里前一步已经将结果变成前8位为1的数了,所以需要右移8位后继续运算
               n |= n >>> 8;
    // 这里前一步已经将结果变成前16位为1的数了,所以需要右移16位后继续运算
               n |= n >>> 16;
    // 到这已经将整个int的32位都做了处理(如果需要的话),保证了每一位都是1
    // 然后将返回结果+1就变成了一个2的冥次方的数
               return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
           }
    

    例如初始化容量cap为11, 则n = cap - 1; 二进制为: 1010

    • 第一步计算右移1位之后与自己进行或运算: 至少保证前2位都是1
    1010
     101  -> 1111
    
    • 第二步计算右移2位之后与上一步结果进行或运算:至少保证前4位都是1
    1111
    0011  -> 1111
    
    • 第三步计算右移4位之后与上一步结果进行或运算:至少保证前8位都是1
      注:由于我们给的初始容量比较小,只占了四个比特位,所以在第二步执行之后,就已经所有比特位都为1了,后续的计算结果不变都是四个1
    1111
    0000  -> 1111
    
    • 第四步计算右移8位之后与上一步结果进行或运算:至少保证前16位都是1
    1111
    0000  -> 1111
    
    • 第五步计算右移16位之后与上一步结果进行或运算:至少保证前32位都是1,这里就结束了,因为int最长就到32位,如果给的初始值足够大,则会到最后这一步才能保证所有的比特位都变成1
    1111
    0000  -> 1111
    

    最后如果是正常的int值,则需要+1再返回,由于所有的比特位都是1,+1之后就变成了1后面全是0,即2的冥次方

    m.put("key1", "value1")

    HashMap通过put方法添加元素的key和value,基本的思想就是通过计算key在内部table的索引值,然后将value放置到索引值对应的table中的元素上,在计算索引值的时候可能会出现索引值相同的情况,这里就发生了hash碰撞,一般情况下hash碰撞有以下四种解决方案:

    • 开放地址法
    • 再哈希法
    • 链地址法(拉链法)
    • 建立一个公共溢出区

    HashMap中采用的就是拉链法来解决hash碰撞/冲突的问题,这样就可以不用创建一个超大的数组来存放元素,降低内存损耗,同时又可以存放大量元素,多余的元素通过在数组上拉链存储,由于数组本身是可以通过下标随机访问的,所以HashMap在获取元素的时候能够快速的定位到指定位置上的元素,如果元素上的值不是要获取的值,那就继续遍历链表,这样既保留了数组快速定位的优势,又增加了链表无限追加的能力
    不过虽然采用拉链法可以解决hash碰撞,但是也会存在极端的情况下,当所有的元素索引值都一样时,就相当于拉了一个很长的链,我们知道链表和数组不同,没办法通过索引随机访问,只能从头遍历元素,直到找到目标元素,这就出现遍历的性能问题,当链表越来越长时遍历的性能就会越来越差

    问题

    1. 如何增加hash的随机性,使每个元素计算出的索引值尽可能的不同,以此增加元素在数组中的散列程度
    2. 如何避免极端情况下,链表过长失去快速检索的优势
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
    
    /**
         * Computes key.hashCode() and spreads (XORs) higher bits of hash
         * to lower.  Because the table uses power-of-two masking, sets of
         * hashes that vary only in bits above the current mask will
         * always collide. (Among known examples are sets of Float keys
         * holding consecutive whole numbers in small tables.)  So we
         * apply a transform that spreads the impact of higher bits
         * downward. There is a tradeoff between speed, utility, and
         * quality of bit-spreading. Because many common sets of hashes
         * are already reasonably distributed (so don't benefit from
         * spreading), and because we use trees to handle large sets of
         * collisions in bins, we just XOR some shifted bits in the
         * cheapest possible way to reduce systematic lossage, as well as
         * to incorporate impact of the highest bits that would otherwise
         * never be used in index calculations because of table bounds.
         */
        static final int hash(Object key) {
            int h;
            return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
        }
    

    增加hash的散列程度

    HashMap内部定义了一个hash(key)方法,是将key的原始hashCode右移了16位然后再与自身进行异或运算,目的是为了将hashcode的高16位低16位混合(这里是个思考点,下面会说明)

    该hash方法与后面计算索引值有关联,后面的索引值计算公式是(n - 1) & hash,n是数组的长度,

    1. 由于n总是为2的冥次方,所以n-1就变成了一个二进制形式都是1的数,这样可以保证&运算时hash的所有bit位都能参与运算,从这里可以看出为什么要保证数组容量是2的冥次方。
      因为如果是2的冥次方,n-1之后就变成了所有比特位都为1的数,这样进行&运算时,实际的值就完全取决于hash本身,否则索引值就会受到n的影响而不稳定

    例如 假设n为5,二进制为:101,与hash进行&运算之后,中间那一位就会变成0,导致hash里面对应的bit位数据失效,因为&运算只有都为1才是1,否则就是0

    注: 这里如果直接用n进行&运算的话就会极大可能的导致发生hash冲突或者碰撞,因为n是2的冥次方,二进制就是最高位为1,其余所有bit位都是0,这样与hash在进行&运算就没有意义了,因为其他位运算结果都是0,只有最高位是1

    如果有人有这个疑问的话,可以看看,没有疑问那就直接跳过吧。索引计算公式: (n - 1) & hash的一个小疑问,之前看源码的时候总是会有点疑惑,这里不会出现下标溢出吗?后来将数据都转换成二进制就变的豁然开朗,(n - 1)的二进制是比n少1位的,例如n 是 1000000, 则(n - 1) 就是 0111111,所以在计算的时候,前面都是0,只有后面的1会参与运算,这样是不可能超过数组容量的,也就不会发生溢出的情况

    1. 正常情况下table的数据都是比较少的,与hash进行&运算时,hash中超过数组长度的部分都不会参与到运算,这样就意味着hash本身的高位数据大部分都会被丢弃掉,所以HashMap重新定义了一个hash(key)方法,将高位与低位做了混合,这样高位的数据就不会浪费,同时也增加了索引计算结果的随机性,即增加了散列程度降低hash碰撞的可能性

    例如: 字符串"test" hashcode
    3556498
    "test" hashcode 对应的二进制形式
    1101100100010010010010
    假设n为8,二进制为 : 1000, n - 1 : 111
    (n - 1) & "test".hashCode()
    0000000000000000000111
    1101100100010010010010 > 010
    hash("test".hashCode())
    1101100100010010010010
    0000000000000000110110 > 1101100100010010100100
    (n - 1) & hash("test".hashCode())
    0000000000000000000111
    1101100100010010100100 > 100

    put方法简要说明

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                       boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        // 判断table是否为空,如果为空需要先初始化,因为在默认创建的时候没有创建数组元素,需要调用resize()方法来执行数组初始化操作
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 通过(n - 1) & hash计算key对应的索引值,定位table中的元素
        // 这里计算key在数组中位置的公式设计已经在上面做了详细说明
        // 如果对应位置上的元素为null,则直接新建一个结点元素存放到table指定位置上
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            // 如果table指定位置上的元素不为空,则需要判断这个元素的key是否和要放置元素的key
            Node<K,V> e; K k;
            // 判断元素的hash是否相等, 判断元素的key是否相等
            // 如果hash和key都相等,则说明这个元素和放入的元素一样
            // 否则需要从该元素往后遍历:可能是链表也可能是树
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                // 如果该结点是树,则使用树的方式去遍历元素
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                // 否则该结点为链表,遍历链表
                for (int binCount = 0; ; ++binCount) {
                    // 如果元素next为null,则说明该位置上就一个值,直接追加在链表尾部
                    if ((e = p.next) == null) {
                        // 新建元素插入到链表尾部
                        p.next = newNode(hash, key, value, null);
                        // 元素插入之后,需要判断链表长度,如果达到转换成树的阈值,则将链表转换成树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果元素不为null,判断与传入的key是否相等,如果相等则说明该元素的key与传入的key相同
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            // 上一步遍历的目标元素
            // 如果目标元素不为空则表示已存在相同key的元素
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                // 跟据onlyIfAbsent判断是否需要覆盖原始的值
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 如果是新的元素,则会走到这里,增加元素的数量,并判断是否需要扩容
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
    

    put的基本流程:

    1. 如果table为空,则初始化table
    2. 计算索引值,定位table中的元素
    • 如果为空则直接创建新的结点存放到数组中
    • 如果不为空则说明发生了hash碰撞,按照HashMap的解决方案,需要遍历后面的链表,但是jdk1.8做了优化,数组元素可能是链表也可能是树:如果是树则用遍历树的方式遍历需要放置的元素,如果是链表则用链表的方式遍历需要放置的元素
    1. 如果HashMap中存在key相同的结点,则根据onlyIfAbsent判断是否需要覆盖旧值,如果HashMap中没有对应的key,则表明创建了新的结点
      5、如果是创建新的结点,需要判断是否达到了扩容的阈值,如果达到了,则需要进行扩容操作

    resize()方法说明

    /**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     */
    final Node<K,V>[] resize() {
        // 将原始的table赋值给oldTab
        Node<K,V>[] oldTab = table;
        // 记录oldTab的长度
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        // 记录扩容之前的阈值
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            // 如果oldCap大于0,则判断是否大于最大的容量
            // 大于HashMap允许的最大容量,则直接返回最大值
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            // 如果小于最大值,则进行右移1位1操作,即扩容后的容量为之前的两倍
            // 需要判断扩容后的容量是否大于最大容量,并且旧的容量是否大于或者等于默认的初始化容量
            // 如果扩容后不大于最大容量,并且就的容量大于初始容量,则将阈值也右移1位
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                    oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                    (int)ft : Integer.MAX_VALUE);
        }
        // 上述都是初始化新的值,包括新的容量,新的阈值
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        // 如果老的数组不为null,则说明有值,需要将旧值转移到新的数组中
        if (oldTab != null) {
            // 遍历旧数组的所有数据
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    // 如果元素的next为null,则说明这个位置上的元素就一个值
                    // 直接用新的容量计算索引值放置到新的数组中
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    // 如果next不为null,则说明该元素有多个值
                    // 判断元素类型,如果是tree则用tree的方式遍历赋值
                    // 如果是链表则用链表方式遍历赋值
                    else if (e instanceof TreeNode)
                        // tree
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // 链表
                        // 这里有个需要注意的点,就是链表迁移之后,一般会变成两条链表,分散到新数组的两边
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            // 这里使用就的容量计算索引,直接使用实际的容量,没有减一,即形式为最高位1其余都是0的二进制
                            // 当进行&运算时,除了最高位可能为1以外,其余都为0
                            // 所以当计算的索引值为0时,即元素位置不变,放在原来的位置
                            // 当计算的索引值为1时,则位置发生变化,放在原来位置+oldCap的位置上
                            if ((e.hash & oldCap) == 0) {
                                // 生成原来位置上(新数组的低位)的一条链表
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                // 生成原来位置+oldCap上(新数组的高位)的一条链表
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            // 原位置
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            // 原位置+oldCap
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }
    

    m.get("key5")

    HashMap的get方法相对比较简单,主要就是根据key的hash值计算索引值,定位到目标元素

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    
    /**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        // 当数组不为空时,计算索引值获取元素(或者说头结点)
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            // 判断头结点是否为目标元素
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            // 继续遍历结点
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    // 按照树的方式遍历
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    //按照链表的方式遍历
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }
    

    未完待续..

    相关文章

      网友评论

          本文标题:HashMap源码学习与理解(基于JDK1.8)

          本文链接:https://www.haomeiwen.com/subject/yofiihtx.html