HashMap 源码分析

作者: terry蒋 | 来源:发表于2021-01-25 00:55 被阅读0次

java源码分析之LinkedHashMap
ConcurrentHashMap 源码分析（Java vers
java源码分析之HashMap（三）
JDK1.8HashMap源码分析
HashMap源码分析——put和get（三）
HashMap源码分析 —— put与get（四）
面试准备
HashMap源码分析——put和get（二）
JAVA 8 HashMap 源码分析
HashMap源码笔记（二）

一、前言

花了一周学习 HashMap 的源码，探究其原理和设计思路，发现 HashMap 设计非常精妙，而且一直在精益求精地完善。学后发现 HashMap 有太多有价值的知识，难怪面试官对 HashMap 这么关注。

打算写一篇总结的文章，但是 HashMap 有非常多的细节，一篇文章肯定不够讲述清楚，本文只分析 HashMap 基础原理，后面再抽空再写其他文章补充细节。

二、概述

１. HashMap 简介

HashMap 是使用频率非常高的 Key-Value 关系的数据结构。从 JDK 1.2 开始就已经有该类，并在不同版本中不断迭代优化。

２. HashMap 特性

允许 key 为 null，value 为 null。key 为 null 时，hash 值为 0。
HashMap　键值对是无序的，键值对存放的位置和插入的顺序无法。且在进行某些操作(put、remove)后，键值对的顺序可能会发生变化。
HashMap　非线程安全。

3. HashMap 原理

算法：HashMap 底层基于拉链式的散列算法实现。
数据结构：HashMap 在 JDK 1.7 基于 数组 + 链表 实现，在 JDK 1.8 基于 数组 + 链表 + 红黑树 实现。

三、源码分析

本文将分析 JDK 1.7 和 JDK 1.8 的 HashMap 源码。

1. 核心变量

JDK 1.8 中 HashMap 源码中主要的变量

    /* --------------  静态变量 -------------- */
    /**
     * The default initial capacity - MUST be a power of two.
     * 中文解释: 默认初始容量-必须是 2 的幂
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The load factor used when none specified in constructor.
     * 中文解释: 默认的负载因子。构造函数中未指定负载因子时，将使用该默认值.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     * 中文解释: 决定桶用红黑树而非链表的阀值。当添加一个元素到桶时，如果桶中元素个数达到该阀值，桶将转换为树。
     * 该值必须大于 2 且小于等于 8 。
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     * 中文解释: 用于容量调整操作时将拆分树结构的阀值。应该小于TREEIFY_THRESHOLD，且最大为6。
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     * 中文解释: 必须大于等于该值，桶才允许被转换成数结构。(否则数组将因为单个桶中节点过多而扩容)
     * 应该至少是 TREEIFY_THRESHOLD 的四倍，以避免调整大小和树化阈值之间的冲突。
     */
    static final int MIN_TREEIFY_CAPACITY = 64;


    /* --------------  成员变量 -------------- */
    /**
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     * 中文解释: 该 table 在首次使用时初始化，并根据需要调整大小。
     * 分配空间时，长度始终是 2 的幂。
     */
    transient Node<K,V>[] table;

    /**
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     * 中文解释: 保存缓存的 entry 集合。          
     * 请注意，keySet() 和 values() 取的是 HashMap 的 父类 AbstractMap 中的字段 keySet 和keySet。
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * The number of key-value mappings contained in this map.
     * 中文解释: 在当前 map 中 key-value 键值对的总数
     */
    transient int size;

    /**
     * The next size value at which to resize (capacity * load factor).
     * 中文解释: 到达该值时将跳转 table 的大小（该值 = 容量 * 负载因子）。
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    // 如果尚未分配表数组，则此字段保留初始数组容量，或者为零，表示DEFAULT_INITIAL_CAPACITY
    int threshold;

    /**
     * The load factor for the hash table.
     * 中文解释: hash table 的负载因子
     *
     * @serial
     */
    final float loadFactor;

initialCapacity

loadFactor

threshold

2. 核心方法

小提示：

HashMap 源码中，经常会在 if 判断表达式、for 循环遍历等步骤中进行赋值操作，这个赋值操作可能比较容易忽略，且表达式过长导致阅读时不易理解，需要多留意。

在 putVal() 方法中有一段代码作为示例：

// 注意此时 p 被赋值了 tab[i = (n - 1) & hash]。与我们平时的习惯不一样，平时一般会在 if 判断前进行赋值
if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
else {
    Node<K,V> e; K k;
    // p 在此处用到
    if (p.hash == hash &&
        ((k = p.key) == key || (key != null && key.equals(k))))
        ......
        ......
        ......
}
}

2.1 构造函数

JDK 1.8 中 HashMap 的构造函数源码

    /**
     * 构造函数1(最常用): 无参构造函数。负载因子取默认的0.75
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
    }   

    /**
     * 构造函数2 (推荐使用): 指定初始容量。负载因子取默认的0.75
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * 构造函数3: 指定初始容量和负载因子
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // 找到大于或等于 initialCapacity 的最小2的幂，设置为容量
        this.threshold = tableSizeFor(initialCapacity);
    }

    /**
     * Returns a power of two size for the given target capacity.
     * 计算出大于或等于 initialCapacity 的最小2的幂
     */
    static final int tableSizeFor(int cap) {
        int n = cap - 1;
        n |= n >>> 1;
        n |= n >>> 2;
        n |= n >>> 4;
        n |= n >>> 8;
        n |= n >>> 16;
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }

    /**
     * 构造函数4: 指定要拷贝的map。负载因子取默认的0.75
     * Constructs a new <tt>HashMap</tt> with the same mappings as the
     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with
     * default load factor (0.75) and an initial capacity sufficient to
     * hold the mappings in the specified <tt>Map</tt>.
     *
     * @param   m the map whose mappings are to be placed in this map
     * @throws  NullPointerException if the specified map is null
     */
    public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
    }

构造函数 1 是最常用的构造函数，但是推荐使用构造函数 2 ，指定初始容量。当未指定初始容量时，会指定默认的初始容量 16，如果在实际使用中容量需求远大于 16 ，则需要多次扩容，造成性能消耗。

2.2 put() 插入

JDK 1.8 中 HashMap 的 put() 方法源码

/**
 * Associates the specified value with the specified key in this map.
 * If the map previously contained a mapping for the key, the old
 * value is replaced.
 * 中文解释: 将指定的 value 与 当前 map 中指定的 key 相关联。
 * 如果当前 map 原先已有 指定 key 相关的 映射关系，则老的 value 将被覆盖。
 * @param key key with which the specified value is to be associated
 * 将与 value 关联的 key
 * @param value value to be associated with the specified key
 * 将与 key 关联的 value
 * @return the previous value associated with <tt>key</tt>, or
 *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
 *         (A <tt>null</tt> return can also indicate that the map
 *         previously associated <tt>null</tt> with <tt>key</tt>.)
 * 返回原先与 key 关联的值；如果没有 key 关联的映射，则为 null 。 （返回 null 也可能是因为原先 key 关联的 value 为 null。）
 */
public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

/**
 * Implements Map.put and related methods
 * 中文解释: 该方法是 Map.put() 和 其他相关方法 的具体实现
 * @param hash hash for key  中文解释:  key 的 hash 值
 * @param key the key        中文解释:  key 值
 * @param value the value to put   中文解释: put 设置的 value
 * @param onlyIfAbsent if true, don't change existing value 
 * 中文解释: onlyIfAbsent，直译是，是否仅仅在不存在时才执行。直白解释是，如果为 true ，则不改变已存在的 value。一般情况，都是 false，只有 putIfAbsent(K key, V value) 方法中 调用 putVal() 时，onlyIfAbsent = true。
 * @param evict if false, the table is in creation mode.  中文解释: 如果为 false，则 table 处于创建模式
 * @return previous value, or null if none 中文解释: 返回原先关联的 value ，如果无关联的 value 则返回 null
 */
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    
    // 如果桶数组 table 为空，即未进行初始化，则进行初始化
    if ((tab = table) == null || (n = tab.length) == 0)
        // 用 resize() 扩容的方式进行初始化
        n = (tab = resize()).length;
    // 如果 table 中不包含 hash 值对应的节点，则新建键值对节点，并放入 table。注意此时已经将 p 指向了tab[i = (n - 1) & hash]，即 table 中 hash 值相同
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    // 如果 table 中包含 hash 值对应的节点
    else {
        Node<K,V> e; K k;
        
        /* 
         * 如果当前桶中第一个节点的 hash 值、 键值都相等时，则将 e 指向该节点。
         * 为什么是第一个呢？因为 table 存放的是链表的第一个节点或者红黑树的根节点
         */
        if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        // 如果桶中的引用类型为TreeNode，则调用红黑树的插入方法
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        // 如果不是红黑树，则一定是链表
        else {
            // 对链表进行遍历，并在遍历的过程中统计链表长度
            for (int binCount = 0; ; ++binCount) {
                // 链表中不包含要插入的键值对节点时，则将该节点接在链表的最后
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    // 如果链表长度大于或等于树化阈值，则链表转为树
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                // 如果当前链表包含了相同 key 的键值对，则终止遍历
                if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        
        // 当前 map 中 存在要插入的 key 
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            // onlyIfAbsent 如果为 false ，且 原先的 value 为 null 时，才覆盖旧的值。
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            // 在 HashMap 中 afterNodeAccess(e) 是空方法
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    
    // 键值对数量超过阀值，则进行扩容
    if (++size > threshold)
        resize();
    
    // 在 HashMap 中 afterNodeInsertion(evict) 是空方法
    afterNodeInsertion(evict);
    return null;
}

HashMap会缩容吗？

HashMap 只有扩容，没有缩容的机制。

参考：为什么java的hashmap不支持动态缩小容量？

2.3 get() 查询

JDK 1.8 中 HashMap 的 get() 方法源码

/**
 * Returns the value to which the specified key is mapped,
 * or {@code null} if this map contains no mapping for the key.
 *
 * <p>More formally, if this map contains a mapping from a key
 * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
 * key.equals(k))}, then this method returns {@code v}; otherwise
 * it returns {@code null}.  (There can be at most one such mapping.)
 *
 * <p>A return value of {@code null} does not <i>necessarily</i>
 * indicate that the map contains no mapping for the key; it's also
 * possible that the map explicitly maps the key to {@code null}.
 * The {@link #containsKey containsKey} operation may be used to
 * distinguish these two cases.
 *
 * @see #put(Object, Object)
 */
public V get(Object key) {
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

/**
 * Implements Map.get and related methods
 * 中文解释: 该方法是 Map.get() 和 其他相关方法 的具体实现
 * @param hash hash for key  中文解释: key 的 hash 值
 * @param key the key   中文解释: key 值
 * @return the node, or null if none  中文解释: 返回 key 对应的节点，如果节点不存在，则返回 null
 */
final Node<K,V> getNode(int hash, Object key) {
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    
    // 如果 table 不为空，且 key 在 table 存在，则定位 key 对应节点所在位置
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        // 如果正好是桶的第一个节点
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        // 如果不是桶的第一个节点，则继续往后查找
        if ((e = first.next) != null) {
            // 如果桶中的引用类型为TreeNode，则根据红黑树的查找方式查找节点
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            // 如果不是树，则一定是链表。遍历链表，查找出节点
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    
    
    // 如果 table 空，或 key 在 table 不存在，则返回 null
    return null;
}

查找的步骤还是比较简单的。

2.4 遍历

遍历的方式

Map<String, Integer> map = new HashMap<>(16);
map.put("a", 1);
map.put("b", 2);

// 遍历方式1
for (Map.Entry<String, Integer> next : map.entrySet()) {
    System.out.println("key=" + next.getKey() + " value=" + next.getValue());
}

// 遍历方式2
for (String key : map.keySet()) {
    System.out.println("key=" + key + " value=" + map.get(key));
}

推荐使用方式 1 ，因为第一种方式可以直接把 key 和 value 取出，而方式 2 需要通过 key 再去搜索一次，效率差得多。

2.5 remove() 删除

JDK 1.8 中 HashMap 的 remove() 方法源码

/**
 * Removes the mapping for the specified key from this map if present.
 * 删除当前 map 中指定 key 对应的映射关系
 *
 * @param  key key whose mapping is to be removed from the map
 * 中文解释: 要从 map 中删除对应映射关系的 key
 * @return the previous value associated with <tt>key</tt>, or
 *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
 *         (A <tt>null</tt> return can also indicate that the map
 *         previously associated <tt>null</tt> with <tt>key</tt>.)
 * 中文解释: 返回 key 原先关联的 value，当原先的 value 为 null，或 key 无对应的映射关系时，则返回 null
 */
public V remove(Object key) {
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
}

/**
 * Implements Map.remove and related methods
 * 中文解释: 该方法是 Map.remove() 和 其他相关方法 的具体实现
 * @param hash hash for key 中文解释: key 对应的 hash 值
 * @param key the key 中文解释: key
 * @param value the value to match if matchValue, else ignored 中文解释: 如果matchValue 为 true 才匹配的 value ， 否则忽略 TODO 还是不太理解干啥用的
 * @param matchValue if true only remove if value is equal 中文解释: 如果为 true ，则仅在 value 相等时才删除
 * @param movable if false do not move other nodes while removing 中文解释: 如果为 false ，则在删除时，不会移动其他节点
 * @return the node, or null if none 中文解释: 返回 key 对应的节点，如果不存在对应节点，则返回 null
 */
final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    Node<K,V>[] tab; Node<K,V> p; int n, index;
  
  // 和 get() 类似，如果 table 不为空，且 key 在 table 存在，则定位 key 对应节点所在位置
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) != null) {
        Node<K,V> node = null, e; K k; V v;
    
    // 如果正好是桶的第一个节点
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            node = p;
    // 如果不是桶的第一个节点，则继续往后查找
        else if ((e = p.next) != null) {
      // 如果桶中节点的引用类型为TreeNode，则根据红黑树的查找方式查找节点
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
      // 如果不是树，则一定是链表。遍历链表，查找出节点
            else {
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key ||
                         (key != null && key.equals(k)))) {
                        node = e;
                        break;
                    }
          // p = e 的目的是用 p 指向当前节点，而while 表达式中 e = e.next 是让 e 指向 下一个节点，该步骤是后面移除节点，修复链表的准备工作
                    p = e;
                } while ((e = e.next) != null); // 如果 (e = e.next) == null 说明已经遍历到最后一个节点，依旧没有找到对应的节点，将退出 do-while 循环，此时 node 依旧为 null
            }
        }
    
    // 如果存在对应节点，则删除节点，并修复链表或红黑树
        if (node != null && (!matchValue || (v = node.value) == value ||
                             (value != null && value.equals(v)))) {
      // 如果桶中节点的引用类型为TreeNode，则按红黑树删除节点的方式删除该节点
            if (node instanceof TreeNode)
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
      // 如果不是树，则一定是链表结构。如果是第一个节点，将 table 的 index 下标的位置指向 node 节点的下一个节点，即第二个节点移动到第一个节点，这样 node 节点就在链表中删除了
            else if (node == p)
                tab[index] = node.next;
      // 如果是链表结构，且不是第一个第一个节点，则 key 对应的节点 node 的前一个节点的 next 指 向 node 之后的节点，这样 node 节点就就在链表中删除了
            else
                p.next = node.next;
            ++modCount;
      
      // map 的 键值对的总数 size 减一 
            --size;
      // 在 HashMap 中 afterNodeRemoval(e) 是空方法
            afterNodeRemoval(node);
            return node;
        }
    }
  
  // 如果 table 空，或 key 在 table 不存在，则返回 null
    return null;
}

romove() 前半部分和 get() 方法步骤是几乎是一样的。

java源码分析之LinkedHashMap
相关文章java源码分析之HashMap（一）java源码分析之HashMap（二）java源码分析之HashMa...
ConcurrentHashMap 源码分析（Java vers
在我之前的文章《HashMap 源码分析》中分析了HashMap的源码，众所周知，ConcurrentHashMa...
java源码分析之HashMap（三）
相关文章java源码分析之HashMap（一）java源码分析之HashMap（二）https://blog.cs...
JDK1.8HashMap源码分析
HashMap源码分析分析源码之前，先了解一下HashMap的结构，JDK1.7之前HashMap是通过数组结构...
HashMap源码分析——put和get（三）
HashMap源码分析——put和get（三）链接上一节： HashMap源码分析——put和get（二） ...
HashMap源码分析 —— put与get（四）
HashMap源码分析 —— put与get（四）链接上一节 : HashMap源码分析 —— put与get...
面试准备
1.HashMap && CurrentHashMap源码分析 HashMap源码解析 java 并发编程之 Co...
HashMap源码分析——put和get（二）
HashMap源码分析——put和get（二）链接上一节：HashMap源码分析——put和get（一） 2...
JAVA 8 HashMap 源码分析
JAVA 8 HashMap 源码分析一什么是HashMap？ HashMap 继承了AbstractMap,...
HashMap源码笔记（二）
紧接这上一篇：HashMap源码笔记（一）我们继续来分析HashMap源码，接下来我们来看看HashMap的源码说...

HashMap 源码分析

一、前言

二、概述

１. HashMap 简介

２. HashMap 特性

3. HashMap 原理

三、源码分析

1. 核心变量

initialCapacity

loadFactor

threshold

2. 核心方法

2.1 构造函数

2.2 put() 插入

HashMap会缩容吗？

2.3 get() 查询

2.4 遍历

2.5 remove() 删除

相关文章

java源码分析之LinkedHashMap

ConcurrentHashMap 源码分析（Java vers

java源码分析之HashMap（三）

JDK1.8HashMap源码分析

HashMap源码分析——put和get（三）

HashMap源码分析 —— put与get（四）

面试准备

HashMap源码分析——put和get（二）

JAVA 8 HashMap 源码分析

HashMap源码笔记（二）

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读