HashMap源码-概述

作者: kkyeer | 来源:发表于2018-11-25 12:35 被阅读0次

HashMap剖析
Java集合：HashMap源码剖析
HashMap源码-概述
Java集合框架源码解读(3)——LinkedHashMap
HashMap源码分析
LinkedHashMap源码分析
深入TreeMap源码解析（JDK1.8）
【Java集合】源码分析之HashMap
HashMap源码之一：概述
HashMap源码

Implementation Notes

HashMap有两个参数，initialCapacity(默认16），loadFactor默认0.75，当容器内节点数量多于initialCapacity*loadFactor，自动扩充
loadFactor越大，时间（puth和get的时间)成本越高，越小，空间成本越高
一般情况下，内部存储的是哈希表，当内容过大时，转变为TreeNode容器，TreeNode容器内部类似TreeMap的结构，HashMap的方法大部分不区分，只在TreeNode有额外实现时被调用（通过instrance of TreeNode)方法，使用TreeNode容器是为了在数据过多时能够快速查找，然而因为大部分Map内部元素没有达到需要Tree Bin(树状容器）存储的要求（默认64个元素），所以checking for existence of tree bins may be delayed in the course of table methods.
树形态根据key的哈希码排序，但是当类实现了Comparable接口时，类的compareTo方法被调用来排序，在多个实例返回同样的hashCode的情况下，通过实现compareTo方法来提高效率
TreeNode大小大概是普通Node的两倍，因此需要一个阈值来控制它的启用

常量

DEFAULT_INITIAL_CAPACITY 初始化容量，16
MAXIMUM_CAPACITY 最大容量，2的30次方
DEFAULT_LOAD_FACTOR 负载比例，0.75
TREEIFY_THRESHOLD 树状阈值，HashMap相同hash的默认放到同一个节点并next链时穿起来，但当链长>=TREEIFY_THRESHOLD时，需将链变为树状以提升访问效率
UNTREEIFY_THRESHOLD （树状转数组的阈值，6）
MIN_TREEIFY_CAPACITY 最小转树状阈值，64

内部静态Node类

内部存储final hash,final key,value,next四个变量
hashCode方法返回Objects.hashCode(key) ^ Objects.hashCode(value)

public final int hashCode() {
            return Objects.hashCode(key) ^ Objects.hashCode(value);
        }

equals方法要求key和value对equals方法都成立

方法

1.hash方法

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

计算key的hash，结尾是key的hashCode方法返回值（int型）的高16位拼接高16位与低16位按位异或的结果
这么做的原因是:table中计算key对应存储位置的时候，使用的是(capacity-1)&hash，当n=16时，相当于16&hash,也就是，000000000000000001111&hash,也就是前28位确认为0，hash的后四位决定是否碰撞，对于只在高位有区别的key，是大概率会碰撞的，因此将高16位spread到低位去，可以在某些场景减少碰撞，下面是验证：

public static void main(String[] args) {
        Float f1 = 11111.0f;
        Float f2 = 111111.0f;
        int capacity = 16;
        System.out.println("f1的二进制："+Integer.toBinaryString(f1.hashCode()));
        System.out.println("f2的二进制："+Integer.toBinaryString(f2.hashCode()));
        System.out.println("HashMap630行，位置计算方式为(n - 1) & hash，假设当前容量为16\n若不进行spread，则：");
        int index1 = (capacity-1)&f1.hashCode();
        int index2 = (capacity-1)&f2.hashCode();
        System.out.println("位置1为:"+index1);
        System.out.println("位置2为:"+index2);
        System.out.println("发生碰撞\n若进行spread:");
        index1 = (capacity-1)&spreadHash(f1.hashCode());
        index2 = (capacity-1)&spreadHash(f2.hashCode());
        System.out.println("位置1为:"+index1);
        System.out.println("位置2为:"+index2);
        System.out.println("不发生碰撞");
    }

    /**
     * 按hashMap的方法，计算spread后的hash
     * @param hash
     * @return
     */
    static  int spreadHash(int hash){
        return hash ^ (hash >>> 16);
    }

f1的二进制：1000110001011011001110000000000
f2的二进制：1000111110110010000001110000000
根据HashMap源码630行，位置计算方式为(n - 1) & hash，假设当前容量为16
若不进行spread，则：
位置1为:0
位置2为:0
发生碰撞
若进行spread:
位置1为:13
位置2为:9
不发生碰撞

为什么是^不是|或者&，验证可得按&是不好的，比如上面例子，按位与的话，还是会碰撞（结果都是0），^和|的运算结果一致，待探究

2.comparableClassFor方法：判断是否实现Comparable接口

static Class<?> comparableClassFor(Object x)

因为String类型的key最多，且实现了Comparable接口，所以入参为String类的直接返回String.class，
判断类实现Comparable接口，且泛型实际类型为自己，则返回
否则返回null

3.compareComparables方法：返回两个Comparable对象的比较值

static int compareComparables(Class<?> kc, Object k, Object x)

kc:实现了Comparable接口的实体类
k：对象1 c:对象2
通过调用时保证类型正确
x为空或者不是kc类型时返回空，否则返回k.compareTo(x)

4.tableSizeFor：HashMap的方法，计算目标容量对应的2的次方数容量

static final int tableSizeFor(int cap)

内部变量

1.transient Node<K,V>[] table;

核心存储变量，随需要resize，初始化时，容量大小是2的次方

2.transient Set<Map.Entry<K,V>> entrySet;

保存缓存的entrySet

3.transient int size;

存储当前map的大小

4.transient int modCount;

存储当前map发生Structural modifications的次数，用于在迭代时快速失败（fail-fast)

5.int threshold;

下一次resize的阈值，比如当前threshold为16，当前要放17个元素进去，则需要resize

6.final float loadFactor;

负载因子

网友评论

本文标题：HashMap源码-概述

本文链接：https://www.haomeiwen.com/subject/gmrpqqtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！