哈希表

作者: code希必地 | 来源:发表于2021-01-25 13:25 被阅读0次

Java数据结构_哈希表_基本概念
redis数据结构--字典
哈希表和链表
算法-哈希表算法总结
数据结构 -- 哈希表及其应用
数据结构与算法（第一季）：哈希表（Hash Table）
深入理解哈希表
2019 算法面试相关(leetcode)--哈希表
Redis中的字典
Redis数据结构与对象——哈希

在学习Hash表之前，先来分析一下之前映射（Map）文章中我们自己实现的TreeMap。

1、TreeMap分析

时间复杂度：

TreeMap内部我们使用红黑树实现的，TreeMap的搜索、添加、删除的时间复杂度为O(logn)级别的。

特点：

Key必须具有可比较性

TreeMap无法保证存储的元素顺序和插入顺序一致，但是存储的元素是根据Key的排序进行有序分布的。

但是在很多实际需求中

Map中存储的元素并不讲究顺序。
并不要求Key具有可比较性。
在不考虑元素顺序，不要求Key具有可比较性，使用哈希表实现的Map，可以使得时间复杂度达到O(1)级别。

2、需求

需求：设计一个写字楼通讯录，存放所有公司的通讯信息

要求添加、删除、搜索的时间复杂度为O(1)。

针对这个需求，我们可以创建一个足够大的数组，将座机号码作为数组的索引，将公司信息作为数组中的值。这里假设座机号码的长度为8，设计如下：

Company[] companies = new Company[100000000];

public void add(int phone, Company company) {
    companies[phone] = company;
}

public void remove(int phone) {
    companies[phone] = null;
}

public Company get(int phone) {
    return companies[phone];
}

image.png

可以看到上面设计是可以做到搜索、添加、删除的时间复杂度达到O(1)级别的，但是有如下缺点：

空间复杂度较大，申请了长度为100000000数组。
数组中可能只有一小部分被在使用，造成空间的极大浪费。
其实数组companies就是一个哈希表，典型的【空间换时间】。

3、哈希表（Hash Table）

哈希表也叫散列表，下面举例看下哈希表是如何高效处理数据的。

put("Jack",666);
put("Rose",777);
put("Kate",888);

哈希表搜索、添加、删除的流程是类似的。

1、通过哈希函数将key生成对应的索引index，哈希函数的时间复杂度为【O(1)】。
2、根据index操作指定位置的元素，时间复杂度【O(1)】。

image.png

上图中的table就是哈希表，哈希表是典型的空间换时间的应用，哈希表内部的数组元素很多地方也叫Buket（桶），整个数组叫Buckets或Buckets Array。

4、哈希冲突（Hash Collision）

哈希冲突也叫哈希碰撞。

如果两个不同的key，经过哈希函数计算出来的值相同，这样就会造成哈希冲突。

比如两个key，Rose和Kate经过哈希函数计算出来的index都为3，此时就会出现哈希冲突。

image.png

4.1、哈希冲突的解决方案

1、开放定址法

按照一定的规则向其他地址探测，直到遇到空桶，将重复的值存入到空桶中。

2、再哈希法

设计多个哈希函数，比如使用哈希函数1时发生了哈希冲突，则使用哈希函数2进行计算，如果依然冲突则使用哈希函数3计算，直到不再冲突为止。

3、链地址法

使用单链表存储相同index的元素。

4.2、JDK1.8的哈希冲突解决方案

JDK1.8使用数组+单向链表+红黑树来处理哈希冲突，在JDK1.8之前使用数组+单向链表进行处理冲突。

image.png

1、默认使用单向链表将元素串起来。
2、在添加元素时，当哈希表容量>=64且单向链表的节点数大于8时就会由单向链表转成红黑树存储元素。
3、当红黑树节点数量少到一定程度时，又会转到单向链表。
4、JDK1.8中使用链表+红黑树解决哈希冲突。
为啥要使用单向链表呢？双向链表的性能不应该更高吗？

因为在哈希冲突时，会从头开始遍历链表，如果冲突的key和遍历到的节点中的key相同，则进行value覆盖，如果全都不相同，则添加到链表的尾部，所以没有必要使用双向链表。

而且双向链表中的节点比单向链表多了一个pre指针，浪费空间。

5、哈希函数

哈希表中哈希函数的实现步骤如下：

先计算出key的哈希值（哈希值必须是整数）。

再让key的哈希值和数组的大小进行运算，生成一个索引值。

//生成哈希表中的索引
public int hash(Object key) {
    return hashCode(key) % table.length;
}

为了提高效率，使用&位运算替代%。【前提是数组的长度必须是2^n次幂】

image.png
从上图可以看到：任意数和1111进行&位运算后的结果肯定是小于等于1111。如果1111是数组的长度，那么和1111进行&运算得到的结果肯定就是数组中的索引值。而我们要求数组的长度为2^n，2^n-1使用二进制表示为11....11。最终优化代码如下：

public int hash(Object key) {
    return hashCode(key) & (table.length-1);
}

良好的哈希函数

良好的哈希函数是：让哈希值更加均匀分布---->减少哈希冲突---->提升哈希表的性能。

6、如何生成key的哈希值

1、key的常见种类有：

整数、浮点数、字符串、自定义对象。

不同种类的key，hash值生成的方式不一样，但是目标是一致的
1. 尽量让每个key的hash值是不一样的。
2. 尽量让每个key的所有信息都参与运算。

2、在java中HashMap的key必须实现hashCode、equals方法，也允许key为null。

6.1、整数的哈希值

public int hashCode() {
    return Integer.hashCode(value);
}

public static int hashCode(int value) {
    return value;
}

如果是整数的，直接将整数值当做哈希值，比如100的哈希值就是100。

6.2、Float的哈希值

public int hashCode() {
    return Float.hashCode(value);
}

public static int hashCode(float value) {
    return floatToIntBits(value);
}

如果是单精度的浮点数，哈希值是将在存储的二进制格式转成整数。

6.3、Long和Double的哈希值

Long的哈希值

public int hashCode() {
    return Long.hashCode(value);
}

 public static int hashCode(long value) {
    return (int)(value ^ (value >>> 32));
}

^ 和 >>>的作用

^异或位运算符，相同为0，不同为1。
>>>表示无符号右移,无论正数还是负数，右移高位都补0。
>>表示有符号右移，正数高位补0，负数高位补1。
举例看下value ^ (value >>> 32)的作用

image.png
作用如下：

高32bit和低32bit混合计算出32bit的哈希值。

这样就做到了充分利用key的所有信息计算出hash值，来降低hash冲突的风险。

只能使用异或^才能做到，高位和低位的完全混合，使用位运算符&和|，都无法做到。

Double的哈希值

 public int hashCode() {
    return Double.hashCode(value);
}

public static int hashCode(double value) {
    long bits = doubleToLongBits(value);
    return (int)(bits ^ (bits >>> 32));
}

如果key是Double类型的，将存储的二进制转成Long型，然后对Long型的高32bit和低32bit进行混合计算得到一个32bit的hash值。

6.4、字符串的hash值

整数5438是如何计算出来的

5 * 10³+4 * 10²+3 * 10¹+8 * 10º

字符串是由若干个字符组成的

比如字符串jack，是由字符j、a、c、k四个字符组成的，字符本质是整数。

因此字符串jack的哈希值可以表示为：j * n³ + a * n² + c* n¹ + k* nº，等价于(( j * n + a ) * n+c ) * n+k。

在JDK中，n为31，为什么使用31？
因为31是一个奇素数，JVM会将31 * i自动优化成（i << 5）- i。
推导过程：31 * i = (2^5 - 1 ) * i = (2^5) * i - i = i << 5 - i

字符串的hash值计算过程如下

public int hash(String s) {
    int len = s.length();
    int hash = 0;
    for (int i = 0; i < len; i++) {
        char c = s.charAt(i);
        hash = hash * 31 + c;
    }
    return hash;
}

由于31 * i可以优化为 （i << 5）- i，优化后的代码如下

public int hash(String s) {
    int len = s.length();
    int hash = 0;
    for (int i = 0; i < len; i++) {
        char c = s.charAt(i);
        hash = hash * 31 + c;
    }
    return hash;
}

优化后的代码如下

public  int hash1(String s) {
    int len = s.length();
    int hash = 0;
    for (int i = 0; i < len; i++) {
        char c = s.charAt(i);
        hash = (hash << 5) - hash + c;
    }
    return hash;
}

关于31的讨论

31不仅满足2^n - 1的特点，它还是一个奇素数（既是奇数，又是素数，即是质数）。
素数和其他数相乘的结果比其他方式更容易产生唯一性，减少哈希冲突。
最终选择31是经过观测分布结果后的选择。

6.5、自定对象的hash值

上面讲解了整数、单精度浮点数、双精度浮点数、长整型、字符串的哈希值结算，其实对应的对象类中已经做了具体实现，而对于自定义的对象来说我们需要自己实现哈希值的计算，如果没有实现hashCode()方法，那么自定义对象的哈希值默认为对象地址值。
那么我们能不能直接使用对象的地址值来作为哈希值呢？下面举例说明下，先定义一个Person类

public class Person {
    private String name;
    private float height;
    private int age;

    public Person(String name, float height, int age) {
        this.name = name;
        this.height = height;
        this.age = age;
    }
}

那么

Person p1=new Person("jack", 1.80f, 22);
Person p2=new Person("jack", 1.80f, 22);

由于Person对象的哈希值默认为地址值，显然p1和p2的哈希值不同。
如果你的需求就是将对象的地址值作为对象的哈希值，你可以不用重写hashCode()，不过实际情况中默认的哈希值并不能满足需求。假如我们的需求是当Person的name、height、age相同时是同一个人，那么Person中的这些成员变量都要参与哈希值的计算。而且前面我们提到了一个好的哈希函数需要做到尽量保证唯一、尽量让key的所有信息都参与哈希值的运算。也就是说：让Person中的所有成员变量都参与hash值的运算。

@Override
public int hashCode() {
    int hash = Integer.hashCode(age);
    hash = 31 * hash + Float.hashCode(height);
    hash = 31 * hash + name == null ? 0 : name.hashCode();
    return hash;
}

hash值是int型的，在计算过程中哈希值太大，溢出了该怎么办？
溢出了不需要做任何处理，因为我们想要得到的就是整形的hash值，溢出的部分省略即可。

6.5.1、自定义对象的equals

自定义对象作为HashMap的key，最好同时重写hashCode()和equals()方法。
HashMap中使用哈希表来存储数据的

image.png

hashCode()：重写hashCode的目的是为了通过key计算出来的哈希值进行运算获取key对应的索引。
equals()：如果两个key计算出来的索引值相同，这样就会发生哈希冲突。由于数组中存储的单向链表的根节点，当发生冲突时，就需要使用key1.eqauls(key2)去链表中找到相同的key，若链表中有相同的key则会将value进行覆盖，如果没有则将value添加到链表的末尾。equals()的作用就是判断两个key是否是同一key。
自定义对象equals的默认实现如下：

@Override
public boolean equals(Object obj) {
    return super.equals(obj);
}

public boolean equals(Object obj) {
    return (this == obj);
}

equals的默认实现是通过比较对象的内存地址，如果内存地址相等，则表示是两个对象相同。
默认的实现一般情况下是无法满足我们的需求的，假如Person中的name、height、age都相等时，我们才认为两个Person对象相同。下面看下如何实现

@Override
public boolean equals(Object obj) {
    if (this == obj)
        return true;

    if (obj == null || this.getClass() != obj.getClass())
        return false;
    Person person = (Person) obj;
    return this.age == person.age && 
            this.height == person.height &&
            this.name == null ? person.name == null : this.name.equals(person.name);
}

另外需要注意的是：

必须保证equals()为true的两个key，它们的哈希值一定相同

假设Person的name和age相同的对象是相同的对象，此时我们实现equals()和hashCode()方法时就只需要将name和age参与运算即可。否则无法保证两个equals()为true的key，它们的哈希值也相同。

反过来，哈希值相同的两个key，它们的equals()不一定为true。
如果只重写equals()，不重写hashCode()

这样可能会导致2个equals为true的key都存储在哈希表中

如果只重写hashCode()，不重写equals()

可能会导致2个name、age、height相等的Person对象都存储在哈希表中。

7、实现HashMap

7.1、HashMap的接口设计

public interface Map<K, V> {

    int size();

    boolean isEmpty();

    void clear();

    /**
     * @param key
     * @param value
     * @return key相同会覆盖旧的value，返回被覆盖的value
     */
    V put(K key, V value);

    V get(K key);

    V remove(K key);

    boolean containsKey(K key);

    boolean containsValue(V value);

    void traversal(Visitor<K, V> visitor);

    public static abstract class Visitor<K, V> {
        boolean stop;

        public abstract boolean visitor(K key, V value,boolean color);
    }
}

定义HashMap类实现Map接口

public class HashMap<K, V> implements Map<K, V>{

    @Override
    public int size() {
        // TODO Auto-generated method stub
        return 0;
    }

    @Override
    public boolean isEmpty() {
        // TODO Auto-generated method stub
        return false;
    }

    @Override
    public void clear() {
        // TODO Auto-generated method stub
        
    }

    @Override
    public V put(K key, V value) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public V get(K key) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public V remove(K key) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public boolean containsKey(K key) {
        // TODO Auto-generated method stub
        return false;
    }

    @Override
    public boolean containsValue(V value) {
        // TODO Auto-generated method stub
        return false;
    }

    @Override
    public void traversal(Visitor<K, V> visitor) {
        // TODO Auto-generated method stub
        
    }
}

前面我们提到了在JDK1.8中，HashMap底层是由哈希表（数组）+单向链表+红黑树来实现数据的存储以及哈希冲突的处理。哈希表中存储的是单向链表或者红黑树的根节点，这里我们简单处理一下，统一存放红黑树的节点。下面定义一个红黑树的Node类

public static final boolean RED = true;
public static final boolean BLACK = false;

private static class Node<K,V> {
    K key;
    V value;
    // 新添加的节点默认为RED
    boolean color = RED;

    Node<K, V> parent;
    Node<K, V> left;
    Node<K, V> right;

    public Node(K key, V value, Node<K, V> parent) {
        this.key = key;
        this.value = value;
        this.parent = parent;
    }

    public boolean isLeftChild() {
        return parent != null && this.equals(parent.left);
    }

    public boolean isRightChild() {
        return parent != null && this.equals(parent.right);
    }

    public boolean isLeaf() {
        return left == null && right == null;
    }

    public boolean hasTwoChildren() {
        return left != null && right != null;
    }

    public Node<K, V> sibling() {
        if (isLeftChild()) {
            return parent.right;
        }

        else if (isRightChild()) {
            return parent.left;
        }
        return null;
    }
}

由于HashMap是使用数组存储元素，所以需要定义一个数组，数组容量大小为2^n次方，数组中存入的Node类型的元素。

private Node<K, V>[] table;
private static final int DEFAULT_CAPACITY = 1 << 4;

public HashMap() {
    table = new Node[DEFAULT_CAPACITY];
}

@Override
public int size() {
    return size;
}

@Override
public boolean isEmpty() {
    return size == 0;
}

7.2、clear()

@Override
public void clear() {
    if (size == 0)
        return;
    size = 0;
    for (int i = 0; i < table.length; i++) {
        table[i] = null;
    }
}

1、清空直接把数组中的元素置成null即可，遍历的范围为:[0，table.length)，而不应该为：[0，size)。因为数组中存入的元素的顺序并不是按照索引的顺序的。
2、clear()方法实现时需要注意在遍历前先对size进行判断，如果size=0时表示HashMap中没有元素，就不需要遍历整个数组并将元素置成null了。否则仍然需要对数组进行遍历，造成不必要的浪费。

7.3、put(K key, V value)

在添加之前需要找到存入哪个桶中，即找到key对应的索引

/**
 * @param key
 * @return key对应的索引
 */
private int index(K key) {
    if (key == null)
        return 0;
    int hash = key.hashCode();
    return hash & (table.length - 1);
}

/**
 * @param node
 * @return 找出node对应的索引
 */
private int index(Node<K, V> node) {
    if (node == null)
        return 0;
    return node.hash  & (table.length - 1);
}

为了减少哈希冲突，对哈希值做进一步的处理：扰动计算

/**
 * @param key
 * @return key对应的索引
 */
private int index(K key) {
    if (key == null)
        return 0;
    int hash = key.hashCode();
    return (hash ^ (hash >>> 16)) & (table.length - 1);
}

/**
 * @param node
 * @return 找出node对应的索引
 */
private int index(Node<K, V> node) {
    if (node == null)
        return 0;
    return (node.hash ^ (node.hash >>> 16)) & (table.length - 1);
}

put()的具体实现

@Override
public V put(K key, V value) {
    // 添加之前，需要使用key来计算得到索引
    int index = index(key);
    // 数组中存入的元素是不同红黑树的根节点，取出index位置对应的红黑树的根节点
    Node<K, V> root = table[index];
    if (root == null) { // 添加的第一个节点
        root = new Node<>(key, value, null);
        table[index] = root;
        size++;
        afterPut(root);
        return null;
    }
    // 来到这里说明发生了哈希冲突（key1和key2计算出来的索引相同）,这时候就需要向红黑树中添加节点
    int cmp = 0;
    Node<K, V> node = root;
    Node<K, V> parent = root;
    int k1 = key == null ? 0 : key.hashCode();
    while (node != null) {
        // 由于在HashMap中并不要求key具有可比较性，该如何进行比较呢？
        cmp = compare(key, k1, node.key, node.hash);
        parent = node;
        if (cmp > 0) {
            node = node.right;
        } else if (cmp < 0) {
            node = node.left;
        } else {
            V oldValue = node.value;
            node.key = key;
            node.value = value;
            return oldValue;
        }
    }

    Node<K, V> newNode = new Node<>(key, value, parent);
    if (cmp > 0) {
        parent.right = newNode;
    } else if (cmp < 0) {
        parent.left = newNode;
    }
    afterPut(newNode);
    size++;
    return null;
}

由于HashMap中的key不要求具有可比较性，那么该如何对key进行比较呢？
key的比较

private int compare(K k1, int h1, K k2, int h2) {
    // 比较哈希值
    int result = h1 - h2;
    if (result != 0)
        return result;

    // 来到这里说明哈希值相等
    // 比较equals
    if (Objects.equals(k1, k2))
        return 0;

    // 来到这里说明哈希值相等但不equals
    // 比较类名
    if (k1 != null && k2 != null) {
        String k1Cls = k1.getClass().getName();
        String k2Cls = k2.getClass().getName();
        result = k1Cls.compareTo(k2Cls);
        if (result != 0)
            return result;

        // 类名相同，则表示同种类型，且具有可比较性
        if (k1 instanceof Comparable) {
            return ((Comparable) k1).compareTo(k2);
        }
    }

    // 来到这里的可能情况：k1和k2不可能同时为null 否则Objects.equals(k1,k2)必定返回true
    // 1、k1 = null && k2 != null
    // 2、k1 != null && k2 = null
    // 3、同种类型且不具有可比较性
    // 目前没有其他的比较方式了，这里只能使用内存地址进行比较了
    return System.identityHashCode(k1) - System.identityHashCode(k2);
}

7.4、get(K key)

get(K key)方法返回key对应的value，key和value都是存在红黑树的节点上的，所以找到key对应的节点即可，如果未找到返回null即可。
问题就在于如何找到key对应的节点？

红黑树在添加节点时是通过对key进行比较来确定节点添加的位置，查询按照相同的规则去对key进行比较查询即可，具体实现看方法node(Node<K, V> node, K k1)。

@Override
public V get(K key) {
    Node<K, V> node = node(key);
    return node == null ? null : node.value;
}

/**
 * @param key
 * @return 返回key对应的节点
 */
private Node<K, V> node(K key) {
    Node<K, V> root = table[index(key)];
    return root == null ? null : node(root, key);
}

private Node<K, V> node(Node<K, V> node, K k1) {
    int h1 = k1 == null ? 0 : k1.hashCode();
    while (node != null) {
        int cmp = compare(k1, h1, node.key, node.hash);
        if (cmp == 0)
            return node;
        if (cmp > 0)
            node = node.right;
        else
            node = node.left;
    }
    return null;
}

7.5、containsKey(K key)和containsValue(V value)

containsKey(K key)：表示HashMap中是否存在key，其实就是去找是否存在key对应的节点，我们可以利用上面的方法node(key)即可实现功能。
containsValue(V value)：表示HashMap中是否存在value，由于我们存入的数据是根据key来进行比较存储的，所以我们只能遍历HashMap中的哈希表中存在的所有红黑树。

@Override
public boolean containsKey(K key) {
    return node(key) != null;
}

@Override
public boolean containsValue(V value) {
    if (size == 0)
        return false;
    // 遍历数组中的所有树
    Queue<Node<K, V>> queue = new LinkedList<>();
    for (int i = 0; i < table.length; i++) {
        Node<K, V> root = table[i];
        if (root == null)
            continue;
        queue.offer(root);
        while (!queue.isEmpty()) {
            Node<K, V> node = queue.poll();
            if (Objects.equals(node.value, value))
                return true;
            if (node.left != null)
                queue.offer(node.left);
            if (node.right != null)
                queue.offer(node.right);
        }
    }
    return false;
}

7.6、traversal(Visitor<K, V> visitor)

和containsValue()方法类似，我们需要遍历哈希表中所有的红黑树。

@Override
public void traversal(Visitor<K, V> visitor) {
    if (size == 0 || visitor == null)
        return;
    Queue<Node<K, V>> queue = new LinkedList<>();
    for (int i = 0; i < table.length; i++) {
        Node<K, V> root = table[i];
        if (root == null)
            continue;

        queue.offer(root);
        while (!queue.isEmpty()) {
            Node<K, V> node = queue.poll();
            if (visitor.visitor(node.key, node.value, node.color))
                return;
            if (node.left != null)
                queue.offer(node.left);
            if (node.right != null)
                queue.offer(node.right);
        }
    }
}

8、问题剖析

上面我们已经实现了Map中所有的接口方法，下面进行测试一下
先创建一个Key类

public class Key {
    protected int value;

    public Key(int value) {
        this.value = value;
    }
    
    @Override
    public int hashCode() {
        return value / 10;
    }
    
    @Override
    public boolean equals(Object obj) {
        if (obj == this) return true;
        if (obj == null || obj.getClass() != getClass()) return false;
        return ((Key) obj).value == value;
    }
    
    @Override
    public String toString() {
        return "v(" + value + ")";
    }
}

这个类中我们重写了hashCode()和equals()方法，当value值相等表示是同一个Key对象；当value的值大于等于0，小于10时哈希值都是0。测试用例如下：

static void test2() {
    HashMap<Object, Integer> map = new HashMap<>();
    for (int i = 0; i < 10; i++) {
        map.put(new Key(i), i);
    }
    System.out.println(map.get(new Key(1)));
    map.print();
}

上面示例中我们向map中添加10个元素，key是Key对象，它们的哈希值全部为0，所以这10个元素肯定是添加到同一棵红黑树上的。打印红黑树结果如下：

image.png
那么调用map.get(new Key(1))时，key=new Key(1)，那么key的哈希值是等于0的，会在上面红黑树中查找，而且只要传入Key对象的value相同时则表示是同一个对象，那么正常来说map.get(new Key(1))得到的value肯定是等于1的。但是打印结果却为null。
下面再来看下map.get(K key)方法到底做了什么？

@Override
public V get(K key) {
    Node<K, V> node = node(key);
    return node == null ? null : node.value;
}

/**
 * @param key
 * @return 返回key对应的节点
 */
private Node<K, V> node(K key) {
    Node<K, V> root = table[index(key)];
    return root == null ? null : node(root, key);
}

private Node<K, V> node(Node<K, V> node, K k1) {
    int h1 = k1 == null ? 0 : k1.hashCode();
    while (node != null) {
        int cmp = compare(k1, h1, node.key, node.hash);
        if (cmp == 0)
            return node;
        if (cmp > 0)
            node = node.right;
        else
            node = node.left;
    }
    return null;
}

private int compare(K k1, int h1, K k2, int h2) {
    // 比较哈希值
    int result = h1 - h2;
    if (result != 0)
        return result;

    // 来到这里说明哈希值相等
    // 比较equals
    if (Objects.equals(k1, k2))
        return 0;

    // 来到这里说明哈希值相等但不equals
    // 比较类名
    if (k1 != null && k2 != null) {
        String k1Cls = k1.getClass().getName();
        String k2Cls = k2.getClass().getName();
        result = k1Cls.compareTo(k2Cls);
        if (result != 0)
            return result;

        // 类名相同，则表示同种类型，且具有可比较性
        if (k1 instanceof Comparable) {
            return ((Comparable) k1).compareTo(k2);
        }
    }

    // 来到这里的可能情况：k1和k2不可能同时为null 否则Objects.equals(k1,k2)必定返回true
    // 1、k1 = null && k2 != null
    // 2、k1 != null && k2 = null
    // 3、同种类型且不具有可比较性
    // 目前没有其他的比较方式了，这里只能使用内存地址进行比较了
    return System.identityHashCode(k1) - System.identityHashCode(k2);
}

经过断点可知问题出现在方法compare(K k1, int h1, K k2, int h2)中。
我们结合compare()方法，对照着刚刚打印的红黑树来分析下问题：
首先拿k1=Key(1)的哈希值和根节点k2=Key(3)的哈希值比较，哈希值相等，向下执行equals()，明显它们并不相同的对象，而且它们是相同类型且不具有比较性，下面直接会通过System.identityHashCode(k1) - System.identityHashCode(k2)内存地址进行比较，如果k1对象的内存地址较大，那么会去红黑树的右子树中查找，这样的话永远也找不到所以会返回null，如果运气好的话，k1的内存地址较小，会去左子树中查找，这样也能找到。
所以通过内存地址去比较并不合理，不稳定。

8.1、node(Node<K, V> node, K k1)的bug修复

通过上面的分析，我们是不是只要将compare()进行完善就可以了，其实不然，由于put()和node()方法内部都使用了compare()方法，而它们的处理是有所不同的，下面看下如何对node()方法进行优化。

private Node<K, V> node(Node<K, V> node, K k1) {
    int h1 = k1 == null ? 0 : k1.hashCode();
    int cmp = 0;
    // 存储查找结果
    Node<K, V> result = null;
    while (node != null) {
        int h2 = node.hash;
        K k2 = node.key;
        // 这里没有使用 int result=h1 - h2;然后比较result的正负
        // 原因很简单：哈希值可能为负值，所以h1 - h2 当h2为负值是result=h1 + h2;可能会造成溢出。
        // 那么比较就不准确了，这里直接比较h1和h2.
        // 先比较hash值
        if (h1 > h2)
            cmp = 1;
        else if (h1 < h2)
            cmp = -1;
        else if (Objects.equals(k1, k2)) {
            cmp = 0;
        } else if (k1 != null && k2 != null && k1.getClass().equals(k2.getClass()) && k1 instanceof Comparable) {
            cmp = ((Comparable) k1).compareTo(k2);
        } else {
            // 哈希值相同，但不equals且不具有比较性，这时候没其他办法比较，只能去扫描整棵树
            if (node.right != null && (result = node(node.right, k1)) != null) {
                return result;
            } else if (node.left != null && (result = node(node.left, k1)) != null) {
                return result;
            } else {
                return null;
            }
        }
        if (cmp == 0)
            return node;
        if (cmp > 0)
            node = node.right;
        else
            node = node.left;
    }
    return null;
}

主要修改了key的比较逻辑：

1、比较哈希值，哈希值不相等则去左边或右边查找，相等则比较equals()。
2、比较equals()，equals返回true直接将node返回即可，反之继续比较。
3、若key具有可比性，则调用compareTo()进行比较。
4、否则直接去扫描整棵树去查找，不再使用内存地址去比较。

这样比较key真的就没有问题了吗？下面再看个示例。让Key实现Comparable类

package com.ygj.map.model;

public class Key implements Comparable<Key> {
    protected int value;
    protected int value2;

    public Key(int value, int value2) {
        this.value = value;
        this.value2 = value2;
    }

    @Override
    public int hashCode() {
        return value / 10;
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == this)
            return true;
        if (obj == null || obj.getClass() != getClass())
            return false;
        return ((Key) obj).value == value && ((Key) obj).value2 == value2;
    }

    @Override
    public String toString() {
        return "v(" + value + ",_" + value2 + ")";
    }

    @Override
    public int compareTo(Key key) {
        return this.value2 - key.value2;
    }
}

Key中增加一个成员变量value2，而且只有当value和value2都相等时才是同一个对象，hashCode()方法不变。compareTo()方法只比较values2的大小。
测试示例

static void test2() {
    HashMap<Object, Integer> map = new HashMap<>();
    for (int i = 1; i < 10; i++) {
        map.put(new Key(i, i), i);
    }
    System.out.println(map.get(new Key(1, 2)));
    map.print();
}

输出结果为2。

不对啊，HashMap中并没有插入key为(new Key(1, 2)的值啊，应该返回null才对啊。

首先k1=(new Key(1, 2)的哈希值为0，存入的9个元素key的哈希值也为0。
在查找时我们先比较的哈希值，而哈希值相等，所以会去比较equals()，而明显k1和9个元素的equals()均返回false。
此时Key已经有可比较性了，所以当k1和k2=Key(2，2)进行比较时k1.compareTo(k2)=0,所以就把k2对应node返回了，所以输出结果为2。
通过compareTo()方法比较相等并不代表它们是同一个key，只有k1.equals(k2)=true时才表示是同一个key'，所以需要忽略compareTo()=0的情况。
优化如下：

private Node<K, V> node(Node<K, V> node, K k1) {
    int h1 = k1 == null ? 0 : k1.hashCode();
    int cmp = 0;
    // 存储查找结果
    Node<K, V> result = null;
    while (node != null) {
        int h2 = node.hash;
        K k2 = node.key;
        // 这里没有使用 int result=h1 - h2;然后比较result的正负
        // 原因很简单：哈希值可能为负值，所以h1 - h2 当h2为负值是result=h1 + h2;可能会造成溢出。
        // 那么比较就不准确了，这里直接比较h1和h2.
        // 先比较hash值
        if (h1 > h2)
            cmp = 1;
        else if (h1 < h2)
            cmp = -1;
        else if (Objects.equals(k1, k2)) {
            cmp = 0;
        } else if (k1 != null && k2 != null && k1.getClass().equals(k2.getClass()) && k1 instanceof Comparable
                && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {
//当满足 (cmp = ((Comparable) k1).compareTo(k2))== 0忽略
//              cmp = ((Comparable) k1).compareTo(k2);
        } else {
            // 哈希值相同，但不equals且不具有比较性，这时候没其他办法比较，只能去扫描整棵树
            if (node.right != null && (result = node(node.right, k1)) != null) {
                return result;
            } else if (node.left != null && (result = node(node.left, k1)) != null) {
                return result;
            } else {
                return null;
            }
        }
        if (cmp == 0)
            return node;
        if (cmp > 0)
            node = node.right;
        else
            node = node.left;
    }
    return null;
}

到此为止node()方法已经处理完了，同样的put()方法也使用compare()，也需要进行处理。

8.1、put(K key, V value)的bug修复

@Override
public V put(K key, V value) {
    // 添加之前，需要使用key来计算得到索引
    int index = index(key);
    // 数组中存入的元素是不同红黑树的根节点，取出index位置对应的红黑树的根节点
    Node<K, V> root = table[index];
    if (root == null) { // 添加的第一个节点
        root = new Node<>(key, value, null);
        table[index] = root;
        size++;
        afterPut(root);
        return null;
    }
    // 来到这里说明发生了哈希冲突（key1和key2计算出来的索引相同）,这时候就需要向红黑树中添加节点
    int cmp = 0;
    Node<K, V> node = root;
    Node<K, V> parent = root;
    int h1 = key == null ? 0 : key.hashCode();
    K k1 = key;
    // 记录是否已经扫描过整棵树，避免重复扫描
    boolean isSearched = false;
    // 记录查找结果
    Node<K, V> result = null;
    while (node != null) {
        // 由于在HashMap中并不要求key具有可比较性，该如何进行比较呢？
//          cmp = compare(key, k1, node.key, node.hash);
        K k2 = node.key;
        int h2 = node.hash;
        // 比较hash
        if (h1 > h2) {
            cmp = 1;
        } else if (h1 < h2) {
            cmp = -1;
        } else if (Objects.equals(k1, k2)) {
            cmp = 0;
        } else if (k1 != null && k2 != null && k1.getClass() == k2.getClass() && k1 instanceof Comparable
                && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {

        } else if (isSearched) {// 表示已经扫描过了
            cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
        } else {// searched == false; 还没有扫描，然后再根据内存地址大小决定左右
            if ((node.left != null && (result = node(node.left, k1)) != null)
                    || (node.right != null && (result = node(node.right, k1)) != null)) {
                // 找到的是Node是result不再是node
                node = result;
                cmp = 0;
            } else {
                isSearched = true;
                // 如果扫描整棵树也没有找到元素，则使用内存地址进行比较
                cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
            }
        }
        parent = node;
        if (cmp > 0) {
            node = node.right;
        } else if (cmp < 0) {
            node = node.left;
        } else {
            V oldValue = node.value;
            node.key = key;
            node.value = value;
            node.hash = h1;
            return oldValue;
        }
    }

    Node<K, V> newNode = new Node<>(key, value, parent);
    if (cmp > 0) {
        parent.right = newNode;
    } else if (cmp < 0) {
        parent.left = newNode;
    }
    afterPut(newNode);
    size++;
    return null;
}

和查询不同的是：如果在扫描整棵树都没有找到时需要使用内存地址进行比较。

8.2、key比较的总结

从添加和查询中的key比较逻辑来看，只有当k1.equals(k2)=true时，才认为k1和k2相同，其他情况下（哈希值相等、compareTo()方法返回0）都不认为两个key相同。我们增加这么多判断规则是只是为了提高查询和插入的效率，下面我们修改下put()和node()的key比较逻辑

@Override
public V put(K key, V value) {
    // 添加之前，需要使用key来计算得到索引
    int index = index(key);
    // 数组中存入的元素是不同红黑树的根节点，取出index位置对应的红黑树的根节点
    Node<K, V> root = table[index];
    if (root == null) { // 添加的第一个节点
        root = new Node<>(key, value, null);
        table[index] = root;
        size++;
        afterPut(root);
        return null;
    }
    // 来到这里说明发生了哈希冲突（key1和key2计算出来的索引相同）,这时候就需要向红黑树中添加节点
    int cmp = 0;
    Node<K, V> node = root;
    Node<K, V> parent = root;
    // 记录是否已经扫描过整棵树，避免重复扫描
    boolean isSearched = false;
    // 记录查找结果
    Node<K, V> result = null;
    K k1 = key;
    int h1 = key == null ? 0 : key.hashCode();
    while (node != null) {
        K k2 = node.key;
        if (Objects.equals(k1, k2)) {
            cmp = 0;
        } else if (isSearched) {// 表示已经扫描过了
            cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
        } else {// searched == false; 还没有扫描，然后再根据内存地址大小决定左右
            if ((node.left != null && (result = node(node.left, k1)) != null)
                    || (node.right != null && (result = node(node.right, k1)) != null)) {
                // 找到的是Node是result不再是node
                node = result;
                cmp = 0;
            } else {
                isSearched = true;
                // 如果扫描整棵树也没有找到元素，则使用内存地址进行比较
                cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
            }
        }
        parent = node;
        if (cmp > 0) {
            node = node.right;
        } else if (cmp < 0) {
            node = node.left;
        } else {
            V oldValue = node.value;
            node.key = key;
            node.value = value;
            node.hash = h1;
            return oldValue;
        }
    }

    Node<K, V> newNode = new Node<>(key, value, parent);
    if (cmp > 0) {
        parent.right = newNode;
    } else if (cmp < 0) {
        parent.left = newNode;
    }
    afterPut(newNode);
    size++;
    return null;
}

在添加元素时，我们去除了哈希值的比较，先判断k1.equals(k2)，如果equals()返回false，则去整棵树中搜索。添加元素的方法修改了，查询的方法也需要进行修改。

private Node<K, V> node(Node<K, V> node, K k1) {
    int cmp = 0;
    // 存储查找结果
    Node<K, V> result = null;
    while (node != null) {
        K k2 = node.key;
        if (Objects.equals(k1, k2)) {
            cmp = 0;
        } else {
            // 哈希值相同，但不equals且不具有比较性，这时候没其他办法比较，只能去扫描整棵树
            if (node.right != null && (result = node(node.right, k1)) != null) {
                return result;
            } else if (node.left != null && (result = node(node.left, k1)) != null) {
                return result;
            } else {
                return null;
            }
        }
        if (cmp == 0)
            return node;
        if (cmp > 0)
            node = node.right;
        else
            node = node.left;
    }
    return null;
}

测试示例:向map中添加大量的数据，然后再将这些数据都删除

static void test1Map(Map<String, Integer> map, String[] words) {
    Times.test(map.getClass().getName(), new Task() {
        @Override
        public void execute() {
            for (String word : words) {
                Integer count = map.get(word);
                count = count == null ? 0 : count;
                map.put(word, count + 1);
            }
            System.out.println(map.size()); // 17188

            int count = 0;
            for (String word : words) {
                Integer i = map.get(word);
                count += i == null ? 0 : i;
                map.remove(word);
            }
            Asserts.test(count == words.length);
            Asserts.test(map.size() == 0);
        }
    });
}

修改key之前比较逻辑之前的耗时

单词总数：200369
-------------------------------------
【com.ygj.map.HashMap】
开始：17:44:34.989
6996
结束：17:44:35.101
耗时：0.111秒
-------------------------------------

修改之后key比较逻辑之前的耗时

单词总数：200369
-------------------------------------
【com.ygj.map.HashMap_Test】
开始：17:44:17.851
6996
结束：17:44:19.544
耗时：1.692秒

很明显，修改key比较逻辑之后，效率变的很低了，这是因为在比较key时需要扫描整棵树，而如果在添加时先进行比较哈希值，哈希值大的添加在树的右边，那么在查询时也会根据哈希值大小去左子树或右子树中查询，比如要查询key的哈希值较大，那么必然直接去右子树中查找，这样以来左子树中大量节点我们就不要去比较了，很显然效率很高。

8.3、remove(K key)实现

通过key找到节点，然后删除节点即可

@Override
public V remove(K key) {
    Node<K, V> node = node(key);
    if (node == null)
        return null;
    V oldValue = node.value;
    if (node.hasTwoChildren()) { // 度为2的节点
        // 找到前驱节点后继节点 然后覆盖节点的值
        Node<K, V> preNode = precursor(node);
        node.key = preNode.key;
        node.value = preNode.value;
        //注意替换hash
        node.hash = preNode.hash;
        // 删除前驱节点或后继节点
        // 前驱节点或后继节点肯定是度为1 或0的节点
        node = preNode;
    }
    int index = index(node);
    // 删除度为1或0的节点
    Node<K, V> replacement = node.left != null ? node.left : node.right;
    if (replacement != null) {
        replacement.parent = node.parent;
        if (node.isLeftChild()) {
            node.parent.left = replacement;
        } else if (node.isRightChild()) {
            node.parent.right = replacement;
        } else {
            // 根节点
            table[index] = replacement;
        }
        afterRemove(node,replacement);
    } else if (node.parent == null) {
        table[index] = null;
    } else {
        // 叶子节点
        if (node.isLeftChild()) {
            node.parent.left = null;
        } else if (node.isRightChild()) {
            node.parent.right = null;
        }
        afterRemove(node,null);
    }

    size--;
    return oldValue;
}

9、扩容

到目前为止HashMap的方法已经实现完了，但是在初始化时我们创建了一个容量为16的数组，如果存入大量的元素，那么每个桶中的红黑树的深度就会越大，这样就会造成性能较差。这时候就需要对桶数组进行扩容。

装填因子：节点总数量 / 哈希表桶数组长度，也叫负载因子。
在JDK1.8的HashMap中，如果装填因子超过0.75时，就扩容为原来的2倍。
而扩容的逻辑肯定是在添加函数中，即HashMap的put()函数中。
上面我们说了当装填因子大于0.75时，进行扩容，容量增加为原来的2倍。
需要将旧的哈希表中所有元素都移动到新的哈希表中，类似于向新的哈希表中添加元素。具体逻辑如下：

/**
 * 扩容
 */
private void resize() {
    // 装填因子小于等于0.75不进行扩容
    if (size * 1.0f / table.length <= LOAD_FACTOR)
        return;

    Node<K, V>[] oldTable = table;
    table = new Node[table.length << 1];
    Queue<Node<K, V>> queue = new LinkedList<>();
    for (int i = 0; i < oldTable.length; i++) {
        Node<K, V> root = oldTable[i];
        if (root == null)
            continue;
        queue.offer(root);
        while (!queue.isEmpty()) {
            Node<K, V> node = queue.poll();

            if (node.left != null)
                queue.offer(node.left);

            if (node.right != null)
                queue.offer(node.right);

            // 挪动节点到新的哈希表中，
            // 注意挪动代码需要放到最后，否则会导致获得node.left和node.right是错乱的
            moveNode(node);
        }

    }
}

/**
 * 挪动节点到新的哈希表中 类似于添加元素的逻辑
 * 
 * @param node
 */
private void moveNode(Node<K, V> newNode) {
    // 注意重置newNode的属性
    newNode.parent = null;
    newNode.left = null;
    newNode.right = null;
    newNode.color = RED;

    // 首先计算出添加到那个桶中
    int index = index(newNode);
    // 取出对应index位置的红黑树的根节点
    Node<K, V> root = table[index];
    // 说明添加进来的是根节点
    if (root == null) {
        root = newNode;
        table[index] = root;
        afterPut(root);
        return;
    }

    // 添加到红黑树上
    Node<K, V> node = root;
    Node<K, V> parent = root;
    int cmp = 0;
    int h1 = newNode.hash;
    K k1 = newNode.key;

    while (node != null) {
        int h2 = node.hash;
        K k2 = node.key;
        parent = node;
        if (h1 > h2) {
            cmp = 1;
        } else if (h1 < h2) {
            cmp = -1;
        } else if (k1 != null && k2 != null && k1.getClass() == k2.getClass() && k1 instanceof Comparable
                && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {
        } else {
            cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
        }

        if (cmp > 0)
            node = node.right;
        else if (cmp < 0)
            node = node.left;
    }

    // 记得维护parent信息
    newNode.parent = parent;
    if (cmp > 0)
        parent.right = newNode;
    else if (cmp < 0)
        parent.left = newNode;

    afterPut(newNode);
}

注意：在移动节点元素之前记得重置节点的属性。
moveNode()逻辑和put()逻辑类似，只不过在判断添加到节点的左边还是右边时需要去掉equals()和搜索逻辑，因为旧的哈希表中的节点不可能是重复的，所以没必要进行判断。

10、HashMap的完整代码

public class HashMap<K, V> implements Map<K, V> {
    private int size;
    public static final boolean RED = true;
    public static final boolean BLACK = false;

    private Node<K, V>[] table;
    private static final int DEFAULT_CAPACITY = 1 << 4;
    private static final float LOAD_FACTOR = 0.75f;

    public HashMap() {
        table = new Node[DEFAULT_CAPACITY];
    }

    @Override
    public int size() {
        return size;
    }

    @Override
    public boolean isEmpty() {
        return size == 0;
    }

    @Override
    public void clear() {
        if (size == 0)
            return;
        size = 0;
        for (int i = 0; i < table.length; i++) {
            table[i] = null;
        }
    }

    /**
     * 扩容
     */
    private void resize() {
        // 装填因子小于等于0.75不进行扩容
        if (size * 1.0f / table.length <= LOAD_FACTOR)
            return;

        Node<K, V>[] oldTable = table;
        table = new Node[table.length << 1];
        Queue<Node<K, V>> queue = new LinkedList<>();
        for (int i = 0; i < oldTable.length; i++) {
            Node<K, V> root = oldTable[i];
            if (root == null)
                continue;
            queue.offer(root);
            while (!queue.isEmpty()) {
                Node<K, V> node = queue.poll();

                if (node.left != null)
                    queue.offer(node.left);

                if (node.right != null)
                    queue.offer(node.right);

                // 挪动节点到新的哈希表中，
                // 注意挪动代码需要放到最后，否则会导致获得node.left和node.right是错乱的
                moveNode(node);
            }

        }
    }

    /**
     * 挪动节点到新的哈希表中 类似于添加元素的逻辑
     * 
     * @param node
     */
    private void moveNode(Node<K, V> newNode) {
        // 注意重置newNode的属性
        newNode.parent = null;
        newNode.left = null;
        newNode.right = null;
        newNode.color = RED;

        // 首先计算出添加到那个桶中
        int index = index(newNode);
        // 取出对应index位置的红黑树的根节点
        Node<K, V> root = table[index];
        // 说明添加进来的是根节点
        if (root == null) {
            root = newNode;
            table[index] = root;
            afterPut(root);
            return;
        }

        // 添加到红黑树上
        Node<K, V> node = root;
        Node<K, V> parent = root;
        int cmp = 0;
        int h1 = newNode.hash;
        K k1 = newNode.key;

        while (node != null) {
            int h2 = node.hash;
            K k2 = node.key;
            parent = node;
            if (h1 > h2) {
                cmp = 1;
            } else if (h1 < h2) {
                cmp = -1;
            } else if (k1 != null && k2 != null && k1.getClass() == k2.getClass() && k1 instanceof Comparable
                    && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {
            } else {
                cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
            }

            if (cmp > 0)
                node = node.right;
            else if (cmp < 0)
                node = node.left;
        }

        // 记得维护parent信息
        newNode.parent = parent;
        if (cmp > 0)
            parent.right = newNode;
        else if (cmp < 0)
            parent.left = newNode;

        afterPut(newNode);
    }

    @Override
    public V put(K key, V value) {
        resize();
        // 添加之前，需要使用key来计算得到索引
        int index = index(key);
        // 数组中存入的元素是不同红黑树的根节点，取出index位置对应的红黑树的根节点
        Node<K, V> root = table[index];
        if (root == null) { // 添加的第一个节点
            root = new Node<>(key, value, null);
            table[index] = root;
            size++;
            afterPut(root);
            return null;
        }
        // 来到这里说明发生了哈希冲突（key1和key2计算出来的索引相同）,这时候就需要向红黑树中添加节点
        int cmp = 0;
        Node<K, V> node = root;
        Node<K, V> parent = root;
        int h1 = key == null ? 0 : key.hashCode();
        K k1 = key;
        // 记录是否已经扫描过整棵树，避免重复扫描
        boolean isSearched = false;
        // 记录查找结果
        Node<K, V> result = null;
        while (node != null) {
            // 由于在HashMap中并不要求key具有可比较性，该如何进行比较呢？
//          cmp = compare(key, k1, node.key, node.hash);
            K k2 = node.key;
            int h2 = node.hash;
            // 比较hash
            if (h1 > h2) {
                cmp = 1;
            } else if (h1 < h2) {
                cmp = -1;
            } else if (Objects.equals(k1, k2)) {
                cmp = 0;
            } else if (k1 != null && k2 != null && k1.getClass() == k2.getClass() && k1 instanceof Comparable
                    && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {

            } else if (isSearched) {// 表示已经扫描过了
                cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
            } else {// searched == false; 还没有扫描，然后再根据内存地址大小决定左右
                if ((node.left != null && (result = node(node.left, k1)) != null)
                        || (node.right != null && (result = node(node.right, k1)) != null)) {
                    // 找到的是Node是result不再是node
                    node = result;
                    cmp = 0;
                } else {
                    isSearched = true;
                    // 如果扫描整棵树也没有找到元素，则使用内存地址进行比较
                    cmp = System.identityHashCode(k1) - System.identityHashCode(k2);
                }
            }
            parent = node;
            if (cmp > 0) {
                node = node.right;
            } else if (cmp < 0) {
                node = node.left;
            } else {
                V oldValue = node.value;
                node.key = key;
                node.value = value;
                node.hash = h1;
                return oldValue;
            }
        }

        Node<K, V> newNode = new Node<>(key, value, parent);
        if (cmp > 0) {
            parent.right = newNode;
        } else if (cmp < 0) {
            parent.left = newNode;
        }
        afterPut(newNode);
        size++;
        return null;
    }

    @Override
    public V get(K key) {
        Node<K, V> node = node(key);
        return node == null ? null : node.value;
    }

    @Override
    public V remove(K key) {
        Node<K, V> node = node(key);
        if (node == null)
            return null;
        V oldValue = node.value;
        if (node.hasTwoChildren()) { // 度为2的节点
            // 找到前驱节点后继节点 然后覆盖节点的值
            Node<K, V> preNode = precursor(node);
            node.key = preNode.key;
            node.value = preNode.value;
            // 注意替换hash
            node.hash = preNode.hash;
            // 删除前驱节点或后继节点
            // 前驱节点或后继节点肯定是度为1 或0的节点
            node = preNode;
        }
        int index = index(node);
        // 删除度为1或0的节点
        Node<K, V> replacement = node.left != null ? node.left : node.right;
        if (replacement != null) {
            replacement.parent = node.parent;
            if (node.isLeftChild()) {
                node.parent.left = replacement;
            } else if (node.isRightChild()) {
                node.parent.right = replacement;
            } else {
                // 根节点
                table[index] = replacement;
            }
            afterRemove(node, replacement);
        } else if (node.parent == null) {
            table[index] = null;
        } else {
            // 叶子节点
            if (node.isLeftChild()) {
                node.parent.left = null;
            } else if (node.isRightChild()) {
                node.parent.right = null;
            }
            afterRemove(node, null);
        }

        size--;
        return oldValue;
    }

    @Override
    public boolean containsKey(K key) {
        return node(key) != null;
    }

    @Override
    public boolean containsValue(V value) {
        if (size == 0)
            return false;
        // 遍历数组中的所有树
        Queue<Node<K, V>> queue = new LinkedList<>();
        for (int i = 0; i < table.length; i++) {
            Node<K, V> root = table[i];
            if (root == null)
                continue;
            queue.offer(root);
            while (!queue.isEmpty()) {
                Node<K, V> node = queue.poll();
                if (Objects.equals(node.value, value))
                    return true;
                if (node.left != null)
                    queue.offer(node.left);
                if (node.right != null)
                    queue.offer(node.right);
            }
        }
        return false;
    }

    @Override
    public void traversal(Visitor<K, V> visitor) {
        if (size == 0 || visitor == null)
            return;
        Queue<Node<K, V>> queue = new LinkedList<>();
        for (int i = 0; i < table.length; i++) {
            Node<K, V> root = table[i];
            if (root == null)
                continue;

            queue.offer(root);
            while (!queue.isEmpty()) {
                Node<K, V> node = queue.poll();
                if (visitor.visitor(node.key, node.value, node.color))
                    return;
                if (node.left != null)
                    queue.offer(node.left);
                if (node.right != null)
                    queue.offer(node.right);
            }
        }
    }

    // 打印红黑树
    public void print() {
        if (size == 0)
            return;
        for (int i = 0; i < table.length; i++) {
            final Node<K, V> root = table[i];
            System.out.println("【index = " + i + "】");
            BinaryTrees.println(new BinaryTreeInfo() {
                @Override
                public Object string(Object node) {
                    return node;
                }

                @Override
                public Object root() {
                    return root;
                }

                @Override
                public Object right(Object node) {
                    return ((Node<K, V>) node).right;
                }

                @Override
                public Object left(Object node) {
                    return ((Node<K, V>) node).left;
                }
            });
            System.out.println("---------------------------------------------------");
        }
    }

    private void afterRemove(Node<K, V> node, Node<K, V> replacement) {

        // 如果删除的红色节点 不需处理
        if (isRed(node))
            return;

        // 来到这里删除是BLACK节点
        // 删除有一个RED子节点的
        if (isRed(replacement)) {
            // 将替代的子节点染成BLACK
            black(replacement);
            return;
        }

        // 删除BLACK叶子节点
        Node<K, V> parent = node.parent;
        if (parent == null)// 删除的是根节点
            return;

        // 找兄弟节点
        boolean isLeft = parent.left == null || node.isLeftChild();
        Node<K, V> sibling = isLeft ? parent.right : parent.left;

        if (isLeft) {
            // 转换成兄弟节点为BLACK
            if (isRed(sibling)) {
                black(sibling);
                red(parent);
                rotateLeft(parent);
                sibling = parent.right;
            }
            if (isBlack(sibling.left) && isBlack(sibling.right)) { // 不能借
                boolean isParentBlack = isBlack(parent);
                black(parent);
                red(sibling);
                // 父节点向下合并，可能还会造成下溢，将parent作为删除的节点处理
                if (isParentBlack)
                    afterRemove(parent, null);
            } else {

                if (isBlack(sibling.right)) { // RL
                    rotateRight(sibling);
                    sibling = parent.right;
                }
                color(sibling, colorOf(parent));
                black(parent);
                black(sibling.right);
                rotateLeft(parent);
            }
        } else {
            // 转换成兄弟节点为BLACK
            if (isRed(sibling)) {
                black(sibling);
                red(parent);
                rotateRight(parent);
                sibling = parent.left;
            }
            if (isBlack(sibling.left) && isBlack(sibling.right)) { // 不能借
                boolean isParentBlack = isBlack(parent);
                black(parent);
                red(sibling);
                // 父节点向下合并，可能还会造成下溢，将parent作为删除的节点处理
                if (isParentBlack)
                    afterRemove(parent, null);
            } else {

                if (isBlack(sibling.left)) { // LR
                    rotateLeft(sibling);
                    sibling = parent.left;
                }
                color(sibling, colorOf(parent));
                black(parent);
                black(sibling.left);
                rotateRight(parent);
            }

        }

    }

    private Node<K, V> precursor(Node<K, V> node) {
        Node<K, V> leftNode = node.left;
        if (leftNode != null) { // node.left.right.right.....
            while (leftNode.right != null) {
                leftNode = leftNode.right;
            }
            return leftNode;

        } else { // node.parent.parent....
            while (node.parent != null && node.isLeftChild()) {
                node = node.parent;
            }
            return node.parent;
        }
    }

    /**
     * @param key
     * @return 返回key对应的节点
     */
    private Node<K, V> node(K key) {
        Node<K, V> root = table[index(key)];
        return root == null ? null : node(root, key);
    }

    private Node<K, V> node(Node<K, V> node, K k1) {
        int h1 = k1 == null ? 0 : k1.hashCode();
        int cmp = 0;
        // 存储查找结果
        Node<K, V> result = null;
        while (node != null) {
            int h2 = node.hash;
            K k2 = node.key;
            // 这里没有使用 int result=h1 - h2;然后比较result的正负
            // 原因很简单：哈希值可能为负值，所以h1 - h2 当h2为负值是result=h1 + h2;可能会造成溢出。
            // 那么比较就不准确了，这里直接比较h1和h2.
            // 先比较hash值
            if (h1 > h2)
                cmp = 1;
            else if (h1 < h2)
                cmp = -1;
            else if (Objects.equals(k1, k2)) {
                cmp = 0;
            } else if (k1 != null && k2 != null && k1.getClass().equals(k2.getClass()) && k1 instanceof Comparable
                    && (cmp = ((Comparable) k1).compareTo(k2)) != 0) {
//              cmp = ((Comparable) k1).compareTo(k2);
            } else {
                // 哈希值相同，但不equals且不具有比较性，这时候没其他办法比较，只能去扫描整棵树
                if (node.right != null && (result = node(node.right, k1)) != null) {
                    return result;
                } else if (node.left != null && (result = node(node.left, k1)) != null) {
                    return result;
                } else {
                    return null;
                }
            }
            if (cmp == 0)
                return node;
            if (cmp > 0)
                node = node.right;
            else
                node = node.left;
        }
        return null;
    }

//  private int compare(K k1, int h1, K k2, int h2) {
//      // 比较哈希值
//      int result = h1 - h2;
//      if (result != 0)
//          return result;
//
//      // 来到这里说明哈希值相等
//      // 比较equals
//      if (Objects.equals(k1, k2))
//          return 0;
//
//      // 来到这里说明哈希值相等但不equals
//      // 比较类名
//      if (k1 != null && k2 != null) {
//          String k1Cls = k1.getClass().getName();
//          String k2Cls = k2.getClass().getName();
//          result = k1Cls.compareTo(k2Cls);
//          if (result != 0)
//              return result;
//
//          // 类名相同，则表示同种类型，且具有可比较性
//          if (k1 instanceof Comparable) {
//              return ((Comparable) k1).compareTo(k2);
//          }
//      }
//
//      // 来到这里的可能情况：k1和k2不可能同时为null 否则Objects.equals(k1,k2)必定返回true
//      // 1、k1 = null && k2 != null
//      // 2、k1 != null && k2 = null
//      // 3、同种类型且不具有可比较性
//      // 目前没有其他的比较方式了，这里只能使用内存地址进行比较了
//      return System.identityHashCode(k1) - System.identityHashCode(k2);
//  }

    /**
     * @param key
     * @return key对应的索引
     */
    private int index(K key) {
        if (key == null)
            return 0;
        int hash = key.hashCode();
        return (hash ^ (hash >>> 16)) & (table.length - 1);
    }

    /**
     * @param node
     * @return 找出node对应的索引
     */
    private int index(Node<K, V> node) {
        if (node == null)
            return 0;
        return (node.hash ^ (node.hash >>> 16)) & (table.length - 1);
    }

    /**
     * //用于添加后的修复
     * 
     * @param node
     */
    private void afterPut(Node<K, V> node) {
        Node<K, V> parent = node.parent;
        if (parent == null) { // 根节点
            black(node);
            return;
        }
        // 添加有12种
        // 4种parent为黑色，不需处理
        if (isBlack(parent))
            return;

        // 来到这里parent是RED，判断uncle节点是否红色
        Node<K, V> sibling = parent.sibling();
        Node<K, V> grand = parent.parent;
        if (isRed(sibling)) { // uncle节点为红色
            // 将parent、uncle染成黑色
            black(parent);
            black(sibling);
            // 将grand染成红色作为新添加的节点
            grand = red(grand);
            afterPut(grand);
            return;
        }

        // 来到这里uncle节点为黑色
        if (parent.isLeftChild()) { // L
            if (node.isLeftChild()) { // LL
                black(parent);
                red(grand);
                rotateRight(grand);
            } else { // LR
                black(node);
                red(grand);
                rotateLeft(parent);
                rotateRight(grand);
            }
        } else { // R
            if (node.isLeftChild()) { // RL
                black(node);
                red(grand);
                rotateRight(parent);
                rotateLeft(grand);
            } else { // RR
                black(parent);
                red(grand);
                rotateLeft(grand);
            }
        }

    }

    private void rotateLeft(Node<K, V> grand) { // RR
        Node<K, V> parent = grand.right;
        Node<K, V> child = parent.left;

        grand.right = child;
        parent.left = grand;

        parent.parent = grand.parent;
        if (grand.isLeftChild()) {
            grand.parent.left = parent;
        } else if (grand.isRightChild()) {
            grand.parent.right = parent;
        } else {
            // 能来到这里说明，红黑树上所有节点中的key计算出来的索引都一样，
            // 所以root=table[index(grand)]，index(Node node)传入这一棵红黑树上的任意节点都可以
            table[index(grand)] = parent;
        }

        grand.parent = parent;
        if (child != null)
            child.parent = grand;
    }

    private void rotateRight(Node<K, V> grand) { // LL
        Node<K, V> parent = grand.left;
        Node<K, V> child = parent.right;

        grand.left = child;
        parent.right = grand;

        parent.parent = grand.parent;
        if (grand.isLeftChild()) {
            grand.parent.left = parent;
        } else if (grand.isRightChild()) {
            grand.parent.right = parent;
        } else {
            // 能来到这里说明，红黑树上所有节点中的key计算出来的索引都一样，
            // 所以root=table[index(grand)]，index(Node node)传入这一棵红黑树上的任意节点都可以
            table[index(grand)] = parent;
        }

        grand.parent = parent;
        if (child != null)
            child.parent = grand;
    }

    private Node<K, V> black(Node<K, V> node) {
        return color(node, BLACK);
    }

    private Node<K, V> red(Node<K, V> node) {
        return color(node, RED);
    }

    private boolean isBlack(Node<K, V> node) {
        return colorOf(node) == BLACK;
    }

    private boolean isRed(Node<K, V> node) {
        return colorOf(node) == RED;
    }

    private boolean colorOf(Node<K, V> node) {
        return node == null ? BLACK : node.color;
    }

    private Node<K, V> color(Node<K, V> node, boolean color) {
        if (node != null)
            node.color = color;
        return node;
    }

    private static class Node<K, V> {
        K key;
        V value;
        int hash;
        // 新添加的节点默认为RED
        boolean color = RED;

        Node<K, V> parent;
        Node<K, V> left;
        Node<K, V> right;

        public Node(K key, V value, Node<K, V> parent) {
            this.key = key;
            this.value = value;
            this.parent = parent;
            this.hash = key == null ? 0 : key.hashCode();
        }

        public boolean isLeftChild() {
            return parent != null && this.equals(parent.left);
        }

        public boolean isRightChild() {
            return parent != null && this.equals(parent.right);
        }

        public boolean isLeaf() {
            return left == null && right == null;
        }

        public boolean hasTwoChildren() {
            return left != null && right != null;
        }

        public Node<K, V> sibling() {
            if (isLeftChild()) {
                return parent.right;
            }

            else if (isRightChild()) {
                return parent.left;
            }
            return null;
        }

        @Override
        public String toString() {
            return "K" + "_" + key + "_v_" + value;
        }
    }

}

11、TreeMap VS HashMap

1、时间复杂度
TreeMap的底层使用红黑树实现的，搜索、添加、删除的时间复杂度为O(logn)；
HashMap的底层使用哈希表+单链表+红黑树实现的，搜索、添加、删除的时间复杂度为O(1)。
为啥同样使用红黑树，HashMap的复杂度为啥为O(1)呢？

比如添加到HashMap中的节点总数为n，那么这n个节点并不是添加到同一棵红黑树上的，而是会均分到桶数组中每个桶中的红黑树上，而且装填因子比较小为0.75，当桶数组中的红黑树增加到一定数量时又会进行扩容，所以分配到一棵红黑树上的节点是常数级别的。所以HashMap的搜索、添加、删除的复杂度为O(1)级别的。

2、何时选择TreeMap
既然HashMap的时间复杂度为O(1)，是不是只要使用到映射就直接使用HashMap呢？
TreeMap中的key是要求具有可比较性的，而且TreeMap的元素的遍历是按照key的升序遍历的，所以如果要求是有序遍历就使用TreeMap，否则使用HashMap。

Java数据结构_哈希表_基本概念
本文目标哈希表的基本概念,哈希冲突,哈希函数什么是哈希表哈希表也叫做散列表(hash有剁碎的意思)哈希表是空...
redis数据结构--字典
Redis的字典底层就是哈希表。哈希表首先给出哈希表的定义：其中可以看到，table是一个哈希表节点的数组，...
哈希表和链表
优秀文章：Chapter: 散列表（哈希表）一、哈希表哈希表hashtable(key，value) 就是把K...
算法-哈希表算法总结
1 哈希表模拟思路：通过设计哈希表，模拟O(1)时间复杂度的哈希表。 2 数组作为哈希表思路：数组就是简单的哈...
数据结构 -- 哈希表及其应用
这篇主要用来记录一下学习到的关于哈希表的知识点。文章结构哈希表哈希表的定义哈希表的优缺点哈希碰撞负载因...
数据结构与算法（第一季）：哈希表（Hash Table）
一、哈希表（Hash Table） 1、概念哈希表也叫做散列表。哈希表的原理：利用哈希函数生成key对应的i...
深入理解哈希表
深入理解哈希表深入理解哈希表
2019 算法面试相关(leetcode)--哈希表
哈希表相关的原理可以参考下：浅谈哈希表(HashTable)深入理解哈希表哈希表的理解理解HashSet及使用哈...
Redis中的字典
Redis中的字典 Redis中的字典使用哈希表作为底层实现，一个哈希表中可以有多个哈希表结点，而每个哈希表结点保...
Redis数据结构与对象——哈希
1 字典的实现 Redis的字典使用哈希表作为底层实现，一个哈希表可以有多个哈希表节点，即每个哈希表节点就保存了字...

哈希表

1、TreeMap分析

2、需求

3、哈希表（Hash Table）

4、哈希冲突（Hash Collision）

4.1、哈希冲突的解决方案

4.2、JDK1.8的哈希冲突解决方案

5、哈希函数

6、如何生成key的哈希值

6.1、整数的哈希值

6.2、Float的哈希值

6.3、Long和Double的哈希值

6.4、字符串的hash值

关于31的讨论

6.5、自定对象的hash值

6.5.1、自定义对象的equals

7、实现HashMap

7.1、HashMap的接口设计

7.2、clear()

7.3、put(K key, V value)

7.4、get(K key)

7.5、containsKey(K key)和containsValue(V value)

7.6、traversal(Visitor<K, V> visitor)

8、问题剖析

8.1、node(Node<K, V> node, K k1)的bug修复

8.1、put(K key, V value)的bug修复

8.2、key比较的总结

8.3、remove(K key)实现

9、扩容

10、HashMap的完整代码

11、TreeMap VS HashMap

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读