HashMap的源码分析有注释！！！

作者: ae12 | 来源:发表于2018-09-21 09:43 被阅读2次

Q：일상, 그리고 너
补充链接，equals() Vs "==":
https://www.cnblogs.com/zhxhdean/archive/2011/03/25/1995431.html还有外文网站上的这个：
https://www.geeksforgeeks.org/difference-equals-method-java/
1.hashMap的get() 和put()? 以及用到的数据结构？
2.hash冲突解决方法?
1、2 答案呢，在源码分析的流程图里了

3.与hashTable 区别？

hashtable 是线程安全的:从源码回答有sychronized() 修饰符，所以性能低。
2)key value 不可以为null;
3)One of HashMap's subclasses is [LinkedHashMap] so in the event that you'd want predictable iteration order (which is insertion order by default), you could easily swap out the HashMap for a LinkedHashMap. This wouldn't be as easy if you were using Hashtable.
----HashMap的子类之一是LinkedHashMap，因此如果您想要可预测的迭代顺序（默认情况下是插入顺序），您可以轻松地为LinkedHashMap换出HashMap。如果您使用Hashtable，这将不那么容易
4)hash（）以及扩容机制不一致
具体hashTablede 的hash实现以及 2*n-1 扩容原因在这里：https://www.jianshu.com/writer#/notebooks/20925198/notes/34152101

4.如何保证线程安全？什么原理？hashmap不是线程安全的
解释不错的：https://coolshell.cn/articles/9606.html
hashmap是线程不安全的
Because during the auto-resizing mechanism, if a thread tries to put or get an object, the map might use the old index value and won’t find the new bucket in which the entry is.
---因为它的 auto-resizing mechanism,如果线程一挂起，而线程二执行完了，且改变了元素位置，而线程一仍然会取旧的那个 bukcet index;这有可能导致死循环

The worst case scenario is when 2 threads put a data at the same time and the 2 put() calls resize the Map at the same time. Since both threads modify the linked lists at the same time, the Map might end up with an inner-loop in one of its linked lists. If you tries to get a data in the list with an inner loop, the get() will never end.
---而线程同时读或是取数据，map可能出现内部死循环

至于如何保证线程安全？
-----可以用hashtable 或是concurrentHashMap; hashTable在
下面解释：
The HashTable implementation is a thread safe implementation that prevents from this situation. But, since all the CRUD methods are synchronized this implementation is very slow. For example, if thread 1 calls get(key1), thread 2 calls get(key2) and thread 3 calls get(key3), only one thread at a time will be able to get its value whereas the 3 of them could access the data at the same time.
-----hashtable可以防止上面情况发生，但是因为CRUD会synchronized，所以会慢，例如如果线程1、2、3同时访问数据，那么只有一个线程可以访问数据。

A smarter implementation of a thread safe HashMap exists since JAVA 5: the ConcurrentHashMap. Only the buckets are synchronized so multiples threads can get(), remove() or put() data at the same time if it doesn’t imply accessing the same bucket or resizing the inner array. It’s better to use this implementation in a multithreaded application.

更明智的做法是ConcurrentHashMap.只有buckts被synchronized，那么get()\move() put() 可以同时访问数据只要不同时访问相同的buckets或是resizing
inner array.在多线程的应用，用ConcurrentHashMap更好

5.hashmap如何扩容的？如何避免扩容？
当hashmap的size>capacoty* loadFactor -1，hashmap自动扩容，且容量增2倍。当未指定loadFactor和 capacity时，默认0.75、16,当容量=160.75 -1 =11 时候，自动扩容为 216=32；
为什么避免自动扩容？
JProfiler的统计信息如下，可以看到扩容的耗费在transfer方法上。

hashmap扩容耗费.jpeg

扩容的目的：
This aim of this resize operation is to decrease the size of the linked lists so that the time cost of put(), remove() and get() methods stays low. ---- 为了减少put() get() remve() 方法在查找元素时的array()的size（）

扩容前后，元素散列.jpg

he picture shows a representation before and after the resizing of the inner array. Before the increase, in order to get Entry E, the map had to iterate through a list of 5 elements. After the resizing, the same get() just iterates through a linked list of 2 elements, the get() is 2 times faster after the resizing !
可见，扩容后重新散列的表，在查找同样的EntryE,map内的array需要迭代5,而新的只需迭代2, 效率更快。

因为每次扩容，都要new Table，并将元素现有元素拷贝到新的Table中，所以，如果元素数量比较多，在扩容时候，就会影响效率，所以，要避免，避免方法：指定 Capacity、loadFactor.

详细扩容机制的，可以看其源码resize()实现原理：

java8新特征

hashmap(java8).jpg

By inheritance, the inner table can contain both Node (linked list ) and TreeNode (red-black tree). Oracle decided to use both data structures with the following rules:
– If for a given index (bucket) in the inner table there are more than 8 nodes, the linked list is transformed into a red black tree
– If for a given index (bucket) in the inner table there are less than 6 nodes, the tree is transformed into a linked list

红黑树，查找元素时间复杂度 cost O(log(n))而链表 it would have cost O(n) with a linked list.
Memory overhead --内存开销

6.hashmap是有序的吗？如何实现有序？ hashmap是无序的。从put()方法实现原理（看put()流程图）：根据key的hash()值计算Bucket的位置，不涉及有序。实现有序方法：LinkedHashMap

hashmap的数据结构、实现原理、源码、put get方法、负载因子、扩容机制、底层的
底层数组长度 2的 n次方原因。。

HashMap 实现了Map接口，继承AbstractMap；Map里规定了键值对映射规则，而AbstractMap提供了MAp的骨干实现，以最大限度减少了实现Map的工作。

HashMap<Key,Value> extends AbstractMap<Key,Value> implement Map<Key,Value>

HashMap（jdk 8）的put的流程图：

hashmap.put()流程图.jpeg

总结：
1.在桶内buket[i] 是否存在，不存在（ p==buket[i]!=null没有哈希冲突），则新建节点（new node）

如果存在冲突，则判断
2-1:hash的key值和给定的key值相等或是地址相同，这里用到了equal()，则赋值给e,最后更新e的value
2-2:p不存在，就判断是否是红黑树：
2-2-1: p是否TreeNode,是TreeNode 就去插入红黑树中
2-2-2: 不是TreeNode，就是链表，则循环链表，直到1.找到key, 则break跳出循环 2.找到p.next==null后，插入链尾，然后判断是否 hashmap.length>=8？就 treeifyBin(tab, hash)（主主要是重新），

    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }
然后是详细的put（）方法的代码
    /**
     * Implements Map.put and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value(put 传 false)
     * @param evict if false, the table is in creation mode.(put 传 true)
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
        Node<K,V>[] tab;  //缓存底层数组用,都是指向一个地址的引用
        Node<K,V> p;      //插入数组的桶i处的键值对节点
        int n;            //底层数组的长度
        int i;            //插入数组的桶的下标
        //刚开始table是null或空的时候，初始化个默认的table；为tab和n赋值，tab指向底层数组，n为底层数组的长度
        if ((tab = table) == null || (n = tab.length) == 0){
            n = (tab = resize()).length;
        }
        //(n - 1) & hash：根据hash值算出插入点在底层数组的桶的位置，即下标值；为p赋值，也为i赋值（不管碰撞与否，都已经赋值了）
        //如果在数组上，没有发生碰撞，即当前要插入的位置上之前没有插入过值，则直接在此位置插入要插入的键值对
        if ((p = tab[i = (n - 1) & hash]) == null){
            tab[i] = newNode(hash, key, value, null);//插入的节点的next属性是null
        } else {    //发生碰撞，即当前位置已经插入了值
            Node<K,V> e;    //中间变量吧，跟冒泡排序里面的那个中间变量似的，起到个值交换的作用
            K k;            //同上
            //hash值相同，key也相同，那么就是更新这个键值对的值。同 jdk 1.7
            if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k)))){ //注意在这个if内【e != null】
                e = p;//这地方，e = p  他们两个都是指向数组下标为i的地方，在这if else if else结束之后，再把节点的值value给更新了
            } else if (p instanceof TreeNode){  //这个树方法可能会返回null。
                //jdk 1.8引入了红黑树来处理碰撞，上面判断p的类型已经是树结构了，
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);//如果是，则走添加树的方法。
            } else {    //注意在这个else内，当为添加新节点时，【e == 】；更新某个节点时，就不是null
                for (int binCount = 0; ; ++binCount) {//还未形成树结构，还是jdk 1.7的链表结构
                    //差别就是1.7:是头插法，后来的留在数组上，先来的链在尾上；1.8:是先来的就留在数组上，后来的链在尾上
                    //判断p.next是否为空，同时为e赋值，若为空，则p.next指向新添加的节点，这是在链表长度小于7的时候
                    if ((e = p.next) == null) {
                        //这个地方有个不好理解的地方：在判断条件里面，把e指向p.next，也就是说现在e=null而不是下下一行错误的理解。
                        //这也就解释了更新的时候，返回oldValue，新建的时候，是不在那地方返回的。
                        p.next = newNode(hash, key, value, null);//e = p.next,p.next指向新生成的节点，也就是说e指向新节点（错误）
                        //对于临界值的分析：
                        //假设此次是第六次，binCount == 6,不会进行树变，当前链表长度是7；下次循环。
                        //binCount == 7，条件成立，进行树变，以后再put到这个桶的位置的时候，这个else就不走了，走中间的那个数结构的分叉语句啦
                        //这个时候，长度为8的链表就变成了红黑树啦
                        if (binCount >= TREEIFY_THRESHOLD - 1){// -1 for 1st //TREEIFY_THRESHOLD == 8
                            treeifyBin(tab, hash);
                        }
                        break;//插入新值或进行树变后，跳出for循环。此时e未重定向，还是指向null，虽然后面p.next指向了新节点。
                        //但是，跟e没关系。
                    }
                    //如果在循环链表的时候，找到key相同的节点，那么就跳出循环，就走不到链表的尾上了。
                    // e已经在上一步已经赋值了，且不为null,也会跳出for循环，会在下面更新value的值
                    if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))){
                        break;
                    }
                    //这个就是p.next也就是e不为空，然后，还没有key相同的情况出现，那就继续循环链表，
                    // p指向p.next也就是e，继续循环，继续，e=p.next
                    p = e;
                    //直到p.next为空，添加新的节点；或者出现key相等，更新旧值的情况才跳出循环。
                }
            }
            //经过上面if else if else之后，e在新建节点的时候，为null；更新的时候，则被赋值。在树里面处理putTreeVal（）同样如此，
            if (e != null) { // existing mapping for key//老外说的对，就是只有更新的时候，才走这，才会直接return oldValue
                V oldValue = e.value;
                //onlyIfAbsent 这个在调用hashMap的put()的时候，一直是false，那么下面更新value是肯定执行的
                if (!onlyIfAbsent || oldValue == null){
                    e.value = value;
                }
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold){
            resize();
        }
        afterNodeInsertion(evict);
        return null;
    }

treeifyBin(tab, hash)源码：

   /**
     * Replaces all linked nodes in bin at index for given hash unless
     * table is too small, in which case resizes instead.
     */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
     // < 64 扩容
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {

            TreeNode<K,V> hd = null, tl = null;
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
  // > 64,转为红黑树
                hd.treeify(tab);
        }
    }

resize ：初始化table或是扩容（2）
官方解释，自行理解哈哈哈😄
/*
* Initializes or doubles table size. If null, allocates in
* accord with initial capacity target held in field threshold.
* Otherwise, because we are using power-of-two expansion, the
* elements from each bin must either stay at same index, or move
* with a power of two offset in the new table.
*
* @return the table
*/

//一些关于红黑树的操作（树形化、放入树中）参考https://juejin.im/entry/5839ad0661ff4b007ec7cc7a
2.get()


    /**
     * Implements Map.get and related methods
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

HashMap的源码分析有注释！！！

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读