美文网首页
JDK1.7 HashMap源码解读

JDK1.7 HashMap源码解读

作者: zclzhangcl | 来源:发表于2017-03-22 00:39 被阅读0次

    之前分析过JDK8的源码,对于共同的内容就不再赘述。
    首先看hashmap的继承关系:

    继承关系.png

    可以看出与JDK8变化不大。

    那我们从头开始读吧。

        /**
         * holds values which can't be initialized until after VM is booted.
         */
        private static class Holder {
    
            /**
             * Table capacity above which to switch to use alternative hashing.
             */
            static final int ALTERNATIVE_HASHING_THRESHOLD;
    
            static {
                String altThreshold = java.security.AccessController.doPrivileged(
                    new sun.security.action.GetPropertyAction(
                        "jdk.map.althashing.threshold"));
    
                int threshold;
                try {
                    threshold = (null != altThreshold)
                            ? Integer.parseInt(altThreshold)
                            : ALTERNATIVE_HASHING_THRESHOLD_DEFAULT;
    
                    // disable alternative hashing if -1
                    if (threshold == -1) {
                        threshold = Integer.MAX_VALUE;
                    }
    
                    if (threshold < 0) {
                        throw new IllegalArgumentException("value must be positive integer.");
                    }
                } catch(IllegalArgumentException failed) {
                    throw new Error("Illegal value for 'jdk.map.althashing.threshold'", failed);
                }
    
                ALTERNATIVE_HASHING_THRESHOLD = threshold;
            }
        }
    

    先看这段源码,这段是JDK7内的第一段代码.作为静态内部类,这个类的作用只有一个,那就是初始化ALTERNATIVE_HASHING_THRESHOLD值。看其过程可知,是为了解析出初始的临界值,超出这个值map则会再hash扩大。并且在调用的地方也写明了,只会在jvm启动之后才会调用。对于这段代码stackoverflow还有一个提问:http://stackoverflow.com/questions/29918624/what-is-the-use-of-holder-class-in-hashmap

    看一下hashmap最重要的一个构造方法:

    /**
    * Constructs an empty <tt>HashMap</tt> with the specified initial
    * capacity and load factor.
    *
    * @param  initialCapacity the initial capacity
    * @param  loadFactor      the load factor
    * @throws IllegalArgumentException if the initial capacity is negative
    *         or the load factor is nonpositive
    */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
    
        this.loadFactor = loadFactor;
        threshold = initialCapacity;
        init();
    }
    

    这个构造方法就是为了给负载因子和临界值赋值。对于我们一般常用的new HashMap();的初始是多少呢?

    /**
    * Constructs an empty <tt>HashMap</tt> with the default initial capacity
    * (16) and the default load factor (0.75).
    */
    public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
    }
    

    注释中写的很清楚了。
    这里还有一个init方法,看源码可以看出是一个空方法,那么是不是有点奇怪呢?其实这个是在hashmap中是用不着的,但是在linkedHashMap中会覆写这个方法。和jdk8有点类似。

    看一下对于定位元素的核心2个方法:

    /**
    * Retrieve object hash code and applies a supplemental hash function to the
    * result hash, which defends against poor quality hash functions.  This is
    * critical because HashMap uses power-of-two length hash tables, that
    * otherwise encounter collisions for hashCodes that do not differ
    * in lower bits. Note: Null keys always map to hash 0, thus index 0.
    */
    final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }
    
        h ^= k.hashCode();
    
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }
    
    /**
    * Returns index for hash code h.
    */
    static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }
    

    先说indexFor,看注释:length must be a non-zero power of 2,意思是length必定是被2除尽,也就是这里的length肯定是2的幂次方,好处可以自行查询。这个indexFor的含义其实是定位出这个hash值位于HashMap的哪个bucket中,等同于h%length。

    对于hash方法,这里引入了hashSeed,为了降低hash冲突的概率。若入参为String且hashSeed不为0,即需要动态生成新的hash值。(加入这段分支代码,是为了解决jdk7u40以下的一个bug,有兴趣可以单独了解)
    否则的话会走下面的分支。下面这段分支的含义是为了解决hash冲突而做的二次hash值,即将高位的hash值与低位进行异或操作,降低冲突的概率。

    看完了定位元素,那么我们再来看获取元素:

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)
     */
    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        Entry<K,V> entry = getEntry(key);
    
        return null == entry ? null : entry.getValue();
    }
    /**
     * Offloaded version of get() to look up null keys.  Null keys map
     * to index 0.  This null case is split out into separate methods
     * for the sake of performance in the two most commonly used
     * operations (get and put), but incorporated with conditionals in
     * others.
     */
    private V getForNullKey() {
        if (size == 0) {
            return null;
        }
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null)
                return e.value;
        }
        return null;
    }
    /**
     * Returns the entry associated with the specified key in the
     * HashMap.  Returns null if the HashMap contains no mapping
     * for the key.
     */
    final Entry<K,V> getEntry(Object key) {
        if (size == 0) {
            return null;
        }
    
        int hash = (key == null) ? 0 : hash(key);
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash &&
                ((k = e.key) == key || (key != null && key.equals(k))))
                return e;
        }
        return null;
    }
    

    取元素的代码分为3段,主要看最下面方法getEntry。
    Entry<K,V> e = table[indexFor(hash, table.length)]这段是定位到该元素在hashmap是位于哪个bucket内。jdk中的hashmap是数组+链表的结构,会根据key的hash值映射到对应的数组元素内,但是这个数组元素不是存储一个对象,而是维护一个链表。每个数组元素被称之为一个bucket。在取的时候里面会判断是否为同一个key。判断的逻辑有2层,注意体会。

    put方法,则更复杂一点。

    /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
    
        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }
    /**
     * Offloaded version of put for null keys
     */
    private V putForNullKey(V value) {
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0);
        return null;
    }
    /**
     * Adds a new entry with the specified key, value and hash code to
     * the specified bucket.  It is the responsibility of this
     * method to resize the table if appropriate.
     *
     * Subclass overrides this to alter the behavior of put method.
     */
    void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }
    
        createEntry(hash, key, value, bucketIndex);
    }
    /**
     * Like addEntry except that this version is used when creating entries
     * as part of Map construction or "pseudo-construction" (cloning,
     * deserialization).  This version needn't worry about resizing the table.
     *
     * Subclass overrides this to alter the behavior of HashMap(Map),
     * clone, and readObject.
     */
    void createEntry(int hash, K key, V value, int bucketIndex) {
        Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<>(hash, key, value, e);
        size++;
    }
    

    inflateTable这个方法就不说了,主要是初始化table。在jdk8已废弃,我就没细看了。
    put时会先判断put的key是否在bucket存在,若存在,则更新value,返回旧的value。否则的话,则会去新增一个bucket。

    若map存储空间不够时(大于临界值threshold),则会进行扩容。看一下源码:

    /**
     * Rehashes the contents of this map into a new array with a
     * larger capacity.  This method is called automatically when the
     * number of keys in this map reaches its threshold.
     *
     * If current capacity is MAXIMUM_CAPACITY, this method does not
     * resize the map, but sets threshold to Integer.MAX_VALUE.
     * This has the effect of preventing future calls.
     *
     * @param newCapacity the new capacity, MUST be a power of two;
     *        must be greater than current capacity unless current
     *        capacity is MAXIMUM_CAPACITY (in which case value
     *        is irrelevant).
     */
    void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return;
        }
    
        Entry[] newTable = new Entry[newCapacity];
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }
    /**
     * Transfers all entries from current table to newTable.
     */
    void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;
        for (Entry<K,V> e : table) {
            while(null != e) {
                Entry<K,V> next = e.next;
                if (rehash) {
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }
    }
    

    先判断容量是否已达最大值(1<<30),若已达系统默认最大值,则直接赋为Integer.MAX_Value.未达最大值,则会走下面的transfer逻辑。
    transfer的逻辑是将原map各bucket的值,按照新的策略,重新放入各bucket内。但是由于HashMap是非线程安全的,在这一步可能会出现环,那么在get的时候就会出现oom异常。一个例子可以看这个:http://www.cnblogs.com/ITtangtang/p/3966467.html

    移除元素的源码如下:

    /**
         * Removes the mapping for the specified key from this map if present.
         *
         * @param  key key whose mapping is to be removed from the map
         * @return the previous value associated with <tt>key</tt>, or
         *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
         *         (A <tt>null</tt> return can also indicate that the map
         *         previously associated <tt>null</tt> with <tt>key</tt>.)
         */
        public V remove(Object key) {
            Entry<K,V> e = removeEntryForKey(key);
            return (e == null ? null : e.value);
        }
    
        /**
         * Removes and returns the entry associated with the specified key
         * in the HashMap.  Returns null if the HashMap contains no mapping
         * for this key.
         */
        final Entry<K,V> removeEntryForKey(Object key) {
            if (size == 0) {
                return null;
            }
            int hash = (key == null) ? 0 : hash(key);
            int i = indexFor(hash, table.length);
            Entry<K,V> prev = table[i];
            Entry<K,V> e = prev;
    
            while (e != null) {
                Entry<K,V> next = e.next;
                Object k;
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k)))) {
                    modCount++;
                    size--;
                    if (prev == e)
                        table[i] = next;
                    else
                        prev.next = next;
                    e.recordRemoval(this);
                    return e;
                }
                prev = e;
                e = next;
            }
    
            return e;
        }
    

    核心也是在下面的removeEntryForKey方法上。步骤也是类似的,先定位到bucket,然后对这个bucket进行循环,找到该key对应的元素,将其指向下一个,当前元素没有引用了,会被GC回收掉。

    最后看一下类图中继承的cloneable接口的实现:

    /**
         * Returns a shallow copy of this <tt>HashMap</tt> instance: the keys and
         * values themselves are not cloned.
         *
         * @return a shallow copy of this map
         */
        public Object clone() {
            HashMap<K,V> result = null;
            try {
                result = (HashMap<K,V>)super.clone();
            } catch (CloneNotSupportedException e) {
                // assert false;
            }
            if (result.table != EMPTY_TABLE) {
                result.inflateTable(Math.min(
                    (int) Math.min(
                        size * Math.min(1 / loadFactor, 4.0f),
                        // we have limits...
                        HashMap.MAXIMUM_CAPACITY),
                   table.length));
            }
            result.entrySet = null;
            result.modCount = 0;
            result.size = 0;
            result.init();
            result.putAllForCreate(this);
    
            return result;
        }
    

    注释里已经声明了这个clone方法只是一个浅复制。一般是不推荐直接clone的。

    相关文章

      网友评论

          本文标题:JDK1.7 HashMap源码解读

          本文链接:https://www.haomeiwen.com/subject/qsunnttx.html