HashMap的数据结构是基于数组+链表实现的。
1 HashMap中重要的静态常量
- DEFAULT_INITIAL_CAPACITY(默认初始化容量)
/**
* The default initial capacity - MUST be a power of two.
*/
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
默认初始化为1左移4位,为16。要求初始化容量必须为2的幂次方。这里的容量指的是HashMap中数组的长度,即桶的数量。
疑问:为什么HashMap的初始化容量必须为2的幂次方呢?
- DEFAULT_LOAD_FACTOR(默认加载因子)
默认加载因子为0.75。/** * The load factor used when none specified in constructor. */ static final float DEFAULT_LOAD_FACTOR = 0.75f;
疑问:加载因子作用是啥?
2 HashMap中的实例变量
- table(存储键值对的数组,又称为桶数组)
/**
* The table, resized as necessary. Length MUST Always be a power of two.
*/
transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
- size(实际键值对数量)
/** * The number of key-value mappings contained in this map. */ transient int size;
- loadFactor(加载因子)
/** * The load factor for the hash table. * * @serial */ final float loadFactor;
- threshold(阈值)
HashMap扩容的阈值,到达该阈值后,HashMap可能进行扩容。/** * The next size value at which to resize (capacity * load factor). * @serial */ // If table == EMPTY_TABLE then this is the initial capacity at which the // table will be created when inflated. int threshold;
threshold=capacity * load factor
3 HashMap存储键值对的内部节点Entry
static class Entry<K,V> implements Map.Entry<K,V> {
final K key;
V value;
Entry<K,V> next;
int hash;
/**
* Creates new entry.
*/
Entry(int h, K k, V v, Entry<K,V> n) {
value = v;
next = n;
key = k;
hash = h;
}
public final K getKey() {
return key;
}
public final V getValue() {
return value;
}
public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}
public final boolean equals(Object o) {
if (!(o instanceof Map.Entry))
return false;
Map.Entry e = (Map.Entry)o;
Object k1 = getKey();
Object k2 = e.getKey();
if (k1 == k2 || (k1 != null && k1.equals(k2))) {
Object v1 = getValue();
Object v2 = e.getValue();
if (v1 == v2 || (v1 != null && v1.equals(v2)))
return true;
}
return false;
}
public final int hashCode() {
return Objects.hashCode(getKey()) ^ Objects.hashCode(getValue());
}
省略...
Entry中存储的信息:
- key
- key所对应的hash
- value
- 该Entry的下一个节点
值得关注的是Entry的equals方法:
当且仅当两个Entry的key相同,value也相同时,两个Entry才equals。
4 HashMap常用默认构造方法
/**
* Constructs an empty <tt>HashMap</tt> with the default initial capacity
* (16) and the default load factor (0.75).
*/
public HashMap() {
this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);
}
上面的无参构造方法使用调用下面的有参构造方法,其中初始容量为16(扩容阈值为16),加载因子为0.75;
/**
* Constructs an empty <tt>HashMap</tt> with the specified initial
* capacity and load factor.
*
* @param initialCapacity the initial capacity
* @param loadFactor the load factor
* @throws IllegalArgumentException if the initial capacity is negative
* or the load factor is nonpositive
*/
public HashMap(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal initial capacity: " +
initialCapacity);
if (initialCapacity > MAXIMUM_CAPACITY)
initialCapacity = MAXIMUM_CAPACITY;
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal load factor: " +
loadFactor);
this.loadFactor = loadFactor;
threshold = initialCapacity;
init();
}
5 HashMap put方法详解
/**
* Associates the specified value with the specified key in this map.
* If the map previously contained a mapping for the key, the old
* value is replaced.
*
* @param key key with which the specified value is to be associated
* @param value value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or
* <tt>null</tt> if there was no mapping for <tt>key</tt>.
* (A <tt>null</tt> return can also indicate that the map
* previously associated <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
if (table == EMPTY_TABLE) {
inflateTable(threshold);
}
if (key == null)
return putForNullKey(value);
int hash = hash(key);
int i = indexFor(hash, table.length);
for (Entry<K,V> e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(hash, key, value, i);
return null;
}
- 如果table数组下标为i的桶内没有元素,则根据key和value创建一个entry,将其放入下标i的桶中,并返回null。
- 如果talbe数组下标为i的桶内存在元素,并在存在相同key的元素,则使用新value替换该key所对应的旧值。
-
如果talbe数组下标为i的桶内存在元素,并且存在相同key的元素则根据key和value创建一个entry,将其放入下标i的桶中,并返回null。
image.png
5.1 inflateTable方法(初始化数组)
/**
* Inflates the table.
*/
private void inflateTable(int toSize) {
// Find a power of 2 >= toSize
int capacity = roundUpToPowerOf2(toSize);
threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
table = new Entry[capacity];
initHashSeedAsNeeded(capacity);
}
该方法主要作用:
- 将HashMap的capacity(容量)调整为2的幂次方
- 重新计算扩容threshold(阈值)
- 根据capacity初始化Entry数组
疑问:HashMap为啥需要将capacity调整为2的幂次方?
5.2 putForNullKey方法(存放key为null的Entry)
/**
* Offloaded version of put for null keys
*/
private V putForNullKey(V value) {
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}
modCount++;
addEntry(0, null, value, 0);
return null;
}
我们一般将HashMap中数组table,index为0的位置称为第一个桶,index为1的位置称为第二个桶,以此类推。
HashMap中将key为null的键值对存储在第一个桶内。存放key为null的键值对存在两种情况:
- 数组table的第一个桶,内有元素(不一定是key为null的元素,还有其他元素),循环从该桶的链表中查找key为null的键值对。如果找到,则替换key为null的键值对中value的值(即使用新value值替换key所在entry中的value),并返回旧value值;如果没有找到,则根据key和value新建一个entry对象,将其放在第一个桶内,并返回null。
- 数组table的第一个桶,没有元素,根据key和value新建一个entry对象,将其放在第一个桶内,并返回null。
5.3 hash方法
/**
* Retrieve object hash code and applies a supplemental hash function to the
* result hash, which defends against poor quality hash functions. This is
* critical because HashMap uses power-of-two length hash tables, that
* otherwise encounter collisions for hashCodes that do not differ
* in lower bits. Note: Null keys always map to hash 0, thus index 0.
*/
final int hash(Object k) {
int h = hashSeed;
if (0 != h && k instanceof String) {
return sun.misc.Hashing.stringHash32((String) k);
}
h ^= k.hashCode();
// This function ensures that hashCodes that differ only by
// constant multiples at each bit position have a bounded
// number of collisions (approximately 8 at default load factor).
h ^= (h >>> 20) ^ (h >>> 12);
return h ^ (h >>> 7) ^ (h >>> 4);
}
hash函数,计算key的哈希值
5.4 indexFor方法(计算数组的下标)
/**
* Returns index for hash code h.
*/
static int indexFor(int h, int length) {
// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
return h & (length-1);
}
5.5 addEntry方法
/**
* Adds a new entry with the specified key, value and hash code to
* the specified bucket. It is the responsibility of this
* method to resize the table if appropriate.
*
* Subclass overrides this to alter the behavior of put method.
*/
void addEntry(int hash, K key, V value, int bucketIndex) {
if ((size >= threshold) && (null != table[bucketIndex])) {
resize(2 * table.length);
hash = (null != key) ? hash(key) : 0;
bucketIndex = indexFor(hash, table.length);
}
createEntry(hash, key, value, bucketIndex);
}
如果HashMap中键值对的个数size超过阈值,并且key所在桶内不存在键值对,则将该HashMap扩容至2倍。
5.6 resize方法
/**
* Rehashes the contents of this map into a new array with a
* larger capacity. This method is called automatically when the
* number of keys in this map reaches its threshold.
*
* If current capacity is MAXIMUM_CAPACITY, this method does not
* resize the map, but sets threshold to Integer.MAX_VALUE.
* This has the effect of preventing future calls.
*
* @param newCapacity the new capacity, MUST be a power of two;
* must be greater than current capacity unless current
* capacity is MAXIMUM_CAPACITY (in which case value
* is irrelevant).
*/
void resize(int newCapacity) {
Entry[] oldTable = table;
int oldCapacity = oldTable.length;
if (oldCapacity == MAXIMUM_CAPACITY) {
threshold = Integer.MAX_VALUE;
return;
}
Entry[] newTable = new Entry[newCapacity];
transfer(newTable, initHashSeedAsNeeded(newCapacity));
table = newTable;
threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
}
该方法作用:
- 创建新table数组
- 调用transfer方法将旧table数组中的数据迁移至新table数组
- 重新计算HashMap的扩容阈值threshold
5.7 transfer方法
/**
* Transfers all entries from current table to newTable.
*/
void transfer(Entry[] newTable, boolean rehash) {
int newCapacity = newTable.length;
for (Entry<K,V> e : table) {
while(null != e) {
Entry<K,V> next = e.next;
if (rehash) {
e.hash = null == e.key ? 0 : hash(e.key);
}
int i = indexFor(e.hash, newCapacity);
e.next = newTable[i];
newTable[i] = e;
e = next;
}
}
}
该方法将当前table数组中的元素移动到新table数组中:
- 首先判断是否需要对key,进行重新hash,如需要则重新计算散列值
- 根据hash值,重新计算数组的下标(即所在的桶)
- 将取出的元素插入到桶中链表的头部
6 HashMap Get方法
/**
* Returns the value to which the specified key is mapped,
* or {@code null} if this map contains no mapping for the key.
*
* <p>More formally, if this map contains a mapping from a key
* {@code k} to a value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise
* it returns {@code null}. (There can be at most one such mapping.)
*
* <p>A return value of {@code null} does not <i>necessarily</i>
* indicate that the map contains no mapping for the key; it's also
* possible that the map explicitly maps the key to {@code null}.
* The {@link #containsKey containsKey} operation may be used to
* distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
if (key == null)
return getForNullKey();
Entry<K,V> entry = getEntry(key);
return null == entry ? null : entry.getValue();
}
6.1 getForNullKey方法
/**
* Offloaded version of get() to look up null keys. Null keys map
* to index 0. This null case is split out into separate methods
* for the sake of performance in the two most commonly used
* operations (get and put), but incorporated with conditionals in
* others.
*/
private V getForNullKey() {
if (size == 0) {
return null;
}
for (Entry<K,V> e = table[0]; e != null; e = e.next) {
if (e.key == null)
return e.value;
}
return null;
}
getForNullKey从table数组的第一个桶里面,循环查找key为null的entry所对应的value;若没有找到,则返回null
6.2 getEntry方法
/**
* Returns the entry associated with the specified key in the
* HashMap. Returns null if the HashMap contains no mapping
* for the key.
*/
final Entry<K,V> getEntry(Object key) {
if (size == 0) {
return null;
}
int hash = (key == null) ? 0 : hash(key);
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}
getEntry:
- 计算key的hash值
- 根据hash值计算key在table数组中所在的下标(即找到桶)
- 在桶中便利链表,若找到与该key值相等的entry,则返回entry内的value;否则返回null
7 疑问解答
- 为什么HashMap的容量必须为2的幂次方呢?
因为计算key所在数组下标时,我们的目的是将key所对应的hash值映射到数组下标0---length-1内即可。当然方法有很多,比如取余,我们需要找一个高效的方法达到该目的。
当容量为2的幂次方时,使用key所对应的hash值h与数组长度length进行”与“操作,比hash值h与数组长度length进行取余操作快。因为计算对位操作,更为迅速。
static int indexFor(int h, int length) {
// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
return h & (length-1);
}
- 加载因子作用是啥?
加载因子loadFactor,用来影响HashMap的扩容时机。threshold=capacity * load factor,当HashMap中键值对数量size达到threshold阈值并且添加键值对中key所对应的桶内已存在其他键值对,则扩容。
网友评论