JDK1.7HashMap源码详解

Posted 2021-05-11 Lyy11

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了JDK1.7HashMap源码详解相关的知识，希望对你有一定的参考价值。

内容：底层数据结构，哈希碰撞，存储位置计算，快速失败，扩容，避免扩容，多线程并发情况下的安全问题及解决方法。

底层数据结构：Hashmap底层呈现门帘状，JDK7中是采用数组+链表存储数据，数组及链表中采用Entry存储

哈希碰撞：当某一数组位置没有元素时将要存储的元素直接放入，当某一数组位置有元素时先判断新旧元素是否是同一对象(key是否相同)，是则直接覆盖后返回旧元素，反之则采用头插法将新数据插入存储位置的链表，并将数组位置指定为链表头结点。

Hashmap的put方法源码

public V put(K key, V value) { if (table == EMPTY_TABLE) { inflateTable(threshold); } if (key == null) return putForNullKey(value); int hash = hash(key); int i = indexFor(hash, table.length); for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; }    } modCount++; addEntry(hash, key, value, i); return null;}

table为hashmap对象的一个属性，及数组，若为空(第一次执行put方法)则初始化hashmap，我们在new Hashmap()时往往可以指定初始数组容量和加载因子(默认为16和0.75)[加载因子：当table中元素数量>=初始数组容量*加载因子时，发生扩容，及大小变为原来的两倍]，而初始数组容量在传入时roundUpToPowerOf2()方法会自动计算为比传入数字大的第一个2的次方数，比如传入17会自动转换为32，具体原因下文

private void inflateTable(int toSize) { // Find a power of 2 >= toSize    int capacity = roundUpToPowerOf2(toSize); threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); table = new Entry[capacity]; initHashSeedAsNeeded(capacity);}

private static int roundUpToPowerOf2(int number) { // assert number >= 0 : "number must be non-negative"; int rounded = number >= MAXIMUM_CAPACITY ? MAXIMUM_CAPACITY : (rounded = Integer.highestOneBit(number)) != 0 ? (Integer.bitCount(number) > 1) ? rounded << 1 : rounded                : 1; return rounded;}

假设当前传入number为17，MAXIMUM_CAPACITY为1左移30为即1073741824hashmap最大容量

 static final int MAXIMUM_CAPACITY = 1 << 30;

public static int highestOneBit(int i) { // HD, Figure 3-1 i |= (i >> 1); i |= (i >> 2); i |= (i >> 4); i |= (i >> 8); i |= (i >> 16); return i - (i >>> 1);}

主要方法，将传入数据分别右移动1，2，4，8，16位并做或运算，即将数据二进制从左边开始首个为1的右边数据全部置0，如15：0000 1111->0000 1000 即8。提前将数据左移一位(加倍)再减一后传入可以求出比它大的最小二的次方数，关于为什么要用2的次方数做初始容量是因为扩容时的原因。

table = new Entry[capacity];

之后new一个Entry数组给hashmap，初始化就算完成了

注意hashmap是允许传入key为null的，这是和hashtable的一个不同，当key1为null时直接存储在table[0]的链表上。

private V putForNullKey(V value) { for (Entry<K,V> e = table[0]; e != null; e = e.next) { if (e.key == null) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; addEntry(0, null, value, 0); return null;}

存储位置计算：

hash值计算：

final int hash(Object k) { int h = hashSeed; if (0 != h && k instanceof String) { return sun.misc.Hashing.stringHash32((String) k);    }    h ^= k.hashCode(); // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4);}

hashseed哈希种子，为了提高数据哈希后高位数字的利用率，所以才做这么多次右移

位置计算：

int i = indexFor(hash, table.length);

static int indexFor(int h, int length) { // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2"; return h & (length-1);}

这里位置计算直接用与操作，例如length为16，15为0000 1111，h为0011 1010，与后为0000 1010，我们发现这个数据总是为16之内的二进制数，与运算即保证了数据在数组上平均分布，又保证了计算出的存储位置在数组内，不会越界。除此为还可以用取模来计算位置。

for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; }}

之后要对这一链表进行遍历，如果存在和新数据一样的原数据就覆盖原数据并返回原数据

addEntry即将元素存入map

addEntry(hash, key, value, i);

void addEntry(int hash, K key, V value, int bucketIndex) { if ((size >= threshold) && (null != table[bucketIndex])) { resize(2 * table.length); hash = (null != key) ? hash(key) : 0; bucketIndex = indexFor(hash, table.length);    } createEntry(hash, key, value, bucketIndex);}

方法首先判断当前map的size是否大于等于阙值，即初始容量*加载因子，是则进行扩容，否则创建新Entry后存入map

void createEntry(int hash, K key, V value, int bucketIndex) { Entry<K,V> e = table[bucketIndex]; table[bucketIndex] = new Entry<>(hash, key, value, e); size++;}

扩容：

if ((size >= threshold) && (null != table[bucketIndex])) { resize(2 * table.length); hash = (null != key) ? hash(key) : 0; bucketIndex = indexFor(hash, table.length);}

resize():

void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return;    } Entry[] newTable = new Entry[newCapacity]; transfer(newTable, initHashSeedAsNeeded(newCapacity)); table = newTable; threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);}

newCapacity为扩容后的大小，创建新Entry[]，transfer()就是具体的元素移动代码

void transfer(Entry[] newTable, boolean rehash) { int newCapacity = newTable.length; for (Entry<K,V> e : table) { while(null != e) { Entry<K,V> next = e.next; if (rehash) { e.hash = null == e.key ? 0 : hash(e.key); } int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } }}

对数组和数组上的每个链表都进行遍历，将每个元素重新计算存储位置后存入新数组

initHashSeedAsNeeded()是否需要重新计算哈希种子，哈希种子存在的意义就是让哈希算法更复杂，取得的哈希值更无序，扩容后很可能就需要重新计算

final boolean initHashSeedAsNeeded(int capacity) { boolean currentAltHashing = hashSeed != 0; boolean useAltHashing = sun.misc.VM.isBooted() && (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD); boolean switching = currentAltHashing ^ useAltHashing; if (switching) { hashSeed = useAltHashing ? sun.misc.Hashing.randomHashSeed(this) : 0; } return switching;}

初始HashSeed为0，currentAltHashing 为0，要让switching 为1只能useAltHashing为1，即capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD

而Holder.ALTERNATIVE_HASHING_THRESHOLD可以通过允许运行程序时通过命令传入。

避免扩容：

只能在初始化hashmap时对即将存入的数据量进行预估，进而初始化最合适容量的hashmap。

多线程并发情况下的安全问题及解决方法：

Hashmap在并发下是不安全的，因为多个线程同时操作同一个hashmap时由于线程执行速度的不一致很可能造成循环链表，进而死循环

解决方法：

1：使用Hashtable，对每个方法都加了synchronized锁，保证了线程安全的同时极大降低了运行效率

2：使用ConcurrentHashMap，Concurrent并发包下提供的线程安全Hashmap，采用细粒度的锁对hashmap多线程状态下的安全进行保证。

快速失败：

HashMap<Object, Object> hashMap = new HashMap<>();hashMap.put("1","1");hashMap.put("2","2");for (Object o : hashMap.keySet()) { if (o.equals("2")){ hashMap.remove("2"); }}

这段代码乍看没问题，但是运行时会抛出异常

这便是快速失败机制，因为Hashmap内数据存储是无序的，当先扫描到“2”时对元素进行删除，此时modCount参数自增1，modCount即修改次数，当修改次数和初始化时的

这段循环的字节码分析：

while(i$.hasNext()) { Object o = i$.next(); if (o.equals("2")) { hashMap.remove("2"); }}

增强for循环只不过是语法糖，编译后还是迭代器调用了hashNex()和next()

可以看源码，next()方法调用时先会判断的值，如果不想等就抛出异常，为什么要抛出这个异常，还是因为多线程并发，例如如果一个线程在遍历，一个线程在删除，又会造成线程不安全，这是一种保护机制。

final Entry<K,V> nextEntry() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); Entry<K,V> e = next; if (e == null) throw new NoSuchElementException();
 if ((next = e.next) == null) { Entry[] t = table; while (index < t.length && (next = t[index++]) == null) ; } current = e; return e;}

以上是关于JDK1.7HashMap源码详解的主要内容，如果未能解决你的问题，请参考以下文章

一句话+两张图搞定JDK1.7HashMap,剩下凑字数

j2ee HashMap源码

HashMap ConcurrentHashMap解读

Android 逆向类加载器 ClassLoader ( 类加载器源码简介 | BaseDexClassLoader | DexClassLoader | PathClassLoader )(代码片段

Java线程池详解

(转) Java中的负数及基本类型的转型详解