HashMap与ArrayMap的区别1

Posted 2021-04-24 灰灰的Rom笔记

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了HashMap与ArrayMap的区别1相关的知识，希望对你有一定的参考价值。

看了一下午的 HashMap 的实现原理，感觉很有必要记录下来，防止之后忘记。

HashMap的构成原理

1、HashMap结构

HashMap 当中，存储最终数据的其实是一个 HashMapEntry 类型的数组：

HashMapEntry<K, V>[] table。

而 HashMapEntry 对象其实是属于一种单向链表结构。这样最终其实是构造了一种二维的结构。
我们看一下 HashMapEntry 类：
里面有四个元素，key，value，hash值以及指向的下一个节点的对象。

static class HashMapEntry<K, V> implements Entry<K, V> {
    final K key;
    V value;
    final int hash;
    HashMapEntry<K, V> next;

    HashMapEntry(K key, V value, int hash, HashMapEntry<K, V> next) {
        this.key = key;
        this.value = value;
        this.hash = hash;
        this.next = next;
    }

    public final K getKey() {
        return key;
    }

    public final V getValue() {
        return value;
    }

    public final V setValue(V value) {
        V oldValue = this.value;
        this.value = value;
        return oldValue;
    }

    @Override public final boolean equals(Object o) {
        if (!(o instanceof Entry)) {
            return false;
        }
        Entry<?, ?> e = (Entry<?, ?>) o;
        return Objects.equal(e.getKey(), key)
                && Objects.equal(e.getValue(), value);
    }

    @Override public final int hashCode() {
        return (key == null ? 0 : key.hashCode()) ^
                (value == null ? 0 : value.hashCode());
    }

    @Override public final String toString() {
        return key + "=" + value;
    }
}

2、查找key对应的对象

知道了 HashMap 是由数组构成的，那么再来看一看 HashMap 根据 key 值查找 value 的时候是如何去实现的。

public V get(Object key) {
    if (key == null) {
        HashMapEntry<K, V> e = entryForNullKey;
        return e == null ? null : e.value;
    }

    int hash = Collections.secondaryHash(key);
    HashMapEntry<K, V>[] tab = table;
    for (HashMapEntry<K, V> e = tab[hash & (tab.length - 1)];
            e != null; e = e.next) {
        K eKey = e.key;
        if (eKey == key || (e.hash == hash && key.equals(eKey))) {
            return e.value;
        }
    }
    return null;
}

首先根据 key 值来计算出对应的 hashcode 值。

然后用 hashcode 值去和（数组长度-1）且操作。最终的结果肯定就是一个小于数组长度的某个数字。

比如 hashcode 为 0x0007(16进制)，数组长度为 16，这样计算 7&15，这样计算出来的 index 值就是 7，该对象的值就位于该数组的第7个链表对象当中。

这时候很自然的一个问题就出来了，那么如果我是 0x0017(16进制)呢，这样计算的出来的 index 肯定也是7，那么也就是位于该数组的第7个链表对象中。

这时候为什么使用数组和链表结合的优势就出来了，如果不是链表的话，根据 hashcode 得出相同 index 值的对象就无处存放了。

根据上面的代码

for (HashMapEntry<K, V> e = tab[hash & (tab.length - 1)];e != null; e = e.next)

的可知道，顺序的遍历链表，直到找到直到 key 相同返回。同时我们也可以得出结论，最极端的情况下，找到对应 key 值的元素需要遍历 map.size() 个的链接对象。

同样的原理，在不考虑数组扩容的情况下，添加对象也是一样的流程，根据 hash 值计算出 index 值，然后插入 index 对应的链表当中。

3、添加key对应的对象

首先先看源代码：

@Override public V put(K key, V value) {
    if (key == null) {
        return putValueForNullKey(value);
    }

    int hash = Collections.secondaryHash(key);
    HashMapEntry<K, V>[] tab = table;
    int index = hash & (tab.length - 1);
    for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
        if (e.hash == hash && key.equals(e.key)) {
            preModify(e);
            V oldValue = e.value;
            e.value = value;
            return oldValue;
        }
    }

    // No entry for (non-null) key is present; create one
    modCount++;
    if (size++ > threshold) {
        tab = doubleCapacity();
        index = hash & (tab.length - 1);
    }
    addNewEntry(key, value, hash, index);
    return null;
}

// No entry for (non-null) key is present; create one之前的代码基本上就是上面所说的，根据 hashcode 计算出 index 然后找到对应的链表。如果存在对应的 key 就直接重新赋值返回。

下面重点讲的就是数组的扩容。

这里顺便提一下 threshold（临界值），初始创建的时候 threshold 为 -1，当首次添加元素扩容之后 threshold 的值为数组长度的 3/4。当 size 大于临界值的时候，数组扩容。

数组扩容，doubleCapacity() 方法。

private HashMapEntry<K, V>[] doubleCapacity() {
    HashMapEntry<K, V>[] oldTable = table;
    int oldCapacity = oldTable.length;
    if (oldCapacity == MAXIMUM_CAPACITY) {
        return oldTable;
    }
    int newCapacity = oldCapacity * 2;
    HashMapEntry<K, V>[] newTable = makeTable(newCapacity);
    if (size == 0) {
        return newTable;
    }

    for (int j = 0; j < oldCapacity; j++) {
        /*
         * Rehash the bucket using the minimum number of field writes.
         * This is the most subtle and delicate code in the class.
         */
        HashMapEntry<K, V> e = oldTable[j];
        if (e == null) {
            continue;
        }
        int highBit = e.hash & oldCapacity;
        HashMapEntry<K, V> broken = null;
        newTable[j | highBit] = e;
        for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) {
            int nextHighBit = n.hash & oldCapacity;
            if (nextHighBit != highBit) {
                if (broken == null)
                    newTable[j | nextHighBit] = n;
                else
                    broken.next = n;
                broken = e;
                highBit = nextHighBit;
            }
        }
        if (broken != null)
            broken.next = null;
    }
    return newTable;
}

上面这些代码我们一行一行理解。
扩容的时候，首先 new 一个新的数组，数组长度是原来的两倍，下一步就是把老的数组的每一条链表对象添加到新的数组当中去。
这个过程看代码挺复杂的，但是实际理解上还是蛮简单的。
就是一个和数组长度的且操作，然后放到数组对应的位置上。

举例：
如果 hashcode 为 0x0013(16进制)，原数组长度 16 的时候就放到第3个，扩容之后数组长度变成了 32。
新数组的 index 位置：(19&15)|(19&16) = 19。这其实就是newTable[j | highBit] = e;这一行代码。
那么就会放到新数组的第19位。

4、何时扩容

首先如果初始化的时候没有设置容量 capacity，并且这个数值是大于4小于2的31次方的。则会对这个值进行处理，找到大于该值并且最接近该值的2的N次方，就比如19的话则会设置容量为32。

if (capacity < MINIMUM_CAPACITY) {
    capacity = MINIMUM_CAPACITY;
} else if (capacity > MAXIMUM_CAPACITY) {
    capacity = MAXIMUM_CAPACITY;
} else {
    capacity = Collections.roundUpToPowerOfTwo(capacity);
}

看了这个，自然也就想到了刚开始就设置容量的好处了。
如果我需要一个容纳96个元素的 map，那么只要我把 capacity 初始值设置为128，那么就不会经历16到32到64再到128的三次扩容，这样来说是节省内存和运算成本的。
当然如果需要容纳97个元素的话，因为超过了 capacity 值的3/4，所以就需要设置为256了，否则也会经历一次扩容的。

未完，请看下一篇：HashMap与ArrayMap的区别2

以上是关于HashMap与ArrayMap的区别1的主要内容，如果未能解决你的问题，请参考以下文章

ArrayMap 和HashMap的区别

ArrayMap和HashMap区别

ArrayMap代码分析

HashMap，ArrayMap，SparseArray源码分析及性能对比

Android内存优化（使用SparseArray和ArrayMap代替HashMap）

数据结构HashMap（Android SparseArray 和ArrayMap）