数据结构 - ConcurrentHashMap 一步步深入

Posted 2021-02-20 yuanjiangnan

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了数据结构 - ConcurrentHashMap 一步步深入相关的知识，希望对你有一定的参考价值。

技术图片
简介
上一篇我们介绍了ConcurrentHashMap的主干方法，本篇是ConcurrentHashMap的终篇，我们主要针对它的元素统计，扩容，元素迁移等做讲解。首先我们回顾一下前面章节，普通节点Hash为key的hash；树节点为TreeBin内部封装红黑树头节点，并且维护树，TreeBin的Hash值为-2；迁移节点ForwardingNode，它的Hash值为-1。在主干方法中对非数组修改都会锁定头节点。

ConcurrentHashMap 的addCount

private final void addCount(long x, int check) {
    CounterCell[] as; long b, s;
    // 先判断counterCells是否为空，为空通过cas修改baseCount
    // 要么修改baseCount失败（s=b + x 一样会执行）
    if ((as = counterCells) != null ||
            !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
        CounterCell a; long v; int m;
        boolean uncontended = true;
        // 先判断counterCells为空，
        // 然后随机取余数获取counterCells下标a，并取值
        // （counterCells长度必须是2的n次方）
        // 取值为空修改counterCells数组a位置值，修改失败调用fullAddCount
        if (as == null || (m = as.length - 1) < 0 ||
                (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
                !(uncontended =
                        U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
            fullAddCount(x, uncontended);
            return;
        }
        // 链表长度小于等于1
        if (check <= 1)
            return;
        // 统计
        s = sumCount();
    }
    // 链表长度大于等于0
    if (check >= 0) {
        Node<K,V>[] tab, nt; int n, sc;
        // 如果map.size() 大于 sizeCtl扩容阈值 且
        // table 不是空；且 table 的长度小于 1 << 30
        // 这时sc为原扩容阈值
        while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
                (n = tab.length) < MAXIMUM_CAPACITY) {
            // 生成一个扩容戳
            int rs = resizeStamp(n);
            // 原扩容阈值小于0（第一次一定不会进去）
            if (sc < 0) {
                // 能到这一定在扩容
                // 这里5个判断，意思是有任何一个满足就不能帮忙搬移元素
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                        sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                        transferIndex <= 0)
                    break;
                // 修改阈值加1
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    // 帮忙搬移元素
                    transfer(tab, nt);
            }
            // 修改扩容阈值为(rs<<16)+2，一定是负的
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                    (rs << RESIZE_STAMP_SHIFT) + 2))
                // 开始迁移元素
                transfer(tab, null);
            // 统计元素个数
            s = sumCount();
        }
    }
}

这里逻辑非常绕，我们要拆开看。元素计数器counterCells，为什么要用元素计数器？
当并发比较高并且所有线程都是修改元素个数，会是什么情况？做到了高效添加删除节点，但是所有线程最后都去竞争改baseCount，性能损耗严重，性能又被拉下来了。counterCells就是用来解决修改baseCount性能损耗问题，没有并发时直接修改baseCount，有并发时一定有修改baseCount修改失败，失败又不能不管，这时失败的就在counterCells数组中随便找个元素记一下，记录方式就是数组中元素原值加1（元素是对象）。在获取元素个数时，需要把baseCount加上counterCells所有元素值加到一起就是总元素个数，这也是ConcurrentHashMap 分片统计原理。

ConcurrentHashMap 统计元素及使用

public int size() {
    // 返回当前元素个数
    long n = sumCount();
    return ((n < 0L) ? 0 :
            (n > (long)Integer.MAX_VALUE) ? Integer.MAX_VALUE :
                    (int)n);
}
final long sumCount() {
    CounterCell[] as = counterCells; CounterCell a;
    long sum = baseCount;
    if (as != null) {
        // 遍历所有计数器
        for (int i = 0; i < as.length; ++i) {
            // 元素不为空时参与统计
            if ((a = as[i]) != null)
                sum += a.value;
        }
    }
    return sum;
}
// 统计数为0时Map为空
public boolean isEmpty() {
    return sumCount() <= 0L;
}

ConcurrentHashMap 扩容

// 第一个扩容扩容线程nextTab为null
private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
    int n = tab.length, stride;
    // stride是每个线程转移几个槽
    // 将 length/8 然后除以 CPU核心数。如果得到的结果小于 16，那么就使用 16。
    if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
        stride = MIN_TRANSFER_STRIDE;
    // nextTab 为空（第一个线程调用）
    if (nextTab == null) {
        try {
            // 新数组（长度为原来2倍）
            Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];
            nextTab = nt;
        } catch (Throwable ex) {
            // 数组越界，超过int最大值
            sizeCtl = Integer.MAX_VALUE;
            return;
        }
        // 使用新数组赋值nextTable
        nextTable = nextTab;
        // 更新转移下标，就是 老的 tab 的 length
        transferIndex = n;
    }
    // 新数组长度
    int nextn = nextTab.length;
    // 创建迁移节点
    ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
    boolean advance = true;
    boolean finishing = false;
    for (int i = 0, bound = 0;;) {
        Node<K,V> f; int fh;
        // 领任务
        while (advance) {
            int nextIndex, nextBound;
            if (--i >= bound || finishing)
                advance = false;
            else if ((nextIndex = transferIndex) <= 0) {
                i = -1;
                advance = false;
            }
            // 修改 transferIndex，即 length - 区间值，留下剩余的区间值供后面的线程使用
            else if (U.compareAndSwapInt
                    (this, TRANSFERINDEX, nextIndex,
                            nextBound = (nextIndex > stride ?
                                    nextIndex - stride : 0))) {
                bound = nextBound;
                i = nextIndex - 1;
                advance = false;
            }
        }
        // 如果 i 小于0
        // 如果 i >= tab.length
        // 如果 i + tab.length >= nextTable.length
        if (i < 0 || i >= n || i + n >= nextn) {
            int sc;
            // 如果完成了扩容
            if (finishing) {
                // 置空nextTable，赋值新table，设置新扩容阈值
                nextTable = null;
                table = nextTab;
                sizeCtl = (n << 1) - (n >>> 1);
                return;
            }
            // 能到这说明还在扩容。设置sc-1
            if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                // 没有线程在帮助他们扩容了。也就是说，扩容结束了。
                if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                    return;
                // 扩容完成
                finishing = advance = true;
                i = n;
            }
        }
        // 原槽为空时置入fwd
        else if ((f = tabAt(tab, i)) == null)
            advance = casTabAt(tab, i, null, fwd);
        // 原槽头节点已经是ForwardingNode对象，忽略
        else if ((fh = f.hash) == MOVED)
            advance = true;
        // 开始迁移
        else {
            // 锁住头节点
            synchronized (f) {
                // 看一下头节点有没有改变
                if (tabAt(tab, i) == f) {
                    Node<K,V> ln, hn;
                    // hash是否大于0
                    if (fh >= 0) {
                        int runBit = fh & n;
                        Node<K,V> lastRun = f;
                        // 遍历链表
                        for (Node<K,V> p = f.next; p != null; p = p.next) {
                            int b = p.hash & n;
                            // 元素分类
                            if (b != runBit) {
                                runBit = b;
                                lastRun = p;
                            }
                        }
                        // 高位位0（不需要移动）
                        if (runBit == 0) {
                            ln = lastRun;
                            hn = null;
                        }
                        // 高位位1（需要移动）
                        else {
                            hn = lastRun;
                            ln = null;
                        }
                        // 组装新链表
                        for (Node<K,V> p = f; p != lastRun; p = p.next) {
                            int ph = p.hash; K pk = p.key; V pv = p.val;
                            if ((ph & n) == 0)
                                ln = new Node<K,V>(ph, pk, pv, ln);
                            else
                                hn = new Node<K,V>(ph, pk, pv, hn);
                        }
                        // 放入新数组槽中
                        setTabAt(nextTab, i, ln);
                        setTabAt(nextTab, i + n, hn);
                        // 原槽中使用占位符
                        setTabAt(tab, i, fwd);
                        // 成功
                        advance = true;
                    }
                    // 头节点为TreeBin
                    else if (f instanceof TreeBin) {
                        TreeBin<K,V> t = (TreeBin<K,V>)f;
                        TreeNode<K,V> lo = null, loTail = null;
                        TreeNode<K,V> hi = null, hiTail = null;
                        int lc = 0, hc = 0;
                        // 遍历树取两个树（lo、hi）
                        for (Node<K,V> e = t.first; e != null; e = e.next) {
                            int h = e.hash;
                            TreeNode<K,V> p = new TreeNode<K,V>
                                    (h, e.key, e.val, null, null);
                            if ((h & n) == 0) {
                                if ((p.prev = loTail) == null)
                                    lo = p;
                                else
                                    loTail.next = p;
                                loTail = p;
                                ++lc;
                            }
                            else {
                                if ((p.prev = hiTail) == null)
                                    hi = p;
                                else
                                    hiTail.next = p;
                                hiTail = p;
                                ++hc;
                            }
                        }
                        // 长度小于6时，树退链
                        ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                (hc != 0) ? new TreeBin<K,V>(lo) : t;
                        hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                (lc != 0) ? new TreeBin<K,V>(hi) : t;
                        // 放入新数组槽中
                        setTabAt(nextTab, i, ln);
                        setTabAt(nextTab, i + n, hn);
                        // 原槽中使用占位符
                        setTabAt(tab, i, fwd);
                        // 完成
                        advance = true;
                    }
                }
            }
        }
    }
}

扩容跟HashMap区别还是比较大的，单核时一个线程迁移所有槽，多线程时，每个线程最小迁移16个槽。

ConcurrentHashMap 链转树触发扩容

private final void tryPresize(int size) {
    // size在传入之前就已经*2了，判断size合法性
    int c = (size >= (MAXIMUM_CAPACITY >>> 1)) ? MAXIMUM_CAPACITY :
            tableSizeFor(size + (size >>> 1) + 1);
    int sc;
    while ((sc = sizeCtl) >= 0) {
        Node<K,V>[] tab = table; int n;
        // 数组为空
        if (tab == null || (n = tab.length) == 0) {
            n = (sc > c) ? sc : c;
            // 修改扩容阈值
            if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
                try {
                    // 初始化table
                    if (table == tab) {
                        Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                        table = nt;
                        sc = n - (n >>> 2);
                    }
                } finally {
                    // 设置新的扩容阈值
                    sizeCtl = sc;
                }
            }
        }
        // 不需要扩容
        else if (c <= sc || n >= MAXIMUM_CAPACITY)
            break;
        // table是否改变
        else if (tab == table) {
            // 下面这一坨跟addcount后半段是一个意思
            int rs = resizeStamp(n);
            // 阈值小于零（被扩容线程修改为负的）
            if (sc < 0) {
                Node<K,V>[] nt;
                if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                        sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                        transferIndex <= 0)
                    break;
                // 帮忙迁移元素
                if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                    transfer(tab, nt);
            }
            // 打上迁移标示
            else if (U.compareAndSwapInt(this, SIZECTL, sc,
                    (rs << RESIZE_STAMP_SHIFT) + 2))
                transfer(tab, null);
        }
    }
}

技术图片

以上是关于数据结构 - ConcurrentHashMap 一步步深入的主要内容，如果未能解决你的问题，请参考以下文章

ConcurrentHashMap

ConcurrentHashMap底层实现原理(JDK1.8)源码分析

ConcurrentHashMap（八股笔记）

ConcurrentHashMap以及HashMap,HashTable的区别

死磕 java集合之ConcurrentHashMap源码分析