我可以在 C++ 中仅使用 std::atomic 而不使用 std::mutex 安全地共享一个变量吗？

Posted 2023-02-21

技术标签:

【中文标题】我可以在 C++ 中仅使用 std::atomic 而不使用 std::mutex 安全地共享一个变量吗？【英文标题】：Can I safelly share a variable across threads in C++ using only std::atomic without std::mutex? 【发布时间】：2018-01-16 15:34:24 【问题描述】：

我编写了一个计算多核素数的程序。（请忽略这个事实，该算法并非完全有效，数字 0 和 1 在这里被认为是素数。目的只是练习使用线程。）

变量 taken（接下来要测试的数字）正在 8 个线程之间共享。

问题是它可以由一个线程递增，然后由另一个线程立即递增，并在它已经递增两次（或更多次）时由它们读取，因此可以跳过某些值，这是一件坏事。

我以为可以通过使用std::atomic_uint作为变量类型来解决，但我显然错了。

有什么方法可以解决这个问题而无需使用std::mutex，因为我听说它会导致相当大的开销？源代码：

#include <iostream>
#include <chrono>
#include <vector>
#include <algorithm>
#include <thread>
#include <atomic>

int main()

    const uint MAX = 1000;

    std::vector<bool> isPrime(MAX), done(MAX);
    std::fill(done.begin(), done.end(), false);
    std::atomic_uint taken0; //shared variable
    std::vector<std::thread> threads;
    auto start = std::chrono::system_clock::now();

    for (uint i = 0; i < 8; ++i) 
        threads.emplace_back(
            [&]()
                bool res;
                for (uint tested; (tested = taken.fetch_add(1)) < MAX; )  //taken should be incremented and copied atomically
                    res = true;
                    for (uint k = 2; k < tested; ++k) 
                        if (tested % k == 0) 
                            res = false;
                            break;
                        
                    
                    isPrime[tested] = res;
                    done[tested] = true;
                
            
        );
    
    for (auto & t : threads) 
        t.join();
    

    auto end = std::chrono::system_clock::now();
    auto milliseconds = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
    uint num = std::count_if(isPrime.begin(), isPrime.end(), [](bool b)return b;);
    uint nDone = std::count_if(done.begin(), done.end(), [](bool b)return !b;);
    std::cout << "number: " << num << " duration: " << milliseconds.count() << '\n';
    std::cout << "not done: " << nDone << '\n';
    for (uint i = 0; i < MAX; ++i)  //Some numbers are always skipped
        if (!done[i]) 
            std::cout << i << ", ";
        
    
    std::cout << '\n';
    return 0;

代码是使用 g++ 和 -O3 和 -pthread 参数编译的。输出：

number: 169 duration: 1
not done: 23
143, 156, 204, 206, 207, 327, 328, 332, 334, 392, 393, 396, 502, 637, 639, 671, 714, 716, 849, 934, 935, 968, 969,

每次输出都不一样。

【问题讨论】：

当心std::vector<bool>，即使从未访问过重叠的索引，它也会给出竞争条件... 循环对我来说似乎是正确的。每次taken 递增时，都会处理其获取的值 (tested)。我猜这个问题是由于std::vector<bool> 造成的数据竞争。试试改成std::vector<char>看看效果。 @jameslarge 我不同意。您可以从多个线程访问不同的内存位置而无需同步。 vector<char> 或 vector<int> 的不同元素是不同的内存位置。您只需要确保不要更新具有相同索引的元素，但这在 OPs 代码中得到保证。只需将std::vector<bool> isPrime(MAX), isDone(MAX); 替换为std::array<bool, MAX> isPrime, isDone;。您将修复您的数据竞赛，并免费享受美好的时间优化。演示：coliru.stacked-crooked.com/a/36110afafdea0ee8 Write concurrently vector<bool>的可能重复 【参考方案1】：

专业化std::vector<bool> 将值压缩为单个位。因此，单个字节中有多个向量元素，即在单个内存位置中。因此，您的线程会在没有同步的情况下更新相同的内存位置，这是一种数据竞争（因此根据标准是未定义的行为）。

尝试将std::vector<bool> 更改为std::vector<char>。

【讨论】：

覆盖std::vector<bool> 到std::vector<char> 解决了它。非常感谢。

以上是关于我可以在 C++ 中仅使用 std::atomic 而不使用 std::mutex 安全地共享一个变量吗？的主要内容，如果未能解决你的问题，请参考以下文章