std::async 导致死锁?
Posted
技术标签:
【中文标题】std::async 导致死锁?【英文标题】:std::async causes deadlock? 【发布时间】:2019-12-31 15:36:40 【问题描述】:我试图在繁重的工作负载应用程序中使用 std::async 来提高性能,但我不时遇到死锁。我调试了很长时间,几乎可以肯定我的代码没问题,而且std库似乎有问题。
于是我写了一个简单的测试程序来作证:
#include <iostream>
#include <vector>
#include <algorithm>
#include <numeric>
#include <future>
#include <string>
#include <mutex>
#include <unistd.h>
#include <atomic>
#include <iomanip>
std::atomic_long numbers[6];
void add(std::atomic_long& n)
++n;
void func2(std::atomic_long& n)
for (auto i = 0L; i < 1000000000000L; ++i)
std::async(std::launch::async, [&] add(n);); // Small task, I want to run them simultaneously
int main()
std::vector<std::future<void>> results;
for (int i = 0; i < 6; ++i)
auto& n = numbers[i];
results.push_back(std::async(std::launch::async, [&n] func2(n);));
while (true)
sleep(1);
for (int i = 0; i < 6; ++i)
std::cout << std::setw(20) << numbers[i] << " ";
std::cout << std::endl;
for (auto& r : results)
r.wait();
return 0;
这个程序会产生这样的输出:
763700 779819 754005 763287 767713 748994
768822 785172 759678 769393 772956 754469
773529 789382 763524 772704 776398 757864
778560 794419 768580 777507 781542 762991
782056 795578 771704 780554 784865 766162
801633 812610 788111 802617 803661 784894
一段时间(分钟或小时)后,如果出现死锁,输出将是这样的:
4435337 4452421 4507907 4501378 2549550 4462899
4441213 4457648 4514424 4506626 2549550 4468019
4446301 4462675 4519272 4511889 2549550 4473266
4453940 4470304 4526382 4519513 2549550 4480872
4461095 4477708 4533272 4526901 2549550 4488313
4470974 4488287 4543442 4537286 2549550 4498733
第五列被冻结。
一天之后,变成了这样:
23934912 23967635 24007250 23931203 2549550 3249788689
23934912 23967635 24007250 23931203 2549550 3249816818
23934912 23967635 24007250 23931203 2549550 3249835009
23934912 23967635 24007250 23931203 2549550 3249860262
23934912 23967635 24007250 23931203 2549550 3249894331
除了最后一列之外,几乎所有列都冻结了。看起来很奇怪。
我在 Linux、macOS、FreeBSD 上运行,结果是:
macOS:10.15.2,Clang:11.0.0,无死锁 FreeBSD:12.0, Clang:6.0.1, 死锁 Linux:ubuntu 5.0.0-37,g++:7.4.0,无死锁 Linux:ubuntu 4.4.0-21,Clang:3.8.0,死锁在gdb中,调用栈是:
(gdb) thread apply all bt
Thread 10 (LWP 100467 of process 37763):
#0 0x000000080025c630 in ?? () from /lib/libthr.so.3
#1 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fff4ad57000
Thread 9 (LWP 100464 of process 37763):
#0 0x000000080046fafa in _umtx_op () from /lib/libc.so.7
#1 0x0000000800264912 in ?? () from /lib/libthr.so.3
#2 0x000000080031f9f9 in std::__1::mutex::unlock() () from /usr/lib/libc++.so.1
#3 0x00000008002e8f55 in std::__1::__assoc_sub_state::set_value() () from /usr/lib/libc++.so.1
#4 0x00000000002053e1 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__execute() ()
#5 0x0000000000205763 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >*> >(void*) ()
#6 0x000000080025c776 in ?? () from /lib/libthr.so.3
#7 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fff6944a000
Thread 8 (LWP 100431 of process 37763):
#0 0x000000080046fafa in _umtx_op () from /lib/libc.so.7
#1 0x0000000800264912 in ?? () from /lib/libthr.so.3
#2 0x000000080031f9f9 in std::__1::mutex::unlock() () from /usr/lib/libc++.so.1
#3 0x00000008002e8f55 in std::__1::__assoc_sub_state::set_value() () from /usr/lib/libc++.so.1
#4 0x00000000002053e1 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__execute() ()
#5 0x0000000000205763 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >*> >(void*) ()
#6 0x000000080025c776 in ?? () from /lib/libthr.so.3
#7 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffc371a000
Thread 7 (LWP 100657 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x000000000020346b in func2(std::__1::atomic<long>&) ()
#7 0x0000000000206f18 in main::$_1::operator()() const ()
#8 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#9 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#10 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#11 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#12 0x000000080025c776 in ?? () from /lib/libthr.so.3
#13 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf5f9000
Thread 6 (LWP 100656 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x0000000000207a22 in std::__1::__release_shared_count::operator()(std::__1::__shared_count*) ()
#7 0x00000000002044f4 in std::__1::future<void> std::__1::__make_async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >(std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0>&&) ()
#8 0x00000000002035ea in std::__1::future<std::__1::__invoke_of<std::__1::decay<func2(std::__1::atomic<long>&)::$_0>::type>::type> std::__1::async<func2(std::__1::atomic<long>&)::$_0>(std::__1::launch, func2(std::__1::atomic<long>&)::$_0&&) ()
#9 0x0000000000203462 in func2(std::__1::atomic<long>&) ()
#10 0x0000000000206f18 in main::$_1::operator()() const ()
#11 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#12 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#13 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#14 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#15 0x000000080025c776 in ?? () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf7fa000
Thread 5 (LWP 100655 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x0000000000207a22 in std::__1::__release_shared_count::operator()(std::__1::__shared_count*) ()
#7 0x00000000002044f4 in std::__1::future<void> std::__1::__make_async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >(std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0>&&) ()
#8 0x00000000002035ea in std::__1::future<std::__1::__invoke_of<std::__1::decay<func2(std::__1::atomic<long>&)::$_0>::type>::type> std::__1::async<func2(std::__1::atomic<long>&)::$_0>(std::__1::launch, func2(std::__1::atomic<long>&)::$_0&&) ()
#9 0x0000000000203462 in func2(std::__1::atomic<long>&) ()
#10 0x0000000000206f18 in main::$_1::operator()() const ()
#11 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#12 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#13 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#14 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#15 0x000000080025c776 in ?? () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf9fb000
Thread 4 (LWP 100654 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x0000000000207a22 in std::__1::__release_shared_count::operator()(std::__1::__shared_count*) ()
#7 0x00000000002044f4 in std::__1::future<void> std::__1::__make_async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >(std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0>&&) ()
#8 0x00000000002035ea in std::__1::future<std::__1::__invoke_of<std::__1::decay<func2(std::__1::atomic<long>&)::$_0>::type>::type> std::__1::async<func2(std::__1::atomic<long>&)::$_0>(std::__1::launch, func2(std::__1::atomic<long>&)::$_0&&) ()
#9 0x0000000000203462 in func2(std::__1::atomic<long>&) ()
#10 0x0000000000206f18 in main::$_1::operator()() const ()
#11 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#12 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#13 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#14 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#15 0x000000080025c776 in ?? () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfbfc000
Thread 3 (LWP 100653 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x0000000000207a22 in std::__1::__release_shared_count::operator()(std::__1::__shared_count*) ()
#7 0x00000000002044f4 in std::__1::future<void> std::__1::__make_async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >(std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0>&&) ()
#8 0x00000000002035ea in std::__1::future<std::__1::__invoke_of<std::__1::decay<func2(std::__1::atomic<long>&)::$_0>::type>::type> std::__1::async<func2(std::__1::atomic<long>&)::$_0>(std::__1::launch, func2(std::__1::atomic<long>&)::$_0&&) ()
#9 0x0000000000203462 in func2(std::__1::atomic<long>&) ()
#10 0x0000000000206f18 in main::$_1::operator()() const ()
#11 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#12 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#13 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#14 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#15 0x000000080025c776 in ?? () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfdfd000
Thread 2 (LWP 100652 of process 37763):
#0 0x000000080026a66c in ?? () from /lib/libthr.so.3
#1 0x000000080025e731 in ?? () from /lib/libthr.so.3
#2 0x0000000800268388 in ?? () from /lib/libthr.so.3
#3 0x000000080032de72 in std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) () from /usr/lib/libc++.so.1
#4 0x00000008002e971b in std::__1::__assoc_sub_state::wait() () from /usr/lib/libc++.so.1
#5 0x0000000000205389 in std::__1::__async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >::__on_zero_shared() ()
#6 0x0000000000207a22 in std::__1::__release_shared_count::operator()(std::__1::__shared_count*) ()
#7 0x00000000002044f4 in std::__1::future<void> std::__1::__make_async_assoc_state<void, std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0> >(std::__1::__async_func<func2(std::__1::atomic<long>&)::$_0>&&) ()
#8 0x00000000002035ea in std::__1::future<std::__1::__invoke_of<std::__1::decay<func2(std::__1::atomic<long>&)::$_0>::type>::type> std::__1::async<func2(std::__1::atomic<long>&)::$_0>(std::__1::launch, func2(std::__1::atomic<long>&)::$_0&&) ()
#9 0x0000000000203462 in func2(std::__1::atomic<long>&) ()
#10 0x0000000000206f18 in main::$_1::operator()() const ()
#11 0x0000000000206eed in void std::__1::__async_func<main::$_1>::__execute<>(std::__1::__tuple_indices<>) ()
#12 0x0000000000206ea5 in std::__1::__async_func<main::$_1>::operator()() ()
#13 0x0000000000206df3 in std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::__execute() ()
#14 0x0000000000207183 in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<main::$_1> >*> >(void*) ()
#15 0x000000080025c776 in ?? () from /lib/libthr.so.3
#16 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdfffe000
Thread 1 (LWP 100148 of process 37763):
#0 0x00000008004f984a in _nanosleep () from /lib/libc.so.7
#1 0x000000080025f17c in ?? () from /lib/libthr.so.3
#2 0x000000080045fe0b in sleep () from /lib/libc.so.7
#3 0x0000000000203b7b in main ()
好像很多线程卡在std::__1::condition_variable::wait
上,不合理,在测试代码中,根本没有使用任何条件。
谁能告诉我,是我做错了还是 std 库中有错误?
谢谢。这个例子并没有完全模仿我的程序的实际行为。我把它简化得太多了。
现在我添加未来的向量,这更像是:
void func2(std::atomic_long& n)
std::vector<std::future<void>> rs;
for (auto i = 0L; i < 1000000000000L; ++i)
rs.push_back(std::async(std::launch::async, [&] add(n);));
for (auto& r : rs)
r.wait();
但它仍然得到相同的结果: 在 macOS 上,没问题。
29693311 29904143 29994992 29856976 30020535 29832796
29709344 29917687 30005488 29875611 30039727 29848932
29725334 29930826 30019428 29892350 30056678 29866293
29737403 29948258 30036760 29904964 30074102 29883648
29746597 29965134 30050115 29914459 30086189 29900767
29761543 29977363 30066833 29929475 30101723 29915059
29777678 29993381 30084101 29949095 30117847 29926040
29794253 30007301 30102985 29972819 30129613 29939935
在freebsd上,又死机了:
34079 29595 38239 508788 30194 41242
34079 29595 38239 509103 30194 41242
34079 29595 38239 509583 30194 41242
34079 29595 38239 509808 30194 41242
34079 29595 38239 510187 30194 41242
34079 29595 38239 510543 30194 41242
34079 29595 38239 510932 30194 41242
34079 29595 38239 511616 30194 41242
34079 29595 38239 512111 30194 41242
34079 29595 38239 512952 30194 41242
34079 29595 38239 514032 30194 41242
34079 29595 38239 514205 30194 41242
34079 29595 38239 514577 30194 41242
【问题讨论】:
注意std::async(std::launch::async, [&] add(n););
不是异步的,因为返回值被忽略为explained here。
这几乎肯定不是标准库实现中的错误。
是的,听起来您可能只是想启动一些线程?
我不明白您为什么希望您的 async
员工永远工作。他们调用最终完成的提供的函数。你看到的僵局可能只是工人已经完成了。
@FrançoisAndrieux 这些工人不应该在计数器一直递增到1000000000000L
之前完成。
【参考方案1】:
您没有考虑std::async
的返回值,返回的未来将阻止任何执行,直到您以std::async
开始的任务结束。编写的这个程序并没有按照您的预期执行。
此外,您正在使用递归调用 std::async 并且它没有被授予它将产生一个新线程,它可以管理一个池,因此如果池很忙,您的程序显然可以冻结,因为您正在执行的循环非常长。如果你想要更多的控制,你可以使用 std::thread 和 std::packaged_task
【讨论】:
谢谢。是的,好点。该程序没有像我预期的那样工作。实际上,它是完全同步的。关键是它,如果它是同步的,它不应该冻结。以上是关于std::async 导致死锁?的主要内容,如果未能解决你的问题,请参考以下文章
基于std::mutex std::lock_guard std::condition_variable 和std::async实现的简单同步队列