在 C++ 中并行运行一个函数的简单方法

Posted 2023-02-22

技术标签:

【中文标题】在 C++ 中并行运行一个函数的简单方法【英文标题】：Easy way to run a function multiple times in parrallel in C++ 【发布时间】：2021-12-26 07:58:41 【问题描述】：

我想知道是否有一种简单的方法可以并行运行一个函数多次。我已经尝试过多线程，但要么有一些我不理解的东西，要么它实际上并没有加快计算速度（实际上恰恰相反）。我这里有我想并行运行的功能：

void heun_update_pos(vector<planet>& planets, vector<double> x_i, vector<double> y_i, vector<double> mass, size_t n_planets, double h, int i)
    
    
    
        if (planets[i].mass != 0) 
            double sum_gravity_x = 0;
            double sum_gravity_y = 0;
    
            //loop for collision check and gravitational contribution
            for (int j = 0; j < n_planets; j++) 
    
                if (planets[j].mass != 0) 
    
                    double delta_x = planets[i].x_position - x_i[j];
                    double delta_y = planets[i].y_position - y_i[j];
    
                    //computing the distances between two planets in x and y
                    if (delta_x != 0 && delta_y != 0) 
                        //collision test
                        if (collision_test(planets[i], planets[j], delta_x, delta_y) == true) 
                            planets[i].mass += planets[j].mass;
                            planets[j].mass = 0;
                        
    
                        //sum of the gravity contributions from other planets
                        sum_gravity_x += gravity_x(delta_x, delta_y, mass[j]);
                        sum_gravity_y += gravity_y(delta_x, delta_y, mass[j]);
    
                    
                
            ;
            double sx_ip1 = planets[i].x_speed + (h / 2) * sum_gravity_x;
            double sy_ip1 = planets[i].y_speed + (h / 2) * sum_gravity_y;
            double x_ip1 = planets[i].x_position + (h / 2) * (planets[i].x_speed + sx_ip1);
            double y_ip1 = planets[i].y_position + (h / 2) * (planets[i].y_speed + sy_ip1);
            planets[i].update_position(x_ip1, y_ip1, sx_ip1, sy_ip1);
        ;

这是我尝试使用多线程的方式：

    const int cores = 6;
    vector<thread> threads(cores);
    int active_threads = 0;
    int closing_threads = 1;

    for (int i = 0; i < n_planets; i++) 

        threads[active_threads] = thread(&Heun_update_pos, ref(planets), x_i, y_i, mass, n_planets, h, i);

        if (i > cores - 2) threads[closing_threads].join();

        //There should only be as many threads as there are cores
        closing_threads++;
        if (closing_threads > cores - 1) closing_threads = 0;

        active_threads++; // counting the number of active threads
        if (active_threads >= cores) active_threads = 0;

    ;

    //CLOSING REMAINING THREADS
    for (int k = 0; k < cores; k++) 
        if (threads[k].joinable()) threads[k].join();
    ;

我今天刚开始学习C++（之前用过Python），这是我的第一个代码，所以我对C++的所有功能都不是很熟悉。

【问题讨论】：

您的连续版本需要多长时间？使用并行标准算法，例如std::for_each(std::execution::par_unseq, ...). 对用于限制/加入活动线程的方法感觉有些不对劲。作为一个测试函数，我建议sleep()，没什么复杂的。然后，无论何时启动或加入线程，都会在控制台上抛出一些输出。我想这会告诉你发生了什么。也许你可以从std::async开始。这具有较低的开销。 . . 【参考方案1】：

创建新线程需要很长时间，通常需要 50-100 微秒。根据您的串行版本需要多长时间，它真的不会很有帮助。如果您多次运行此代码，则值得尝试使用线程池，因为唤醒线程最多需要 5 微秒。

在此处查看类似的答案：

Is there a performance benefit in using a pool of threads over simply creating threads?

在 C++ 中有一个称为 OpenMP 的多线程计算框架。您可能会考虑使用它。

https://bisqwit.iki.fi/story/howto/openmp/

【讨论】：

以上是关于在 C++ 中并行运行一个函数的简单方法的主要内容，如果未能解决你的问题，请参考以下文章