C++:OpenMP 并行循环内存泄漏
Posted
技术标签:
【中文标题】C++:OpenMP 并行循环内存泄漏【英文标题】:C++: OpenMP parallel loop memory leaks 【发布时间】:2019-08-03 09:10:01 【问题描述】:大家好, 我在 C++ 代码中使用 OpenMP 时遇到了相当严重的内存泄漏问题。我正在为一些地球物理计算编写一个库,它们非常耗时。我创建了一段简单的代码给你一个想法(原始代码很长,希望不是解决方案所必需的)。为了避免一遍又一遍地重写“相同”的代码行,我有一些模板,它们是通过使用指向方法函数的指针来指定的(就像我知道空间中的位置并且需要计算不同的数量一样)。我还在使用“armadillo”(http://arma.sourceforge.net/)库进行一些计算,但不管有没有它,问题仍然存在。
如果代码仅使用单线程运行,则没有任何问题。但是随着时间的推移,使用 OpenMP(#pragma 指令)会导致内存泄漏。该程序有效地消耗了所有可用内存,然后崩溃。您可以使用我提供的代码重现程序(只需将“迭代大小”从 5000 更改为更大的值)
我试图用我自己的替换“犰狳向量”,但它似乎不会导致问题。我认为犰狳不是问题。我运行了 valgrid memcheck 但不确定到底发生了什么(两种“错误”):
149,520,000 bytes in 3,738 blocks are definitely lost in loss record 24 of 25
in Openmp_class::generate_data(unsigned long) in $HOME/Programovanie/Ptr2memOpenMP/openmp_class.cpp:20
1: operator new[](unsigned long) in /builddir/build/BUILD/valgrind-3.15.0/coregrind/m_replacemalloc/vg_replace_malloc.c:433
2: Openmp_class::generate_data(unsigned long) in $HOME/Programovanie/Ptr2memOpenMP/openmp_class.cpp:20
3: main._omp_fn.0 in $HOME/Programovanie/Ptr2memOpenMP/main.cpp:44
4: /usr/lib64/libgomp.so.1.0.0
5: start_thread in /usr/lib64/libpthread-2.29.so
6: clone in /usr/lib64/libc-2.29.so
49,840,000 bytes in 1,246 blocks are definitely lost in loss record 22 of 25
in Openmp_class::multiply_elements() in $HOME/Programovanie/Ptr2memOpenMP/openmp_class.cpp:90
1: operator new[](unsigned long) in /builddir/build/BUILD/valgrind-3.15.0/coregrind/m_replacemalloc/vg_replace_malloc.c:433
2: Openmp_class::multiply_elements() in $HOME/Programovanie/Ptr2memOpenMP/openmp_class.cpp:90
3: main._omp_fn.0 in $HOME/Programovanie/Ptr2memOpenMP/main.cpp:45
4: GOMP_parallel in /usr/lib64/libgomp.so.1.0.0
5: main in $HOME/Programovanie/Ptr2memOpenMP/main.cpp:14
头文件:openmp_class.h
#ifndef OPENMP_CLASS_H
#define OPENMP_CLASS_H
#include <iomanip>
#include <iostream>
#include <cmath>
#include <armadillo>
using namespace std;
class Openmp_class
double* xvec;
double* yvec;
size_t size;
public:
Openmp_class();
~Openmp_class();
void generate_data( size_t n );
double add_element( size_t n );
double substract_element( size_t n );
arma::vec add_elements( size_t upto_n );
arma::vec multiply_elements( size_t upto_n );
double *multiply_elements();
;
#endif // OPENMP_CLASS_H
CPP 文件 openmp_class.cpp
#include "openmp_class.h"
Openmp_class::Openmp_class()
Openmp_class::~Openmp_class()
this->xvec = nullptr;
this->yvec = nullptr;
delete [] this->xvec;
delete [] this->yvec;
void Openmp_class::generate_data(size_t n)
this->xvec = new double[n];
this->yvec = new double[n];
this->size = n;
arma::vec xrand = arma::randu<arma::vec>(n);
arma::vec yrand = arma::randu<arma::vec>(n);
for (unsigned int i = 0; i < xrand.n_elem; i++)
this->xvec[i] = xrand(i);
this->yvec[i] = yrand(i);
xrand.reset();
yrand.reset();
double Openmp_class::add_element(size_t n)
if ( n < this->size )
return this->xvec[n] + this->yvec[n];
else
string errmsg = "Openmp_class::add_element index n out of bounds!";
throw runtime_error( errmsg );
double Openmp_class::substract_element(size_t n)
if ( n < this->size )
return this->xvec[n] - this->yvec[n];
else
string errmsg = "Openmp_class::substract_element index n out of bounds!";
throw runtime_error( errmsg );
arma::vec Openmp_class::add_elements(size_t upto_n)
if ( upto_n < this->size )
arma::vec results = arma::zeros<arma::vec>( upto_n );
for (unsigned int i = 0; i < upto_n; i++ )
results(i) = this->xvec[i] + this->yvec[i];
return results;
else
string errmsg = "Openmp_class::add_elements index n out of bounds!";
throw runtime_error( errmsg );
arma::vec Openmp_class::multiply_elements(size_t upto_n)
if ( upto_n < this->size )
arma::vec results = arma::zeros<arma::vec>( upto_n );
for (unsigned int i = 0; i < upto_n; i++ )
results(i) = this->xvec[i] * this->yvec[i];
return results;
else
string errmsg = "Openmp_class::add_elements index n out of bounds!";
throw runtime_error( errmsg );
double *Openmp_class::multiply_elements()
double *xy = new double[this->size ];
for (unsigned int i = 0; i < this->size; i++)
xy[i] = this->xvec[i] * this->yvec[i];
return xy;
主文件 main.cpp
#include <iostream>
#include <iomanip>
#include <cmath>
#define ARMA_USE_HDF5
#include <armadillo>
#include "openmp_class.h"
using namespace std;
//#define ARMA_OPEN_MP
int main(/*int argc, char *argv[]*/ void)
Openmp_class Myclass;
Myclass.generate_data( 10 );
#ifdef ARMA_OPEN_MP
#pragma omp parallel
#pragma omp for
for (unsigned int j = 10; j <= 500000; j++)
arma::vec (Openmp_class::*ptrmem) (size_t) = &Openmp_class::multiply_elements;
Openmp_class TestClass;
TestClass.generate_data( 5000 );
arma::vec x_vec = (TestClass.*ptrmem)(4999);
ptrmem = nullptr;
#pragma omp barrier
#else
#pragma omp parallel
#pragma omp for
for (unsigned int j = 10; j <= 500000; j++)
double* (Openmp_class::*ptre2mltply)() = &Openmp_class::multiply_elements;
Openmp_class TestClass;
TestClass.generate_data( 5000 );
double* x_vec = (TestClass.*ptre2mltply)();
x_vec = nullptr;
delete [] x_vec;
ptre2mltply = nullptr;
#pragma omp barrier
#endif
return 1;
有人处理过这个问题吗?有什么建议吗?
感谢您的宝贵时间。
附:指向函数(或类成员)的指针究竟是如何在多个线程之间共享的?
【问题讨论】:
在 Openmp_class 的析构函数中,您将 nullptr 分配给指针然后删除,这是复制粘贴错误还是我遗漏了什么? 是的,这是一个复制粘贴错误。因为最初(在我重写一大段代码之前)我使用的是 double** 等变量。但它似乎工作。我不敢相信我犯了这样的错误。 只是为了确保我理解,更改“真实”代码中的相应指令解决了问题? 在 main() 中你也有这个代码位:x_vec = nullptr; delete [] x_vec;
;就像在你的类析构函数中一样,不应该颠倒顺序吗?
在这个例子中,它起作用了。在我的代码中,我会尽快检查它。现在我必须离开电脑。
【参考方案1】:
在删除指针之前,您不应将 nullptr
分配给它们。
在您的 dtor 中,您将 nullptr
分配给您的成员,然后释放他们。
this->xvec = nullptr; this->yvec = nullptr; delete [] this->xvec; delete [] this->yvec;
还有在main函数中:
x_vec = nullptr; delete [] x_vec; ptre2mltply = nullptr;
只需从您的代码中删除这些分配。
【讨论】:
【参考方案2】:我建议您像管理 Armadillo 对象一样管理自己的向量,即让 C++ 运行时系统负责内存分配/释放。
在您的情况下,大多数访问都是使用x_vec[index]
之类的语法完成的,因此这很容易。这与std::vector
对象完全兼容。因此,您可以摆脱原始指针,而只使用 std::vector 对象。然后您不再需要手动删除任何内容。
在您的第二个multiply_elements()
函数中,由于存在 STL 提供的 std::vector 移动构造函数,您可以高效地按值返回 std::vector 对象。
我冒昧地修改了你的代码,结果程序似乎让 Valgrind 很高兴。
头文件:openmp_class.h
#ifndef OPENMP_CLASS_H
#define OPENMP_CLASS_H
#include <iomanip>
#include <iostream>
#include <cmath>
#include <armadillo>
#include <vector>
using namespace std;
class Openmp_class
private:
// CHANGE:
std::vector<double> xvec;
std::vector<double> yvec;
size_t size;
public:
Openmp_class();
~Openmp_class();
void generate_data( size_t n );
double add_element( size_t n );
double substract_element( size_t n );
arma::vec add_elements( size_t upto_n );
arma::vec multiply_elements( size_t upto_n );
std::vector<double> multiply_elements();
;
#endif // OPENMP_CLASS_H
CPP 文件 openmp_class.cpp:
#include "openmp_class.h"
Openmp_class::Openmp_class()
Openmp_class::~Openmp_class()
// vector components automatically deleted
void Openmp_class::generate_data(size_t n)
// CHANGE:
this->xvec.resize(n);
this->yvec.resize(n);
this->size = n;
arma::vec xrand = arma::randu<arma::vec>(n);
arma::vec yrand = arma::randu<arma::vec>(n);
for (unsigned int i = 0; i < xrand.n_elem; i++)
this->xvec[i] = xrand(i);
this->yvec[i] = yrand(i);
xrand.reset();
yrand.reset();
double Openmp_class::add_element(size_t n)
if ( n < this->size )
return this->xvec[n] + this->yvec[n];
else
string errmsg = "Openmp_class::add_element index n out of bounds!";
throw runtime_error( errmsg );
double Openmp_class::substract_element(size_t n)
if ( n < this->size )
return this->xvec[n] - this->yvec[n];
else
string errmsg = "Openmp_class::substract_element index n out of bounds!";
throw runtime_error( errmsg );
arma::vec Openmp_class::add_elements(size_t upto_n)
if ( upto_n < this->size )
arma::vec results = arma::zeros<arma::vec>( upto_n );
for (unsigned int i = 0; i < upto_n; i++ )
results(i) = this->xvec[i] + this->yvec[i];
return results;
else
string errmsg = "Openmp_class::add_elements index n out of bounds!";
throw runtime_error( errmsg );
arma::vec Openmp_class::multiply_elements(size_t upto_n)
if ( upto_n < this->size )
arma::vec results = arma::zeros<arma::vec>( upto_n );
for (unsigned int i = 0; i < upto_n; i++ )
results(i) = this->xvec[i] * this->yvec[i];
return results;
else
string errmsg = "Openmp_class::add_elements index n out of bounds!";
throw runtime_error( errmsg );
std::vector<double> Openmp_class::multiply_elements()
// CHANGE:
std::vector<double> xy(this->size);
for (unsigned int i = 0; i < this->size; i++)
xy[i] = this->xvec[i] * this->yvec[i];
return xy;
主文件 main.cpp:
#include <iostream>
#include <iomanip>
#include <cmath>
#define ARMA_USE_HDF5
#include <armadillo>
#include "openmp_class.h"
using namespace std;
//#define ARMA_OPEN_MP
int main(/*int argc, char *argv[]*/ void)
Openmp_class Myclass;
Myclass.generate_data( 10 );
#ifdef ARMA_OPEN_MP
#pragma omp parallel
#pragma omp for
for (unsigned int j = 10; j <= 500000; j++)
arma::vec (Openmp_class::*ptrmem) (size_t) = &Openmp_class::multiply_elements;
Openmp_class testObj;
testObj.generate_data( 5000 );
arma::vec x_vec = (testObj.*ptrmem)(4999);
ptrmem = nullptr;
#pragma omp barrier
#else
#pragma omp parallel
#pragma omp for
for (unsigned int j = 10; j <= 500000; j++)
std::vector<double> (Openmp_class::*ptre2mltply)() = &Openmp_class::multiply_elements;
Openmp_class testObj;
testObj.generate_data( 5000 );
std::vector<double> x_vec = (testObj.*ptre2mltply)();
ptre2mltply = nullptr;
#pragma omp barrier
#endif
return 1;
附言指向函数(或类成员)的指针究竟是如何在多个线程之间共享的?
指向类成员的指针在物理上是一个内存偏移量,因此是一个整数常量。所以每个线程都可以有自己的(便宜的)副本。
【讨论】:
以上是关于C++:OpenMP 并行循环内存泄漏的主要内容,如果未能解决你的问题,请参考以下文章
如何检查 C++ 代码中的内存泄漏。有没有检查内存泄漏的免费工具[重复]