什么是在 C++ 中运行算法的内存使用实验的好方法？

Posted 2023-02-23

技术标签:

【中文标题】什么是在 C++ 中运行算法的内存使用实验的好方法？【英文标题】：what is a good way to run experiments for the memory usage of an algorithm in C++? 【发布时间】：2017-04-16 12:26:47 【问题描述】：

我有算法 A 和算法 B 是用 C++ 实现的。 A 在理论上比B 使用更多的空间，事实证明在实践中也是如此。我想生成一些漂亮的图表来说明这一点。两种算法都接收输入n，我希望我的实验针对不同的n而有所不同，因此图表的x轴必须类似于n = 10^6, 2*10^6, ...

通常，当涉及到时间或缓存未命中等数据时，我最喜欢的实验设置方法如下。在 C++ 文件中，我有这样实现的算法：

#include <iostream>
using namespace std;
int counters[1000];
void init_statistics()
   //use some library for example papi (http://icl.cs.utk.edu/papi/software/)
  //to start counting, store the results in the counters array


void stop_statistics()
   //this is just to stop counting

int algA(int n)
//algorithm code
int result = ...
return result;


void main(int argc, const char * argv[])

   int n = atoi(argv[1]);
   init_statistics(); //function that initializes the statistic counters
   int res = algA(n);
   end_statistics(); //function that ends the statistics counters
   cout<<res<<counter[0]<<counter[1]<<....<<endl;

然后我将创建一个 python 脚本，用于不同的n 调用result = subprocess.check_output(['./algB',...])。之后，在python中解析结果字符串并以合适的格式打印出来。例如，如果我将 R 用于绘图，我可以将数据打印到外部文件，其中每个计数器由 \t 分隔。

这对我来说效果很好，但现在是我第一次需要关于算法使用的空间的数据，我不知道如何计算这个空间。一种方法是使用 valgrind，这是 valgrind 可能的输出：

==15447== Memcheck, a memory error detector
==15447== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==15447== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==15447== Command: ./algB 1.txt 2.txt
==15447== 
==15447== 
==15447== HEAP SUMMARY:
==15447==     in use at exit: 72,704 bytes in 1 blocks
==15447==   total heap usage: 39 allocs, 38 frees, 471,174,306 bytes allocated
==15447== 
==15447== LEAK SUMMARY:
==15447==    definitely lost: 0 bytes in 0 blocks
==15447==    indirectly lost: 0 bytes in 0 blocks
==15447==      possibly lost: 0 bytes in 0 blocks
==15447==    still reachable: 72,704 bytes in 1 blocks
==15447==         suppressed: 0 bytes in 0 blocks
==15447== Rerun with --leak-check=full to see details of leaked memory
==15447== 
==15447== For counts of detected and suppressed errors, rerun with: -v
==15447== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

有趣的数字是471,174,306 bytes。但是，valgrind 大大减慢了执行时间，同时不仅返回这个数字，而且返回这个大字符串。而且我不确定如何解析它，因为由于某种原因，如果使用 python 我调用result = subprocess.check_output(['valgrind','./algB',...])，result 字符串仅存储./algB 的输出，并且完全忽略了 valgrind 返回的内容。

谢谢你！

【问题讨论】：

您应该可以overwrite operator new 如示例所示进行精确测量。 【参考方案1】：

memcheck 是查找内存泄漏的工具，您应该使用massif (another tool available in valgrind) 进行内存分配分析。

【讨论】：

这似乎是一个不错的选择，但有没有办法只返回峰值内存消耗？只是内存消耗峰值？这就是任何过程监控实用程序都可以向您展示的内容。如果您使用的是 VS，那么您还可以使用内置内存监视器（主菜单 -> 分析 -> 性能分析器）。谢谢，我会记住这一点。我已经尝试过 massif，老实说，即使速度很慢，它也可以非常详细地概述内存消耗。在考虑了您在评论中写的内容后，我想我将不得不使用 /usr/bin/time -v 和它具有的 Maximum resident set size (kbytes) 变量...

以上是关于什么是在 C++ 中运行算法的内存使用实验的好方法？的主要内容，如果未能解决你的问题，请参考以下文章