为啥这段代码会消耗这么多堆？

Posted 2023-03-06

技术标签:

【中文标题】为啥这段代码会消耗这么多堆？【英文标题】：Why does this code consume so much heap?为什么这段代码会消耗这么多堆？ 【发布时间】：2012-12-17 07:11:08 【问题描述】：

Here is the full repository。这是一个非常简单的测试，它使用 postgresql-simple 数据库绑定将 50000 个随机事物插入到数据库中。它使用 MonadRandom，可以懒惰地生成 Things。

Here is the lazy Thing generator.

Here is case1 和使用事物生成器的代码的特定 sn-p：

insertThings c = do
  ts <- genThings
  withTransaction c $ do
    executeMany c "insert into things (a, b, c) values (?, ?, ?)" $ map (\(Thing ta tb tc) -> (ta, tb, tc)) $ take 50000 ts

Here is case2，它只是将事物转储到标准输出：

main = do
  ts <- genThings
  mapM print $ take 50000 ts

在第一种情况下，我的 GC 时间非常糟糕：

cabal-dev/bin/posttest +RTS -s       
   1,750,661,104 bytes allocated in the heap
     619,896,664 bytes copied during GC
      92,560,976 bytes maximum residency (10 sample(s))
         990,512 bytes maximum slop
             239 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3323 colls,     0 par   11.01s   11.46s     0.0034s    0.0076s
  Gen  1        10 colls,     0 par    0.74s    0.77s     0.0769s    0.2920s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.97s  (  3.86s elapsed)
  GC      time   11.75s  ( 12.23s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   14.72s  ( 16.09s elapsed)

  %GC     time      79.8%  (76.0% elapsed)

  Alloc rate    588,550,530 bytes per MUT second

  Productivity  20.2% of total user, 18.5% of total elapsed

虽然在第二种情况下时间很好：

cabal-dev/bin/dumptest +RTS -s > out
   1,492,068,768 bytes allocated in the heap
       7,941,456 bytes copied during GC
       2,054,008 bytes maximum residency (3 sample(s))
          70,656 bytes maximum slop
               6 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2888 colls,     0 par    0.13s    0.16s     0.0001s    0.0089s
  Gen  1         3 colls,     0 par    0.01s    0.01s     0.0020s    0.0043s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.00s  (  2.37s elapsed)
  GC      time    0.14s  (  0.16s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    2.14s  (  2.53s elapsed)

  %GC     time       6.5%  (6.4% elapsed)

  Alloc rate    744,750,084 bytes per MUT second

  Productivity  93.5% of total user, 79.0% of total elapsed

我尝试过应用堆分析，但什么都不懂。看起来所有 50000 个事物都是先在内存中构建的，然后通过查询转换为 ByteStrings，然后将这些字符串发送到数据库。但为什么会发生呢？如何确定有罪代码？

GHC 版本为 7.4.2

所有库和包本身的编译标志为 -O2（由沙箱中的 cabal-dev 编译）

【问题讨论】：

我不明白你的第一种和第二种情况有什么不同的代码。您能否发布 1) 独立代码或至少明确说明案例是什么 2) GHC 版本 3) 编译器标志。这对我来说几乎不使用堆 - 你是在没有优化的情况下编译吗？这将是一个问题。我用-O2重新编译了所有的库，但是没有效果 【参考方案1】：

我使用 formatMany 和 50k Things 检查了个人资料。记忆稳步增加，然后迅速下降。使用的最大内存略高于 40mb。主要成本中心是 buildQuery 和 escapeStringConn，然后是 toRow。一半的数据是 ARR_WORDS（字节字符串）、Actions 和列表。

formatMany 几乎可以从嵌套的 Action 列表中组合成一个很长的 ByteString。操作被转换为ByteString Builders，它保留ByteStrings，直到用于产生最终的long strict ByteString。这些字节串的寿命很长，直到最终的 BS 被构建。

字符串需要使用 libPQ 进行转义，因此任何非普通操作 BS 都会传递给 libPQ 并在 escapeStringConn 和朋友中替换为新的，添加更多垃圾。如果将 Thing 中的 Text 替换为另一个 Int，GC 时间会从 75% 下降到 45%。

我尝试通过 formatMany 和 buildQuery 减少临时列表的使用，将 mapM 替换为 foldM 而不是 Builder。它没有多大帮助，但会增加一点代码复杂性。

TLDR - Builders 不能被懒惰地消耗，因为它们都需要产生最终严格的ByteString（几乎是字节数组）。如果您有内存问题，请将 executeMany 拆分为同一事务中的块。

【讨论】：

我自己解决了这个问题，但不是很清楚。构建器也可以生成惰性字节字符串，但不幸的是 libPQ 不会使用惰性字符串。这在 haskell 的 ByteStrings 中真是愚蠢的问题。

以上是关于为啥这段代码会消耗这么多堆？的主要内容，如果未能解决你的问题，请参考以下文章