使用 Lodash memoize 的“Javascript 堆内存不足”

Posted

技术标签:

【中文标题】使用 Lodash memoize 的“Javascript 堆内存不足”【英文标题】:"Javascript heap out of memory" using Lodash memoize 【发布时间】:2019-03-04 21:08:44 【问题描述】:

我正在尝试通过将记忆应用于递归解决方案来解决 LeetCode 的 longest palindromic subsequence 问题。这是递归解决方案,longestPalindromicSubsequence.js

function longestPalindromicSubsequence(string, start = 0, end = string.length) 
  if (end < start)  return 0; 
  if (start === end)  return 1; 
  if (string[start] === string[end]) 
    return 2 + longestPalindromicSubsequence(string, start + 1, end - 1);
  
  return Math.max(
    longestPalindromicSubsequence(string, start + 1, end),
    longestPalindromicSubsequence(string, start, end - 1),
  );


module.exports = longestPalindromicSubsequence;

这里有一些 Jest 测试用例,longestPalindromicSubsequence.test.js

const longestPalindromicSubsequence = require('./longestPalindromicSubsequence');

describe('longest palindromic subsequence', () => 
  test('works for aab', () => 
    expect(longestPalindromicSubsequence('aab')).toBe(2);
  );

  test('works for long string', () => 
    expect(longestPalindromicSubsequence(`$'a'.repeat(50)bcdef`)).toBe(50);
  );
);

这可行,但由于递归调用数量呈指数增长,速度相当慢。例如,对于长度为 ~50 的字符串,需要 9 秒:

$ jest longestPalindromicSubsequence.test.js
 PASS  ./longestPalindromicSubsequence.test.js (9.6s)
  longest palindromic subsequence
    ✓ works for aab (3ms)
    ✓ works for long string (9315ms)

Test Suites: 1 passed, 1 total
Tests:       2 passed, 2 total
Snapshots:   0 total
Time:        10.039s
Ran all test suites matching /longestPalindromicSubsequence.test.js/i.

为了提高这种性能,我尝试在更新的模块longestPalindromicSubsequence2.js 中使用_.memoize

const _ = require('lodash');

const longestPalindromicSubsequence = _.memoize(
  (string, start = 0, end = string.length) => 
    if (end < start)  return 0; 
    if (start === end)  return 1; 
    if (string[start] === string[end]) 
      return 2 + longestPalindromicSubsequence(string, start + 1, end - 1);
    
    return Math.max(
      longestPalindromicSubsequence(string, start + 1, end),
      longestPalindromicSubsequence(string, start, end - 1),
    );
  ,
  (string, start, end) => [string, start, end], // resolver function
);

module.exports = longestPalindromicSubsequence;

但是,当我尝试使用此模块运行测试时,我收到“javascript heap out of memory”错误:

$ jest longestPalindromicSubsequence.test.js

 RUNS  ./longestPalindromicSubsequence.test.js

<--- Last few GCs --->
at[89308:0x104801e00]    15800 ms: Mark-sweep 1379.2 (1401.3) -> 1379.2 (1401.3) MB, 1720.4 / 0.0 ms  (+ 0.0 ms in 5 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 1735 ms) (average mu = 0.128, current mu = 0.057) allocat[89308:0x104801e00]    17606 ms: Mark-sweep 1390.0 (1412.3) -> 1390.0 (1412.3) MB, 1711.7 / 0.0 ms  (+ 0.0 ms in 4 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 1764 ms) (average mu = 0.091, current mu = 0.052) allocat

<--- JS stacktrace --->

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x20b000bdc01d]
Security context: 0x1c189571e549 <JSObject>
    1: /* anonymous */ [0x1c18f7682201] [/Users/kurtpeek/GoogleDrive/LeetCode/longestPalindromicSubsequence2.js:~14] [pc=0x20b0015cd091](this=0x1c18d38893a1 <JSGlobal Object>,string=0x1c18f7682271 <String[55]: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabcdef>,start=45,end=45)
    2: memoized [0x1c18f7682309] [/Users/kurtpeek/GoogleDrive/LeetCode/node_...

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
 1: 0x100037733 node::Abort() [/usr/local/bin/node]
 2: 0x1000378d6 node::FatalTryCatch::~FatalTryCatch() [/usr/local/bin/node]
 3: 0x10018e57b v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 4: 0x10018e51c v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
 5: 0x1004682ee v8::internal::Heap::UpdateSurvivalStatistics(int) [/usr/local/bin/node]
 6: 0x100469ed7 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [/usr/local/bin/node]
 7: 0x1004675cb v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/usr/local/bin/node]
 8: 0x1004663e6 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/usr/local/bin/node]
 9: 0x10046eafc v8::internal::Heap::AllocateRawWithLigthRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node]
10: 0x10046eb48 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node]
11: 0x10044eb7a v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [/usr/local/bin/node]
12: 0x100634916 v8::internal::Runtime_AllocateInTargetSpace(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
13: 0x20b000bdc01d 
Abort trap: 6

我从Node.js heap out of memory 了解到,Node 的标准内存使用量是 1.7GB,我认为应该足够了。任何想法为什么记忆版本不起作用,以及如何解决它?

【问题讨论】:

您是否检查过有多少对longestPalindromicSubsequence 的调用需要被记忆?您可能还想尝试提供不同的解析器,因为 IIRC 它是默认使用的第一个参数,我认为这里总是相同的字符串,这也可能导致问题(但我只是猜测那个)。 呃,对不起,完全跳过了。但是,是的,我想单个字符串键会比数组更有效。我的猜测是 JS 实现很重要,例如,数组是如何实现的,查找成本是多少。 @KurtPeek 我不熟悉 memoize 函数,但是给它一个数组可能是记忆数组引用,这永远不会相同 @juvian Ah--因为它比较的是严格的引用,而不是值。 @KurtPeek 这值得探索,应该不难发现。 【参考方案1】:

我设法通过将解析器函数从 (string, start, end) =&gt; [string, start, end] 更改为 (string, start, end) =&gt; string + start + end 来解决问题:

const _ = require('lodash');

const longestPalindromicSubsequence = _.memoize(
  (string, start = 0, end = string.length) => 
    if (end < start)  return 0; 
    if (start === end)  return 1; 
    if (string[start] === string[end]) 
      return 2 + longestPalindromicSubsequence(string, start + 1, end - 1);
    
    return Math.max(
      longestPalindromicSubsequence(string, start + 1, end),
      longestPalindromicSubsequence(string, start, end - 1),
    );
  ,
  (string, start, end) => string + start + end, // resolver function
);

module.exports = longestPalindromicSubsequence;

现在“长字符串”测试只需要 3 毫秒:

$ jest longestPalindromicSubsequence.test.js
 PASS  ./longestPalindromicSubsequence.test.js
  longest palindromic subsequence
    ✓ works for aab (3ms)
    ✓ works for long string (3ms)

Test Suites: 1 passed, 1 total
Tests:       2 passed, 2 total
Snapshots:   0 total
Time:        1.004s, estimated 10s
Ran all test suites matching /longestPalindromicSubsequence.test.js/i.

使用字符串作为缓存的键似乎比使用数组更节省内存 - 可能是因为字符串在 Javascript 中是不可变的?欢迎对此改进的任何解释。

【讨论】:

每个新的数组实例都被视为唯一的,即使里面有相同的元素。换句话说,['abcd', 4, 4] !== ['abcd', 4, 4] 和这些键的记忆只是消耗内存,但实际上根本没有帮助。只需在函数末尾添加console.log(string, start, end); 并检查实际上使用相同参数进行的顺序调用,就可以轻松检查 string + start + end 也不是很好, (asd, 17, 3) 将与 (asd, 1, 73) 使用相同的键,只需使用 JSON.strigify([string, start , 结束]) @juvian 我想一个分隔符就足够了?例如asd:1:73 @AlexanderAzarov 在这种情况下是的,JSON.stringify 是一个更通用的解决方案【参考方案2】:

我知道您发布了最理想的答案,但想补充一点。根本问题是使用数组是造成瓶颈的原因。在幕后,lodash 有他们自己的MapCache,他们定义了哪些似乎假设字符串将被传递。

但是重新审视 documentation 和 cmets,它们确实公开了 Cache 对象供您覆盖,假设它与它们的 Map 具有相同的接口。

创建一个函数来记忆 func 的结果。如果解析器是 提供,它根据缓存键确定存储结果 提供给记忆函数的参数。默认情况下,第一个 提供给 memoized 函数的参数用作地图缓存 钥匙。使用 memoized 的 this 绑定调用 func 功能。

注意:缓存在 memoized 上作为缓存属性公开 功能。 可以通过替换来自定义其创建 _.memoize.Cache 构造函数,其实例实现了 Map 方法接口 clear、delete、get、has 和 set。

我进入并测试了您的代码,因为如果您想将键作为对象/非字符串引用,您应该使用的实际 Map 是 WeakMap。这是我测试的

const _ = require('lodash');

// override Cache and use WeakMap
_.memoize.Cache = WeakMap;

const longestPalindromicSubsequence = _.memoize(
  (string, start = 0, end = string.length) => 
    if (end < start)  return 0; 
    if (start === end)  return 1; 
    if (string[start] === string[end]) 
      return 2 + longestPalindromicSubsequence(string, start + 1, end - 1);
    
    return Math.max(
      longestPalindromicSubsequence(string, start + 1, end),
      longestPalindromicSubsequence(string, start, end - 1),
    );
  ,
  (string, start, end) => [string, start, end], // resolver function
);

module.exports = longestPalindromicSubsequence;

虽然它仍然需要很长时间,但最终还是会通过而不会遇到 JavaScript 堆内存不足的问题。

正如您所确定的,最好的解决方案是简单地对密钥进行字符串化:) (尽管考虑@juvian 关于使用JSON.stringify 的评论,如果字符串的各个部分最终发生冲突,则最终字符串可能相同)

【讨论】:

以上是关于使用 Lodash memoize 的“Javascript 堆内存不足”的主要内容,如果未能解决你的问题,请参考以下文章

Lodash memoize – 如何删除具有复杂键的缓存条目?

何时在 Ruby on Rails 中使用 memoization

thinking--javascript 中如何使用记忆(Memoization )

thinking--javascript 中如何使用记忆(Memoization )

thinking--javascript 中如何使用记忆(Memoization )

thinking--javascript 中如何使用记忆(Memoization )