malloc 如何在多线程环境中工作？

Posted 2023-02-14

技术标签:

【中文标题】malloc 如何在多线程环境中工作？【英文标题】：How does malloc work in a multithreaded environment? 【发布时间】：2012-05-29 04:56:24 【问题描述】：

典型的malloc（对于 x86-64 平台和 Linux 操作系统）是在开始时天真地锁定互斥锁并在完成后释放它，还是以更聪明的方式在更精细的级别上锁定互斥锁，以便锁竞争减少了？如果确实是第二种方式，它是怎么做到的？

【问题讨论】：

你在哪里看到的上下文是什么？任何引用的代码或参考？轻声：我是问，不是说。 【参考方案1】：

glibc 2.15 操作多个分配arenas。每个竞技场都有自己的锁。当一个线程需要分配内存时，malloc() 会选择一个 arena，将其锁定，然后从中分配内存。

选择竞技场的机制有些复杂，旨在减少锁争用：

/* arena_get() acquires an arena and locks the corresponding mutex.
   First, try the one last locked successfully by this thread.  (This
   is the common case and handled with a macro for speed.)  Then, loop
   once over the circularly linked list of arenas.  If no arena is
   readily available, create a new one.  In this latter case, `size'
   is just a hint as to how much memory will be required immediately
   in the new arena. */

考虑到这一点，malloc() 基本上看起来像这样（为简洁而编辑）：

  mstate ar_ptr;
  void *victim;

  arena_lookup(ar_ptr);
  arena_lock(ar_ptr, bytes);
  if(!ar_ptr)
    return 0;
  victim = _int_malloc(ar_ptr, bytes);
  if(!victim) 
    /* Maybe the failure is due to running out of mmapped areas. */
    if(ar_ptr != &main_arena) 
      (void)mutex_unlock(&ar_ptr->mutex);
      ar_ptr = &main_arena;
      (void)mutex_lock(&ar_ptr->mutex);
      victim = _int_malloc(ar_ptr, bytes);
      (void)mutex_unlock(&ar_ptr->mutex);
     else 
      /* ... or sbrk() has failed and there is still a chance to mmap() */
      ar_ptr = arena_get2(ar_ptr->next ? ar_ptr : 0, bytes);
      (void)mutex_unlock(&main_arena.mutex);
      if(ar_ptr) 
        victim = _int_malloc(ar_ptr, bytes);
        (void)mutex_unlock(&ar_ptr->mutex);
      
    
   else
    (void)mutex_unlock(&ar_ptr->mutex);

  return victim;

这个分配器叫做ptmalloc。它基于 Doug Lea 的 earlier work，由 Wolfram Gloger 维护。

【讨论】：

【参考方案2】：

Doug Lea's malloc 使用粗略锁定（或不锁定，取决于配置设置），其中对malloc/realloc/free 的每次调用都受到全局互斥锁的保护。这是安全的，但在高度多线程的环境中可能效率低下。

ptmalloc3，这是当今大多数 Linux 系统上使用的 GNU C 库 (libc) 中默认的 malloc 实现，具有更细粒度的策略，如 aix's answer 中所述，它允许多个线程安全地同时分配内存。

nedmalloc 是另一个独立的实现，它声称比ptmalloc3 和其他各种分配器具有更好的多线程性能。我不知道它是如何工作的，而且似乎没有任何明显的文档，所以你必须检查源代码才能了解它是如何工作的。

【讨论】：

我仍然不确定 nedmalloc 是真正的工程壮举还是 SEO 垃圾邮件... :-) 还有来自 google 的 tcmalloc，它使用根据您的请求大小的存储桶上的锁。更好的线程性能，更少的争用，更多的剩余分配。 @R..：乍一看确实有点可疑，但它有源代码，因此您可以自己进行基准测试（我没有这样做）。 Doug Lea 还在 dlmalloc.c 的 cmets 中说“如果您在并发程序中使用 malloc，请考虑使用 nedmalloc 或 ptmalloc”。所以我认为这可能是合法的。

以上是关于malloc 如何在多线程环境中工作？的主要内容，如果未能解决你的问题，请参考以下文章