mlock家族：锁定物理内存

Posted 2020-09-23 sky

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了mlock家族：锁定物理内存相关的知识，希望对你有一定的参考价值。

转自：http://blog.csdn.net/fjt19900921/article/details/8074541

锁住内存是为了防止这段内存被操作系统swap掉。并且由于此操作风险高，仅超级用户可以执行。

看家族成员：

#include <sys/mman.h>

int mlock(const void *addr, size_t len);

int munlock(const void *addr, size_t len);

int mlockall(int flags);

int munlockall(void);

系统调用 mlock 家族允许程序在物理内存上锁住它的部分或全部地址空间。这将阻止Linux 将这个内存页调度到交换空间（swap space），即使该程序已有一段时间没有访问这段空间。

一个严格时间相关的程序可能会希望锁住物理内存，因为内存页面调出调入的时间延迟可能太长或过于不可预知。安全性要求较高的应用程序可能希望防止敏感数据被换出到交换文件中，因为这样在程序结束后，攻击者可能从交换文件中恢复出这些数据。

锁定一个内存区间只需简单将指向区间开始的指针及区间长度作为参数调用 mlock。linux 分配内存到页(page)且每次只能锁定整页内存，被指定的区间涉及到的每个内存页都将被锁定。getpagesize 函数返回系统的分页大小，在 x86 Linux 系统上，这个值是 4KB。

简单的测试程序：

#include<unistd.h>
#include<stdio.h>

int main()
{
        int i = getpagesize();
        printf("page size = %d.\n",i);
        return 0;
}

举个例子来说，分配 32Mb 的地址空间并把它锁进内存中，您需要使用如下的代码：

const int alloc_size = 32 * 1024 * 1024; char* memory = malloc (alloc_size); mlock (memory, alloc_size);

需注意的是，仅分配内存并调用 mlock 并不会为调用进程锁定这些内存，因为对应的分页可能是写时复制（copy-on-write）的⁵。因此，你应该在每个页面中写入一个假的值：

size_t i; size_t page_size = getpagesize (); for (i = 0; i < alloc_size; i += page_size) memory[i] = 0;

这样针对每个内存分页的写入操作会强制 Linux 为当前进程分配一个独立、私有的内存页。

要解除锁定，可以用同样的参数调用 munlock。

如果你希望程序的全部地址空间被锁定在物理内存中，请用 mlockall。这个系统调用接受一个参数；如果指定 MCL_CURRENT，则仅仅当前已分配的内存会被锁定，之后分配的内存则不会；MCL_FUTURE 则会锁定之后分配的所有内存。使用 MCL_CURRENT|MCL_FUTURE 将已经及将来分配的所有内存锁定在物理内存中。

锁定大量的内存，尤其是通过 mlockall，对整个系统而言可能是危险的。不加选择的内存加锁会把您的系统折磨到死机，因为其余进程被迫争夺更少的资源的使用权，并且会更快地被交换进出物理内存（这被称之为 thrashing）。如果你锁定了太多的内存，Linux 系统将整体缺乏必需的内存空间并开始杀死进程。

出于这个原因，只有具有超级用户权限的进程才能利用 mlock 或 mlockall 锁定内存。如果一个并无超级用户权限的进程调用了这些系统调用将会失败、得到返回值 -1 并得到 errno 错误号 EPERM。

munlock 系统调用会将当前进程锁定的所有内存解锁，包括经由 mlock 或 mlockall 锁定的所有区间。

一个监视程序内存使用情况的方便方法是使用top命令。在top的输出中，SIZE显示了每个程序的虚地址空间的大小（您的整个程序代码、数据、栈，其中一些应该已被交换出到交换区间）。RSS 列（Resident set size，持久集合大小）显示了程序所占用的的物理内存大小。所有当前运行程序的 RSS 数值总和不会超过您的计算机物理内存大小，并且所有地址空间的大小限制值为2GB（对于32字节版本的Linux来说）

如果您使用了mlock系统调用，请引入<sys/mman.h>头文件。

5 Copy-on-write 写时复制意味着仅当进程在内存区间的任意位置写入内容时，Linux 系统才会为进程创建该区内存的私有副本。

给一段示例程序：

#include<stdio.h>
#include<stdlib.h>
#include<sys/mman.h>

const int alloc_size = 32 * 1024 * 1024;//分配32M内存
int main()
{
        char *memory = malloc(alloc_size);
        if(mlock(memory,alloc_size) == -1) {
                perror("mlock");
                return (-1);
        }
        size_t i;
        size_t page_size = getpagesize();
        for(i=0;i<alloc_size;i+=page_size) {
                printf("i=%zd\n",i);
                memory[i] = 0;
        }

        if(munlock(memory,alloc_size) == -1) {
                perror("munlock");
                return (-1);
        }

return 0;
}

记住用root权限执行。

Copy On Write（写时复制）是在编程中比较常见的一个技术，面试中也会偶尔出现（好像Java中就经常有字符串写时复制的笔试题），今天在看《More Effective C++》的引用计数时就讲到了Copy On Write——写时复制。下面简单介绍下Copy On Write(写时复制)，我们假设STL中的string支持写时复制（只是假设，具体未经考证，这里以Mircosoft Visual Studio 6.0为例，如果有兴趣，可以自己翻阅源码）

Copy On Write(写时复制)的原理是什么？
有一定经验的程序员应该都知道Copy On Write(写时复制)使用了“引用计数”，会有一个变量用于保存引用的数量。当第一个类构造时，string的构造函数会根据传入的参数从堆上分配内存，当有其它类需要这块内存时，这个计数为自动累加，当有类析构时，这个计数会减一，直到最后一个类析构时，此时的引用计数为1或是0，此时，程序才会真正的Free这块从堆上分配的内存。
引用计数就是string类中写时才拷贝的原理！

什么情况下触发Copy On Write(写时复制)
很显然，当然是在共享同一块内存的类发生内容改变时，才会发生Copy On Write(写时复制)。比如string类的[]、=、+=、+等，还有一些string类中诸如insert、replace、append等成员函数等，包括类的析构时。

示例代码：

[cpp] view plain copy print ?

// 作者：代码疯子
// 博客：http://www.programlife.net/
// 引用计数 & 写时复制
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
string sa = "Copy on write";
string sb = sa;
string sc = sb;
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
sc = "Now writing...";
printf("After writing sc:\n");
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
return 0;
}

[cpp] view plain copy print ?

// 作者：代码疯子
// 博客：http://www.programlife.net/
// 引用计数 & 写时复制
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
string sa = "Copy on write";
string sb = sa;
string sc = sb;
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
sc = "Now writing...";
printf("After writing sc:\n");
printf("sa char buffer address: 0x%08X\n", sa.c_str());
printf("sb char buffer address: 0x%08X\n", sb.c_str());
printf("sc char buffer address: 0x%08X\n", sc.c_str());
return 0;
}

输出结果如下（VC 6.0）：

技术分享

可以看到，VC6里面的string是支持写时复制的，但是我的Visual Studio 2008就不支持这个特性（Debug和Release都是）：

技术分享

拓展阅读：（摘自《Windows Via C/C++》5th Edition，不想看英文可以看中文的PDF，中文版第442页）
Static Data Is Not Shared by Multiple Instances of an Executable or a DLL

When you create a new process for an application that is already running, the system simply opens another memory-mapped view of the file-mapping object that identifies the executable file’s image and creates a new process object and a new thread object (for the primary thread). The system also assigns new process and thread IDs to these objects. By using memory-mapped files, multiple running instances of the same application can share the same code and data in RAM.

Note one small problem here. Processes use a flat address space. When you compile and link your program, all the code and data are thrown together as one large entity. The data is separated from the code but only to the extent that it follows the code in the .exe file. (See the following note for more detail.) The following illustration shows a simplified view of how the code and data for an application are loaded into virtual memory and then mapped into an application’s address space.

技术分享
As an example, let’s say that a second instance of an application is run. The system simply maps the pages of virtual memory containing the file’s code and data into the second application’s address space, as shown next.

技术分享
If one instance of the application alters some global variables residing in a data page, the memory contents for all instances of the application change. This type of change could cause disastrous effects and must not be allowed.

The system prohibits this by using the copy-on-write feature of the memory management system. Any time an application attempts to write to its memory-mapped file, the system catches the attempt, allocates a new block of memory for the page containing the memory the application is trying to write to, copies the contents of the page, and allows the application to write to this newly allocated memory block. As a result, no other instances of the same application are affected. The following illustration shows what happens when the first instance of an application attempts to change a global variable in data page 2:

技术分享
The system allocated a new page of virtual memory (labeled as “New page” in the image above) and copied the contents of data page 2 into it. The first instance’s address space is changed so that the new data page is mapped into the address space at the same location as the original address page. Now the system can let the process alter the global variable without fear of altering the data for another instance of the same application.

A similar sequence of events occurs when an application is being debugged. Let’s say that you’re running multiple instances of an application and want to debug only one instance. You access your debugger and set a breakpoint in a line of source code. The debugger modifies your code by changing one of your assembly language instructions to an instruction that causes the debugger to activate itself. So you have the same problem again. When the debugger modifies the code, it causes all instances of the application to activate the debugger when the changed assembly instruction is executed. To fix this situation, the system again uses copy-on-write memory. When the system senses that the debugger is attempting to change the code, it allocates a new block of memory, copies the page containing the instruction into the new page, and allows the debugger to modify the code in the page copy.

Copyed From 程序人生
Home Page:http://www.programlife.NET
Source URL:http://www.programlife.net/copy-on-write.html

写时拷贝：

在复制一个对象的时候并不是真正的把原先的对象复制到内存的另外一个位置上，而是在新对象的内存映射表中设置一个指针，指向源对象的位置，并把那块内存的Copy-On-Write位设置为1.

这样，在对新的对象执行读操作的时候，内存数据不发生任何变动，直接执行读操作；而在对新的对象执行写操作时，将真正的对象复制到新的内存地址中，并修改新对象的内存映射表指向这个新的位置，并在新的内存位置上执行写操作。

这个技术需要跟虚拟内存和分页同时使用，好处就是在执行复制操作时因为不是真正的内存复制，而只是建立了一个指针，因而大大提高效率。但这不是一直成立的，如果在复制新对象之后，大部分对象都还需要继续进行写操作会产生大量的分页错误，得不偿失。所以COW高效的情况只是在复制新对象之后，在一小部分的内存分页上进行写操作。

Linux的fork使用copy-on-write实现。写时拷贝是一种可以推迟甚至免除拷贝数据的技术。内核此时并不复制整个地址空间，而是让父进程和子进程共享同一拷贝。只有在需要写入的时候，数据才会被复制，从而使各个进程拥有各自的拷贝。也就是说，资源的复制只有在需要写入的时候才进行，在此之前，只是以可读方式共享。这种技术使地址空间上的页的拷贝被推迟到实际发生写入的时候。在页根根本不会被写入的情况下--举例来说，fork之后立即调用exec--它们无需复制了。

vfork和fork的功能相同，除了不拷贝父进程的页表项。子进程作为父进程的一个单独的线程在它的地址空间里运行，父进程被阻塞，直到子进程退出或执行exec。子进程不能向地址空间写入。

fork:子进程拷贝父进程的数据段，堆栈段。vfork:子进程与父进程共享数据段。

fork:父子进程的执行次序不确定。vfork:保证子进程先运行,在调用exec或exit之前与父进程数据是共享的,在它调用exec或exit之后父进程才可能被调度运行。如果在调用这两个函数之前子进程依赖于父进程的进一步动作,则会导致死锁。

fork系统调用是创建一个新进程的首选方式,fork的返回值要么是0，要么是非0，父进程与子进程的根本区别在于fork函数的返回值.

vfork系统调用除了能保证用户空间内存(数据段)不会被复制之外,它与fork几乎是完全相同的.

vfork存在的问题是它要求子进程立即调用exec,而不用修改任何内存,这在真正实现的时候要困难的多,尤其是考虑到exec调用有可能失败.

vfork的出现是为了解决当初fork浪费用户空间内存的问题,因为在fork后,很有可能去执行exec,vfork的思想就是取消这种复制.

现在的所有unix/linux变量都使用一种写拷贝的技术(copy on write)，它使得一个普通的fork调用非常类似于vfork.因此vfork变得没有必要.

以上是关于mlock家族：锁定物理内存的主要内容，如果未能解决你的问题，请参考以下文章

MongoDB的mongos实例因无法分配mlock内存挂掉

CentOS7下yum升级被PackageKit锁定（docker1）

linux 使用memtester测试内存稳定性德时候提示： trying mlock ...too many pages, reducing...

CUDA：流

如何分析音频回调

如何在不锁定活动方向的情况下锁定片段方向？