不正确的内存访问:为啥我的内核*不*崩溃

Posted

技术标签:

【中文标题】不正确的内存访问:为啥我的内核*不*崩溃【英文标题】:Incorrect memory access: why is my kernel *not* crashing不正确的内存访问:为什么我的内核*不*崩溃 【发布时间】:2020-05-02 12:10:36 【问题描述】:

我想向某人展示一个不正确的内存访问示例(内核空间试图访问用户空间内存导致错误)。

因此,我将一个旧教程作为 POC,重要的部分是:

static ssize_t dev_write(struct file *filep, const char *buffer, size_t len, loff_t *offset)
   sprintf(message, "%s(%zu letters)", buffer, len);   // appending received string with its length
   // [...]

这会在我的一个环境测试中导致崩溃,这是预期的行为(我在不使用 cpoy_*_user 函数的情况下访问作为 用户空间 变量的缓冲区,因此触发了内存保护机制并我的进程被杀死了。

但在另一台机器上,这个 sn-p 实际上工作得很好,这对我来说似乎很奇怪。两台机器都使用 5.3 内核,内核配置非常相似。

没有崩溃的虚拟机坏了吗?我的代码实际上是 UB 吗?我错过了什么吗?

签入 gdb 后,我真的在访问 gdb 中未映射的缓冲区变量...:

gdb-peda$ hb dev_write
gdb-peda$ c
Thread 3 hit Breakpoint 1, dev_write () at /home/user/testMmap/ebbchar.c:144
144     static ssize_t dev_write(struct file *filep, const char *buffer, size_t len, loff_t *offset)
gdb-peda$ x/x $rip
0xffffffffc010d000 <dev_write>: 0x0f
gdb-peda$ x/s buffer
0x56139a0e5650: "a\n"
gdb-peda$ maintenance info sections
Exec file:
    `/home/max/prog/kgdb/remote/vmlinux', file type elf64-x86-64.
 [0]     0xffffffff81000000->0xffffffff81c04371 at 0x00200000: .text ALLOC LOAD RELOC READONLY CODE HAS_CONTENTS
 [1]     0xffffffff81c04374->0xffffffff81c0456c at 0x00e04374: .notes ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [2]     0xffffffff81c04570->0xffffffff81c08188 at 0x00e04570: __ex_table ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [3]     0xffffffff81e00000->0xffffffff82154f32 at 0x01000000: .rodata ALLOC LOAD RELOC DATA HAS_CONTENTS
 [4]     0xffffffff82154f40->0xffffffff82157af0 at 0x01354f40: .pci_fixup ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [5]     0xffffffff82157af0->0xffffffff82160b18 at 0x01357af0: __ksymtab ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [6]     0xffffffff82160b18->0xffffffff82169090 at 0x01360b18: __ksymtab_gpl ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [7]     0xffffffff82169090->0xffffffff8216d8a4 at 0x01369090: __kcrctab ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [8]     0xffffffff8216d8a4->0xffffffff82171b60 at 0x0136d8a4: __kcrctab_gpl ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [9]     0xffffffff82171b60->0xffffffff8219c23c at 0x01371b60: __ksymtab_strings ALLOC LOAD READONLY DATA HAS_CONTENTS
 [10]     0xffffffff8219c240->0xffffffff8219e478 at 0x0139c240: __param ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [11]     0xffffffff8219e478->0xffffffff8219f000 at 0x0139e478: __modver ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [12]     0xffffffff82200000->0xffffffff82349a00 at 0x01400000: .data ALLOC LOAD RELOC DATA HAS_CONTENTS
 [13]     0xffffffff82349a00->0xffffffff8235d2a8 at 0x01549a00: __bug_table ALLOC LOAD RELOC DATA HAS_CONTENTS in
 [14]     0xffffffff8235d2a8->0xffffffff824a7e28 at 0x0155d2a8: .orc_unwind_ip ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [15]     0xffffffff824a7e28->0xffffffff82697f68 at 0x016a7e28: .orc_unwind ALLOC LOAD READONLY DATA HAS_CONTENTS
 [16]     0xffffffff82697f68->0xffffffff826c807c at 0x01897f68: .orc_lookup ALLOC
 [17]     0xffffffff826c9000->0xffffffff826ca000 at 0x018c9000: .vvar ALLOC LOAD DATA HAS_CONTENTS
 [18]     0x00000000->0x0002b318 at 0x01a00000: .data..percpu ALLOC LOAD RELOC DATA HAS_CONTENTS
 [19]     0xffffffff826f6000->0xffffffff82764674 at 0x01af6000: .init.text ALLOC LOAD RELOC READONLY CODE HAS_CONTENTS
 [20]     0xffffffff82764674->0xffffffff8276500c at 0x01b64674: .altinstr_aux ALLOC LOAD RELOC READONLY CODE HAS_CONTENTS
 [21]     0xffffffff82766000->0xffffffff8284ccb0 at 0x01b66000: .init.data ALLOC LOAD RELOC DATA HAS_CONTENTS
 [22]     0xffffffff8284ccb0->0xffffffff8284ccd0 at 0x01c4ccb0: .x86_cpu_dev.init ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [23]     0xffffffff8284ccd0->0xffffffff8286ba8c at 0x01c4ccd0: .parainstructions ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [24]     0xffffffff8286ba90->0xffffffff828709bb at 0x01c6ba90: .altinstructions ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [25]     0xffffffff828709bb->0xffffffff82871f93 at 0x01c709bb: .altinstr_replacement ALLOC LOAD RELOC READONLY CODE HAS_CONTENTS
 [26]     0xffffffff82871f98->0xffffffff82872060 at 0x01c71f98: .iommu_table ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [27]     0xffffffff82872060->0xffffffff82872088 at 0x01c72060: .apicdrivers ALLOC LOAD RELOC DATA HAS_CONTENTS
 [28]     0xffffffff82872088->0xffffffff82872a81 at 0x01c72088: .exit.text ALLOC LOAD RELOC READONLY CODE HAS_CONTENTS
 [29]     0xffffffff82873000->0xffffffff8287a000 at 0x01c73000: .smp_locks ALLOC LOAD RELOC READONLY DATA HAS_CONTENTS
 [30]     0xffffffff8287a000->0xffffffff8287b000 at 0x01c7a000: .data_nosave ALLOC LOAD DATA HAS_CONTENTS
 [31]     0xffffffff8287b000->0xffffffff82a00000 at 0x01c7b000: .bss ALLOC
 [32]     0xffffffff82a00000->0xffffffff82a2c000 at 0x01c7b000: .brk ALLOC
 [33]     0x00000000->0x0000001c at 0x01c7b000: .comment READONLY HAS_CONTENTS
 [34]     0x00000000->0x000276c0 at 0x01c7b020: .debug_aranges RELOC READONLY HAS_CONTENTS
 [35]     0x00000000->0x0b2ba185 at 0x01ca26e0: .debug_info RELOC READONLY HAS_CONTENTS
 [36]     0x00000000->0x005172ad at 0x0cf5c865: .debug_abbrev READONLY HAS_CONTENTS
 [37]     0x00000000->0x012752a1 at 0x0d473b12: .debug_line RELOC READONLY HAS_CONTENTS
 [38]     0x00000000->0x0024d428 at 0x0e6e8db8: .debug_frame RELOC READONLY HAS_CONTENTS
 [39]     0x00000000->0x002d5379 at 0x0e9361e0: .debug_str READONLY HAS_CONTENTS
 [40]     0x00000000->0x00d028ae at 0x0ec0b559: .debug_loc RELOC READONLY HAS_CONTENTS
 [41]     0x00000000->0x00d46440 at 0x0f90de10: .debug_ranges RELOC READONLY HAS_CONTENTS
gdb-peda$ c
Continuing.
(finishes without crashing)

编辑:为确保内存未映射,我尝试根据@Tsyvarev 的回答使用以下用户空间测试对其进行映射。奇怪的是,即使在这种情况下,我的程序也不会崩溃......

#include<stdio.h>
#include<unistd.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <sys/mman.h>

int main() 
    struct stat s;
    int in = open("aaa", O_RDONLY | O_RSYNC);
    fstat (in, &s);
    int size = s.st_size;
    char* ptr = mmap(NULL, size, PROT_READ, MAP_PRIVATE, in, 0);
    int out = open("/dev/ebbchar",O_WRONLY);
    printf("Written = %d", write(out, ptr, size));
    close(in);
    close(out);
    return 0;


注意:完整的 PoC 代码可以在下面找到 (context there)

/**
 * @file   ebbchar.c
 * @author Derek Molloy
 * @date   7 April 2015
 * @version 0.1
 * @brief   An introductory character driver to support the second article of my series on
 * Linux loadable kernel module (LKM) development. This module maps to /dev/ebbchar and
 * comes with a helper C program that can be run in Linux user space to communicate with
 * this the LKM.
 * @see http://www.derekmolloy.ie/ for a full description and follow-up descriptions.
 */

#include <linux/init.h>           // Macros used to mark up functions e.g. __init __exit
#include <linux/module.h>         // Core header for loading LKMs into the kernel
#include <linux/device.h>         // Header to support the kernel Driver Model
#include <linux/kernel.h>         // Contains types, macros, functions for the kernel
#include <linux/fs.h>             // Header for the Linux file system support
#include <linux/uaccess.h>          // Required for the copy to user function
#define  DEVICE_NAME "ebbchar"    ///< The device will appear at /dev/ebbchar using this value
#define  CLASS_NAME  "ebb"        ///< The device class -- this is a character device driver

MODULE_LICENSE("GPL");            ///< The license type -- this affects available functionality
MODULE_AUTHOR("Derek Molloy");    ///< The author -- visible when you use modinfo
MODULE_DESCRIPTION("A simple Linux char driver for the BBB");  ///< The description -- see modinfo
MODULE_VERSION("0.1");            ///< A version number to inform users

static int    majorNumber;                  ///< Stores the device number -- determined automatically
static char   message[256] = 0;           ///< Memory for the string that is passed from userspace
static short  size_of_message;              ///< Used to remember the size of the string stored
static int    numberOpens = 0;              ///< Counts the number of times the device is opened
static struct class*  ebbcharClass  = NULL; ///< The device-driver class struct pointer
static struct device* ebbcharDevice = NULL; ///< The device-driver device struct pointer

// The prototype functions for the character driver -- must come before the struct definition
static int     dev_open(struct inode *, struct file *);
static int     dev_release(struct inode *, struct file *);
static ssize_t dev_read(struct file *, char *, size_t, loff_t *);
static ssize_t dev_write(struct file *, const char *, size_t, loff_t *);

/** @brief Devices are represented as file structure in the kernel. The file_operations structure from
 *  /linux/fs.h lists the callback functions that you wish to associated with your file operations
 *  using a C99 syntax structure. char devices usually implement open, read, write and release calls
 */
static struct file_operations fops =

   .open = dev_open,
   .read = dev_read,
   .write = dev_write,
   .release = dev_release,
;

/** @brief The LKM initialization function
 *  The static keyword restricts the visibility of the function to within this C file. The __init
 *  macro means that for a built-in driver (not a LKM) the function is only used at initialization
 *  time and that it can be discarded and its memory freed up after that point.
 *  @return returns 0 if successful
 */
static int __init ebbchar_init(void)
   printk(KERN_INFO "EBBChar: Initializing the EBBChar LKM\n");

   // Try to dynamically allocate a major number for the device -- more difficult but worth it
   majorNumber = register_chrdev(0, DEVICE_NAME, &fops);
   if (majorNumber<0)
      printk(KERN_ALERT "EBBChar failed to register a major number\n");
      return majorNumber;
   
   printk(KERN_INFO "EBBChar: registered correctly with major number %d\n", majorNumber);

   // Register the device class
   ebbcharClass = class_create(THIS_MODULE, CLASS_NAME);
   if (IS_ERR(ebbcharClass))                // Check for error and clean up if there is
      unregister_chrdev(majorNumber, DEVICE_NAME);
      printk(KERN_ALERT "Failed to register device class\n");
      return PTR_ERR(ebbcharClass);          // Correct way to return an error on a pointer
   
   printk(KERN_INFO "EBBChar: device class registered correctly\n");

   // Register the device driver
   ebbcharDevice = device_create(ebbcharClass, NULL, MKDEV(majorNumber, 0), NULL, DEVICE_NAME);
   if (IS_ERR(ebbcharDevice))               // Clean up if there is an error
      class_destroy(ebbcharClass);           // Repeated code but the alternative is goto statements
      unregister_chrdev(majorNumber, DEVICE_NAME);
      printk(KERN_ALERT "Failed to create the device\n");
      return PTR_ERR(ebbcharDevice);
   
   printk(KERN_INFO "EBBChar: device class created correctly\n"); // Made it! device was initialized
   return 0;


/** @brief The LKM cleanup function
 *  Similar to the initialization function, it is static. The __exit macro notifies that if this
 *  code is used for a built-in driver (not a LKM) that this function is not required.
 */
static void __exit ebbchar_exit(void)
   device_destroy(ebbcharClass, MKDEV(majorNumber, 0));     // remove the device
   class_unregister(ebbcharClass);                          // unregister the device class
   class_destroy(ebbcharClass);                             // remove the device class
   unregister_chrdev(majorNumber, DEVICE_NAME);             // unregister the major number
   printk(KERN_INFO "EBBChar: Goodbye from the LKM!\n");


/** @brief The device open function that is called each time the device is opened
 *  This will only increment the numberOpens counter in this case.
 *  @param inodep A pointer to an inode object (defined in linux/fs.h)
 *  @param filep A pointer to a file object (defined in linux/fs.h)
 */
static int dev_open(struct inode *inodep, struct file *filep)
   numberOpens++;
   printk(KERN_INFO "EBBChar: Device has been opened %d time(s)\n", numberOpens);
   return 0;


/** @brief This function is called whenever device is being read from user space i.e. data is
 *  being sent from the device to the user. In this case is uses the copy_to_user() function to
 *  send the buffer string to the user and captures any errors.
 *  @param filep A pointer to a file object (defined in linux/fs.h)
 *  @param buffer The pointer to the buffer to which this function writes the data
 *  @param len The length of the b
 *  @param offset The offset if required
 */
static ssize_t dev_read(struct file *filep, char *buffer, size_t len, loff_t *offset)
   int error_count = 0;
   // copy_to_user has the format ( * to, *from, size) and returns 0 on success
   error_count = copy_to_user(buffer, message, size_of_message);

   if (error_count==0)            // if true then have success
      printk(KERN_INFO "EBBChar: Sent %d characters to the user\n", size_of_message);
      return (size_of_message=0);  // clear the position to the start and return 0
   
   else 
      printk(KERN_INFO "EBBChar: Failed to send %d characters to the user\n", error_count);
      return -EFAULT;              // Failed -- return a bad address message (i.e. -14)
   


/** @brief This function is called whenever the device is being written to from user space i.e.
 *  data is sent to the device from the user. The data is copied to the message[] array in this
 *  LKM using the sprintf() function along with the length of the string.
 *  @param filep A pointer to a file object
 *  @param buffer The buffer to that contains the string to write to the device
 *  @param len The length of the array of data that is being passed in the const char buffer
 *  @param offset The offset if required
 */
static ssize_t dev_write(struct file *filep, const char *buffer, size_t len, loff_t *offset)
   sprintf(message, "%s(%zu letters)", buffer, len);   // appending received string with its length
   size_of_message = strlen(message);                 // store the length of the stored message
   printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len);
   return len;


/** @brief The device release function that is called whenever the device is closed/released by
 *  the userspace program
 *  @param inodep A pointer to an inode object (defined in linux/fs.h)
 *  @param filep A pointer to a file object (defined in linux/fs.h)
 */
static int dev_release(struct inode *inodep, struct file *filep)
   printk(KERN_INFO "EBBChar: Device successfully closed\n");
   return 0;


/** @brief A module must use the module_init() module_exit() macros from linux/init.h, which
 *  identify the initialization function at insertion time and the cleanup function (as
 *  listed above)
 */
module_init(ebbchar_init);
module_exit(ebbchar_exit);

【问题讨论】:

一旦你开始处理不同的权限和内存空间,普通的 C UB 和 IDB 的概念就崩溃了。 @ThomasJager 虽然正确,但这并不能回答我的问题。我仍然不知道为什么我可以在一个环境中从内核模块(高地址)读取非映射的用户空间指针(低地址)。什么会导致这种奇怪的行为? 【参考方案1】:

直接访问从内核代码到用户空间内存是不好的,因为有两种可能的情况:

    访问的内存可能不属于进程,因为用户空间代码将错误指针传递给系统调用(错误或有意)。

    访问的内存可能属于进程,但当前未映射

在这两种情况下都会触发页面错误,并且由于该错误是由内核代码引起的,因此系统将此错误视为内核错误

正确访问用户空间内存 - 通过copy_to_user/copy_from_user - 处理这些场景优雅

    如果内存不属于用户空间进程,copy_*_user 函数会返回错误指示符。

    如果内存属于用户空间进程,则copy_*_user函数确保它在访问期间被映射。

所以,为了说明为什么直接访问用户空间内存不好,你可能会触发上述场景:

    将无效指针(例如 NULL)传递给 write 系统调用并观察内核崩溃而不是返回错误代码。

    将指向当前未映射内存的正确指针传递给write 系统调用并观察内核崩溃而不是正确访问内存。

    可以通过打开某个(其他)文件并mmap-ing 其内容来获得非映射指针:对于大多数文件系统,mmap 返回最初的非映射内存。

    澄清:成功的mmap()调用返回指向属于用户进程的内存的指针。但此时此内存可能未映射。第一次访问内存(从用户空间代码)将触发页面错误,并且在此期间内存被映射。

【讨论】:

有趣的答案。但是,就我而言,我使用# echo 'test'&gt;/dev/ebbchar 测试我的代码。因此,除非我错过了什么,否则我的缓冲区不会被映射。而且,由于我直接访问它(没有copy_*_user),我应该崩溃了。奇怪的是,这在 1 VM 中并非如此...... 它不必是 mmapp()ed,它必须在物理内存中。 @MaximeB.:“因此,除非我错过了什么,否则我的缓冲区不会被映射。” - 缓冲区由cat 程序分配,所以它属于进程的内存通常映射的。 (我写了“正常”,因为没有人能确定这一点)。相反,mmap 系统调用返回的内存区域虽然也属于进程的内存,但在被访问之前通常未映射 @Tsyvarev 对不起,我误解了你的回答。有了这个澄清,就更好了。我尝试使用 mmap,但使用 mmap 仍然没有崩溃...我编辑我的帖子以显示这一点。 是的,您当前的代码反映了我的建议。因此,为了演示场景 2 mmap 方法不起作用。我不知道为什么......

以上是关于不正确的内存访问:为啥我的内核*不*崩溃的主要内容,如果未能解决你的问题,请参考以下文章

linux进程为啥有用户栈和内核栈,

为啥内存 NX 需要 Linux 内核中的硬件支持?

CUDA 内核和内存访问(一个内核不完全执行,下一个不启动)

如何访问内核中的常量内存?

如何从 Linux 内核访问用户空间内存?

为啥不将 JWT 访问令牌存储在内存中并在 cookie 中刷新令牌?