Linux/Document: Livepatch
Posted rtoax
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Linux/Document: Livepatch相关的知识,希望对你有一定的参考价值。
Livepatch — The Linux Kernel documentationhttps://www.kernel.org/doc/html/latest/livepatch/livepatch.htmlThis document outlines basic information about kernel livepatching.
livepatch限制:
-
只有可以跟踪的功能才能打补丁。
-
只有当动态 ftrace 位于函数的最开始时,Livepatch 才能可靠地工作。
-
使用 ftrace 框架的 Kretprobes 与修补函数冲突。
-
当代码重定向到新实现时,原始函数中的 Kprobes 将被忽略。
1. 动机
在许多情况下,用户不愿意重新启动系统。这可能是因为他们的系统正在执行复杂的科学计算,或者在高峰使用期间负载很重。除了保持系统正常运行之外,用户还希望拥有一个稳定和安全的系统。Livepatching 通过允许重定向函数调用来为用户提供两者;因此,无需重新启动系统即可修复关键功能。
2. Kprobes、Ftrace、Livepatching
Linux内核中有多种与代码执行重定向直接相关的机制;即:内核探测、函数跟踪和实时修补:
内核探测器是最通用的。可以通过放置断点指令而不是任何指令来重定向代码。
函数跟踪器从靠近函数入口点的预定义位置调用代码。该位置由编译器使用“-pg”gcc 选项生成。
Livepatching 通常需要在函数参数或堆栈以任何方式修改之前在函数入口的最开始处重定向代码。
所有三种方法都需要在运行时修改现有代码。因此,他们需要相互了解,而不是越过对方的脚趾。大多数这些问题都是通过使用动态 ftrace 框架作为基础来解决的。当探测函数条目时,Kprobe 被注册为 ftrace 处理程序,请参阅 CONFIG_KPROBES_ON_FTRACE。在自定义 ftrace 处理程序的帮助下,还可以调用来自实时补丁的替代函数。但是有一些限制,见下文。
3. 一致性模型
函数存在是有原因的。它们接受一些输入参数,获取或释放锁,读取、处理,甚至以定义的方式写入一些数据,都有返回值。换句话说,每个函数都有一个定义好的语义。
许多修复不会改变修改后的函数的语义。例如,他们添加 NULL 指针或边界检查,通过添加丢失的内存屏障来修复竞争,或者在关键部分添加一些锁定。大多数这些更改都是自包含的,并且该功能以相同的方式呈现给系统的其余部分。在这种情况下,功能可能会一一独立更新。
但是还有更复杂的修复。例如,一个补丁可能会同时更改多个函数中的锁定顺序。或者一个补丁可能会交换一些临时结构的含义并更新所有相关功能。在这种情况下,受影响的单元(线程、整个内核)需要同时开始使用所有新版本的函数。此外,切换必须仅在安全的情况下发生,例如,当受影响的锁被释放或此时没有数据存储在修改后的结构中时。
关于如何以安全的方式应用函数的理论相当复杂。目的是定义所谓的一致性模型。它尝试定义何时可以使用新实现的条件,以便系统保持一致。
Livepatch 有一个一致性模型,它是 kGraft 和 kpatch 的混合体:它使用 kGraft 的每任务一致性和系统调用屏障切换以及 kpatch 的堆栈跟踪切换。还有许多后备选项使其非常灵活。
当任务被认为可以安全切换时,补丁会在每个任务的基础上应用。启用补丁后,livepatch 进入转换状态,其中任务会收敛到已打补丁的状态。通常这种过渡状态可以在几秒钟内完成。禁用补丁时会发生相同的序列,除了任务从补丁状态收敛到未补丁状态。
中断处理程序继承它中断的任务的修补状态。对于分叉任务也是如此:子任务继承了父任务的修补状态。
Livepatch 使用几种互补的方法来确定何时可以安全地修补任务:
-
第一个也是最有效的方法是睡眠任务的堆栈检查。如果给定任务的堆栈上没有受影响的函数,则修补该任务。在大多数情况下,这将在第一次尝试时修补大部分或所有任务。否则它会继续定期尝试。此选项仅在体系结构具有可靠堆栈 (HAVE_RELIABLE_STACKTRACE) 时可用。
-
如果需要,第二种方法是内核退出切换。当任务从系统调用、用户空间 IRQ 或信号返回到用户空间时,它会被切换。它在以下情况下很有用:
-
修补在受影响函数上休眠的 I/O 绑定用户任务。在这种情况下,您必须发送 SIGSTOP 和 SIGCONT 以强制它退出内核并进行修补。
-
修补受 CPU 限制的用户任务。如果任务是高度受 CPU 限制的,那么它会在下一次被 IRQ 中断时被修补。
-
-
对于空闲的“交换器”任务,由于它们永远不会退出内核,因此它们在空闲循环中有一个 klp_update_patch_state() 调用,这允许它们在 CPU 进入空闲状态之前被修补。
(请注意,kthreads 还没有这样的方法。)
没有 HAVE_RELIABLE_STACKTRACE 的架构完全依赖第二种方法。在该函数返回之前,某些任务很可能仍在使用旧版本的函数运行。在这种情况下,您必须向任务发出信号。这尤其适用于 kthread。他们可能不会被唤醒,而是需要被强迫。请参阅下面的详细信息。
除非我们能想出另一种方法来修补 kthreads,否则内核实时修补不会完全支持没有 HAVE_RELIABLE_STACKTRACE 的体系结构。
/sys/kernel/livepatch/<patch>/transition 文件显示补丁是否正在转换中。在给定时间只能转换一个补丁。如果任何任务卡在初始补丁状态,补丁可以无限期地保持转换状态。
通过在转换过程中将相反的值写入 /sys/kernel/livepatch/<patch>/enabled 文件,可以反转并有效取消转换。然后所有任务将尝试收敛到原始补丁状态。
还有一个 /proc/<pid>/patch_state 文件,可用于确定哪些任务正在阻止修补操作的完成。如果补丁正在转换中,则此文件显示 0 表示任务未打补丁,1 表示已打补丁。否则,如果没有正在转换的补丁,则显示 -1。任何阻塞转换的任务都可以用 SIGSTOP 和 SIGCONT 发出信号,以强制它们改变它们的修补状态。但这可能对系统有害。向所有剩余的阻塞任务发送一个假信号是一个更好的选择。没有实际传递正确的信号(信号未决结构中没有数据)。任务被中断或唤醒,并被迫更改其修补状态。假信号每 15 秒自动发送一次。
管理员还可以通过 /sys/kernel/livepatch/<patch>/force 属性影响转换。在那里写入 1 会清除所有任务的 TIF_PATCH_PENDING 标志,从而强制任务进入修补状态。重要的提示!force 属性适用于由于阻塞任务而导致转换长时间卡住的情况。管理员应收集所有必要的数据(即此类阻塞任务的堆栈跟踪)并请求补丁分发者的许可以强制转换。未经授权的使用可能会对系统造成损害。这取决于补丁的性质,哪些函数是(未)补丁的,以及阻塞任务在哪些函数中休眠(/proc/<pid>/stack 在这里可能会有所帮助)。使用强制功能时,永久禁用补丁模块的移除 (rmmod)。不能保证在这样的模块中没有任务休眠。如果在循环中禁用和启用补丁模块,则它意味着无限引用计数。
而且,武力的使用也可能会影响到活补丁的未来应用,对系统造成更大的危害。管理员应首先考虑简单地取消转换(见上文)。如果使用了强制,则应该计划重新启动并且不再应用更多的实时补丁。
3.1 为新架构添加一致性模型支持
为了向新架构添加一致性模型支持,有几个选项:
-
添加 CONFIG_HAVE_RELIABLE_STACKTRACE。这意味着移植 objtool,对于非 DWARF 展开器,还要确保堆栈跟踪代码有一种方法来检测堆栈上的中断。
-
或者,确保每个 kthread 都在安全位置调用 klp_update_patch_state()。Kthreads 通常处于无限循环中,重复执行某些操作。切换 kthread 的补丁状态的安全位置将是在循环中的指定点,在该点没有锁定并且所有数据结构都处于明确定义的状态。
使用工作队列或 kthread worker API 时,位置很明确。这些 kthread 在通用循环中处理独立的操作。
对于具有自定义循环的 kthreads,情况要复杂得多。在那里,必须根据具体情况仔细选择安全位置。
在这种情况下,没有 HAVE_RELIABLE_STACKTRACE 的拱门仍然能够使用一致性模型的非堆栈检查部分:
-
在跨越内核/用户空间边界时修补用户任务;和
-
在指定的补丁点修补 kthread 和空闲任务。
此选项不如选项 1,因为它需要向用户任务发出信号并唤醒 kthread 来修补它们。但是对于那些还没有可靠堆栈跟踪的架构来说,它仍然是一个很好的备份选项。
-
4. Livepatch 模块
Livepatches 使用内核模块分发,请参阅 samples/livepatch/livepatch-sample.c,代码如下:
该模块包含我们要替换的功能的新实现。此外,它还定义了一些描述原始实现和新实现之间关系的结构。然后是在加载 livepatch 模块时使内核开始使用新代码的代码。还有一些代码会在 livepatch 模块被移除之前进行清理。所有这些将在接下来的部分中进行更详细的解释。
#include <linux/seq_file.h>
static int livepatch_cmdline_proc_show(struct seq_file *m, void *v)
{
seq_printf(m, "%s\\n", "this has been live patched");
return 0;
}
static struct klp_func funcs[] = {
{
.old_name = "cmdline_proc_show",
.new_func = livepatch_cmdline_proc_show,
}, { }
};
static struct klp_object objs[] = {
{
/* name being NULL means vmlinux */
.funcs = funcs,
}, { }
};
static struct klp_patch patch = {
.mod = THIS_MODULE,
.objs = objs,
};
static int livepatch_init(void)
{
return klp_enable_patch(&patch);
}
static void livepatch_exit(void)
{
}
我个人感觉,这种方法和普通的ftrace和kprobe区别不大,需要程序员有很大的工作量进行,但是用户态的libcare并不是这样做的。后续看看kpatch是什么原理。2021年9月20日18:25:48
4.1. 新功能
新版本的函数通常只是从原始来源复制而来。一个好的做法是为名称添加前缀,以便可以将它们与原始名称区分开来,例如在回溯中。它们也可以声明为静态,因为它们不是直接调用的,也不需要全局可见性。
该补丁仅包含真正修改过的功能。但他们可能希望访问只能在本地访问的原始源文件中的函数或数据。这可以通过生成的 livepatch 模块中的特殊重定位部分来解决,有关详细信息,请参阅 Livepatch 模块 Elf 格式。
4.2. 元数据
补丁由几个结构描述,这些结构将信息分为三个级别:
struct klp_func 是为每个修补函数定义的。它描述了特定功能的原始实现和新实现之间的关系。
该结构包括原始函数的名称(作为字符串)。函数地址在运行时通过 kallsyms 找到。
然后它包括新函数的地址。它是通过分配函数指针直接定义的。请注意,新函数通常定义在同一个源文件中。
作为可选参数,kallsyms 数据库中的符号位置可用于消除同名函数的歧义。这不是数据库中的绝对位置,而是仅针对特定对象( vmlinux 或内核模块)找到的顺序。请注意,kallsyms 允许根据对象名称搜索符号。
struct klp_object 在同一对象中定义了一组修补函数(struct klp_func)。其中对象是 vmlinux (NULL) 或模块名称。
该结构有助于将每个对象的功能分组和处理在一起。请注意,补丁模块的加载时间可能晚于补丁本身,并且相关功能可能仅在可用时才进行补丁。
struct klp_patch 定义了一个修补对象数组(struct klp_object)。
这种结构一致地并最终同步地处理所有修补过的函数。仅当找到所有修补符号时才应用整个修补程序。唯一的例外是来自尚未加载的对象(内核模块)的符号。
有关如何在每个任务的基础上应用补丁的更多详细信息,请参阅“一致性模型”部分。
5. Livepatch 生命周期
Livepatching 可以通过五个基本操作来描述:加载、启用、替换、禁用、移除。
其中替换和禁用操作是相互排斥的。对于给定的补丁,它们具有相同的结果,但对于系统则不同。
5.1. 加载中
唯一合理的方法是在加载 livepatch 内核模块时启用补丁。为此,必须在module_init()回调中调用 klp_enable_patch() 。有两个主要原因:
首先,只有模块可以轻松访问相关的结构 klp_patch。
其次,当补丁无法启用时,错误代码可能用于拒绝加载模块。
5.2. 启用
通过从module_init()回调中调用 klp_enable_patch() 来启用实时补丁。在此阶段,系统将开始使用修补功能的新实现。
首先,根据名称找到修补函数的地址。应用了“新功能”一节中提到的特殊重定位。相关条目在 /sys/kernel/livepatch/<name> 下创建。当上述任何操作失败时,补丁将被拒绝。
其次,livepatch 进入过渡状态,其中任务会收敛到修补状态。如果第一次修补原始函数,则会创建特定于函数的结构 klp_ops 并注册通用 ftrace 处理程序1。此阶段由 /sys/kernel/livepatch/<name>/transition 中的值“1”指示。有关此过程的更多信息,请参阅“一致性模型”部分。
最后,一旦修补了所有任务,“转换”值将更改为“0”。
请注意,函数可能会被多次修补。对于给定的函数,ftrace 处理程序仅注册一次。进一步的补丁只是向结构 klp_ops的列表(参见字段func_stack)添加一个条目。正确的实现由 ftrace 处理程序选择,请参阅“一致性模型”部分。
也就是说,强烈建议使用累积实时补丁,因为它们有助于保持所有更改的一致性。在这种情况下,功能可能仅在过渡期间被修补两次。
5.3. 更换
所有启用的补丁可能会被设置了 .replace 标志的累积补丁替换。
一旦启用了新补丁并且“转换”完成,与被替换补丁相关联的所有函数(结构 klp_func)将从相应的结构 klp_ops 中删除。此外,当相关函数未被新补丁修改且 func_stack 列表变空时,ftrace 处理程序将被取消注册并释放 struct klp_ops。
有关更多详细信息,请参阅原子替换和累积补丁。
5.4. 禁用
通过将“0”写入 /sys/kernel/livepatch/<name>/enabled 可能会禁用启用的补丁。
首先,livepatch 进入过渡状态,其中任务会收敛到未修补状态。系统开始使用之前启用的补丁中的代码,甚至是原始补丁。此阶段由 /sys/kernel/livepatch/<name>/transition 中的值“1”指示。有关此过程的更多信息,请参阅“一致性模型”部分。
其次,一旦所有任务都被取消补丁,'transition' 值就会变为'0'。所有与待禁用补丁相关的函数(struct klp_func)都从对应的struct klp_ops中移除。当 func_stack 列表变空时,ftrace 处理程序被取消注册并且结构 klp_ops 被释放。
三、sysfs接口被破坏。
5.5. 删除
只有当没有用户使用模块提供的功能时,模块移除才是安全的。这就是强制功能永久禁用删除的原因。只有当系统在没有强制的情况下成功转换到新的补丁状态(已打补丁/未打补丁)时,才能保证没有任务在旧代码中休眠或运行。
6. 系统文件
有关已注册补丁的信息可以在 /sys/kernel/livepatch 下找到。可以通过在那里写入来启用和禁用补丁。
/sys/kernel/livepatch/<patch>/force 属性允许管理员影响修补操作。
有关更多详细信息,请参阅文档/ABI/testing/sysfs-kernel-livepatch。
7. 限制
当前的 Livepatch 实现有几个限制:
只有可以跟踪的功能才能打补丁。
Livepatch 基于动态 ftrace。特别是,无法修补实现 ftrace 或 livepatch ftrace 处理程序的函数。否则,代码将陷入无限循环。通过用“notrace”标记有问题的函数来防止潜在的错误。
只有当动态 ftrace 位于函数的最开始时,Livepatch 才能可靠地工作。
函数需要在堆栈之前重定向或以任何方式修改函数参数。例如,livepatch 需要在 x86_64 上使用 -fentry gcc 编译器选项。
一个例外是 PPC 端口。它使用相对寻址和 TOC。每个函数都必须在调用 ftrace 处理程序之前处理 TOC 并保存 LR。此操作必须在返回时恢复。幸运的是,通用的 ftrace 代码也有同样的问题,所有这些都是在 ftrace 级别处理的。
使用 ftrace 框架的 Kretprobes 与修补函数冲突。
kretprobes 和 livepatches 都使用修改返回地址的 ftrace 处理程序。第一个用户获胜。当处理程序已被另一个使用时,探测器或补丁将被拒绝。
当代码重定向到新实现时,原始函数中的 Kprobes 将被忽略。
正在进行一项工作以添加有关这种情况的警告。
英文原文
1. Motivation
There are many situations where users are reluctant to reboot a system. It may be because their system is performing complex scientific computations or under heavy load during peak usage. In addition to keeping systems up and running, users want to also have a stable and secure system. Livepatching gives users both by allowing for function calls to be redirected; thus, fixing critical functions without a system reboot.
2. Kprobes, Ftrace, Livepatching
There are multiple mechanisms in the Linux kernel that are directly related to redirection of code execution; namely: kernel probes, function tracing, and livepatching:
The kernel probes are the most generic. The code can be redirected by putting a breakpoint instruction instead of any instruction.
The function tracer calls the code from a predefined location that is close to the function entry point. This location is generated by the compiler using the ‘-pg’ gcc option.
Livepatching typically needs to redirect the code at the very beginning of the function entry before the function parameters or the stack are in any way modified.
All three approaches need to modify the existing code at runtime. Therefore they need to be aware of each other and not step over each other’s toes. Most of these problems are solved by using the dynamic ftrace framework as a base. A Kprobe is registered as a ftrace handler when the function entry is probed, see CONFIG_KPROBES_ON_FTRACE. Also an alternative function from a live patch is called with the help of a custom ftrace handler. But there are some limitations, see below.
3. Consistency model
Functions are there for a reason. They take some input parameters, get or release locks, read, process, and even write some data in a defined way, have return values. In other words, each function has a defined semantic.
Many fixes do not change the semantic of the modified functions. For example, they add a NULL pointer or a boundary check, fix a race by adding a missing memory barrier, or add some locking around a critical section. Most of these changes are self contained and the function presents itself the same way to the rest of the system. In this case, the functions might be updated independently one by one.
But there are more complex fixes. For example, a patch might change ordering of locking in multiple functions at the same time. Or a patch might exchange meaning of some temporary structures and update all the relevant functions. In this case, the affected unit (thread, whole kernel) need to start using all new versions of the functions at the same time. Also the switch must happen only when it is safe to do so, e.g. when the affected locks are released or no data are stored in the modified structures at the moment.
The theory about how to apply functions a safe way is rather complex. The aim is to define a so-called consistency model. It attempts to define conditions when the new implementation could be used so that the system stays consistent.
Livepatch has a consistency model which is a hybrid of kGraft and kpatch: it uses kGraft’s per-task consistency and syscall barrier switching combined with kpatch’s stack trace switching. There are also a number of fallback options which make it quite flexible.
Patches are applied on a per-task basis, when the task is deemed safe to switch over. When a patch is enabled, livepatch enters into a transition state where tasks are converging to the patched state. Usually this transition state can complete in a few seconds. The same sequence occurs when a patch is disabled, except the tasks converge from the patched state to the unpatched state.
An interrupt handler inherits the patched state of the task it interrupts. The same is true for forked tasks: the child inherits the patched state of the parent.
Livepatch uses several complementary approaches to determine when it’s safe to patch tasks:
-
The first and most effective approach is stack checking of sleeping tasks. If no affected functions are on the stack of a given task, the task is patched. In most cases this will patch most or all of the tasks on the first try. Otherwise it’ll keep trying periodically. This option is only available if the architecture has reliable stacks (HAVE_RELIABLE_STACKTRACE).
-
The second approach, if needed, is kernel exit switching. A task is switched when it returns to user space from a system call, a user space IRQ, or a signal. It’s useful in the following cases:
-
Patching I/O-bound user tasks which are sleeping on an affected function. In this case you have to send SIGSTOP and SIGCONT to force it to exit the kernel and be patched.
-
Patching CPU-bound user tasks. If the task is highly CPU-bound then it will get patched the next time it gets interrupted by an IRQ.
-
-
For idle “swapper” tasks, since they don’t ever exit the kernel, they instead have a klp_update_patch_state() call in the idle loop which allows them to be patched before the CPU enters the idle state.
(Note there’s not yet such an approach for kthreads.)
Architectures which don’t have HAVE_RELIABLE_STACKTRACE solely rely on the second approach. It’s highly likely that some tasks may still be running with an old version of the function, until that function returns. In this case you would have to signal the tasks. This especially applies to kthreads. They may not be woken up and would need to be forced. See below for more information.
Unless we can come up with another way to patch kthreads, architectures without HAVE_RELIABLE_STACKTRACE are not considered fully supported by the kernel livepatching.
The /sys/kernel/livepatch/<patch>/transition file shows whether a patch is in transition. Only a single patch can be in transition at a given time. A patch can remain in transition indefinitely, if any of the tasks are stuck in the initial patch state.
A transition can be reversed and effectively canceled by writing the opposite value to the /sys/kernel/livepatch/<patch>/enabled file while the transition is in progress. Then all the tasks will attempt to converge back to the original patch state.
There’s also a /proc/<pid>/patch_state file which can be used to determine which tasks are blocking completion of a patching operation. If a patch is in transition, this file shows 0 to indicate the task is unpatched and 1 to indicate it’s patched. Otherwise, if no patch is in transition, it shows -1. Any tasks which are blocking the transition can be signaled with SIGSTOP and SIGCONT to force them to change their patched state. This may be harmful to the system though. Sending a fake signal to all remaining blocking tasks is a better alternative. No proper signal is actually delivered (there is no data in signal pending structures). Tasks are interrupted or woken up, and forced to change their patched state. The fake signal is automatically sent every 15 seconds.
Administrator can also affect a transition through /sys/kernel/livepatch/<patch>/force attribute. Writing 1 there clears TIF_PATCH_PENDING flag of all tasks and thus forces the tasks to the patched state. Important note! The force attribute is intended for cases when the transition gets stuck for a long time because of a blocking task. Administrator is expected to collect all necessary data (namely stack traces of such blocking tasks) and request a clearance from a patch distributor to force the transition. Unauthorized usage may cause harm to the system. It depends on the nature of the patch, which functions are (un)patched, and which functions the blocking tasks are sleeping in (/proc/<pid>/stack may help here). Removal (rmmod) of patch modules is permanently disabled when the force feature is used. It cannot be guaranteed there is no task sleeping in such module. It implies unbounded reference count if a patch module is disabled and enabled in a loop.
Moreover, the usage of force may also affect future applications of live patches and cause even more harm to the system. Administrator should first consider to simply cancel a transition (see above). If force is used, reboot should be planned and no more live patches applied.
3.1 Adding consistency model support to new architectures
For adding consistency model support to new architectures, there are a few options:
-
Add CONFIG_HAVE_RELIABLE_STACKTRACE. This means porting objtool, and for non-DWARF unwinders, also making sure there’s a way for the stack tracing code to detect interrupts on the stack.
-
Alternatively, ensure that every kthread has a call to klp_update_patch_state() in a safe location. Kthreads are typically in an infinite loop which does some action repeatedly. The safe location to switch the kthread’s patch state would be at a designated point in the loop where there are no locks taken and all data structures are in a well-defined state.
The location is clear when using workqueues or the kthread worker API. These kthreads process independent actions in a generic loop.
It’s much more complicated with kthreads which have a custom loop. There the safe location must be carefully selected on a case-by-case basis.
In that case, arches without HAVE_RELIABLE_STACKTRACE would still be able to use the non-stack-checking parts of the consistency model:
-
patching user tasks when they cross the kernel/user space boundary; and
-
patching kthreads and idle tasks at their designated patch points.
This option isn’t as good as option 1 because it requires signaling user tasks and waking kthreads to patch them. But it could still be a good backup option for those architectures which don’t have reliable stack traces yet.
-
4. Livepatch module
Livepatches are distributed using kernel modules, see samples/livepatch/livepatch-sample.c.
The module includes a new implementation of functions that we want to replace. In addition, it defines some structures describing the relation between the original and the new implementation. Then there is code that makes the kernel start using the new code when the livepatch module is loaded. Also there is code that cleans up before the livepatch module is removed. All this is explained in more details in the next sections.
4.1. New functions
New versions of functions are typically just copied from the original sources. A good practice is to add a prefix to the names so that they can be distinguished from the original ones, e.g. in a backtrace. Also they can be declared as static because they are not called directly and do not need the global visibility.
The patch contains only functions that are really modified. But they might want to access functions or data from the original source file that may only be locally accessible. This can be solved by a special relocation section in the generated livepatch module, see Livepatch module Elf format for more details.
4.2. Metadata
The patch is described by several structures that split the information into three levels:
struct klp_func is defined for each patched function. It describes the relation between the original and the new implementation of a particular function.
The structure includes the name, as a string, of the original function. The function address is found via kallsyms at runtime.
Then it includes the address of the new function. It is defined directly by assigning the function pointer. Note that the new function is typically defined in the same source file.
As an optional parameter, the symbol position in the kallsyms database can be used to disambiguate functions of the same name. This is not the absolute position in the database, but rather the order it has been found only for a particular object ( vmlinux or a kernel module ). Note that kallsyms allows for searching symbols according to the object name.
struct klp_object defines an array of patched functions (struct klp_func) in the same object. Where the object is either vmlinux (NULL) or a module name.
The structure helps to group and handle functions for each object together. Note that patched modules might be loaded later than the patch itself and the relevant functions might be patched only when they are available.
struct klp_patch defines an array of patched objects (struct klp_object).
This structure handles all patched functions consistently and eventually, synchronously. The whole patch is applied only when all patched symbols are found. The only exception are symbols from objects (kernel modules) that have not been loaded yet.
For more details on how the patch is applied on a per-task basis, see the “Consistency model” section.
5. Livepatch life-cycle
Livepatching can be described by five basic operations: loading, enabling, replacing, disabling, removing.
Where the replacing and the disabling operations are mutually exclusive. They have the same result for the given patch but not for the system.
5.1. Loading
The only reasonable way is to enable the patch when the livepatch kernel module is being loaded. For this, klp_enable_patch() has to be called in the module_init() callback. There are two main reasons:
First, only the module has an easy access to the related struct klp_patch.
Second, the error code might be used to refuse loading the module when the patch cannot get enabled.
5.2. Enabling
The livepatch gets enabled by calling klp_enable_patch() from the module_init() callback. The system will start using the new implementation of the patched functions at this stage.
First, the addresses of the patched functions are found according to their names. The special relocations, mentioned in the section “New functions”, are applied. The relevant entries are created under /sys/kernel/livepatch/<name>. The patch is rejected when any above operation fails.
Second, livepatch enters into a transition state where tasks are converging to the patched state. If an original function is patched for the first time, a function specific struct klp_ops is created and an universal ftrace handler is registered1. This stage is indicated by a value of ‘1’ in /sys/kernel/livepatch/<name>/transition. For more information about this process, see the “Consistency model” section.
Finally, once all tasks have been patched, the ‘transition’ value changes to ‘0’.
Note that functions might be patched multiple times. The ftrace handler is registered only once for a given function. Further patches just add an entry to the list (see field func_stack) of the struct klp_ops. The right implementation is selected by the ftrace handler, see the “Consistency model” section.
That said, it is highly recommended to use cumulative livepatches because they help keeping the consistency of all changes. In this case, functions might be patched two times only during the transition period.
5.3. Replacing
All enabled patches might get replaced by a cumulative patch that has the .replace flag set.
Once the new patch is enabled and the ‘transition’ finishes then all the functions (struct klp_func) associated with the replaced patches are removed from the corresponding struct klp_ops. Also the ftrace handler is unregistered and the struct klp_ops is freed when the related function is not modified by the new patch and func_stack list becomes empty.
See Atomic Replace & Cumulative Patches for more details.
5.4. Disabling
Enabled patches might get disabled by writing ‘0’ to /sys/kernel/livepatch/<name>/enabled.
First, livepatch enters into a transition state where tasks are converging to the unpatched state. The system starts using either the code from the previously enabled patch or even the original one. This stage is indicated by a value of ‘1’ in /sys/kernel/livepatch/<name>/transition. For more information about this process, see the “Consistency model” section.
Second, once all tasks have been unpatched, the ‘transition’ value changes to ‘0’. All the functions (struct klp_func) associated with the to-be-disabled patch are removed from the corresponding struct klp_ops. The ftrace handler is unregistered and the struct klp_ops is freed when the func_stack list becomes empty.
Third, the sysfs interface is destroyed.
5.5. Removing
Module removal is only safe when there are no users of functions provided by the module. This is the reason why the force feature permanently disables the removal. Only when the system is successfully transitioned to a new patch state (patched/unpatched) without being forced it is guaranteed that no task sleeps or runs in the old code.
6. Sysfs
Information about the registered patches can be found under /sys/kernel/livepatch. The patches could be enabled and disabled by writing there.
/sys/kernel/livepatch/<patch>/force attributes allow administrator to affect a patching operation.
See Documentation/ABI/testing/sysfs-kernel-livepatch for more details.
7. Limitations
The current Livepatch implementation has several limitations:
Only functions that can be traced could be patched.
Livepatch is based on the dynamic ftrace. In particular, functions implementing ftrace or the livepatch ftrace handler could not be patched. Otherwise, the code would end up in an infinite loop. A potential mistake is prevented by marking the problematic functions by “notrace”.
Livepatch works reliably only when the dynamic ftrace is located at the very beginning of the function.
The function need to be redirected before the stack or the function parameters are modified in any way. For example, livepatch requires using -fentry gcc compiler option on x86_64.
One exception is the PPC port. It uses relative addressing and TOC. Each function has to handle TOC and save LR before it could call the ftrace handler. This operation has to be reverted on return. Fortunately, the generic ftrace code has the same problem and all this is handled on the ftrace level.
Kretprobes using the ftrace framework conflict with the patched functions.
Both kretprobes and livepatches use a ftrace handler that modifies the return address. The first user wins. Either the probe or the patch is rejected when the handler is already in use by the other.
Kprobes in the original function are ignored when the code is redirected to the new implementation.
There is a work in progress to add warnings about this situation.
以上是关于Linux/Document: Livepatch的主要内容,如果未能解决你的问题,请参考以下文章
Livepatch: Linux kernel updates without rebooting