OOM探究:XNU内存状态管理, Jetsam原理

Posted 想名真难

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了OOM探究:XNU内存状态管理, Jetsam原理相关的知识,希望对你有一定的参考价值。

OOM 其实是Out Of Memory的简称,指的是在 ios 设备上当前应用因为内存占用过高而被操作系统强制终止,在用户侧的感知就是 App 一瞬间的闪退,与普通的 Crash 没有明显差异。但是当我们在调试阶段遇到这种崩溃的时候,从系统 设置- 隐私-分析与改进中是找不到普通类型的崩溃日志,只能够找到Jetsam开头的日志,这种形式的日志其实就是 OOM 崩溃之后系统生成的一种专门反映内存异常问题的日志。OOM是一种操作系统管理内存的机制。因为手机的内存是有限的,不可能无限制的使用,当内存不够时,需要将低优先级的进程kill,腾出内存以便高优先级的进程使用。

那OOM的触发机制到底是怎么样的呢?目前市上的资料说的都比较模糊,没有一个很清晰的介绍, 找了很多文章, 尝试总结一下

源码探究

xnu这块代码是开源的,在opensource.apple.com里可以下到整个 xnu 内核的代码。内存状态管理相关的代码主要是在kern_memorystatus.c(.h)文件中

优先级队列

首先系统对进程是分优先级的,整个系统会有一个优先级队列。

#define MEMSTAT_BUCKET_COUNT (JETSAM_PRIORITY_MAX + 1)

typedef struct memstat_bucket 
    TAILQ_HEAD(, proc) list;
    int count;
 memstat_bucket_t;

memstat_bucket_t memstat_bucket[MEMSTAT_BUCKET_COUNT];

kern_memorystatus.c中定义了一个memstat_bucket_t的结构体。结构体很简单,count 表示这个优先级下有多少个进程,list是一个链表,用来存放各个进程。(使用链表是为了插入和删除方便。)

memstat_bucket_t结构体之后,系统定义了一个memstat_bucket_t结构体的数组,用来存放系统进程的优先级队列。每个优先级对应一个memstat_bucket_t结构体,结构体中存放着这个优先级下的所有进程。

kern_memorystatus.h中定义了优先级有哪些:

#define JETSAM_PRIORITY_REVISION                  2

#define JETSAM_PRIORITY_IDLE_HEAD                -2
/* The value -1 is an alias to JETSAM_PRIORITY_DEFAULT */
#define JETSAM_PRIORITY_IDLE                      0
#define JETSAM_PRIORITY_IDLE_DEFERRED         1 /* Keeping this around till all xnu_quick_tests can be moved away from it.*/
#define JETSAM_PRIORITY_AGING_BAND1       JETSAM_PRIORITY_IDLE_DEFERRED
#define JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC  2
#define JETSAM_PRIORITY_AGING_BAND2       JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC
#define JETSAM_PRIORITY_BACKGROUND                3
#define JETSAM_PRIORITY_ELEVATED_INACTIVE     JETSAM_PRIORITY_BACKGROUND
#define JETSAM_PRIORITY_MAIL                      4
#define JETSAM_PRIORITY_PHONE                     5
#define JETSAM_PRIORITY_UI_SUPPORT                8
#define JETSAM_PRIORITY_FOREGROUND_SUPPORT        9
#define JETSAM_PRIORITY_FOREGROUND               10
#define JETSAM_PRIORITY_AUDIO_AND_ACCESSORY      12
#define JETSAM_PRIORITY_CONDUCTOR                13
#define JETSAM_PRIORITY_HOME                     16
#define JETSAM_PRIORITY_EXECUTIVE                17
#define JETSAM_PRIORITY_IMPORTANT                18
#define JETSAM_PRIORITY_CRITICAL                 19

#define JETSAM_PRIORITY_MAX                      21

/* TODO - tune. This should probably be lower priority */
#define JETSAM_PRIORITY_DEFAULT                  18
#define JETSAM_PRIORITY_TELEPHONY                19

可以看到foreground是10,background 是3,当内存紧张的时候,后台的进程会优先被干掉,正常当foreground前面优先级的进程全被kill后,依然内存紧张,才会kill foreground进程

优先级规则是:内核进程优先级 > 操作系统优先级 > App 优先级。且前台 App 优先级高于后台运行的 App;当进程的优先级相同时, CPU 占用多的进程的优先级会被降低。

用户态的应用程序的线程不可能高于操作系统和内核。iOS 上应用程序优先级最高的是 SpringBoard;此外线程的优先级不是一成不变的。Mach 会根据线程的利用率和系统整体负载动态调整线程优先级。如果耗费 CPU 太多就降低线程优先级,如果线程过度挨饿,则会提升线程优先级。但是无论怎么变,程序都不能超过其所在线程的优先级区间范围。

OOM 类型

目前 OOM 主要分为11种类型:

/* Cause */
enum 
    kMemorystatusInvalid            = JETSAM_REASON_INVALID,
    kMemorystatusKilled         = JETSAM_REASON_GENERIC,
    kMemorystatusKilledHiwat        = JETSAM_REASON_MEMORY_HIGHWATER, //high water
    kMemorystatusKilledVnodes       = JETSAM_REASON_VNODE, // vnode
    kMemorystatusKilledVMPageShortage   = JETSAM_REASON_MEMORY_VMPAGESHORTAGE, //vm page shortager
    kMemorystatusKilledVMThrashing      = JETSAM_REASON_MEMORY_VMTHRASHING, // vm thrashing
    kMemorystatusKilledFCThrashing      = JETSAM_REASON_MEMORY_FCTHRASHING, // fc thrashing
    kMemorystatusKilledPerProcessLimit  = JETSAM_REASON_MEMORY_PERPROCESSLIMIT, // per process limit
    kMemorystatusKilledDiagnostic       = JETSAM_REASON_MEMORY_DIAGNOSTIC, // diagnostic
    kMemorystatusKilledIdleExit     = JETSAM_REASON_MEMORY_IDLE_EXIT, // idle exit
    kMemorystatusKilledZoneMapExhaustion    = JETSAM_REASON_ZONE_MAP_EXHAUSTION // map exhaustion
;

对应每种类型,输出日志时会有相应的字符串,输出到 log 中

/* For logging clarity */
static const char *memorystatus_kill_cause_name[] = 
    ""                      ,
    "jettisoned"        ,       /* kMemorystatusKilled          */
    "highwater"             ,       /* kMemorystatusKilledHiwat     */ 
    "vnode-limit"           ,       /* kMemorystatusKilledVnodes        */
    "vm-pageshortage"       ,       /* kMemorystatusKilledVMPageShortage    */
    "vm-thrashing"          ,       /* kMemorystatusKilledVMThrashing   */
    "fc-thrashing"          ,       /* kMemorystatusKilledFCThrashing   */
    "per-process-limit"     ,       /* kMemorystatusKilledPerProcessLimit   */
    "diagnostic"            ,       /* kMemorystatusKilledDiagnostic    */
    "idle-exit"             ,       /* kMemorystatusKilledIdleExit      */
    "zone-map-exhaustion"   ,       /* kMemorystatusKilledZoneMapExhaustion */
;

当我们的 App 触发 OOM 时,系统会有相应的日志写到手机的设置->隐私->分析->分析数据->jstsamEvent-xxx文件中。打开文件,可以看到reason一栏会标明 OOM 的类型

这是我手机里的一个jstsamEvent文件

...
  "largestProcess" : "Boom",
  "genCounter" : 23,
  "processes" : [
  
    "uuid" : "ebd916c8-96e7-3b8f-985d-027098a13fd6",
    "states" : [
      "daemon",
      "idle"
    ],
    "killDelta" : 1887,
    "genCount" : 0,
    "age" : 200706725,
    "purgeable" : 0,
    "fds" : 50,
    "coalition" : 268,
    "rpages" : 34,
    "reason" : "vm-pageshortage",
    "pid" : 2205,
    "cpuTime" : 0.0030500000000000002,
    "name" : "xpcproxy",
    "lifetimeMax" : 79
  ,

...

在这里我们可以看到占用内存最大的进程是 boom,OOM 的类型是vm-pageshortage

  • pageSize:指的是当前设备物理内存页的大小,当前设备是iPhoneXs Max,大小是 16KB,苹果 A7 芯片之前的设备物理内存页大小则是 4KB。

  • states:当前应用的运行状态,对于Heimdallr-Example这个应用而言是正在前台运行的状态,这类崩溃我们称之为FOOM(Foreground Out Of Memory);与此相对应的也有应用程序在后台发生的 OOM 崩溃,这类崩溃我们称之为BOOM(Background Out Of Memory)。

  • rpages:是resident pages的缩写,表明进程当前占用的内存页数量,Heimdallr-Example 这个应用占用的内存页数量是 92800,基于 pageSize 和 rpages 可以计算出应用崩溃时占用的内存大小:16384 * 92800 / 1024 /1024 = 1.4GB。

  • reason:表明进程被终止的的原因,Heimdallr-Example这个应用被终止的原因是超过了操作系统允许的单个进程物理内存占用的上限。

Jetsam机制清理策略可以总结为下面两点:

  1. 单个 App 物理内存占用超过上限

  2. 整个设备物理内存占用收到压力按照下面优先级完成清理:

    1. 用户应用>系统应用

    2. 内存占用高的应用>内存占用低的应用

    3. 后台应用>前台应用

stackoverflow上有一份数据,整理了单个App的 OOM 临界值

device

crash amount:MB

total amount:MB

percentage of total

iPad1

127

256

49%

iPad2

275

512

53%

iPad3

645

1024

62%

iPad4(iOS 8.1)

585

1024

57%

Pad Mini 1st Generation

297

512

58%

iPad Mini retina(iOS 7.1)

696

1024

68%

iPad Air

697

1024

68%

iPad Air 2(iOS 10.2.1)

1383

2048

68%

iPad Pro 9.7"(iOS 10.0.2 (14A456))

1395

1971

71%

iPad Pro 10.5”(iOS 11 beta4)

3057

4000

76%

iPad Pro 12.9” (2015)(iOS 11.2.1)

3058

3999

76%

iPad 10.2(iOS 13.2.3)

1844

2998

62%

iPod touch 4th gen(iOS 6.1.1)

130

256

51%

iPod touch 5th gen

286

512

56%

iPhone4

325

512

63%

iPhone4s

286

512

56%

iPhone5

645

1024

62%

iPhone5s

646

1024

63%

iPhone6(iOS 8.x)

645

1024

62%

iPhone6 Plus(iOS 8.x)

645

1024

62%

iPhone6s(iOS 9.2)

1396

2048

68%

iPhone6s Plus(iOS 10.2.1)

1396

2048

68%

iPhoneSE(iOS 9.3)

1395

2048

68%

iPhone7(iOS 10.2)

1395

2048

68%

iPhone7 Plus(iOS 10.2.1)

2040

3072

66%

iPhone8(iOS 12.1)

1364

1990

70%

iPhoneX(iOS 11.2.1)

1392

2785

50%

iPhoneXS(iOS 12.1)

2040

3754

54%

iPhoneXS Max(iOS 12.1)

2039

3735

55%

iPhoneXR(iOS 12.1)

1792

2813

63%

iPhone11(iOS 13.1.3)

2068

3844

54%

iPhone11 Pro Max(iOS 13.2.3)

2067

3740

55%

OOM 的触发方式

正常 OOM 的触发方式有2种,一种是同步触发,一种是异步触发。比如 VMPageShortage类型的 OOM 触发方式:

boolean_t memorystatus_kill_on_VM_page_shortage(boolean_t async) 
    if (async) 
        return memorystatus_kill_process_async(-1, kMemorystatusKilledVMPageShortage);
     else 
        os_reason_t jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_VMPAGESHORTAGE);
        if (jetsam_reason == OS_REASON_NULL) 
            printf("memorystatus_kill_on_VM_page_shortage -- sync: failed to allocate jetsam reason\\n");
        

        return memorystatus_kill_process_sync(-1, kMemorystatusKilledVMPageShortage, jetsam_reason);
    


同步触发比较简单粗暴,直接根据pid,kill 掉相应的进程。如果 pid 传的是-1,就 kill 掉优先级队列里面优先级最低的那个进程。(如果多个进程同一个优先级,系统会根据占用内存大小排序,kill 掉内存占用最大的进程)

static boolean_t 
memorystatus_kill_process_sync(pid_t victim_pid, uint32_t cause, os_reason_t jetsam_reason) 
    boolean_t res;

    uint32_t errors = 0;

    if (victim_pid == -1) 
        /* No pid, so kill first process */
        res = memorystatus_kill_top_process(TRUE, TRUE, cause, jetsam_reason, NULL, &errors);
     else 
        res = memorystatus_kill_specific_process(victim_pid, cause, jetsam_reason);
    
    
    if (errors) 
        memorystatus_clear_errors();
    

    return res;

而异步触发实际是通过设置一个内存标志位,标志当前内存已经有问题了,然后唤醒专门的内存管理线程去管理内存状态,触发 OOM,kill 部分进程,回收内存。

static boolean_t 
memorystatus_kill_process_async(pid_t victim_pid, uint32_t cause) 
    /*
     * TODO: allow a general async path
     *
     * NOTE: If a new async kill cause is added, make sure to update memorystatus_thread() to
     * add the appropriate exit reason code mapping.
     */
    if ((victim_pid != -1) || (cause != kMemorystatusKilledVMPageShortage && cause != kMemorystatusKilledVMThrashing &&
                   cause != kMemorystatusKilledFCThrashing && cause != kMemorystatusKilledZoneMapExhaustion)) 
        return FALSE;
    
    
    kill_under_pressure_cause = cause;
    memorystatus_thread_wake();
    return TRUE;

内存状态管理线程

系统中专门有一个线程用来管理内存状态,当内存状态出现问题或者内存压力过大时,将会通过一定的策略,干掉一些 App 回收内存。

将部分无关代码删除后,内存状态管理线程代码是这样的

static void
memorystatus_thread(void *param __unused, wait_result_t wr __unused)

    static boolean_t is_vm_privileged = FALSE;

    boolean_t post_snapshot = FALSE;
    uint32_t errors = 0;
    uint32_t hwm_kill = 0;
    boolean_t sort_flag = TRUE;
    boolean_t corpse_list_purged = FALSE;
    int jld_idle_kills = 0;

    if (is_vm_privileged == FALSE) 
        /* 一些初始化工作 */
        thread_wire(host_priv_self(), current_thread(), TRUE);
        is_vm_privileged = TRUE;
        
        if (vm_restricted_to_single_processor == TRUE)
            thread_vm_bind_group_add();
        thread_set_thread_name(current_thread(), "VM_memorystatus");
        memorystatus_thread_block(0, memorystatus_thread);
    
    
    // 真正的内存管理的循环
    while (memorystatus_action_needed()) 
        boolean_t killed;
        int32_t priority;
        uint32_t cause;
        uint64_t jetsam_reason_code = JETSAM_REASON_INVALID;
        os_reason_t jetsam_reason = OS_REASON_NULL;

        cause = kill_under_pressure_cause;
        switch (cause) 
            case kMemorystatusKilledFCThrashing:
                jetsam_reason_code = JETSAM_REASON_MEMORY_FCTHRASHING;
                break;
            case kMemorystatusKilledVMThrashing:
                jetsam_reason_code = JETSAM_REASON_MEMORY_VMTHRASHING;
                break;
            case kMemorystatusKilledZoneMapExhaustion:
                jetsam_reason_code = JETSAM_REASON_ZONE_MAP_EXHAUSTION;
                break;
            case kMemorystatusKilledVMPageShortage:
                /* falls through */
            default:
                jetsam_reason_code = JETSAM_REASON_MEMORY_VMPAGESHORTAGE;
                cause = kMemorystatusKilledVMPageShortage;
                break;
        

        /* HIGHWATER类型的 OOM 触发 */
        boolean_t is_critical = TRUE;
        if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) 
            if (is_critical == FALSE) 
                /*
                 * For now, don't kill any other processes.
                 */
                break;
             else 
                goto done;
            
        

        jetsam_reason = os_reason_create(OS_REASON_JETSAM, jetsam_reason_code);
        if (jetsam_reason == OS_REASON_NULL) 
            printf("memorystatus_thread: failed to allocate jetsam reason\\n");
        
        
        // 核心的 OOM 触发机制
        if (memorystatus_act_aggressive(cause, jetsam_reason, &jld_idle_kills, &corpse_list_purged, &post_snapshot)) 
            goto done;
        

        os_reason_ref(jetsam_reason);

        /* LRU,干掉优先级最低的一个进程 */
        killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
        sort_flag = FALSE;

        if (killed) 
            /* Jetsam Loop Detection */
            if (memorystatus_jld_enabled == TRUE) 
                if ((priority == JETSAM_PRIORITY_IDLE) || (priority == system_procs_aging_band) || (priority == applications_aging_band)) 
                    jld_idle_kills++;
                 
            

            if ((priority >= JETSAM_PRIORITY_UI_SUPPORT) && (total_corpses_count() > 0) && (corpse_list_purged == FALSE)) 
                task_purge_all_corpses();
                corpse_list_purged = TRUE;
            
            goto done;
        
        
        if (memorystatus_avail_pages_below_critical()) 
            /*
             * Still under pressure and unable to kill a process - purge corpse memory
             */
            if (total_corpses_count() > 0) 
                task_purge_all_corpses();
                corpse_list_purged = TRUE;
            

            if (memorystatus_avail_pages_below_critical()) 
                /*
                 * Still under pressure and unable to kill a process - panic
                 */
                panic("memorystatus_jetsam_thread: no victim! available pages:%llu\\n", (uint64_t)memorystatus_available_pages);
            
        
            
done:       

        /*
         * We do not want to over-kill when thrashing has been detected.
         * To avoid that, we reset the flag here and notify the
         * compressor.
         */
        if (is_reason_thrashing(kill_under_pressure_cause)) 
            kill_under_pressure_cause = 0;
#if CONFIG_JETSAM
            vm_thrashing_jetsam_done();
#endif /* CONFIG_JETSAM */
         else if (is_reason_zone_map_exhaustion(kill_under_pressure_cause)) 
            kill_under_pressure_cause = 0;
        

        os_reason_free(jetsam_reason);
    

    kill_under_pressure_cause = 0;
    
    if (errors) 
        memorystatus_clear_errors();
    

代码比较多,我们来慢慢解析

准入条件

我们可以看到真正核心的代码在while (memorystatus_action_needed())的循环里面,memorystatus_action_needed是触发 OOM 的核心判断条件

/* Does cause indicate vm or fc thrashing? */
static boolean_t
is_reason_thrashing(unsigned cause)

    switch (cause) 
    case kMemorystatusKilledVMThrashing:
    case kMemorystatusKilledFCThrashing:
        return TRUE;
    default:
        return FALSE;
    


/* Is the zone map almost full? */
static boolean_t
is_reason_zone_map_exhaustion(unsigned cause)

    if (cause == kMemorystatusKilledZoneMapExhaustion)
        return TRUE;
    return FALSE;


static boolean_t
memorystatus_action_needed(void)

    return (is_reason_thrashing(kill_under_pressure_cause) ||
            is_reason_zone_map_exhaustion(kill_under_pressure_cause) ||
           memorystatus_available_pages <= memorystatus_available_pages_pressure);


kill_under_pressure_cause值为kMemorystatusKilledVMThrashing,kMemorystatusKilledFCThrashing,kMemorystatusKilledZoneMapExhaustion时,或者当前可用内存 memorystatus_available_pages 小于阈值memorystatus_available_pages_pressure时,会走进去触发 OOM。

high-water

进入循环之后,首先走到memorystatus_act_on_hiwat_processes

/* HIGHWATER类型的 OOM 触发 */
boolean_t is_critical = TRUE;
if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) 
    if (is_critical == FALSE) 
        /*
         * For now, don't kill any other processes.
         */
        break;
     else 
        goto done;
    

这是触发HIGHWATER类型 OOM 的关键方法

static boolean_t
memorystatus_act_on_hiwat_processes(uint32_t *errors, uint32_t *hwm_kill, boolean_t *post_snapshot, __unused boolean_t *is_critical)

    boolean_t killed = memorystatus_kill_hiwat_proc(errors);

    if (killed) 
        *hwm_kill = *hwm_kill + 1;
        *post_snapshot = TRUE;
        return TRUE;
     else 
        memorystatus_hwm_candidates = FALSE;
    
    return FALSE;

memorystatus_act_on_hiwat_processes会直接调用memorystatus_kill_hiwat_proc,核心代码都在memorystatus_kill_hiwat_proc中。

static boolean_t
memorystatus_kill_hiwat_proc(uint32_t *errors)

    pid_t aPid = 0;
    proc_t p = PROC_NULL, next_p = PROC_NULL;
    boolean_t new_snapshot = FALSE, killed = FALSE;
    int kill_count = 0;
    unsigned int i = 0;
    uint32_t aPid_ep;
    uint64_t killtime = 0;
        clock_sec_t     tv_sec;
        clock_usec_t    tv_usec;
        uint32_t        tv_msec;
    os_reason_t jetsam_reason = OS_REASON_NULL;
    
    jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_HIGHWATER);
    proc_list_lock();
    
    next_p = memorystatus_get_first_proc_locked(&i, TRUE);
    while (next_p) 
        uint64_t footprint_in_bytes = 0;
        uint64_t memlimit_in_bytes  = 0;
        boolean_t skip = 0;

        p = next_p;
        next_p = memorystatus_get_next_proc_locked(&i, p, TRUE);
        
        aPid = p->p_pid;
        aPid_ep = p->p_memstat_effectivepriority;
        
        if (p->p_memstat_state  & (P_MEMSTAT_ERROR | P_MEMSTAT_TERMINATED)) 
            continue;
        
        
        /* skip if no limit set */
        if (p->p_memstat_memlimit <= 0) 
            continue;
        

        footprint_in_bytes = get_task_phys_footprint(p->task);
        memlimit_in_bytes  = (((uint64_t)p->p_memstat_memlimit) * 1024ULL * 1024ULL);   /* convert MB to bytes */
        skip = (footprint_in_bytes <= memlimit_in_bytes);


#if CONFIG_FREEZE
        if (!skip) 
            if (p->p_memstat_state & P_MEMSTAT_LOCKED) 
                skip = TRUE;
             else 
                skip = FALSE;
                           
        
#endif

        if (skip) 
            continue;
         else 
            if (memorystatus_jetsam_snapshot_count == 0) 
                
            p->p_memstat_state |= P_MEMSTAT_TERMINATED;

            killtime = mach_absolute_time();
            absolutetime_to_microtime(killtime, &tv_sec, &tv_usec);
            tv_msec = tv_usec / 1000;
                
            
                memorystatus_update_jetsam_snapshot_entry_locked(p, kMemorystatusKilledHiwat, killtime);
                    
                if (proc_ref_locked(p) == p) 
                    proc_list_unlock();

                    /*
                     * memorystatus_do_kill drops a reference, so take another one so we can
                     * continue to use this exit reason even after memorystatus_do_kill()
                     * returns
                     */
                    os_reason_ref(jetsam_reason);

                    killed = memorystatus_do_kill(p, kMemorystatusKilledHiwat, jetsam_reason);

                    /* Success? */
                    if (killed) 
                        proc_rele(p);
                        kill_count++;
                        goto exit;
                    

                    proc_list_lock();
                    proc_rele_locked(p);
                    p->p_memstat_state &= ~P_MEMSTAT_TERMINATED;
                    p->p_memstat_state |= P_MEMSTAT_ERROR;
                    *errors += 1;
                

                i = 0;
                next_p = memorystatus_get_first_proc_locked(&i, TRUE);
            
        
    
    
    proc_list_unlock();
    
exit:
    os_reason_free(jetsam_reason);
    return killed;

首先通过memorystatus_get_first_proc_locked(&i, TRUE)去优先级队列里面取出优先级最低的进程。如果这个进程内存小于阈值(footprint_in_bytes <= memlimit_in_bytes),则跳过然后取下一个进程memorystatus_get_next_proc_locked,如果内存超过阈值,将通过memorystatus_do_kill干掉这个进程,并结束循环。

我们可以看到这里计算内存的口径主要用的是phys_footprint,不过目前观察我自己手机上的 OOM 类型,从未见过high-water 类型的 OOM,猜测可能high-water的阈值比较高,比较难触发,大家也可以看看自己手机里的 OOM 类型,如果有 high-water 类型的 OOM,可以告诉我

normal kill

如果没有high-water的进程,程序继续往下执行,走到memorystatus_act_aggressive方法里,这个方法是通常oom的触发方法,大部分OOM都在这里面触发。

static boolean_t
memorystatus_act_aggressive(uint32_t cause, os_reason_t jetsam_reason, int *jld_idle_kills, boolean_t *corpse_list_purged, boolean_t *post_snapshot)

    if (memorystatus_jld_enabled == TRUE) 

        boolean_t killed;
        uint32_t errors = 0;

        /* Jetsam Loop Detection - locals */
        memstat_bucket_t *bucket;
        int     jld_bucket_count = 0;
        struct timeval  jld_now_tstamp = 0,0;
        uint64_t    jld_now_msecs = 0;
        int     elevated_bucket_count = 0;

        /* Jetsam Loop Detection - statics */
        static uint64_t  jld_timestamp_msecs = 0;
        static int   jld_idle_kill_candidates = 0;  /* Number of available processes in band 0,1 at start */
        static int   jld_eval_aggressive_count = 0;     /* Bumps the max priority in aggressive loop */
        static int32_t   jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;

        microuptime(&jld_now_tstamp);

        jld_now_msecs = (jld_now_tstamp.tv_sec * 1000);

        proc_list_lock();
        switch (jetsam_aging_policy) 
        case kJetsamAgingPolicyLegacy:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
            jld_bucket_count += bucket->count;
            break;
        case kJetsamAgingPolicySysProcsReclaimedFirst:
        case kJetsamAgingPolicyAppsReclaimedFirst:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            bucket = &memstat_bucket[system_procs_aging_band];
            jld_bucket_count += bucket->count;
            bucket = &memstat_bucket[applications_aging_band];
            jld_bucket_count += bucket->count;
            break;
        case kJetsamAgingPolicyNone:
        default:
            bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
            jld_bucket_count = bucket->count;
            break;
        

        bucket = &memstat_bucket[JETSAM_PRIORITY_ELEVATED_INACTIVE];
        elevated_bucket_count = bucket->count;

        proc_list_unlock();

        if ( (jld_bucket_count == 0) || 
             (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) 
            jld_timestamp_msecs  = jld_now_msecs;
            // 先回收优先级特别低的进程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,这些进程回收后jld_bucket_count将等于0
            jld_idle_kill_candidates = jld_bucket_count;
            *jld_idle_kills      = 0;
            jld_eval_aggressive_count = 0;
            jld_priority_band_max   = JETSAM_PRIORITY_UI_SUPPORT;
        

        // 正常状态下先回收一些随时可以回收的线程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,这些进程回收后才能走进这个判断里面
        if (*jld_idle_kills > jld_idle_kill_candidates) 
            jld_eval_aggressive_count++;

            if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
                (total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) 
                task_purge_all_corpses();
                *corpse_list_purged = TRUE;
            
            else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) 
                if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
                    (memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) 

                 else 
                    jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
                
            

            // 先干掉后台线程
            /* Visit elevated processes first */
            while (elevated_bucket_count) 

                elevated_bucket_count--;

                os_reason_ref(jetsam_reason);
                killed = memorystatus_kill_elevated_process(
                    cause,
                    jetsam_reason,
                    jld_eval_aggressive_count,
                    &errors);

                if (killed) 
                    *post_snapshot = TRUE;
                    // 如果还是有压力,就继续杀App
                    if (memorystatus_avail_pages_below_pressure()) 
                        /*
                         * Still under pressure.
                         * Find another pinned processes.
                         */
                        continue;
                     else 
                        return TRUE;
                    
                 else 
                    break;
                
            

            // 干掉前台线程
            killed = memorystatus_kill_top_process_aggressive(
                kMemorystatusKilledVMThrashing,
                jld_eval_aggressive_count, 
                jld_priority_band_max, 
                &errors);
                
            if (killed) 
                /* Always generate logs after aggressive kill */
                *post_snapshot = TRUE;
                *jld_idle_kills = 0;
                return TRUE;
             
        

        return FALSE;
    

    return FALSE;

这里的逻辑比较多,我们慢慢解释。

首先有一个jld_bucket_count,这里面包含可以直接干掉的低优先级进程数量。

switch (jetsam_aging_policy) 
case kJetsamAgingPolicyLegacy:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicySysProcsReclaimedFirst:
case kJetsamAgingPolicyAppsReclaimedFirst:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    bucket = &memstat_bucket[system_procs_aging_band];
    jld_bucket_count += bucket->count;
    bucket = &memstat_bucket[applications_aging_band];
    jld_bucket_count += bucket->count;
    break;
case kJetsamAgingPolicyNone:
default:
    bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
    jld_bucket_count = bucket->count;
    break;

if ( (jld_bucket_count == 0) || 
     (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) 
    jld_timestamp_msecs  = jld_now_msecs;
    // 先回收优先级特别低的进程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,这些进程回收后jld_bucket_count将等于0
    jld_idle_kill_candidates = jld_bucket_count;
    *jld_idle_kills      = 0;
    jld_eval_aggressive_count = 0;
    jld_priority_band_max   = JETSAM_PRIORITY_UI_SUPPORT;


// 正常状态下先回收一些随时可以回收的线程:JETSAM_PRIORITY_IDLE,system_procs_aging_band,applications_aging_band,这些进程回收后才能走进这个判断里面
if (*jld_idle_kills > jld_idle_kill_candidates) 
// 这里面是我们App经常触发OOM的地方

    
killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
if (killed) 
    jld_idle_kills++;

根据jetsam_aging_policy确定哪些优先级类型的进程需要被直接干掉。正常走到kJetsamAgingPolicyAppsReclaimedFirst或者kJetsamAgingPolicySysProcsReclaimedFirstjld_bucket_count = JETSAM_PRIORITY_IDLE + system_procs_aging_band + applications_aging_band

*jld_idle_kills表示已经kill掉的低优先级进程,每次kill掉一个低优先级进程jld_idle_kills++jld_idle_kill_candidates = jld_bucket_count;,在if (*jld_idle_kills > jld_idle_kill_candidates)的判断条件里,只有前面提到的jld_bucket_count的低优先级进程全部被干掉了,才会走到判断条件里面。

所以当内存不够的时候,系统会先回收JETSAM_PRIORITY_IDLE ``system_procs_aging_band ``applications_aging_band优先级的进程。

我们再来看判断条件里面

if (*jld_idle_kills > jld_idle_kill_candidates) 
    jld_eval_aggressive_count++;

    if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
        (total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) 
        task_purge_all_corpses();
        *corpse_list_purged = TRUE;
    
    else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) 
        if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) ||
            (memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) 

         else 
            jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
        
    

    // 先干掉后台线程
    /* Visit elevated processes first */
    while (elevated_bucket_count) 

        elevated_bucket_count--;

        os_reason_ref(jetsam_reason);
        killed = memorystatus_kill_elevated_process(
            cause,
            jetsam_reason,
            jld_eval_aggressive_count,
            &errors);

        if (killed) 
            *post_snapshot = TRUE;
            // 如果还是有压力,就继续杀App
            if (memorystatus_avail_pages_below_pressure()) 
                /*
                 * Still under pressure.
                 * Find another pinned processes.
                 */
                continue;
             else 
                return TRUE;
            
         else 
            break;
        
    

    // 干掉前台线程
    killed = memorystatus_kill_top_process_aggressive(
        kMemorystatusKilledVMThrashing,
        jld_eval_aggressive_count, 
        jld_priority_band_max, 
        &errors);
        
    if (killed) 
        /* Always generate logs after aggressive kill */
        *post_snapshot = TRUE;
        *jld_idle_kills = 0;
        return TRUE;
     

他会先通过memorystatus_kill_elevated_process干掉后台的进程,每干掉一个进程,检测一下内存压力,检测内存压力还是通过memorystatus_available_pages

static boolean_t memorystatus_avail_pages_below_pressure(void) 
    return (memorystatus_available_pages <= memorystatus_available_pages_pressure);


如果memorystatus_available_pages还是小于阈值,则继续kill下一个进程。当所有后台进程都被kill后。如果还有内存压力,再通过memorystatus_kill_top_process_aggressivekill掉优先级最低的进程。这里是触发FOOM的关键,如果foreground已经是最低优先级的进程了,那就会发生FOOM,kill掉前台的App

memorystatus_available_pages计算

是否触发FOOM,主要还是根据memorystatus_available_pages是否小于阈值。那memorystatus_available_pages怎么计算呢?

查阅源码,可以找到

#define VM_CHECK_MEMORYSTATUS do  \\
    memorystatus_pages_update(      \\
            vm_page_pageable_external_count + \\
        vm_page_free_count +        \\
            (VM_DYNAMIC_PAGING_ENABLED() ? 0 : vm_page_purgeable_count) \\
        ); \\
     while(0)
    
void memorystatus_pages_update(unsigned int pages_avail)

    memorystatus_available_pages = pages_avail;
    ...

可以看到memorystatus_available_pages = vm_page_pageable_external_count + vm_page_free_count + vm_page_purgeable_count

  • vm_page_pageable_external_count: iOS里表示已经备份的page count,内存不够时,可以使用
  • vm_page_free_count: 表示未使用的page count
  • vm_page_purgeable_count: 表示可清理的page count

另外memorystatus_available_pages_pressure实际等于手机最大内存的15%。也就是说当可用内存小于系统内存的15%时,就会触发OOM了

逻辑汇总

纵观memorystatus_thread代码,逻辑如下:

  1. 判断 kill_under_pressure_cause值为
    kMemorystatusKilledVMThrashing,
    kMemorystatusKilledFCThrashing,
    kMemorystatusKilledZoneMapExhaustion时,
    或者当前可用内存 memorystatus_available_pages 小于阈值memorystatus_available_pages_pressure,进入OOM逻辑
  2. 遍历每个进程,跟据phys_footprint,判断每个进程是否高于阈值,如果高于阈值,以high-water类型kill进程,触发OOM
  3. 如果JETSAM_PRIORITY_IDLE,
    JETSAM_PRIORITY_AGING_BAND1,
    JETSAM_PRIORITY_IDLE优先级队列中还存在进程,则kill一个最低优先级的进程,再次走1的判断逻辑
  4. 当所有低优先级进程被kill掉后,如果memorystatus_available_pages仍然小于阈值,先kill掉后台进程,每kill一个进程,判断一下memorystatus_available_pages是否还小于阈值,如果已经小于阈值,则结束流程,走到1
  5. 当所有后台优先级进程都被kill后,调用memorystatus_kill_top_process_aggressive,kill掉前台的进程。再次回到1

总结

根据源码,触发前台OOM的可能性有3个:

  1. 直接触发同步kill,比如kMemorystatusKilledPerProcessLimit类型的OOM,这个解释起来还需要一篇文章,暂时不在本文的讨论范围之类
  2. footprint_in_bytes > memlimit_in_bytes,触发high-water类型的OOM,目前我在自己手机上,暂时没有看到这个类型的OOM
  3. 当后台线程都被kill后,依然memorystatus_available_pages <= memorystatus_available_pages_pressure,进而系统kill掉我们的App



参考文章: OOM探究:XNU 内存状态管理 - 简书

带你打造一套 APM 监控系统 之 OOM 问题

iOS性能优化实践:头条抖音如何实现OOM崩溃率下降50%

iOS内存abort(Jetsam) 原理探究

以上是关于OOM探究:XNU内存状态管理, Jetsam原理的主要内容,如果未能解决你的问题,请参考以下文章

iOS App Crash原理分析

Squirrel状态机-从原理探究到最佳实践

OOM和频繁GC预防方案

ThreadPoolExcutor 原理探究

谁能够详细介绍下MAC系统的内核

Cloud Run 上的容器内存管理和 OOM