进程调度函数scheduler_tick()的触发原理:周期PERIODIC定时器

Posted lingjiajun

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了进程调度函数scheduler_tick()的触发原理:周期PERIODIC定时器相关的知识,希望对你有一定的参考价值。

参考文章:

https://www.jb51.net/article/133579.htm

https://blog.csdn.net/flaoter/article/details/77509553

https://www.cnblogs.com/arnoldlu/p/7078204.html 中时间子系统相关系列blog,讲的比较详细。

主要文件所在目录:

kernel/msm-4.9/kernel/time/tick-common.c、tick-dchrf.c、timer.c、hrtimer.c、clockevent.c等相关源文件

kernel/msm-4.9/drivers/clocksource.c

kernel/msm-4.9/drivers/clocksource文件夹下的一些源文件

kernel/sched/core.c

 

简述

作为进程调度中,最关键的函数:scheduler_tick()。它也是大多数调度函数的源,那么它自身又被谁调用的呢?

scheduler_tick()是所有调度子函数的父函数,而其是由Linux时间子系统的tick_device调用。tick_device是一个周期性定时器,定时时间为1个tick,当触发中断后,会在中断处理函数中,调用scheduler_tick()。

而打开了tickless,即动态tick后,那么就会切换至oneshot模式,并负责调用scheduler_tick()。

这篇文章会简要地解释这个原理。

 

NO_HZ动态时钟 & hrtimer高精度时钟

因为是tick是由tick device周期性触发,所以当系统在idle时,为了减少系统功耗,应该关闭周期性tick。所以,NO_HZ的动态时钟应运而生。它会在系统空闲,仅有idle进程时,关闭周期性tick;而当跳出idle进程时,会重新再开启周期性tick。

关于NO_HZ的详细资料,可以参考:https://www.kernel.org/doc/Documentation/timers/NO_HZ.txt

SDM845平台的时间子系统就是基于NO_HZ动态时钟, 以及高精度定时时钟的,对应.config配置如下:

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

---CONFIG_NO_HZ_IDLE(处于idle状态没有tick,非idle状态正常tick)------------当前平台处于这个config

---CONFIG_NO_HZ_FULL(处于idle或者cpu仅有一个进程运行,停止tick;其他情况正常)

NO_HZ的情况下,有3种模式:

  • 系统动态时钟尚未激活模式
  • 高精度工作模式
  • 低精度工作模式
enum tick_nohz_mode {
    NOHZ_MODE_INACTIVE,
    NOHZ_MODE_LOWRES,
    NOHZ_MODE_HIGHRES,
};

/**
 * struct tick_sched - sched tick emulation and no idle tick control/stats
 * @sched_timer:    hrtimer to schedule the periodic tick in high
 *            resolution mode
 * @last_tick:        Store the last tick expiry time when the tick
 *            timer is modified for nohz sleeps. This is necessary
 *            to resume the tick timer operation in the timeline
 *            when the CPU returns from nohz sleep.
 * @tick_stopped:    Indicator that the idle tick has been stopped
 * @idle_jiffies:    jiffies at the entry to idle for idle time accounting
 * @idle_calls:        Total number of idle calls
 * @idle_sleeps:    Number of idle calls, where the sched tick was stopped
 * @idle_entrytime:    Time when the idle call was entered
 * @idle_waketime:    Time when the idle was interrupted
 * @idle_exittime:    Time when the idle state was left
 * @idle_sleeptime:    Sum of the time slept in idle with sched tick stopped
 * @iowait_sleeptime:    Sum of the time slept in idle with sched tick stopped, with IO outstanding
 * @sleep_length:    Duration of the current idle sleep
 * @do_timer_lst:    CPU was the last one doing do_timer before going idle
 */
struct tick_sched {
    struct hrtimer            sched_timer;
    unsigned long            check_clocks;
    enum tick_nohz_mode        nohz_mode;
    ktime_t                last_tick;
    int                inidle;
    int                tick_stopped;
    unsigned long            idle_jiffies;
    unsigned long            idle_calls;
    unsigned long            idle_sleeps;
    int                idle_active;
    ktime_t                idle_entrytime;
    ktime_t                idle_waketime;
    ktime_t                idle_exittime;
    ktime_t                idle_sleeptime;
    ktime_t                iowait_sleeptime;
    ktime_t                sleep_length;
    unsigned long            last_jiffies;
    u64                next_timer;
    ktime_t                idle_expires;
    int                do_timer_last;
    atomic_t            tick_dep_mask;
};

 

tick_device

tick_device相关数据结构如下:

1、tick_device的工作模式,支持两种:一种是周期性periodic,另一种是一次性oneshot。

enum tick_device_mode {
    TICKDEV_MODE_PERIODIC,
    TICKDEV_MODE_ONESHOT,
};

struct tick_device {
    struct clock_event_device *evtdev;
    enum tick_device_mode mode;
};

 tick device是通过tick_check_new_device函数进行创建

/*
 * Check, if the new registered device should be used. Called with
 * clockevents_lock held and interrupts disabled.
 */
void tick_check_new_device(struct clock_event_device *newdev)
{
    struct clock_event_device *curdev;
    struct tick_device *td;
    int cpu;

    cpu = smp_processor_id();       //获取当前CPU id
    td = &per_cpu(tick_cpu_device, cpu);    //获取当前CPU的tick device结构体
    curdev = td->evtdev;

    /* cpu local device ? */
    if (!tick_check_percpu(curdev, newdev, cpu))    //判断是否是只服务local CPU,否则就会注册broadcast,走下面的分支。
        goto out_bc;

    /* Preference decision */
    if (!tick_check_preferred(curdev, newdev))      //如果是onshot模式,并且已有一个tick device,那么就选用其中高rate的。但是如果高rate的是non-CPU local device,那仍然会选用低rate的local tick device
        goto out_bc;

    if (!try_module_get(newdev->owner))
        return;

    /*
     * Replace the eventually existing device by the new
     * device. If the current device is the broadcast device, do
     * not give it back to the clockevents layer !
     */
    if (tick_is_broadcast_device(curdev)) {
        clockevents_shutdown(curdev);
        curdev = NULL;
    }
    clockevents_exchange_device(curdev, newdev);    //更新clock_event_device
    tick_setup_device(td, newdev, cpu, cpumask_of(cpu));    //setup,下面详细解析
    if (newdev->features & CLOCK_EVT_FEAT_ONESHOT)
        tick_oneshot_notify();
    return;

out_bc:
    /*
     * Can the new device be used as a broadcast device ?
     */
    tick_install_broadcast_device(newdev);
}

而在tick_setup_device函数,会进一步初始化tick device。如果是第一次setup,那么模式只能是periodic周期性的tick device。

/*
 * Setup the tick device
 */
static void tick_setup_device(struct tick_device *td,
                  struct clock_event_device *newdev, int cpu,
                  const struct cpumask *cpumask)
{
    ktime_t next_event;
    void (*handler)(struct clock_event_device *) = NULL;

    /*
     * First device setup ?
     */
    if (!td->evtdev) {      //如果是当前CPU第一个注册的tick device
        /*
         * If no cpu took the do_timer update, assign it to
         * this cpu:
         */
        if (tick_do_timer_cpu == TICK_DO_TIMER_BOOT) {  //此tick device将会让其管理全局jiffies等时间信息
            if (!tick_nohz_full_cpu(cpu))        
                tick_do_timer_cpu = cpu;
            else
                tick_do_timer_cpu = TICK_DO_TIMER_NONE;
            tick_next_period = ktime_get();
            tick_period = ktime_set(0, NSEC_PER_SEC / HZ);  //HZ为1秒内需要有多少的脉冲,基于此来设定定时时间
        }

        /*
         * Startup in periodic mode first.
         */
        td->mode = TICKDEV_MODE_PERIODIC;       //当前cpu第一次设定tick device的时候,缺省设定为周期性的tick
    } else {
        handler = td->evtdev->event_handler;
        next_event = td->evtdev->next_event;
        td->evtdev->event_handler = clockevents_handle_noop;
    }

    td->evtdev = newdev;      //将系统clock_event_device赋值给对应tick device的evtdev指针。这是比较关键的一步,代表了tick device找到合适挂载

    /*
     * When the device is not per cpu, pin the interrupt to the
     * current cpu:
     */
    if (!cpumask_equal(newdev->cpumask, cpumask))
        irq_set_affinity(newdev->irq, cpumask);

    /*
     * When global broadcasting is active, check if the current
     * device is registered as a placeholder for broadcast mode.
     * This allows us to handle this x86 misfeature in a generic
     * way. This function also returns !=0 when we keep the
     * current active broadcast state for this CPU.
     */
    if (tick_device_uses_broadcast(newdev, cpu))
        return;

    if (td->mode == TICKDEV_MODE_PERIODIC)
        tick_setup_periodic(newdev, 0);
    else
        tick_setup_oneshot(newdev, handler, next_event);
}

在tick_setup_periodic中,

先设置中断handler,再开启定时器。

/*
 * Setup the device for a periodic tick
 */
void tick_setup_periodic(struct clock_event_device *dev, int broadcast)
{
    tick_set_periodic_handler(dev, broadcast);      //设置handler

    /* Broadcast setup ? */
    if (!tick_device_is_functional(dev))
        return;

    if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
        !tick_broadcast_oneshot_active()) {
        clockevents_switch_state(dev, CLOCK_EVT_STATE_PERIODIC);        //设置clock工作状态
    } else {
        unsigned long seq;
        ktime_t next;

        do {
            seq = read_seqbegin(&jiffies_lock);
            next = tick_next_period;
        } while (read_seqretry(&jiffies_lock, seq));

        clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);

        for (;;) {
            if (!clockevents_program_event(dev, next, false))
                return;
            next = ktime_add(next, tick_period);
        }
    }
}

中断handler,使用的是不支持broadcast的。

/*
 * Set the periodic handler depending on broadcast on/off
 */
void tick_set_periodic_handler(struct clock_event_device *dev, int broadcast)
{
    if (!broadcast)
        dev->event_handler = tick_handle_periodic;      //非broadcast
    else
        dev->event_handler = tick_handle_periodic_broadcast;
}

 

/**
 * clockevents_switch_state - set the operating state of a clock event device
 * @dev:    device to modify
 * @state:    new state
 *
 * Must be called with interrupts disabled !
 */
void clockevents_switch_state(struct clock_event_device *dev,
                  enum clock_event_state state)
{
    if (clockevent_get_state(dev) != state) {
        if (__clockevents_switch_state(dev, state)) //设置工作状态
            return;

        clockevent_set_state(dev, state);

        /*
         * A nsec2cyc multiplicator of 0 is invalid and we‘d crash
         * on it, so fix it up and emit a warning:
         */
        if (clockevent_state_oneshot(dev)) {
            if (unlikely(!dev->mult)) {
                dev->mult = 1;
                WARN_ON(1);
            }
        }
    }
}

最后会调用device特定的periodic工作函数

static int __clockevents_switch_state(struct clock_event_device *dev,
                      enum clock_event_state state)
{
    if (dev->features & CLOCK_EVT_FEAT_DUMMY)
        return 0;

    /* Transition with new state-specific callbacks */
    switch (state) {
    case CLOCK_EVT_STATE_DETACHED:
        /* The clockevent device is getting replaced. Shut it down. */

    case CLOCK_EVT_STATE_SHUTDOWN:
        if (dev->set_state_shutdown)
            return dev->set_state_shutdown(dev);
        return 0;

    case CLOCK_EVT_STATE_PERIODIC:
        /* Core internal bug */
        if (!(dev->features & CLOCK_EVT_FEAT_PERIODIC))
            return -ENOSYS;
        if (dev->set_state_periodic)
            return dev->set_state_periodic(dev);    //调用device set_state_periodic工作函数,但实际当前平台没有这个函数。直接return 0
        return 0;

    case CLOCK_EVT_STATE_ONESHOT:
        /* Core internal bug */
        if (!(dev->features & CLOCK_EVT_FEAT_ONESHOT))
            return -ENOSYS;
        if (dev->set_state_oneshot)
            return dev->set_state_oneshot(dev);
        return 0;

    case CLOCK_EVT_STATE_ONESHOT_STOPPED:
        /* Core internal bug */
        if (WARN_ONCE(!clockevent_state_oneshot(dev),
                  "Current state: %d
",
                  clockevent_get_state(dev)))
            return -EINVAL;

        if (dev->set_state_oneshot_stopped)
            return dev->set_state_oneshot_stopped(dev);
        else
            return -ENOSYS;

    default:
        return -ENOSYS;
    }
}

上面提到set_state_periodic并未定义。那么应该走到哪里呢?我们看前面在 tick_setup_periodic 函数中,会判断:

    if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
        !tick_broadcast_oneshot_active())

 

实际当前平台的clock event device不支持CLOCK_EVT_FEAT_PERIODIC模式,所以代码会走到else中,模拟周期tick:

    if ((dev->features & CLOCK_EVT_FEAT_PERIODIC) &&
        !tick_broadcast_oneshot_active()) {
        clockevents_switch_state(dev, CLOCK_EVT_STATE_PERIODIC);
    } else {    //走到else中,模拟周期tick
        unsigned long seq;
        ktime_t next;

        do {
            seq = read_seqbegin(&jiffies_lock);
            next = tick_next_period;
        } while (read_seqretry(&jiffies_lock, seq));

        clockevents_switch_state(dev, CLOCK_EVT_STATE_ONESHOT);  //这里同样因为没有set_state_onshot的接口函数,直接return 0。

        for (;;) {
            if (!clockevents_program_event(dev, next, false))   //这里是一个无限循环,但是正常情况下,由于return 0,直接跳出循环
                return;
            next = ktime_add(next, tick_period);
        }
    }

 

/**
 * clockevents_program_event - Reprogram the clock event device.
 * @dev:    device to program
 * @expires:    absolute expiry time (monotonic clock)
 * @force:    program minimum delay if expires can not be set
 *
 * Returns 0 on success, -ETIME when the event is in the past.
 */
int clockevents_program_event(struct clock_event_device *dev, ktime_t expires,
                  bool force)
{
    unsigned long long clc;
    int64_t delta;
    int rc;

    if (unlikely(expires.tv64 < 0)) {
        WARN_ON_ONCE(1);
        return -ETIME;
    }

    dev->next_event = expires;

    if (clockevent_state_shutdown(dev))
        return 0;

    /* We must be in ONESHOT state here */
    WARN_ONCE(!clockevent_state_oneshot(dev), "Current state: %d
",
          clockevent_get_state(dev));

    /* Shortcut for clockevent devices that can deal with ktime. */
    if (dev->features & CLOCK_EVT_FEAT_KTIME)
        return dev->set_next_ktime(expires, dev);

    delta = ktime_to_ns(ktime_sub(expires, ktime_get()));
    if (delta <= 0)
        return force ? clockevents_program_min_delta(dev) : -ETIME;

    delta = min(delta, (int64_t) dev->max_delta_ns);
    delta = max(delta, (int64_t) dev->min_delta_ns);

    clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
    rc = dev->set_next_event((unsigned long) clc, dev);

    return (rc && force) ? clockevents_program_min_delta(dev) : rc;
}

 

 

/**
 * clockevents_program_min_delta - Set clock event device to the minimum delay.
 * @dev:    device to program
 *
 * Returns 0 on success, -ETIME when the retry loop failed.
 */
static int clockevents_program_min_delta(struct clock_event_device *dev)
{
    unsigned long long clc;
    int64_t delta;
    int i;

    for (i = 0;;) {
        delta = dev->min_delta_ns;
        dev->next_event = ktime_add_ns(ktime_get(), delta);

        if (clockevent_state_shutdown(dev))
            return 0;

        dev->retries++;
        clc = ((unsigned long long) delta * dev->mult) >> dev->shift;
        if (dev->set_next_event((unsigned long) clc, dev) == 0)
            return 0;

        if (++i > 2) {
            /*
             * We tried 3 times to program the device with the
             * given min_delta_ns. Try to increase the minimum
             * delta, if that fails as well get out of here.
             */
            if (clockevents_increase_min_delta(dev))
                return -ETIME;
            i = 0;
        }
    }
}

 

当触发了定时器之后,会调用终端handler函数:tick_handle_periodic,函数下半部分会重新设置timer触发的时间。下次触发,仍然进入中断handler函数。如此往复,模拟周期tick。

/*
 * Event handler for periodic ticks
 */
void tick_handle_periodic(struct clock_event_device *dev)
{
    int cpu = smp_processor_id();
    ktime_t next = dev->next_event;

    tick_periodic(cpu);        //更新wall time等操作,调用update_process()

#if defined(CONFIG_HIGH_RES_TIMERS) || defined(CONFIG_NO_HZ_COMMON)
    /*
     * The cpu might have transitioned to HIGHRES or NOHZ mode via
     * update_process_times() -> run_local_timers() ->
     * hrtimer_run_queues().
     */
    if (dev->event_handler != tick_handle_periodic)
        return;
#endif

    if (!clockevent_state_oneshot(dev))
        return;
    for (;;) {
        /*
         * Setup the next period for devices, which do not have
         * periodic mode:
         */
        next = ktime_add(next, tick_period);

        if (!clockevents_program_event(dev, next, false))
            return;
        /*
         * Have to be careful here. If we‘re in oneshot mode,
         * before we call tick_periodic() in a loop, we need
         * to be sure we‘re using a real hardware clocksource.
         * Otherwise we could get trapped in an infinite
         * loop, as the tick_periodic() increments jiffies,
         * which then will increment time, possibly causing
         * the loop to trigger again and again.
         */
        if (timekeeping_valid_for_hres())
            tick_periodic(cpu);
    }
}

 

在tick_periodic中,

/*
 * Periodic tick
 */
static void tick_periodic(int cpu)
{
    if (tick_do_timer_cpu == cpu) {
        write_seqlock(&jiffies_lock);

        /* Keep track of the next tick event */
        tick_next_period = ktime_add(tick_next_period, tick_period);

        do_timer(1);
        write_sequnlock(&jiffies_lock);
        update_wall_time();  //更新wall time
    }

    update_process_times(user_mode(get_irq_regs()));
    profile_tick(CPU_PROFILING);  //代码采集器
}

 

 

/*
 * Called from the timer interrupt handler to charge one tick to the current
 * process.  user_tick is 1 if the tick is user time, 0 for system.
 */
void update_process_times(int user_tick)
{
    struct task_struct *p = current;

    /* Note: this timer irq context must be accounted for as well. */
    account_process_tick(p, user_tick);
    run_local_timers();
    rcu_check_callbacks(user_tick);
#ifdef CONFIG_IRQ_WORK
    if (in_irq())
        irq_work_tick();
#endif
    scheduler_tick();    //调用scheculer_tick()
    run_posix_cpu_timers(p);
}

关于代码采集器profile_tick的简要知识:

        profile_tick()函数为代码监管器采集数据。这个函数在单处理器系统上是由do_timer_interrupt()调用的(即全局时钟中断处理程序调用的),在多处理器系统上是由smp_local_timer_interrupt()函数调用的(即本地时钟中断处理程序调用的)

        为了激活代码监管器,在Linux内核启动时必须传递字符串参数"profile=N" ,这里2的N次方,表示要监管的代码段的大小。采集的数据可以从/proc/profile文件中读取。可以通过修改这个文件来重置计数器;在多处理器系统上,修改这个文件还可以改变抽样频率。不过,内核开发者并不直接访问/proc/profile文件,而是用readprofile系统命令

        Linux2.6内核还包含了另一个监管器,叫做oprofile .比起readprofile,oprofile除了更灵活、更可定制外,还能用于发现内核代码、用户态应用程序以及系统库中的热点。当使用oprofile时,profile_tick()调用timer_notify()函数来收集这个新监管器所使用的数据。

 
回归原题,scheduler_tick()具体被调用流程:tick中断->tick_periodic()->update_process_times()->scheduler_tick()或者tick中断->tick_sched_handle()->update_process_times()->scheduler_tick()。本文分析了前者,后者有兴趣可以自行读代码了解。

 

下面为补充linux时间子系统的相关知识,dev这个结构体是如何初始化,并填充的(包括clock_event_device->feature在哪里定义为CLOCK_EVT_FEAT_ONESHOT)。

 

clock source & clock event device

我们可以看到是在DTS中有2个timer配置。

DTS配置:

    timer {
        compatible = "arm,armv8-timer";
        interrupts = <1 1 0xf08>,
                 <1 2 0xf08>,
                 <1 3 0xf08>,
                 <1 0 0xf08>;
        clock-frequency = <19200000>;
    };

    timer@0x17C90000{
        #address-cells = <1>;
        #size-cells = <1>;
        ranges;
        compatible = "arm,armv7-timer-mem";
        reg = <0x17C90000 0x1000>;
        clock-frequency = <19200000>;
           ..... 
    };

./kernel/msm-4.9/drivers/clocksource/arm_arch_timer.c中,会根据这2个timer进行clock source初始化。

CLOCKSOURCE_OF_DECLARE(armv8_arch_timer, "arm,armv8-timer", arch_timer_of_init);

static int __init arch_timer_of_init(struct device_node *np)
{
    int i;

    if (arch_timers_present & ARCH_CP15_TIMER) {
        pr_warn("arch_timer: multiple nodes in dt, skipping
");
        return 0;
    }
  
    arch_timers_present |= ARCH_CP15_TIMER;                      //CP15 timer
    for (i = PHYS_SECURE_PPI; i < MAX_TIMER_PPI; i++)
        arch_timer_ppi[i] = irq_of_parse_and_map(np, i);

    arch_timer_detect_rate(NULL, np);                            //从dts获取频率:19.2M Hz

    arch_timer_c3stop = !of_property_read_bool(np, "always-on");

#ifdef CONFIG_FSL_ERRATUM_A008585
    if (fsl_a008585_enable < 0)
        fsl_a008585_enable = of_property_read_bool(np, "fsl,erratum-a008585");
    if (fsl_a008585_enable) {
        static_branch_enable(&arch_timer_read_ool_enabled);
        pr_info("Enabling workaround for FSL erratum A-008585
");
    }
#endif

    /*
     * If we cannot rely on firmware initializing the timer registers then
     * we should use the physical timers instead.
     */
    if (IS_ENABLED(CONFIG_ARM) &&
        of_property_read_bool(np, "arm,cpu-registers-not-fw-configured"))
        arch_timer_uses_ppi = PHYS_SECURE_PPI;

    /* On some systems, the counter stops ticking when in suspend. */
    arch_counter_suspend_stop = of_property_read_bool(np,
                             "arm,no-tick-in-suspend");

    return arch_timer_init();                     //继续进行后续初始化
}

 

static int __init arch_timer_init(void)
{
    int ret;
    /*
     * If HYP mode is available, we know that the physical timer
     * has been configured to be accessible from PL1. Use it, so
     * that a guest can use the virtual timer instead.
     *
     * If no interrupt provided for virtual timer, we‘ll have to
     * stick to the physical timer. It‘d better be accessible...
     *
     * On ARMv8.1 with VH extensions, the kernel runs in HYP. VHE
     * accesses to CNTP_*_EL1 registers are silently redirected to
     * their CNTHP_*_EL2 counterparts, and use a different PPI
     * number.
     */
    if (is_hyp_mode_available() || !arch_timer_ppi[VIRT_PPI]) {
        bool has_ppi;

        if (is_kernel_in_hyp_mode()) {
            arch_timer_uses_ppi = HYP_PPI;
            has_ppi = !!arch_timer_ppi[HYP_PPI];
        } else {
            arch_timer_uses_ppi = PHYS_SECURE_PPI;
            has_ppi = (!!arch_timer_ppi[PHYS_SECURE_PPI] ||
                   !!arch_timer_ppi[PHYS_NONSECURE_PPI]);
        }

        if (!has_ppi) {
            pr_warn("arch_timer: No interrupt available, giving up
");
            return -EINVAL;
        }
    }

    ret = arch_timer_register();               //(1)注册timer
    if (ret)
        return ret;

    ret = arch_timer_common_init();             //(2)timer相关初始化
    if (ret)
        return ret;

    arch_timer_kvm_info.virtual_irq = arch_timer_ppi[VIRT_PPI];

    return 0;
}

 

(1)注册timer:

static int __init arch_timer_register(void)
{
    int err;
    int ppi;

    arch_timer_evt = alloc_percpu(struct clock_event_device);
    if (!arch_timer_evt) {
        err = -ENOMEM;
        goto out;
    }

    ppi = arch_timer_ppi[arch_timer_uses_ppi];
    switch (arch_timer_uses_ppi) {
    case VIRT_PPI:
        err = request_percpu_irq(ppi, arch_timer_handler_virt,          //仅注册percpu的irq中断(单个cpu独享,非多cpu共享),没有enable irq(真正enable在startup接口中),arch_timer_handler_virt为中断处理函数
                     "arch_timer", arch_timer_evt);
        break;
    case PHYS_SECURE_PPI:
    case PHYS_NONSECURE_PPI:
        err = request_percpu_irq(ppi, arch_timer_handler_phys,
                     "arch_timer", arch_timer_evt);
        if (!err && arch_timer_ppi[PHYS_NONSECURE_PPI]) {
            ppi = arch_timer_ppi[PHYS_NONSECURE_PPI];
            err = request_percpu_irq(ppi, arch_timer_handler_phys,
                         "arch_timer", arch_timer_evt);
            if (err)
                free_percpu_irq(arch_timer_ppi[PHYS_SECURE_PPI],
                        arch_timer_evt);
        }
        break;
    case HYP_PPI:
        err = request_percpu_irq(ppi, arch_timer_handler_phys,
                     "arch_timer", arch_timer_evt);
        break;
    default:
        BUG();
    }

    if (err) {
        pr_err("arch_timer: can‘t register interrupt %d (%d)
",
               ppi, err);
        goto out_free;
    }

    err = arch_timer_cpu_pm_init();                                    //注册cpu和cpu cluster进入/退出low power的notify
    if (err)
        goto out_unreg_notify;


    /* Register and immediately configure the timer on the boot CPU */
    err = cpuhp_setup_state(CPUHP_AP_ARM_ARCH_TIMER_STARTING,         //(1.1)设置cpu状态为TIMER_STARTING,注册并马上在boot cpu上配置timer。后2个函数为对应 开启/关闭cpu的callback函数,
                "AP_ARM_ARCH_TIMER_STARTING",
                arch_timer_starting_cpu, arch_timer_dying_cpu);
    if (err)
        goto out_unreg_cpupm;
    return 0;

out_unreg_cpupm:
    arch_timer_cpu_pm_deinit();

out_unreg_notify:
    free_percpu_irq(arch_timer_ppi[arch_timer_uses_ppi], arch_timer_evt);
    if (arch_timer_has_nonsecure_ppi())
        free_percpu_irq(arch_timer_ppi[PHYS_NONSECURE_PPI],
                arch_timer_evt);

out_free:
    free_percpu(arch_timer_evt);
out:
    return err;
}

(1.1)通过__cpuhp_setup_state,注册并调用arch_timer_starting_cpu

/**
 * __cpuhp_setup_state - Setup the callbacks for an hotplug machine state
 * @state:    The state to setup
 * @invoke:    If true, the startup function is invoked for cpus where
 *        cpu state >= @state
 * @startup:    startup callback function
 * @teardown:    teardown callback function
 *
 * Returns 0 if successful, otherwise a proper error code
 */
int __cpuhp_setup_state(enum cpuhp_state state,
            const char *name, bool invoke,
            int (*startup)(unsigned int cpu),
            int (*teardown)(unsigned int cpu),
            bool multi_instance)
{
    int cpu, ret = 0;
    int dyn_state = 0;

    if (cpuhp_cb_check(state) || !name)
        return -EINVAL;

    get_online_cpus();
    mutex_lock(&cpuhp_state_mutex);

    /* currently assignments for the ONLINE state are possible */
    if (state == CPUHP_AP_ONLINE_DYN) {
        dyn_state = 1;
        ret = cpuhp_reserve_state(state);
        if (ret < 0)
            goto out;
        state = ret;
    }

    cpuhp_store_callbacks(state, name, startup, teardown, multi_instance);    //配置并保存接口sp->startup.single = startup;    sp->teardown.single = teardown;

    if (!invoke || !startup)
        goto out;

    /*
     * Try to call the startup callback for each present cpu
     * depending on the hotplug state of the cpu.
     */
    for_each_present_cpu(cpu) {
        struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
        int cpustate = st->state;

        if (cpustate < state)
            continue;

        ret = cpuhp_issue_call(cpu, state, true, NULL);               //(1.1.1)调用各个处于online cpu的startup
        if (ret) {
            if (teardown)
                cpuhp_rollback_install(cpu, state, NULL);
            cpuhp_store_callbacks(state, NULL, NULL, NULL, false);
            goto out;
        }
    }
out:
    mutex_unlock(&cpuhp_state_mutex);

    put_online_cpus();
    if (!ret && dyn_state)
        return state;
    return ret;
}

 

(1.1.1)调用startup接口,配置clock event device

...
        cb = bringup ? step->startup.single : step->teardown.single;
        if (!cb)
            return 0;
        ret = cb(cpu);
...

 

static int arch_timer_starting_cpu(unsigned int cpu)
{
    struct clock_event_device *clk = this_cpu_ptr(arch_timer_evt);
    u32 flags;

    __arch_timer_setup(ARCH_CP15_TIMER, clk);                      //(1.1.1.1)setup和配置clock event device

    flags = check_ppi_trigger(arch_timer_ppi[arch_timer_uses_ppi]);
    enable_percpu_irq(arch_timer_ppi[arch_timer_uses_ppi], flags);          //这里真正enable timer的per cpu irq

    if (arch_timer_has_nonsecure_ppi()) {
        flags = check_ppi_trigger(arch_timer_ppi[PHYS_NONSECURE_PPI]);
        enable_percpu_irq(arch_timer_ppi[PHYS_NONSECURE_PPI], flags);
    }

    arch_counter_set_user_access();                            //设置user上层无法access timer和 physical counter,只能access virtual counter
    if (evtstrm_enable)
        arch_timer_configure_evtstream();

    return 0;
}

 

1.1.1.1 配置clock event device

其中arch_sys_timer为tick_device,arch_mem_timer为boardcast device。

当前这里是arch_sys_timer

static void __arch_timer_setup(unsigned type,
                   struct clock_event_device *clk)
{
    clk->features = CLOCK_EVT_FEAT_ONESHOT;

    if (type == ARCH_CP15_TIMER) {
        if (arch_timer_c3stop)
            clk->features |= CLOCK_EVT_FEAT_C3STOP;
        clk->name = "arch_sys_timer";
        clk->rating = 450;
        clk->cpumask = cpumask_of(smp_processor_id());
        clk->irq = arch_timer_ppi[arch_timer_uses_ppi];
        switch (arch_timer_uses_ppi) {
        case VIRT_PPI:
            clk->set_state_shutdown = arch_timer_shutdown_virt;        //这里就是配置clock event device的api,确实并没有set_state_periodic
            clk->set_state_oneshot_stopped = arch_timer_shutdown_virt;
            clk->set_next_event = arch_timer_set_next_event_virt;
            break;
        case PHYS_SECURE_PPI:
        case PHYS_NONSECURE_PPI:
        case HYP_PPI:
            clk->set_state_shutdown = arch_timer_shutdown_phys;
            clk->set_state_oneshot_stopped = arch_timer_shutdown_phys;
            clk->set_next_event = arch_timer_set_next_event_phys;
            break;
        default:
            BUG();
        }

        fsl_a008585_set_sne(clk);
    } else {
        clk->features |= CLOCK_EVT_FEAT_DYNIRQ;
        clk->name = "arch_mem_timer";
        clk->rating = 400;
        clk->cpumask = cpu_all_mask;
        if (arch_timer_mem_use_virtual) {
            clk->set_state_shutdown = arch_timer_shutdown_virt_mem;
            clk->set_state_oneshot_stopped = arch_timer_shutdown_virt_mem;
            clk->set_next_event =
                arch_timer_set_next_event_virt_mem;
        } else {
            clk->set_state_shutdown = arch_timer_shutdown_phys_mem;
            clk->set_state_oneshot_stopped = arch_timer_shutdown_phys_mem;
            clk->set_next_event =
                arch_timer_set_next_event_phys_mem;
        }
    }

    clk->set_state_shutdown(clk);                          //先关闭该clock event device

    clockevents_config_and_register(clk, arch_timer_rate, 0xf, 0x7fffffff); //将配置好的clock event device进一步配置并注册到系统中
}
/**
 * clockevents_config_and_register - Configure and register a clock event device
 * @dev:    device to register
 * @freq:    The clock frequency
 * @min_delta:    The minimum clock ticks to program in oneshot mode
 * @max_delta:    The maximum clock ticks to program in oneshot mode
 *
 * min/max_delta can be 0 for devices which do not support oneshot mode.
 */
void clockevents_config_and_register(struct clock_event_device *dev,
                     u32 freq, unsigned long min_delta,
                     unsigned long max_delta)
{
    dev->min_delta_ticks = min_delta;
    dev->max_delta_ticks = max_delta;
    clockevents_config(dev, freq);          //对应19.2MHz的clk,并根据max ticks配置最长的sleep时间
    clockevents_register_device(dev);        //注册device
}
EXPORT_SYMBOL_GPL(clockevents_config_and_register);

 

 

/**
 * clockevents_register_device - register a clock event device
 * @dev:    device to register
 */
void clockevents_register_device(struct clock_event_device *dev)
{
    unsigned long flags;

    /* Initialize state to DETACHED */
    clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);  //初始化state

    if (!dev->cpumask) {
        WARN_ON(num_possible_cpus() > 1);
        dev->cpumask = cpumask_of(smp_processor_id());
    }

    raw_spin_lock_irqsave(&clockevents_lock, flags);

    list_add(&dev->list, &clockevent_devices);         //加入链表
    tick_check_new_device(dev);                  //回到最开始分析的tick device创建
    clockevents_notify_released();

    raw_spin_unlock_irqrestore(&clockevents_lock, flags);
}
EXPORT_SYMBOL_GPL(clockevents_register_device);

 

 

(2)timer相关初始化:

static int __init arch_timer_common_init(void)
{
    unsigned mask = ARCH_CP15_TIMER | ARCH_MEM_TIMER;

    /* Wait until both nodes are probed if we have two timers */
    if ((arch_timers_present & mask) != mask) {                                          //这里会等待"arm,armv7-timer-mem"(下面会分析) 和 "arm,armv8-timer"都probe完成,才进行下一步。
if (arch_timer_needs_probing(ARCH_MEM_TIMER, arch_timer_mem_of_match))
            return 0;
        if (arch_timer_needs_probing(ARCH_CP15_TIMER, arch_timer_of_match))
            return 0;
    }

    arch_timer_banner(arch_timers_present);            //打印timer相关重要debug信息,LOG:03-09 04:17:39.725  root     0     0 I arm_arch_timer: Architected cp15 and mmio timer(s) running at 19.20MHz (virt/virt).
    arch_counter_register(arch_timers_present);        //(2.1)counter注册和初始化
    clocksource_select_force();              //选择clock source,即上一步注册进clock list中的arch_sys_counter
    return arch_timer_arch_init();            //配置并注册delay timer

 

(2.1)计时器注册和初始化

static struct clocksource clocksource_counter = {
	.name	= "arch_sys_counter",
	.rating	= 400,
	.read	= arch_counter_read,
	.mask	= CLOCKSOURCE_MASK(56),
	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
};

static
void __init arch_counter_register(unsigned type) { u64 start_count; /* Register the CP15 based counter if we have one */ if (type & ARCH_CP15_TIMER) { if (IS_ENABLED(CONFIG_ARM64) || arch_timer_uses_ppi == VIRT_PPI) arch_timer_read_counter = arch_counter_get_cntvct; //提供read接口 else arch_timer_read_counter = arch_counter_get_cntpct; clocksource_counter.archdata.vdso_direct = true; #ifdef CONFIG_FSL_ERRATUM_A008585 /* * Don‘t use the vdso fastpath if errata require using * the out-of-line counter accessor. */ if (static_branch_unlikely(&arch_timer_read_ool_enabled)) clocksource_counter.archdata.vdso_direct = false; #endif } else { arch_timer_read_counter = arch_counter_get_cntvct_mem; } if (!arch_counter_suspend_stop) clocksource_counter.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; start_count = arch_timer_read_counter(); clocksource_register_hz(&clocksource_counter, arch_timer_rate); //install clocksource(19.2MHz),将其加入clocksource list,计算mult,shift。 LOG: 03-09 04:17:39.725 root 0 0 I clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x46d987e47, max_idle_ns: 440795202767 ns cyclecounter.mult = clocksource_counter.mult; cyclecounter.shift = clocksource_counter.shift; timecounter_init(&arch_timer_kvm_info.timecounter,              //计算出来的mult,shift到计时器进行初始化配置。 &cyclecounter, start_count); /* 56 bits minimum, so we assume worst case rollover */ sched_clock_register(arch_timer_read_counter, 56, arch_timer_rate);   //(2.1.1)注册sched clock source }

 

(2.1.1)注册sched clock source

void __init
sched_clock_register(u64 (*read)(void), int bits, unsigned long rate)
{
    u64 res, wrap, new_mask, new_epoch, cyc, ns;
    u32 new_mult, new_shift;
    unsigned long r;
    char r_unit;
    struct clock_read_data rd;

    if (cd.rate > rate)
        return;

    WARN_ON(!irqs_disabled());

    /* Calculate the mult/shift to convert counter ticks to ns. */
    clocks_calc_mult_shift(&new_mult, &new_shift, rate, NSEC_PER_SEC, 3600);    //计算mult,shift.转换tick数到ns单位

    new_mask = CLOCKSOURCE_MASK(bits);
    cd.rate = rate;

    /* Calculate how many nanosecs until we risk wrapping */
    wrap = clocks_calc_max_nsecs(new_mult, new_shift, 0, new_mask, NULL);      //计算:多少ns,可能会溢出
    cd.wrap_kt = ns_to_ktime(wrap);

    rd = cd.read_data[0];

    /* Update epoch for new counter and update ‘epoch_ns‘ from old counter*/
    new_epoch = read();
    cyc = cd.actual_read_sched_clock();
    ns = rd.epoch_ns + cyc_to_ns((cyc - rd.epoch_cyc) & rd.sched_clock_mask, rd.mult, rd.shift);
    cd.actual_read_sched_clock = read;

    rd.read_sched_clock    = read;
    rd.sched_clock_mask    = new_mask;
    rd.mult            = new_mult;
    rd.shift        = new_shift;
    rd.epoch_cyc        = new_epoch;
    rd.epoch_ns        = ns;

    update_clock_read_data(&rd);                            //配置read clock data接口

    if (sched_clock_timer.function != NULL) {
        /* update timeout for clock wrap */
        hrtimer_start(&sched_clock_timer, cd.wrap_kt, HRTIMER_MODE_REL);
    }

    r = rate;
    if (r >= 4000000) {
        r /= 1000000;
        r_unit = M;
    } else {
        if (r >= 1000) {
            r /= 1000;
            r_unit = k;
        } else {
            r_unit =  ;
        }
    }

    /* Calculate the ns resolution of this counter */
    res = cyc_to_ns(1ULL, new_mult, new_shift);

    pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps every %lluns
",            //LOG:03-09 04:17:39.725  root     0     0 I sched_clock: 56 bits at 19MHz, resolution 52ns, wraps every 4398046511078ns
        bits, r, r_unit, res, wrap);

    /* Enable IRQ time accounting if we have a fast enough sched_clock() */
    if (irqtime > 0 || (irqtime == -1 && rate >= 1000000))
        enable_sched_clock_irqtime();

    pr_debug("Registered %pF as sched_clock source
", read);
}

下面为mem_timer的部分相关流程,读者有兴趣可以自行跟踪代码。

CLOCKSOURCE_OF_DECLARE(armv7_arch_timer_mem, "arm,armv7-timer-mem",
               arch_timer_mem_init);

static int __init arch_timer_mem_init(struct device_node *np)
{
    struct device_node *frame, *best_frame = NULL;
    void __iomem *cntctlbase, *base;
    unsigned int irq, ret = -EINVAL;
    u32 cnttidr;

    arch_timers_present |= ARCH_MEM_TIMER;
    cntctlbase = of_iomap(np, 0);
    if (!cntctlbase) {
        pr_err("arch_timer: Can‘t find CNTCTLBase
");
        return -ENXIO;
    }

    cnttidr = readl_relaxed_no_log(cntctlbase + CNTTIDR);

    /*
     * Try to find a virtual capable frame. Otherwise fall back to a
     * physical capable frame.
     */
    for_each_available_child_of_node(np, frame) {
        int n;
        u32 cntacr;

        if (of_property_read_u32(frame, "frame-number", &n)) {
            pr_err("arch_timer: Missing frame-number
");
            of_node_put(frame);
            goto out;
        }

        /* Try enabling everything, and see what sticks */
        cntacr = CNTACR_RFRQ | CNTACR_RWPT | CNTACR_RPCT |
             CNTACR_RWVT | CNTACR_RVOFF | CNTACR_RVCT;
        writel_relaxed(cntacr, cntctlbase + CNTACR(n));
        cntacr = readl_relaxed(cntctlbase + CNTACR(n));

        if ((cnttidr & CNTTIDR_VIRT(n)) &&
            !(~cntacr & (CNTACR_RWVT | CNTACR_RVCT))) {
            of_node_put(best_frame);
            best_frame = frame;
            arch_timer_mem_use_virtual = true;
            break;
        }

        if (~cntacr & (CNTACR_RWPT | CNTACR_RPCT))
            continue;

        of_node_put(best_frame);
        best_frame = of_node_get(frame);
    }

    ret= -ENXIO;
    base = arch_counter_base = of_iomap(best_frame, 0);
    if (!base) {
        pr_err("arch_timer: Can‘t map frame‘s registers
");
        goto out;
    }

    if (arch_timer_mem_use_virtual)
        irq = irq_of_parse_and_map(best_frame, 1);
    else
        irq = irq_of_parse_and_map(best_frame, 0);

    ret = -EINVAL;
    if (!irq) {
        pr_err("arch_timer: Frame missing %s irq",
               arch_timer_mem_use_virtual ? "virt" : "phys");
        goto out;
    }

    arch_timer_detect_rate(base, np);
    ret = arch_timer_mem_register(base, irq);
    if (ret)
        goto out;

    return arch_timer_common_init();
out:
    iounmap(cntctlbase);
    of_node_put(best_frame);
    return ret;
}

 

adb下Debug 信息 

 通过adb可以确认,其中arch_sys_timer为tick_device,arch_mem_timer为boardcast device:

cat /sys/devices/system/clocksource/clocksource0/available_clocksource 
arch_sys_counter
cat /sys/devices/system/clockevents/clockevent*/current_device
arch_sys_timer
arch_sys_timer
arch_sys_timer
arch_sys_timer
arch_sys_timer
arch_sys_timer
arch_sys_timer
arch_sys_timer

 Tick Device list也可以通过adb确认:

tc_ocla1_sprout:/ # cat /proc/timer_list                                        
Timer List Version: v0.8
HRTIMER_MAX_CLOCK_BASES: 4
now at 102343778233457 nsecs

cpu: 0
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102343782447797-102343782497797 nsecs [in 4214340 to 4264340 nsecs]
 #1: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102343784131582-102343784181582 nsecs [in 5898125 to 5948125 nsecs]
 #2: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102343746625332-102343786625332 nsecs [in -31608125 to 8391875 nsecs]
 #3: <0000000000000000>, tick_sched_timer, S:03
 # expires at 102343790000000-102343790000000 nsecs [in 11766543 to 11766543 nsecs]
 #4: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102344559131374-102344559181374 nsecs [in 780897917 to 780947917 nsecs]
 #5: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102344624344916-102344625344906 nsecs [in 846111459 to 847111449 nsecs]
 #6: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102344865701377-102344865751377 nsecs [in 1087467920 to 1087517920 nsecs]
 #7: <0000000000000000>, timerfd_tmrproc, S:01
 # expires at 102346297934997-102346297934997 nsecs [in 2519701540 to 2519701540 nsecs]
 #8: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102350106997217-102350146997217 nsecs [in 6328763760 to 6368763760 nsecs]
 #9: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102362935108406-102362975108406 nsecs [in 19156874949 to 19196874949 nsecs]
 #10: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102399338762364-102399394359362 nsecs [in 55560528907 to 55616125905 nsecs]
 #11: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102585376718491-102585416718491 nsecs [in 241598485034 to 241638485034 nsecs]
 #12: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102608578134731-102608578184731 nsecs [in 264799901274 to 264799951274 nsecs]
 #13: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102627061046224-102627161046224 nsecs [in 283282812767 to 283382812767 nsecs]
 #14: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102634070817794-102634170817794 nsecs [in 290292584337 to 290392584337 nsecs]
 #15: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102705307211360-102705407211360 nsecs [in 361528977903 to 361628977903 nsecs]
 #16: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102733142273329-102733242273329 nsecs [in 389364039872 to 389464039872 nsecs]
 #17: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102917415334665-102917515334665 nsecs [in 573637101208 to 573737101208 nsecs]
 #18: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 105191125652957-105191225652957 nsecs [in 2847347419500 to 2847447419500 nsecs]
 #19: <0000000000000000>, sched_clock_poll, S:01
 # expires at 105359635153912-105359635153912 nsecs [in 3015856920455 to 3015856920455 nsecs]
 #20: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 150008206178411-150008206228411 nsecs [in 47664427944954 to 47664427994954 nsecs]
 #21: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 172812354468858-172812354518858 nsecs [in 70468576235401 to 70468576285401 nsecs]
 #22: <0000000000000000>, it_real_fn, S:01
 # expires at 172812376385680-172812376385680 nsecs [in 70468598152223 to 70468598152223 nsecs]
 #23: <0000000000000000>, timerfd_tmrproc, S:01
 # expires at 173057145132000-173057145132000 nsecs [in 70713366898543 to 70713366898543 nsecs]
 #24: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 500008591686350-500008591736350 nsecs [in 397664813452893 to 397664813502893 nsecs]
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 #0: <0000000000000000>, alarmtimer_fired, S:01
 # expires at 102670651000000-102670651000000 nsecs [in 321524511911 to 321524511911 nsecs]
 #1: <0000000000000000>, timerfd_tmrproc, S:01
 # expires at 107028105000000-107028105000000 nsecs [in 4678978511911 to 4678978511911 nsecs]
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 102343782497797 nsecs
  .hres_active    : 1
  .nr_events      : 9716538
  .nr_retries     : 9612
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343760000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4305171671
  .idle_calls     : 15389066
  .idle_sleeps    : 11043177
  .idle_entrytime : 102343772001947 nsecs
  .idle_waketime  : 102343743066739 nsecs
  .idle_exittime  : 102343760567884 nsecs
  .idle_sleeptime : 91073881832478 nsecs
  .iowait_sleeptime: 126261797989 nsecs
  .last_jiffies   : 4305171673
  .next_timer     : 102343780000000
  .idle_expires   : 102344740000000 nsecs
jiffies: 4305171674

cpu: 1
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <0000000000000000>, tick_sched_timer, S:03
 # expires at 102343790000000-102343790000000 nsecs [in 11766543 to 11766543 nsecs]
 #1: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102345616605957-102345616655957 nsecs [in 1838372500 to 1838422500 nsecs]
 #2: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102348461282998-102348491282997 nsecs [in 4683049541 to 4713049540 nsecs]
 #3: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102352914732997-102352914782997 nsecs [in 9136499540 to 9136549540 nsecs]
 #4: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102368297707318-102368397707318 nsecs [in 24519473861 to 24619473861 nsecs]
 #5: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102369592027095-102369632027095 nsecs [in 25813793638 to 25853793638 nsecs]
 #6: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102602692588558-102602692638558 nsecs [in 258914355101 to 258914405101 nsecs]
 #7: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 120558704386589-120558804386589 nsecs [in 18214926153132 to 18215026153132 nsecs]
 #8: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 172847398703681-172847498703681 nsecs [in 70503620470224 to 70503720470224 nsecs]
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 102343790000000 nsecs
  .hres_active    : 1
  .nr_events      : 8525252
  .nr_retries     : 6736
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343750000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4305171670
  .idle_calls     : 714821
  .idle_sleeps    : 515264
  .idle_entrytime : 102343745435957 nsecs
  .idle_waketime  : 102343745435957 nsecs
  .idle_exittime  : 102343745472051 nsecs
  .idle_sleeptime : 5177799957603 nsecs
  .iowait_sleeptime: 12700959475 nsecs
  .last_jiffies   : 4305171670
  .next_timer     : 102343830000000
  .idle_expires   : 102343830000000 nsecs
jiffies: 4305171674

cpu: 2
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <0000000000000000>, tick_sched_timer, S:03
 # expires at 102343790000000-102343790000000 nsecs [in 11766543 to 11766543 nsecs]
 #1: <0000000000000000>, sched_rt_period_timer, S:03
 # expires at 102344000000000-102344000000000 nsecs [in 221766543 to 221766543 nsecs]
 #2: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102349155736740-102349165736738 nsecs [in 5377503283 to 5387503281 nsecs]
 #3: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102353327602474-102353427602474 nsecs [in 9549369017 to 9649369017 nsecs]
 #4: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102354056718424-102354056768424 nsecs [in 10278484967 to 10278534967 nsecs]
 #5: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102356502485854-102356517402853 nsecs [in 12724252397 to 12739169396 nsecs]
 #6: <0000000000000000>, posix_timer_fn, S:01
 # expires at 102380890693049-102380890693049 nsecs [in 37112459592 to 37112459592 nsecs]
 #7: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102396512824866-102396512874866 nsecs [in 52734591409 to 52734641409 nsecs]
 #8: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 104019335142591-104019335192591 nsecs [in 1675556909134 to 1675556959134 nsecs]
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 #0: <0000000000000000>, alarmtimer_fired, S:01
 # expires at 102392489700397-102392489700397 nsecs [in 43363212308 to 43363212308 nsecs]
 #1: <0000000000000000>, alarmtimer_fired, S:01
 # expires at 102624969000000-102624969000000 nsecs [in 275842511911 to 275842511911 nsecs]
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 102343790000000 nsecs
  .hres_active    : 1
  .nr_events      : 8791822
  .nr_retries     : 27696
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343680000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4305171663
  .idle_calls     : 806233
  .idle_sleeps    : 590526
  .idle_entrytime : 102343780041634 nsecs
  .idle_waketime  : 102343653006739 nsecs
  .idle_exittime  : 102343673157676 nsecs
  .idle_sleeptime : 5080392137502 nsecs
  .iowait_sleeptime: 5705048172 nsecs
  .last_jiffies   : 4305171673
  .next_timer     : 102343780000000
  .idle_expires   : 102344990000000 nsecs
jiffies: 4305171674

cpu: 3
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <0000000000000000>, tick_sched_timer, S:03
 # expires at 102343790000000-102343790000000 nsecs [in 11766543 to 11766543 nsecs]
 #1: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102344414214447-102344415214442 nsecs [in 635980990 to 636980985 nsecs]
 #2: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102397177072627-102397277072627 nsecs [in 53398839170 to 53498839170 nsecs]
 #3: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 102632303944714-102632403944714 nsecs [in 288525711257 to 288625711257 nsecs]
 #4: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 104414961740811-104415061740811 nsecs [in 2071183507354 to 2071283507354 nsecs]
 #5: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 108022052013874-108022152013874 nsecs [in 5678273780417 to 5678373780417 nsecs]
 #6: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 110014003359763-110014103359763 nsecs [in 7670225126306 to 7670325126306 nsecs]
 #7: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 9223372036854775807-9223372036854775807 nsecs [in 9223269693076542350 to 9223269693076542350 nsecs]
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 102343790000000 nsecs
  .hres_active    : 1
  .nr_events      : 8028797
  .nr_retries     : 16288
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343780000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4305171673
  .idle_calls     : 776457
  .idle_sleeps    : 558645
  .idle_entrytime : 102343780298822 nsecs
  .idle_waketime  : 102343709265801 nsecs
  .idle_exittime  : 102343780298822 nsecs
  .idle_sleeptime : 5070930572889 nsecs
  .iowait_sleeptime: 8522453661 nsecs
  .last_jiffies   : 4305171673
  .next_timer     : 102344650000000
  .idle_expires   : 102344650000000 nsecs
jiffies: 4305171674

cpu: 4
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <0000000000000000>, hrtimer_wakeup, S:01
 # expires at 111408145011166-111408245011166 nsecs [in 9064366777709 to 9064466777709 nsecs]
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 111408245011166 nsecs
  .hres_active    : 1
  .nr_events      : 51287
  .nr_retries     : 39
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343760000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4305171671
  .idle_calls     : 16077
  .idle_sleeps    : 15389
  .idle_entrytime : 102343780268353 nsecs
  .idle_waketime  : 102343751119343 nsecs
  .idle_exittime  : 102343749631009 nsecs
  .idle_sleeptime : 5771415901159 nsecs
  .iowait_sleeptime: 91151360 nsecs
  .last_jiffies   : 4305171671
  .next_timer     : 9223372036854775807
  .idle_expires   : 9223372036854775807 nsecs
jiffies: 4305171674

cpu: 5
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 1
  .nr_events      : 32597
  .nr_retries     : 26
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343760000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4305171671
  .idle_calls     : 11180
  .idle_sleeps    : 10973
  .idle_entrytime : 102343780269447 nsecs
  .idle_waketime  : 102343751066895 nsecs
  .idle_exittime  : 102343749803353 nsecs
  .idle_sleeptime : 5773496930467 nsecs
  .iowait_sleeptime: 25364688 nsecs
  .last_jiffies   : 4305171671
  .next_timer     : 9223372036854775807
  .idle_expires   : 9223372036854775807 nsecs
jiffies: 4305171674

cpu: 6
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 1
  .nr_events      : 77050
  .nr_retries     : 18
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343760000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4305171671
  .idle_calls     : 9996
  .idle_sleeps    : 9851
  .idle_entrytime : 102343780270280 nsecs
  .idle_waketime  : 102343761084603 nsecs
  .idle_exittime  : 102343750332936 nsecs
  .idle_sleeptime : 5772812686518 nsecs
  .iowait_sleeptime: 10439268 nsecs
  .last_jiffies   : 4305171672
  .next_timer     : 9223372036854775807
  .idle_expires   : 9223372036854775807 nsecs
jiffies: 4305171674

cpu: 7
 clock 0:
  .base:       0000000000000000
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 clock 1:
  .base:       0000000000000000
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1577592197245305630 nsecs
active timers:
 clock 2:
  .base:       0000000000000000
  .index:      2
  .resolution: 1 nsecs
  .get_time:   ktime_get_boottime
  .offset:     5348254632 nsecs
active timers:
 clock 3:
  .base:       0000000000000000
  .index:      3
  .resolution: 1 nsecs
  .get_time:   ktime_get_clocktai
  .offset:     1577592197245305630 nsecs
active timers:
  .expires_next   : 9223372036854775807 nsecs
  .hres_active    : 1
  .nr_events      : 44091
  .nr_retries     : 13
  .nr_hangs       : 0
  .max_hang_time  : 0
  .nohz_mode      : 2
  .last_tick      : 102343680000000 nsecs
  .tick_stopped   : 1
  .idle_jiffies   : 4305171663
  .idle_calls     : 7542
  .idle_sleeps    : 7490
  .idle_entrytime : 102343780271114 nsecs
  .idle_waketime  : 102343680010332 nsecs
  .idle_exittime  : 102343679531114 nsecs
  .idle_sleeptime : 5773215745442 nsecs
  .iowait_sleeptime: 18214117 nsecs
  .last_jiffies   : 4305171664
  .next_timer     : 9223372036854775807
  .idle_expires   : 9223372036854775807 nsecs
jiffies: 4305171674

Tick Device: mode:     1
Broadcast device
Clock Event Device: arch_mem_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           3
 next_event:     102343880000000 nsecs
 set_next_event: arch_timer_set_next_event_virt_mem
 shutdown: arch_timer_shutdown_virt_mem
 event_handler:  tick_handle_oneshot_broadcast
 retries:        52148

tick_broadcast_mask: 00
tick_broadcast_oneshot_mask: fc

Tick Device: mode:     1
Per CPU device: 0
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           3
 next_event:     102343782497797 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        4949

Tick Device: mode:     1
Per CPU device: 1
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           3
 next_event:     102343790000000 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        4068

Tick Device: mode:     1
Per CPU device: 2
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     102343880000000 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        3241

Tick Device: mode:     1
Per CPU device: 3
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     102344415214442 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        2861

Tick Device: mode:     1
Per CPU device: 4
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     111408245011166 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        67

Tick Device: mode:     1
Per CPU device: 5
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        54

Tick Device: mode:     1
Per CPU device: 6
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        54

Tick Device: mode:     1
Per CPU device: 7
Clock Event Device: arch_sys_timer
 max_delta_ns:   111848106728
 min_delta_ns:   1000
 mult:           82463372
 shift:          32
 mode:           1
 next_event:     9223372036854775807 nsecs
 set_next_event: arch_timer_set_next_event_virt
 shutdown: arch_timer_shutdown_virt
 event_handler:  hrtimer_interrupt
 retries:        60

Ftrace抓取的call stack:

          <idle>-0     [003] d.h2 98348.520507: scheduler_tick <-update_process_times
          <idle>-0     [003] d.h2 98348.520534: <stack trace>
 => tick_sched_timer
 => __hrtimer_run_queues
 => hrtimer_interrupt
 => tick_handle_oneshot_broadcast
 => arch_timer_handler_virt_mem
 => handle_irq_event_percpu
 => handle_irq_event
 => handle_fasteoi_irq
 => generic_handle_irq
 => __handle_domain_irq
 => gic_handle_irq
 => el1_irq
 => lpm_cpuidle_enter
 => cpuidle_enter_state
 => cpuidle_enter
 => cpu_startup_entry
 => secondary_start_kernel
 => 
          <idle>-0     [003] d.h2 98348.540237: scheduler_tick <-update_process_times
          <idle>-0     [003] d.h2 98348.540267: <stack trace>
 => tick_sched_timer
 => __hrtimer_run_queues
 => hrtimer_interrupt
 => arch_timer_handler_virt
 => handle_percpu_devid_irq
 => generic_handle_irq
 => __handle_domain_irq
 => gic_handle_irq
 => el1_irq
 => lpm_cpuidle_enter
 => cpuidle_enter_state
 => cpuidle_enter
 => cpu_startup_entry
 => secondary_start_kernel
 => 

 

以上是关于进程调度函数scheduler_tick()的触发原理:周期PERIODIC定时器的主要内容,如果未能解决你的问题,请参考以下文章

linux 进程调度2

Linux系统编程Linux进程调度

Linux用户抢占和内核抢占详解(概念, 实现和触发时机)--Linux进程的管理与调度(二十)

如何防止云调度器多次触发一个函数?

Linux 0.11-一个新进程的诞生完结篇-29

Linux 0.11-一个新进程的诞生完结篇-29