Netty HashedWheelTimer 源码解析

Posted 2021-04-30 code4m

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了Netty HashedWheelTimer 源码解析相关的知识，希望对你有一定的参考价值。

Netty提供了基于论文 HashedandHierarchicalTimingWheels:data structures to efficiently implement a timer facility的时间轮定时器简单实现，可用于进行超时检测，本文从源码角度对此进行分析。

源码分析

Netty Version 4.0.24

原理

HashedWheelTimer由 ticksPerWheel个Bucket组成，将所有的Bucket组成一个圆环，用户每提交一个计时器时，通过计算将该计时器放入对应的Bucket，在 HashedWheelTimer内部有一个滴答计时器，每隔 tickDuration时间滴答一次，每次仅对一个Bucket内的计时器进行超时检查，当下一次滴答到达时，继续对下一个Bucket内的计时器进行超时检查。

初始化

 
   
   
 
  
    
    
  /**
  
    
    
   * tickDuration决定了定时器精度，值越小，则精度越高，默认是100ms
  
    
    
   * ticksPerWheel决定了Bucket的数量，值越小，则放入同一个Bucket中的元素可能越多，默认值是512
  
    
    
   */
  
    
    
  public HashedWheelTimer(
  
    
    
   ThreadFactory threadFactory,
  
    
    
   long tickDuration, TimeUnit unit, int ticksPerWheel) {
  
    
    
  

  
    
    
   if (threadFactory == null) {
  
    
    
   throw new NullPointerException("threadFactory");
  
    
    
   }
  
    
    
   if (unit == null) {
  
    
    
   throw new NullPointerException("unit");
  
    
    
   }
  
    
    
   if (tickDuration <= 0) {
  
    
    
   throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
  
    
    
   }
  
    
    
   if (ticksPerWheel <= 0) {
  
    
    
   throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
  
    
    
   }
  
    
    
  

  
    
    
   // Normalize ticksPerWheel to power of two and initialize the wheel.
  
    
    
   // Bucket数量固定，提前创建好
  
    
    
   wheel = createWheel(ticksPerWheel);
  
    
    
   mask = wheel.length - 1;
  
    
    
  

  
    
    
   // Convert tickDuration to nanos.
  
    
    
   this.tickDuration = unit.toNanos(tickDuration);
  
    
    
  

  
    
    
   // Prevent overflow.
  
    
    
   if (this.tickDuration >= Long.MAX_VALUE / wheel.length) {
  
    
    
   throw new IllegalArgumentException(String.format(
  
    
    
   "tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
  
    
    
   tickDuration, Long.MAX_VALUE / wheel.length));
  
    
    
   }
  
    
    
   // 内部超时检测线程，单线程
  
    
    
   workerThread = threadFactory.newThread(worker);
  
    
    
  

  
    
    
   leak = leakDetector.open(this);
  
    
    
  }

创建超时任务

创建超时任务时，并不会立即将该任务添加到对应的Bucket中，而是先放入 timeouts队列里，等待下一次tick到达时，再转移到Bucket中

 
   
   
 
  
    
    
  public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
  
    
    
   if (task == null) {
  
    
    
   throw new NullPointerException("task");
  
    
    
   }
  
    
    
   if (unit == null) {
  
    
    
   throw new NullPointerException("unit");
  
    
    
   }
  
    
    
   // 通过减计数锁等待Timer启动时间初始化完成
  
    
    
   start();
  
    
    
  

  
    
    
   // Add the timeout to the timeout queue which will be processed on the next tick.
  
    
    
   // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
  
    
    
   long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;
  
    
    
   HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
  
    
    
   timeouts.add(timeout);
  
    
    
   return timeout;
  
    
    
  }

 
   
   
 
  
    
    
  public void start() {
  
    
    
   switch (WORKER_STATE_UPDATER.get(this)) {
  
    
    
   case WORKER_STATE_INIT:
  
    
    
   if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
  
    
    
   workerThread.start();
  
    
    
   }
  
    
    
   break;
  
    
    
   case WORKER_STATE_STARTED:
  
    
    
   break;
  
    
    
   case WORKER_STATE_SHUTDOWN:
  
    
    
   throw new IllegalStateException("cannot be started once stopped");
  
    
    
   default:
  
    
    
   throw new Error("Invalid WorkerState");
  
    
    
   }
  
    
    
  

  
    
    
   // Wait until the startTime is initialized by the worker.
  
    
    
   while (startTime == 0) {
  
    
    
   try {
  
    
    
   startTimeInitialized.await();
  
    
    
   } catch (InterruptedException ignore) {
  
    
    
   // Ignore - it will be ready very soon.
  
    
    
   }
  
    
    
   }
  
    
    
  }

Worker

Worker是内部类，实现了 Runnable接口，是超时检测线程真正执行的任务，主要入口为 run方法

 
   
   
 
  
    
    
  public void run() {
  
    
    
   // Initialize the startTime.
  
    
    
   startTime = System.nanoTime();
  
    
    
   if (startTime == 0) {
  
    
    
   // We use 0 as an indicator for the uninitialized value here, so make sure it's not 0 when initialized.
  
    
    
   startTime = 1;
  
    
    
   }
  
    
    
  

  
    
    
   // Notify the other threads waiting for the initialization at start().
  
    
    
   // 通过计数锁来标记Timer初始化时间，在此之前所有创建的超时任务都会被阻塞
  
    
    
   startTimeInitialized.countDown();
  
    
    
  

  
    
    
   do {
  
    
    
   // 通过Thread.sleep来等待下一次tick到达
  
    
    
   final long deadline = waitForNextTick();
  
    
    
   if (deadline > 0) {
  
    
    
   int idx = (int) (tick & mask);
  
    
    
   // 先处理已取消的超时任务
  
    
    
   processCancelledTasks();
  
    
    
   HashedWheelBucket bucket =
  
    
    
   wheel[idx];
  
    
    
   // 将新创建的超时任务添加到对应的Bucket中
  
    
    
   transferTimeoutsToBuckets();
  
    
    
   // 再处理匹配的Bucket中的超时任务
  
    
    
   bucket.expireTimeouts(deadline);
  
    
    
   // 滴答计数器自增1
  
    
    
   tick++;
  
    
    
   }
  
    
    
   } while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);
  
    
    
  

  
    
    
   // Fill the unprocessedTimeouts so we can return them from stop() method.
  
    
    
   for (HashedWheelBucket bucket: wheel) {
  
    
    
   bucket.clearTimeouts(unprocessedTimeouts);
  
    
    
   }
  
    
    
   for (;;) {
  
    
    
   HashedWheelTimeout timeout = timeouts.poll();
  
    
    
   if (timeout == null) {
  
    
    
   break;
  
    
    
   }
  
    
    
   if (!timeout.isCancelled()) {
  
    
    
   unprocessedTimeouts.add(timeout);
  
    
    
   }
  
    
    
   }
  
    
    
   processCancelledTasks();
  
    
    
  }

 
   
   
 
  
    
    
  private void processCancelledTasks() {
  
    
    
   for (;;) {
  
    
    
   // 遍历取消队列里的所有元素，并执行对应的run方法（实际是将该元素从Bucket中删除）
  
    
    
   Runnable task = cancelledTimeouts.poll();
  
    
    
   if (task == null) {
  
    
    
   // all processed
  
    
    
   break;
  
    
    
   }
  
    
    
   try {
  
    
    
   task.run();
  
    
    
   } catch (Throwable t) {
  
    
    
   if (logger.isWarnEnabled()) {
  
    
    
   logger.warn("An exception was thrown while process a cancellation task", t);
  
    
    
   }
  
    
    
   }
  
    
    
   }
  
    
    
  }

定期将队列里新创建的超时任务转移到Bucket中

 
   
   
 
  
    
    
  private void transferTimeoutsToBuckets() {
  
    
    
   // transfer only max. 100000 timeouts per tick to prevent a thread to stale the workerThread when it just
  
    
    
   // adds new timeouts in a loop.
  
    
    
   // 每次最多转移100,000个超时检测任务到Bucket，其他的要等下次执行该方法时（正常情况下是下一次tick到达）处理
  
    
    
   for (int i = 0; i < 100000; i++) {
  
    
    
   HashedWheelTimeout timeout = timeouts.poll();
  
    
    
   if (timeout == null) {
  
    
    
   // all processed
  
    
    
   break;
  
    
    
   }
  
    
    
   // 如果在添加到Bucket之前任务已取消，则不处理
  
    
    
   if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
  
    
    
   // Was cancelled in the meantime.
  
    
    
   continue;
  
    
    
   }
  
    
    
  

  
    
    
   long calculated = timeout.deadline / tickDuration;
  
    
    
   timeout.remainingRounds = (calculated - tick) / wheel.length;
  
    
    
  

  
    
    
   final long ticks = Math.max(calculated, tick); // Ensure we don't schedule for past.
  
    
    
   // 计算所归属的Bucket
  
    
    
   int stopIndex = (int) (ticks & mask);
  
    
    
  

  
    
    
   HashedWheelBucket bucket = wheel[stopIndex];
  
    
    
   // 添加到Bucket内部的链表
  
    
    
   bucket.addTimeout(timeout);
  
    
    
   }
  
    
    
  }

HashedWheelTimeout

HashedWheelTimeout是内部类，里面包含了用户提交的 TimerTask，主要包括定时器超时和取消操作

超时

 
   
   
 
  
    
    
  public void expire() {
  
    
    
   if (!compareAndSetState(ST_INIT, ST_EXPIRED)) {
  
    
    
   return;
  
    
    
   }
  
    
    
  

  
    
    
   try {
  
    
    
   // 执行用户提交的超时回调，捕获所有异常，保证线程不会因用户异常而终止
  
    
    
   task.run(this);
  
    
    
   } catch (Throwable t) {
  
    
    
   if (logger.isWarnEnabled()) {
  
    
    
   logger.warn("An exception was thrown by " + TimerTask.class.getSimpleName() + '.', t);
  
    
    
   }
  
    
    
   }
  
    
    
  }

取消

取消操作通常由用户发起，用户取消线程和Timer内部的超时检测线程是两个不同的线程，如果用户在取消的同时直接将 HashedWheelTimeout从Bucket中移除，则需要加锁来避免线程安全问题，Netty并没有这么处理，而是通过CAS操作来修改 HashedWheelTimeout状态，并将取消的任务添加到队列里，在下一次tick到达时，再从Bucket中移除，从而避免了加锁

 
   
   
 
  
    
    
  public boolean cancel() {
  
    
    
   // only update the state it will be removed from HashedWheelBucket on next tick.
  
    
    
   if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
  
    
    
   return false;
  
    
    
   }
  
    
    
   // If a task should be canceled we create a new Runnable for this to another queue which will
  
    
    
   // be processed on each tick. So this means that we will have a GC latency of max. 1 tick duration
  
    
    
   // which is good enough. This way we can make again use of our MpscLinkedQueue and so minimize the
  
    
    
   // locking / overhead as much as possible.
  
    
    
   //
  
    
    
   // It is important that we not just add the HashedWheelTimeout itself again as it extends
  
    
    
   // MpscLinkedQueueNode and so may still be used as tombstone.
  
    
    
   timer.cancelledTimeouts.add(new Runnable() {
  
    
    
   @Override
  
    
    
   public void run() {
  
    
    
   HashedWheelBucket bucket = HashedWheelTimeout.this.bucket;
  
    
    
   if (bucket != null) {
  
    
    
   bucket.remove(HashedWheelTimeout.this);
  
    
    
   }
  
    
    
   }
  
    
    
   });
  
    
    
   return true;
  
    
    
  }

HashedWheelBucket

HashedWheelBucket是内部类，其中存储着归属到该Bucket的所有 HashedWheelTimeout
Bucket内的操作主要是增/删 HashedWheelTimeout，没有查询 HashedWheelTimeout需求，因此底层采用链表结构进行存储，每个Bucket内部包含 head和 tail两个成员变量

 
   
   
 
  
    
    
  private HashedWheelTimeout head;
  
    
    
  private HashedWheelTimeout tail;

添加元素

简单的链表添加操作，将新增加的 HashedWheelTimeout添加到链表末尾

 
   
   
 
  
    
    
  public void addTimeout(HashedWheelTimeout timeout) {
  
    
    
   assert timeout.bucket == null;
  
    
    
   timeout.bucket = this;
  
    
    
   if (head == null) {
  
    
    
   head = tail = timeout;
  
    
    
   } else {
  
    
    
   tail.next = timeout;
  
    
    
   timeout.prev = tail;
  
    
    
   tail = timeout;
  
    
    
   }
  
    
    
  }

删除元素

 
   
   
 
  
    
    
  public void remove(HashedWheelTimeout timeout) {
  
    
    
   HashedWheelTimeout next = timeout.next;
  
    
    
   // remove timeout that was either processed or cancelled by updating the linked-list
  
    
    
   if (timeout.prev != null) {
  
    
    
   timeout.prev.next = next;
  
    
    
   }
  
    
    
   if (timeout.next != null) {
  
    
    
   timeout.next.prev = timeout.prev;
  
    
    
   }
  
    
    
  

  
    
    
   if (timeout == head) {
  
    
    
   // if timeout is also the tail we need to adjust the entry too
  
    
    
   if (timeout == tail) {
  
    
    
   tail = null;
  
    
    
   head = null;
  
    
    
   } else {
  
    
    
   head = next;
  
    
    
   }
  
    
    
   } else if (timeout == tail) {
  
    
    
   // if the timeout is the tail modify the tail to be the prev node.
  
    
    
   tail = timeout.prev;
  
    
    
   }
  
    
    
   // null out prev, next and bucket to allow for GC.
  
    
    
   timeout.prev = null;
  
    
    
   timeout.next = null;
  
    
    
   timeout.bucket = null;
  
    
    
  }

超时检测

遍历链表中所有元素，逐个检查是否已超时。如果已超时或已取消，则从链表中删除对应元素

 
   
   
 
  
    
    
  public void expireTimeouts(long deadline) {
  
    
    
   HashedWheelTimeout timeout = head;
  
    
    
  

  
    
    
   // process all timeouts
  
    
    
   // 遍历所有元素
  
    
    
   while (timeout != null) {
  
    
    
   boolean remove = false;
  
    
    
   // 通过remainingRounds来判断是否超时
  
    
    
   if (timeout.remainingRounds <= 0) {
  
    
    
   if (timeout.deadline <= deadline) {
  
    
    
   timeout.expire();
  
    
    
   } else {
  
    
    
   // The timeout was placed into a wrong slot. This should never happen.
  
    
    
   throw new IllegalStateException(String.format(
  
    
    
   "timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
  
    
    
   }
  
    
    
   remove = true;
  
    
    
   } else if (timeout.isCancelled()) {
  
    
    
   remove = true;
  
    
    
   } else {
  
    
    
   timeout.remainingRounds --;
  
    
    
   }
  
    
    
   // store reference to next as we may null out timeout.next in the remove block.
  
    
    
   HashedWheelTimeout next = timeout.next;
  
    
    
   if (remove) {
  
    
    
   remove(timeout);
  
    
    
   }
  
    
    
   timeout = next;
  
    
    
   }
  
    
    
  }

Q&A

Q：超时任务 TimerTask回调方法耗时较长会有什么影响？
A：从源码中可以看出，内部只有一个线程在顺序检查Bucket内的超时情况，并依次调用超时回调，如果回调方法耗时较长，可能会影响后续的超时检测精度，因此建议回调方法中不要有耗时操作

思考

从源码中可以看到， HashedWheelBucket#expireTimeouts每次超时检测操作都会遍历Bucket内的所有元素，难免效率较低，是否可以考虑将链表中的元素按照 remainingRounds有序排列， remainingRounds不采用与0比较的方式，而是直接跟已经走过的轮数来比较，这样如果链表中某个元素被判断出不超时，则之后的所有元素都无须继续检查

以上是关于Netty HashedWheelTimer 源码解析的主要内容，如果未能解决你的问题，请参考以下文章

技术干货 | 深挖 Netty 源码：时间轮底层原理分析

基于时间轮的定时器HashedWheelTimer

javaHashedWheelTimer 使用及源码分析

netty定时器HashedWheelTimer（zz）

Netty HashedWheelTimer 介绍

netty系列之:HashedWheelTimer一种定时器的高效实现