leader说用下httpclient的重试，但我没用，因为我有更好的方案。

Posted 2021-12-09 程序员石磊

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了leader说用下httpclient的重试，但我没用，因为我有更好的方案。相关的知识，希望对你有一定的参考价值。

上期周总结中写了要分析超时重试方案，这次专门介绍下可用的方案。

1、故事背景

客户对我们系统的可用性要求特别高不能低于99%，为了监控这些系统的可用性同时不对各子产品进行代码入侵，我们采取了单独开发服务进行探活。
其实探活通俗的说就是：定时调用下各种接口，查看服务是否可用。各系统接口访问协议不同主要分为：

websocket
http
git ssh
git http
linux shell（ssh）

k8s集群环境下夹杂这各种代理转发中间件，导致链路超长，一旦网络抖动或请求超时就误判为不可用，最终会导致指标很难看。因此大家建议增加重试策略，如果超过3次才判断为服务不可用。

2、重试方案实现分析

leader说用下httpclient的重试。
我觉得Leader的建议很不错，但是只支持Http协议，其他协议的怎么办？然后就开始思考有没有简单做法？因此我在心里描述着自己的诉求：

有没有通用的解决办法？
每个协议都要开发重试，工作量巨大。我不能陷入细节，要从具体细节抽象出来规律，然后利用设计模式之类的思想类解决问题。
不能对业务逻辑进行入侵。

再深入思考：

系统设计之初就采用了策略模式进行了抽象。
也采用了代理模式，避免controller层之间调用策略。代理模式其实也就是代理人，代理人在执行策略之前，可以随意做文章呀.

好了大概想到了解决办法，我要在代理人那里做文章。

3、重试方案搜集

在做事之前，喜欢搜集下现在是否有成熟的解决方案，因为时间紧急没时间重复造轮子，因此我搜集到了二种常见的方案：

3.1、Spring-Retry

Spring Retry提供了自动重新调用失败操作的能力。这对于错误可能是暂时性的（如暂时性的网络故障）很有帮助。Spring Retry提供了对流程和基于策略的行为的声明性控制，易于扩展和定制。

3.1.1、引入依赖

<dependency>
    <groupId>org.springframework.retry</groupId>
    <artifactId>spring-retry</artifactId>
</dependency>

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-aop</artifactId>
</dependency>

3.1.2、接口增加注释

@Service
@Slf4j
public class RemoteService {

    /**
     * 添加重试注解,当有异常时触发重试机制.设置重试5次,默认是3.延时2000ms再次执行,每次延时提高1.5倍.当返回结果不符合要求时,主动报错触发重试.
     * @param count
     * @return
     * @throws Exception
     */
    @Retryable(value = {RemoteAccessException.class }, maxAttempts = 5, backoff = @Backoff(delay = 2000, multiplier = 1.5))
    public String call(Integer count) throws Exception {
        if(count == 10){
            log.info("Remote RPC call do something... {}",LocalTime.now());
            throw new RemoteAccessException("RPC调用异常");
        }
        return "SUCCESS";
    }

    /**
     * 定义回调,注意异常类型和方法返回值类型要与重试方法一致
     * @param e
     * @return
     */
    @Recover
    public String recover(RemoteAccessException e) {
        log.info("Remote RPC Call fail",e);
        return "recover SUCCESS";
    }
}

@Retryable 中有3个参数，

value是可还原的异常类型，也就是重试的异常类型。
maxAttempts 则代表了最大的尝试次数，默认是3次。
exclude，指定异常不重试，默认为空
include，指定异常重试，为空时，所以异常进行重试
backoff 则代表了延迟，默认是没有延迟的，就是失败后立即重试，当然加上延迟时间的处理方案更好，看业务场景，也可以不加括号里面的(delay = 3000L))，默认延迟1000ms.

@Backoff

delay:指定延迟后重试
multiplier:延迟的倍数，eg: delay=1000L,multiplier=2时，第一次重试为1秒，第二次为2秒，第三次为4秒

注意：

注意这里如果@Retryable注解的方法是在Service层，然后在Controller层进行调用的，如果你在本类中调用，那么@Retryable 不会工作。因为当使用@Retryable时，Spring会在原始bean周围创建一个代理，然后可以在特殊情况下特殊处理，这也就是重试的原理了。所以在这种情况下，Spring推荐我们调用一个实际的方法，然后捕获我们在value中抛出的异常，然后根据@Retryable 的饿配置来进行调用。
使用了@Retryable的方法，你要把异常进行抛出处理，要不不会被Retry捕获

3.1.3、调用

@RestController
@RequestMapping("/retry")
@Slf4j
public class RetryController {

    @Autowired
    private RemoteService remoteService;

    @RequestMapping("/show/{count}")
    public String show(@PathVariable Integer count){
        try {
            return remoteService.call(count);
        } catch (Exception e) {
            log.error("RetryController.show Exception",e);
            return "Hello SUCCESS";
        }
    }
}

3.1.3、开启重试

@SpringBootApplication
@EnableRetry //开启重试
public class Application {

    public static void main(String[] args) {

        SpringApplication.run(Application.class,args);
    }
}

3.2、Guava Retrying

guava-retrying 是一个线程安全的 Java 重试类库，提供了一种通用方法去处理任意需要重试的代码，可以方便灵活地控制重试次数、重试时机、重试频率、停止时机等，并具有异常处理功能。

3.2.1、引入依赖

<dependency>
      <groupId>com.github.rholder</groupId>
      <artifactId>guava-retrying</artifactId>
      <version>2.0.0</version>
</dependency>

3.2.2、入门demo

Callable<Boolean> callable = new Callable<Boolean>() {
    public Boolean call() throws Exception {
        return true; // do something useful here
    }
};

Retryer<Boolean> retryer = RetryerBuilder.<Boolean>newBuilder()
    .retryIfResult(Predicates.<Boolean>isNull()) // callable返回null时重试
    .retryIfExceptionOfType(IOException.class) // callable抛出IOException重试
    .retryIfRuntimeException() // callable抛出RuntimeException重试
    .withStopStrategy(StopStrategies.stopAfterAttempt(3)) // 重试3次后停止
    .build();
try {
    retryer.call(callable);
} catch (RetryException e) {
    e.printStackTrace();
} catch (ExecutionException e) {
    e.printStackTrace();
}

接下来对其进行详细说明：

RetryerBuilder是一个factory创建者，可以定制设置重试源且可以支持多个重试源，可以配置重试次数或重试超时时间，以及可以配置等待时间间隔，创建重试者Retryer实例。
RetryerBuilder的重试源支持Exception异常对象和自定义断言对象，通过retryIfException 和retryIfResult设置，同时支持多个且能兼容。
retryIfException，抛出runtime异常、checked异常时都会重试，但是抛出error不会重试。
retryIfRuntimeException只会在抛runtime异常的时候才重试，checked异常和error都不重试。
retryIfExceptionOfType允许我们只在发生特定异常的时候才重试，比如NullPointerException和IllegalStateException`都属于runtime异常，也包括自定义的error

3.2.3、WaitStrategies 重试等待策略

ExponentialWaitStrategy 指数等待策略

指数补偿算法 Exponential Backoff

.withWaitStrategy(WaitStrategies.exponentialWait(100, 5, TimeUnit.MINUTES))

创建一个永久重试的重试器，每次重试失败时以递增的指数时间等待，直到最多5分钟。 5分钟后，每隔5分钟重试一次。对该例而言：

第一次失败后，依次等待时长：2^1 * 100;2^2 * 100；2^3 * 100;...

在ExponentialWaitStrategy中，根据重试次数计算等待时长的源码我们可以关注下：

@Override
public long computeSleepTime(Attempt failedAttempt) {
    double exp = Math.pow(2, failedAttempt.getAttemptNumber());
    long result = Math.round(multiplier * exp);
    if (result > maximumWait) {
        result = maximumWait;
    }
    return result >= 0L ? result : 0L;
}

如果以后有类似的需求，我们可以自己写下这些算法，而有关更多指数补偿算法 Exponential Backoff，可以参考：https://en.wikipedia.org/wiki/Exponential_backoff

FibonacciWaitStrategy 斐波那契等待策略

Fibonacci Backoff 斐波那契补偿算法

.withWaitStrategy(WaitStrategies.fibonacciWait(100, 2, TimeUnit.MINUTES))

创建一个永久重试的重试器，每次重试失败时以斐波那契数列来计算等待时间，直到最多2分钟；2分钟后，每隔2分钟重试一次；对该例而言：

第一次失败后，依次等待时长：1*100;1*100；2*100；3*100；5*100；...

FixedWaitStrategy 固定时长等待策略

withWaitStrategy(WaitStrategies.fixedWait(10,  TimeUnit.SECONDS))

固定时长等待策略，失败后，将等待固定的时长进行重试；

RandomWaitStrategy 随机时长等待策略

withWaitStrategy(WaitStrategies.randomWait(10,  TimeUnit.SECONDS));
withWaitStrategy(WaitStrategies.randomWait(1,  TimeUnit.SECONDS, 10, TimeUnit.SECONDS));

随机时长等待策略，可以设置一个随机等待的最大时长，也可以设置一个随机等待的时长区间。

IncrementingWaitStrategy 递增等待策略

withWaitStrategy(WaitStrategies.incrementingWait(1,  TimeUnit.SECONDS, 5, TimeUnit.SECONDS))

递增等待策略，根据初始值和递增值，等待时长依次递增。就本例而言：

第一次失败后，将依次等待1s；6s(1+5)；11(1+5+5)s；16(1+5+5+5)s；…

ExceptionWaitStrategy 异常等待策略

withWaitStrategy(WaitStrategies.exceptionWait(ArithmeticException.class, e -> 1000L))

根据所发生的异常指定重试的等待时长；如果异常不匹配，则等待时长为0；

CompositeWaitStrategy 复合等待策略

.withWaitStrategy(WaitStrategies.join(WaitStrategies.exceptionWait(ArithmeticException.class, e -> 1000L),WaitStrategies.fixedWait(5, TimeUnit.SECONDS)))

复合等待策略；如果所执行的程序满足一个或多个等待策略，那么等待时间为所有等待策略时间的总和。

3.2.4、 StopStrategies 重试停止策略

NeverStopStrategy

withStopStrategy(StopStrategies.neverStop())

一直不停止，一直需要重试。

StopAfterAttemptStrategy

withStopStrategy(StopStrategies.stopAfterAttempt(3))

在重试次数达到最大次数之后，终止任务。

StopAfterDelayStrategy

withStopStrategy(StopStrategies.stopAfterDelay(3, TimeUnit.MINUTES))

在重试任务达到设置的最长时长之后，无论任务执行次数，都终止任务。

BlockStrategies 阻塞策略

阻塞策略默认提供的只有一种：ThreadSleepStrategy，实现方式是通过Thread.sleep(sleepTime)来实现；不过这也给了我们极大的发挥空间，我们可以自己实现阻塞策略。

AttemptTimeLimiters 任务执行时长限制

这个表示单次任务执行时间限制（如果单次任务执行超时，则终止执行当前任务）；

NoAttemptTimeLimit 无时长限制

.withAttemptTimeLimiter(AttemptTimeLimiters.noTimeLimit())

顾名思义，不限制执行时长；每次都是等执行任务执行完成之后，才进行后续的重试策咯。

FixedAttemptTimeLimit

.withAttemptTimeLimiter(AttemptTimeLimiters.fixedTimeLimit(10, TimeUnit.SECONDS));
.withAttemptTimeLimiter(AttemptTimeLimiters.fixedTimeLimit(10, TimeUnit.SECONDS, Executors.newCachedThreadPool()));

可以指定任务的执行时长限制，并且为了控制线程管理，最好指定相应的线程池。

3.2.5、重试监听

当重试发生时，如果需要额外做一些动作，比如发送邮件通知之类的，可以通过RetryListener，Guava Retryer在每次重试之后会自动回调监听器，并且支持注册多个监听。

@Slf4j
class DiyRetryListener<Boolean> implements RetryListener {
    @Override
    public <Boolean> void onRetry(Attempt<Boolean> attempt) {
        log.info("重试次数:{}",attempt.getAttemptNumber());
        log.info("距离第一次重试的延迟:{}",attempt.getDelaySinceFirstAttempt());
        if(attempt.hasException()){
            log.error("异常原因:",attempt.getExceptionCause());
        }else {
            System.out.println("正常处理结果:{}" + attempt.getResult());
        }
    }
}

定义监听器之后，需要在Retryer中进行注册。

        Retryer<Boolean> retryer = RetryerBuilder.<Boolean>newBuilder()
                .retryIfResult(Predicates.<Boolean>isNull()) // callable返回null时重试
                .retryIfExceptionOfType(IOException.class) // callable抛出IOException重试
                .retryIfRuntimeException() // callable抛出RuntimeException重试
                .withStopStrategy(StopStrategies.stopAfterAttempt(3)) // 重试3次后停止
                .withRetryListener(new DiyRetryListener<Boolean>()) // 注册监听器
                .build();