记录一次线上排查HttpClient超时问题

Posted 程序员技术漫谈

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了记录一次线上排查HttpClient超时问题相关的知识,希望对你有一定的参考价值。


       最近通过SkyWalking(推荐使用该监控)发现线上一个对外服务使用HttpClient调用其它服务时,平均调用1000次,出现10次左右超时异常,占比1%,问了下被调用服务,他们说没限制,看了下线上并发也不大,先看下线上异常:

message:Timeout waiting for connection from poolstack:org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from poolat org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:254)at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:231)at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:173)at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:195)at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:86)at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:108)at org.apache.http.impl.client.InternalHttpClient.doExecute$original$359BcZ03(InternalHttpClient.java:184)at org.apache.http.impl.client.InternalHttpClient.doExecute$original$359BcZ03$accessor$kwq4tKpV(InternalHttpClient.java)at org.apache.http.impl.client.InternalHttpClient$auxiliary$UOMniH0V.call(Unknown Source)at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:93)at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java)at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:106)at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57)....at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)at java.util.concurrent.FutureTask.run(FutureTask.java:266)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)at java.lang.Thread.run(Thread.java:745)

然后再看看代码:

private HttpClient httpClient;
private HttpClient getHttpClient() { if (null == httpClient) { synchronized (this) { if (null == httpClient) { RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(5000).setConnectionRequestTimeout(5000).build(); httpClient = HttpClientBuilder.create().setDefaultRequestConfig(requestConfig).build(); } } }
return httpClient; } public String xxx(xxx,xxx){ String url = "xxxxx"; try { HttpPost httpPost = new HttpPost(url); HttpResponse response = getHttpClient().execute(httpPost); System.out.println(EntityUtils.toString(response.getEntity(), "utf-8")); } catch (IOException e) { e.printStackTrace(); }        .... }

原来是利用DCL获取httpClient单例(同事秀骚操作),看起来并没啥毛病,来看下debug获取的httpClient实例信息:

soga,原来默认获取httpClient的连接池(maxTotal)最大是20个,defalutMaxPerRoute为2(这个参数是针对单个客户端的,比如服务端连接池是20个,defalutMaxPerRoute为2,那么小张客户通过他的客户端访问最大支持2个并发,小李也是。。加起来不能超过20个并发,否则就会超时),看到这里,在结合我们的SkyWalking发现原来是线上流量偶尔会较大,而最大连接池是20个,有点小,所以我们可以调大连接池的大小来减少超时的情况,改造后代码:将maxTotal设置为200,defalutMaxPerRoute设置为10

注:减少超时还可以通过超时发生时进行重试,由于我们业务不需要重试,增加连接池大小即可

private HttpClient httpClient;
private HttpClient getHttpClient() { if (null == httpClient) { synchronized (this) { if (null == httpClient) { RequestConfig requestConfig = RequestConfig.custom().setConnectTimeout(5000).setConnectionRequestTimeout(5000).build(); //改造点 httpClient = HttpClientBuilder.create().setDefaultRequestConfig(requestConfig).setMaxConnTotal(200).setMaxConnPerRoute(10).build(); } } }
return httpClient; }


以上是关于记录一次线上排查HttpClient超时问题的主要内容,如果未能解决你的问题,请参考以下文章

记录一次线上线程池爆了的问题

一次线上http接口调用不通相关的解决过程

记一次线上压测Dubbo线程池队列满的问题

一次线上tomcat应用请求阻塞的排查经过

滴滴Go实战:高频服务接口超时排查&性能调优

记一次线上gc调优的过程