轻松把玩HttpClient之封装HttpClient工具类,优化启用Http连接池策略

Posted 龙轩

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了轻松把玩HttpClient之封装HttpClient工具类,优化启用Http连接池策略相关的知识,希望对你有一定的参考价值。

       写了HttpClient工具类后,有人一直在问我怎么启用http连接池,其实我没考虑过这个问题。不过闲暇的时候,突然间想起了这个问题,就想把这个问题搞一搞。
       之前没用过,但是理解起来应该不算难。作为一个Coder,就算没用过http连接池,但是肯定用过数据库连接池。二者的功能是类似的,就是把建立链接和断开链接的时间节省下来。众所周知,http建立链接是需要3次握手,了解的深入一些的,还知道断开链接需要4次握手。这些操作都是自动完成的,如果我们把这些建立和断开链接的时间节省掉,对于大批量的http请求(如爬虫)是很有用的。
       关于HttpClient如何启用连接池,可以参考这篇文章:http://www.cnblogs.com/likaitai/p/5431246.html。介绍了如何通过连接池获取链接,以及在不用连接时,如果处理不会导致链接直接关闭。
       说了这么多,下面切入正题:HttpClient工具类如何启用http连接池?其实只需要修改创建链接方法即可:
  之前在工具类的核心方法execute方法里获取httpclient对象,调用的是create(String url)方法。返回的是默认的一个HttpClient对象。现在要启用连接池,必须修改此方法。配置连接池的类是HCB,而execute方法接受的参数是HttpConfig参数,所以,首先要在HttpConfig里添加一个HCB对象。然后修改create方法。具体如下:
	/**
	 * HCB对象,用于创建HttpClient对象
	 */
	private HCB hcb;

	public HCB hcb() 
		return hcb;
	

	/**
	 * HCB对象,用于自动从连接池中获得HttpClient对象<br>
	 * <font color="red"><b>请调用hcb.pool方法设置连接池</b></font>
	 * @throws HttpProcessException 
	 */
	public HttpConfig hcb(HCB hcb) throws HttpProcessException 
		this.hcb = hcb;
		return this;
	
	/**
	 * 判定是否开启连接池、及url是http还是https <br>
	 * 		如果已开启连接池,则自动调用build方法,从连接池中获取client对象<br>
	 * 		否则,直接返回相应的默认client对象<br>
	 * 
	 * @throws HttpProcessException 
	 */
	private static void create(HttpConfig config) throws HttpProcessException  
		if(config.hcb()!=null && config.hcb().isSetPool) //如果设置了hcb对象,且配置了连接池,则直接从连接池取
			if(config.url().toLowerCase().startsWith("https://"))
				config.client(config.hcb().ssl().build());
			else
				config.client(config.hcb().build());
			
		else
			if(config.client()==null)//如果为空,设为默认client对象
				if(config.url().toLowerCase().startsWith("https://"))
					config.client(client4HTTPS);
				else
					config.client(client4HTTP);
				
			
		
	
       至于关闭方面,fmt2String以及fmt2Stream方法中,在EntityUtils.toString和EntityUtils.consume方法中已经close了instream,释放了资源。最后调用close(HttpClient)即执行client.close()方法。这样就不会直接关闭链接了,会被连接池自动回收再次使用。
       最后分享一个测试类,分组测试Get请求、Down操作,在开启和关闭Http线程池完成请求的耗时情况。代码如下:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

import org.apache.http.Header;

import com.tgb.ccl.http.common.HttpConfig;
import com.tgb.ccl.http.common.HttpHeader;
import com.tgb.ccl.http.exception.HttpProcessException;
import com.tgb.ccl.http.httpclient.HttpClientUtil;
import com.tgb.ccl.http.httpclient.builder.HCB;

/**
 * 测试启用http连接池
 * 
 * @author arron
 * @date 2016年11月7日 下午1:08:07 
 * @version 1.0
 */
public class TestHttpPool 
	
	// 设置header信息
	private static final Header[] headers = HttpHeader.custom().userAgent("Mozilla/5.0").from("http://blog.csdn.net/newest.html").build();
	
	// URL列表数组,GET请求
	private static final String[] urls = 
			"http://blog.csdn.net/xiaoxian8023/article/details/49883113",
			"http://blog.csdn.net/xiaoxian8023/article/details/49909359",
			"http://blog.csdn.net/xiaoxian8023/article/details/49910127",
			"http://blog.csdn.net/xiaoxian8023/article/details/49910885",
			"http://blog.csdn.net/xiaoxian8023/article/details/51606865",
	;
	
	// 图片URL列表数组,Down操作
	private static final String[] imgurls =
			"http://ss.bdimg.com/static/superman/img/logo/logo_white_fe6da1ec.png",
			"https://scontent-hkg3-1.xx.fbcdn.net/hphotos-xaf1/t39.2365-6/11057093_824152007634067_1766252919_n.png"
	;
	
	private static StringBuffer buf1=new StringBuffer();
	private static StringBuffer buf2=new StringBuffer();
	
	//多线程get请求
	public static void testMultiGet(HttpConfig cfg, int count) throws HttpProcessException
	        try 
				int pagecount = urls.length;
				ExecutorService executors = Executors.newFixedThreadPool(pagecount);
				CountDownLatch countDownLatch = new CountDownLatch(count);   
				//启动线程抓取
				for(int i = 0; i< count;i++)
				    executors.execute(new GetRunnable(countDownLatch,cfg.headers(headers).url(urls[i%pagecount])));
				
				countDownLatch.await();
				executors.shutdown();
			 catch (InterruptedException e) 
				e.printStackTrace();
	        
	
	
	//多线程下载
	public static void testMultiDown(HttpConfig cfg, int count) throws HttpProcessException
		try 
			int pagecount = imgurls.length;
			ExecutorService executors = Executors.newFixedThreadPool(pagecount);
			CountDownLatch countDownLatch = new CountDownLatch(count);   
			//启动线程抓取
			for(int i = 0; i< count;i++)
			    executors.execute(new GetRunnable(countDownLatch, cfg.url(imgurls[i%2]), new FileOutputStream(new File("d://aaa//"+(i+1)+".png"))));
			
			countDownLatch.await();
			executors.shutdown();
		 catch (FileNotFoundException e) 
			e.printStackTrace();
		 catch (InterruptedException e) 
			e.printStackTrace();
		
	
	
	 static class GetRunnable implements Runnable 
	        private CountDownLatch countDownLatch;
	        private HttpConfig config = null;
	        private FileOutputStream out = null;

	        public GetRunnable(CountDownLatch countDownLatch,HttpConfig config)
	           this(countDownLatch, config, null);
	        
	        public GetRunnable(CountDownLatch countDownLatch,HttpConfig config,FileOutputStream out)
	        	this.countDownLatch = countDownLatch;
	        	this.config = config;
	        	this.out = out;
	        
	        
	        @Override
	        public void run() 
	            try 
	            	config.out(out);
	            	if(config.out()==null)
	            		String response = null;
	            		response =  HttpClientUtil.get(config);
	            		System.out.println(Thread.currentThread().getName()+"--获取内容长度:"+response.length());
	            		response = null;

	            	else
	            		HttpClientUtil.down(config);
	            		try 
							config.out().flush();
							config.out().close();
						 catch (IOException e) 
							e.printStackTrace();
						
	            		System.out.println(Thread.currentThread().getName()+"--下载完毕");
	            	
	             catch (HttpProcessException e) 
					e.printStackTrace();
				 finally 
	                countDownLatch.countDown();
	            
	        
	      


	/**
	 * 测试不启用http连接池,get100次,down20次的执行时间
	 * @throws HttpProcessException
	 */
	private static void testNoPool(int getCount, int downCount) throws HttpProcessException 
		long start = System.currentTimeMillis();

		if(getCount>0)
			HttpConfig cfg1 = HttpConfig.custom().client(HCB.custom().build()).headers(headers);
			testMultiGet(cfg1, getCount);
		
		if(downCount>0)
			HttpConfig cfg2 = HttpConfig.custom().client(HCB.custom().build());
			testMultiDown(cfg2, downCount);
		
		
		System.out.println("-----所有线程已完成!------");
        long end = System.currentTimeMillis();
        System.out.println("总耗时(毫秒): -> " + (end - start));
        buf1.append("\\t").append((end-start));
	

	
	/**
	 * 测试启用http连接池,get100次,down20次的执行时间
	 * @throws HttpProcessException
	 */
	private static void testByPool(int getCount, int downCount) throws HttpProcessException 
		long start = System.currentTimeMillis();
		
		HCB hcb= HCB.custom().timeout(10000).pool(10, 10).ssl();
		
		if(getCount>0)
			HttpConfig cfg3 = HttpConfig.custom().hcb(hcb);
			testMultiGet(cfg3, getCount);
		
		if(downCount>0)
			HttpConfig cfg4 = HttpConfig.custom().hcb(hcb);
			testMultiDown(cfg4, downCount);
		

		System.out.println("-----所有线程已完成!------");
        long end = System.currentTimeMillis();
        System.out.println("总耗时(毫秒): -> " + (end - start));
        buf2.append("\\t").append((end-start));
	
	
	public static void main(String[] args) throws Exception 
		File file = new File("d://aaa");
		if(!file.exists() && file.isDirectory())
			file.mkdir();
		
		
		//-------------------------------------------
		//  以下2个方法
		//  分别测试 (get次数,down次数) 
		//  100,0,200,0,500,0,1000,0
		//  0,10,0,20,0,50,0,100
		//  100,10,200,20,500,50,1000,100
		//-------------------------------------------
		
		int[][] times1 = 100,0 , 200,0 , 500,0 , 1000,0;
		int[][] times2 = 0,10,0,20,0,50,0,100;
		int[][] times3 = 100,10,200,20,500,50,1000,100;
		List<int[][]> list = Arrays.asList(times1,times2,times3);
		int n=5;
		
		int t=0;
		//测试未启用http连接池,
		for (int[][] time : list) 
			buf1.append("\\n");
			for (int i = 0; i < time.length; i++) 
				for (int j = 0; j < n; j++) 
					testNoPool(time[i][0],time[i][1]);
					Thread.sleep(100);
					System.gc();
					Thread.sleep(100);
				
				buf1.append("\\n");
			
		

		t=0;
		//测试启用http连接池
		for (int[][] time : list) 
			buf2.append("\\n");
			for (int i = 0; i < time.length; i++) 
				for (int j = 0; j < n; j++) 
					testByPool(time[i][0],time[i][1]);
					Thread.sleep(100);
					System.gc();
					Thread.sleep(100);
				
				buf2.append("\\n");
			
			t++;
		
		
		//把结果打印到Console中
		String[] results1 = buf1.toString().split("\\n");
		String[] results2 = buf2.toString().split("\\n");
		
		for (int i = 0; i < results1.length; i++) 
			System.out.println(results1[i]);
			System.out.println(results2[i]);
		
		
	
       测试结果如下:
操作请求次数是否启用Pool第1次第2次第3次第4次第5次平均时间启用后的效率
GET100480148074853481045224758.652.89%
214619892302235524162241.6
20092229519908591968908918643.15%
499247114863700145455222.4
50023727230822376223427231172342345.88%
121461255712581131211297912676.8
100047518724454502852860557645472348.62%
250732506739550260142488828118.4
Down101060582739440777487408966.44.37%
1041572497331855493258574.8
20173061845518811192941543017859.22.67%
172341602818152175301797117383
50468734152851085499004066646010.4-2.93%
449415037646759437745095147360.2
100899099306598297884409201092344.2-0.93%
914209638894635884249515293203.8
GET,Down100,10159131346514167156071156614143.627.42%
1180510800832210735966810266
200,20265792874427791297123236029037.225.76%
208912466419319195112339421555.8
500,507146272694742857620772574轻松把玩HttpClient之封装HttpClient工具类,封装输入参数,简化工具类

轻松把玩HttpClient之封装HttpClient工具类,插件式配置Header

轻松把玩HttpClient之封装HttpClient工具类,携带Cookie的请求

轻松把玩HttpClient之封装HttpClient工具类,单线程调用及多线程批量调用测试

轻松把玩HttpClient之封装HttpClient工具类,新增验证码识别功能

轻松把玩HttpClient之封装HttpClient工具类,优化启用Http连接池策略

(c)2006-2024 SYSTEM All Rights Reserved IT常识