HttpHelper

Posted 12不懂三

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了HttpHelper相关的知识,希望对你有一定的参考价值。

public class HttpHelpe

    public static async Task<T> GetAsync<T>(string url, string postData = null, string contentType = null, int timeOut = 30, Dictionary<string, string> headers = null)
    
        return await RequestAsync<T>(url, "GET", postData, contentType, timeOut, headers);
    

    public static async Task<T> PostAsync<T>(string url, string postData = null, string contentType = null, int timeOut = 30, Dictionary<string, string> headers = null)
    
        return await RequestAsync<T>(url, "POST", postData, contentType, timeOut, headers);
    

    public static async Task<T> PutAsync<T>(string url, string postData = null, string contentType = null, int timeOut = 30, Dictionary<string, string> headers = null)
    
        return await RequestAsync<T>(url, "PUT", postData, contentType, timeOut, headers);
    

    public static async Task<T> DeleteAsync<T>(string url, string postData = null, string contentType = null, int timeOut = 30, Dictionary<string, string> headers = null)
    
        return await RequestAsync<T>(url, "DELETE", postData, contentType, timeOut, headers);
    

    private static async Task<T> RequestAsync<T>(string url, string method, string postData = null, string contentType = null, int timeOut = 30,
        Dictionary<string, string> headers = null)
    
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        request.Method = method;
        request.Timeout = timeOut * 1000;

        if (!string.IsNullOrEmpty(contentType))
        
            request.ContentType = contentType;
        

        if (headers != null)
        
            foreach (var header in headers)
            
                request.Headers[header.Key] = header.Value;
            
        

        if (!string.IsNullOrEmpty(postData) && (method == "POST" || method == "PUT"))
        
            byte[] dataBytes = Encoding.UTF8.GetBytes(postData);
            request.ContentLength = dataBytes.Length;

            using (Stream stream = await request.GetRequestStreamAsync())
            
                await stream.WriteAsync(dataBytes, 0, dataBytes.Length);
            
        

        using (HttpWebResponse response = (HttpWebResponse)await request.GetResponseAsync())
        
            using (Stream stream = response.GetResponseStream())
            
                StreamReader reader = new StreamReader(stream);
                string responseBody = await reader.ReadToEndAsync();
                T responseObject = Newtonsoft.Json.JsonConvert.DeserializeObject<T>(responseBody);
                return responseObject;
            
        
    

 

抓取百万知乎用户信息之HttpHelper的迭代之路

 

什么是Httphelper?

    httpelpers是一个封装好拿来获取网络上资源的工具类。因为是用http协议,故取名httphelper。

httphelper出现的背景

  使用WebClient可以很方便获取网络上的资源,例如

              WebClient client = new WebClient();
            string html=   client.DownloadString("https://www.baidu.com/");

这样就可以拿到百度首页的的源代码,由于WebClient封装性太强,有时候不大灵活,需要对底层有更细致的把控,这个时候就需要打造自己的网络资源获取工具了;

HttpHelper初级

  现在着手打造自己的下载工具,刚开始时候长这样

public class HttpHelp
  {
        public static string DownLoadString(string url)
        {
               string Source = string.Empty;
         HttpWebRequest request= (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
using (Stream stream = response.GetResponseStream())
{
using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
{
Source
= reader.ReadToEnd();
}
}
}
return Source;
}
}
程序总会出现各种异常的,这个时候加个Try catch语句
public class HttpHelp
  {
        public static string DownLoadString(string url)
        {

           string Source = string.Empty;
           try{
                HttpWebRequest request= (HttpWebRequest)WebRequest.Create(url);
                using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) 
                { 
                    using (Stream stream = response.GetResponseStream())
                     {
                        using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
                       {
                          Source = reader.ReadToEnd(); 
                       } 
                    } 
                }
           }
          catch
{ Console.WriteLine("出错了,请求的URL为{0}", url); } return Source; } }

请求资源是I/O密集型,特别耗时,这个时候需要异步
 public static async Task<string> DownLoadString(string url)
        {
            return await Task<string>.Run(() =>
            {
                string Source = string.Empty;
                try
                {
                    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
                    using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
                    {
                        using (Stream stream = response.GetResponseStream())
                        {
                            using (StreamReader reader = new StreamReader(stream, Encoding.UTF8))
                            {
                                Source = reader.ReadToEnd();
                            }
                        }
                    }
                }
                catch
                {
                    Console.WriteLine("出错了,请求的URL为{0}", url);
                }
                return Source;
            });
           
        }

 HttpHelper完善       
为了欺骗服务器,让服务器认为这个请求是浏览器发出的

   request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0";

 有些资源是需要权限的,这个时候要伪装成某个用户,http协议是无状态的,标记信息都在cookie上面,给请求加上cookie

    request.Headers.Add("Cookie", "这里填cookie,从浏览器上面拷贝")

 再完善下,设定个超时吧

   request.Timeout = 5000;

 

有些网站提供资源是GZIP压缩,这样可以节省带宽,所以请求头再加个
     request.Headers.Add("Accept-Encoding", " gzip, deflate, br");
相应的得到相应流要有相对应的解压,这个时候httphelper变成这样了
           public static string DownLoadString(string url)
{
string Source = string.Empty;
try{

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url); request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0"; request.Headers.Add("Cookie", "这里是Cookie"); request.Headers.Add("Accept-Encoding", " gzip, deflate, br"); request.KeepAlive = true;//启用长连接 using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) { using (Stream dataStream = response.GetResponseStream()) { if (response.ContentEncoding.ToLower().Contains("gzip"))//解压 { using (GZipStream stream = new GZipStream(response.GetResponseStream(), CompressionMode.Decompress)) { using (StreamReader reader = new StreamReader(stream, Encoding.UTF8)) { Source = reader.ReadToEnd(); } } } else if (response.ContentEncoding.ToLower().Contains("deflate"))//解压 { using (DeflateStream stream = new DeflateStream(response.GetResponseStream(), CompressionMode.Decompress)) { using (StreamReader reader = new StreamReader(stream, Encoding.UTF8)) { Source = reader.ReadToEnd(); } } } else { using (Stream stream = response.GetResponseStream())//原始 { using (StreamReader reader = new StreamReader(stream, Encoding.UTF8)) { Source = reader.ReadToEnd(); } } } } } request.Abort(); } catch { Console.WriteLine("出错了,请求的URL为{0}", url); } return Source;
}

请求态度会被服务器拒绝,返回429。这个时候需要设置代理,我们的请求会提交到代理服务器,代理服务器会向目标服务器请求,得到的响应由代理服务器返回给我们。只要不断切换代理,服务器不会因为请求太频繁而拒绝掉程序的请求
   var proxy = new WebProxy(“Adress”,8080);//后面是端口号
   request.Proxy = proxy;//为httpwebrequest设置代理

 

至于如何获取代理,请见后面的博客


 

 

 

 

 

 

 

 

   

 

以上是关于HttpHelper的主要内容,如果未能解决你的问题,请参考以下文章

抓取百万知乎用户信息之HttpHelper的迭代之路

HttpHelper工具类

HttpHelper

csharp HttpHelper类

csharp HttpHelper类

Httphelper