《C# 爬虫 破境之道》:第一境 爬虫原理 — 第三节:WebResponse
Posted mikecheers
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了《C# 爬虫 破境之道》:第一境 爬虫原理 — 第三节:WebResponse相关的知识,希望对你有一定的参考价值。
第二节中,我们介绍了WebRequest,它可以帮助我们发送一个请求,不过正所谓“来而不往非礼也”,对方收到我们的请求,不给点回复,貌似不太合适(不过,还真有脸皮厚的:P)。
接下来,就重点研究一下,我们收到的回复,是个什么样的东东
[Code 1.3.1]
1 // 2 // Summary: 3 // Provides a response from a Uniform Resource Identifier (URI). This is an abstract 4 // class. 5 public abstract class WebResponse : MarshalByRefObject, ISerializable, IDisposable 6 { 7 // 8 // Summary: 9 // Initializes a new instance of the System.Net.WebResponse class. 10 protected WebResponse(); 11 // 12 // Summary: 13 // Initializes a new instance of the System.Net.WebResponse class from the specified 14 // instances of the System.Runtime.Serialization.SerializationInfo and System.Runtime.Serialization.StreamingContext 15 // classes. 16 // 17 // Parameters: 18 // serializationInfo: 19 // An instance of the System.Runtime.Serialization.SerializationInfo class that 20 // contains the information required to serialize the new System.Net.WebRequest 21 // instance. 22 // 23 // streamingContext: 24 // An instance of the System.Runtime.Serialization.StreamingContext class that indicates 25 // the source of the serialized stream that is associated with the new System.Net.WebRequest 26 // instance. 27 // 28 // Exceptions: 29 // T:System.NotSupportedException: 30 // Any attempt is made to access the constructor, when the constructor is not overridden 31 // in a descendant class. 32 protected WebResponse(SerializationInfo serializationInfo, StreamingContext streamingContext); 33 34 // 35 // Summary: 36 // Gets a System.Boolean value that indicates whether this response was obtained 37 // from the cache. 38 // 39 // Returns: 40 // true if the response was taken from the cache; otherwise, false. 41 public virtual bool IsFromCache { get; } 42 // 43 // Summary: 44 // Gets a System.Boolean value that indicates whether mutual authentication occurred. 45 // 46 // Returns: 47 // true if both client and server were authenticated; otherwise, false. 48 public virtual bool IsMutuallyAuthenticated { get; } 49 // 50 // Summary: 51 // When overridden in a descendant class, gets or sets the content length of data 52 // being received. 53 // 54 // Returns: 55 // The number of bytes returned from the Internet resource. 56 // 57 // Exceptions: 58 // T:System.NotSupportedException: 59 // Any attempt is made to get or set the property, when the property is not overridden 60 // in a descendant class. 61 public virtual long ContentLength { get; set; } 62 // 63 // Summary: 64 // When overridden in a derived class, gets or sets the content type of the data 65 // being received. 66 // 67 // Returns: 68 // A string that contains the content type of the response. 69 // 70 // Exceptions: 71 // T:System.NotSupportedException: 72 // Any attempt is made to get or set the property, when the property is not overridden 73 // in a descendant class. 74 public virtual string ContentType { get; set; } 75 // 76 // Summary: 77 // When overridden in a derived class, gets the URI of the Internet resource that 78 // actually responded to the request. 79 // 80 // Returns: 81 // An instance of the System.Uri class that contains the URI of the Internet resource 82 // that actually responded to the request. 83 // 84 // Exceptions: 85 // T:System.NotSupportedException: 86 // Any attempt is made to get or set the property, when the property is not overridden 87 // in a descendant class. 88 public virtual Uri ResponseUri { get; } 89 // 90 // Summary: 91 // When overridden in a derived class, gets a collection of header name-value pairs 92 // associated with this request. 93 // 94 // Returns: 95 // An instance of the System.Net.WebHeaderCollection class that contains header 96 // values associated with this response. 97 // 98 // Exceptions: 99 // T:System.NotSupportedException: 100 // Any attempt is made to get or set the property, when the property is not overridden 101 // in a descendant class. 102 public virtual WebHeaderCollection Headers { get; } 103 // 104 // Summary: 105 // Gets a value that indicates if headers are supported. 106 // 107 // Returns: 108 // Returns System.Boolean. true if headers are supported; otherwise, false. 109 public virtual bool SupportsHeaders { get; } 110 111 // 112 // Summary: 113 // When overridden by a descendant class, closes the response stream. 114 // 115 // Exceptions: 116 // T:System.NotSupportedException: 117 // Any attempt is made to access the method, when the method is not overridden in 118 // a descendant class. 119 public virtual void Close(); 120 // 121 // Summary: 122 // Releases the unmanaged resources used by the System.Net.WebResponse object. 123 public void Dispose(); 124 // 125 // Summary: 126 // When overridden in a descendant class, returns the data stream from the Internet 127 // resource. 128 // 129 // Returns: 130 // An instance of the System.IO.Stream class for reading data from the Internet 131 // resource. 132 // 133 // Exceptions: 134 // T:System.NotSupportedException: 135 // Any attempt is made to access the method, when the method is not overridden in 136 // a descendant class. 137 public virtual Stream GetResponseStream(); 138 // 139 // Summary: 140 // Releases the unmanaged resources used by the System.Net.WebResponse object, and 141 // optionally disposes of the managed resources. 142 // 143 // Parameters: 144 // disposing: 145 // true to release both managed and unmanaged resources; false to releases only 146 // unmanaged resources. 147 protected virtual void Dispose(bool disposing); 148 // 149 // Summary: 150 // Populates a System.Runtime.Serialization.SerializationInfo with the data that 151 // is needed to serialize the target object. 152 // 153 // Parameters: 154 // serializationInfo: 155 // The System.Runtime.Serialization.SerializationInfo to populate with data. 156 // 157 // streamingContext: 158 // A System.Runtime.Serialization.StreamingContext that specifies the destination 159 // for this serialization. 160 protected virtual void GetObjectData(SerializationInfo serializationInfo, StreamingContext streamingContext); 161 }
相比较WebRequest而言,WebResponse简单许多,这里就不再分解片断讲解了。有事儿说事儿,不要以貌取人:D
首先,与WebRequest的类似,它还是抽象类,两个构造函数的结构,一票子虚属性,大部分都是只读的,和一票子虚方法。
其次,与WebRequest不同,它继承了System.IDisposable接口,提醒我们啊,用完了,要释放啊~~~~~~
几个属性
[Code 1.3.2]
1 /// <summary> 2 /// 只读,指示这个回复是不是从缓存中获取的 3 /// </summary> 4 public virtual bool IsFromCache { get; } 5 /// <summary> 6 /// 只读,指示是否发生相互认证 7 /// </summary> 8 public virtual bool IsMutuallyAuthenticated { get; } 9 /// <summary> 10 /// 指示收到的数据长度 11 /// </summary> 12 public virtual long ContentLength { get; set; } 13 /// <summary> 14 /// 指示收到的内容类型 15 /// </summary> 16 public virtual string ContentType { get; set; } 17 /// <summary> 18 /// 只读,指示收到的实际资源URI 19 /// </summary> 20 public virtual Uri ResponseUri { get; } 21 /// <summary> 22 /// 只读,Header集合,HTTP协议中的重点 23 /// </summary> 24 public virtual WebHeaderCollection Headers { get; } 25 /// <summary> 26 /// 只读,指示是否支持Header集合 27 /// </summary> 28 public virtual bool SupportsHeaders { get; }
重要的属性有Headers、ContentType、ContentLength。对于分析数据都是举足轻重的。
几个方法
[Code 1.3.3]
1 /// <summary> 2 /// 获得目标资源的数据流 3 /// </summary> 4 public virtual Stream GetResponseStream(); 5 /// <summary> 6 /// 关闭Response流 7 /// </summary> 8 public virtual void Close(); 9 /// <summary> 10 /// 释放WebResponse对象 11 /// </summary> 12 public void Dispose(); 13 /// <summary> 14 /// 释放WebResponse对象 15 /// </summary> 16 protected virtual void Dispose(bool disposing); 17 /// <summary> 18 /// 使用序列化目标对象所需的数据填充System.Runtime.Serialization.SerializationInfo 19 /// </summary> 20 protected virtual void GetObjectData(SerializationInfo serializationInfo, StreamingContext streamingContext);
重要的方法有GetResponseStream、Close。对于获取数据至关重要。
总而言之,WebResponse还是比较简单的,给我们留下的坑不多。与WebRequest的讲解一样,干嚼无味,还是在实例中体会个中滋味吧:P
以上是关于《C# 爬虫 破境之道》:第一境 爬虫原理 — 第三节:WebResponse的主要内容,如果未能解决你的问题,请参考以下文章
《C# 爬虫 破境之道》:第一境 爬虫原理 — 第一节:整体思路
《C# 爬虫 破境之道》:第一境 爬虫原理 — 第五节:数据流处理的那些事儿
《C# GDI+ 破境之道》:第一境 GDI+基础 —— 第三节:画圆形