C#语言怎样解决从网页上下HTML代码error403 Forbidden的问题

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了C#语言怎样解决从网页上下HTML代码error403 Forbidden的问题相关的知识,希望对你有一定的参考价值。

我想从10000个网页上下载html代码,但是有的网页显示error 403 Forbidden,请问,怎样通过添加代码来解决这个问题????

我写的代码:
string url = "http://en.wiktionary.org/wiki/ce";
HttpWebRequest request;
HttpWebResponse response;
StreamReader sr = null;
string htmlCode = "";
try

request = (HttpWebRequest)WebRequest.Create(url);

response = (HttpWebResponse)request.GetResponse();
sr = new StreamReader(response.GetResponseStream());
htmlCode = sr.ReadToEnd();
if (htmlCode == "")
Console.Write("cannot get HTMLCode.\n");
else

Console.Write("get HTMLCode, code length = " + htmlCode.Length + "\n");
Console.Write(htmlCode);

response.Close();

catch (Exception e) Console.Write(e.Message);

我想中间应该加几句设定浏览器的语句就可以了,高分帮忙。
改好后实下这几个连接:
http://en.wiktionary.org/wiki/ce
http://en.wiktionary.org/wiki/ne
http://en.wiktionary.org/wiki/sont

参考技术A 简单的方法是,判断返回的字符串中,是不是包含error 403 Forbidden就行了。 参考技术B 获取响应的状态。
HttpWebResponse.StatusCode
403错误:HttpStatusCode.Forbidden
参考技术C public virtual int GetResponseCode(WebException exception)

if (exception.Status == WebExceptionStatus.ProtocolError && exception.Response != null &&
exception.Response is HttpWebResponse)

return (int)(((HttpWebResponse)exception.Response).StatusCode);


return 500;



用try catch 捕获WebException 然后判段

参考资料:初入江湖 多多指教啊

参考技术D //下载网页
if(this.textBox1.Text==""|this.textBox2.Text=="")
return;
string FileName=this.textBox2.Text.Trim();
string URL=this.textBox1.Text.Trim();
//加"http://"标志
if (URL.IndexOf(@"http://")==-1 )

URL=@"http://"+URL;

HttpWebRequest MyRequest = (HttpWebRequest)WebRequest.Create(URL);
//发送请求,获取响应
HttpWebResponse MyResponse = (HttpWebResponse)MyRequest.GetResponse();
Stream MyInStream = null;
FileStream MyFileStream = null;
try

MyInStream =MyResponse.GetResponseStream();
long fileSizeInBytes = MyResponse.ContentLength;
//创建文件流对象
MyFileStream = new FileStream(FileName, FileMode.OpenOrCreate, FileAccess.Write);
int length = 1024;
byte[] buffer = new byte[1025];
int bytesread = 0;
string strtemp="";
//从网络读取数据
while((bytesread = MyInStream.Read(buffer, 0, length)) > 0)
//把数据写入文件
MyFileStream.Write(buffer, 0, bytesread);
strtemp+=System.Text.Encoding.Default.GetString(buffer,0,bytesread);

MessageBox.Show("下载网页成功!","信息提示",MessageBoxButtons.OK,MessageBoxIcon.Information);

catch(Exception Err)

MessageBox.Show("下载网页操作失败!错误是:"+Err.Message,"信息提示",MessageBoxButtons.OK,MessageBoxIcon.Information);

finally

//关闭流
if(MyInStream != null)

MyInStream.Close();

if(MyFileStream != null)

MyFileStream.Close();




自己改了,该睡觉了,夜猫非我莫属也~!!

用asp.net c# HttpWebRequest获取网页源代码

public string GetPage(string url)
{

HttpWebRequest request = null;

HttpWebResponse response = null;

StreamReader reader = null;

try
{

request = (HttpWebRequest)WebRequest.Create(url);

request.Timeout = 20000;

request.AllowAutoRedirect = false;

response = (HttpWebResponse)request.GetResponse();

if (response.StatusCode == HttpStatusCode.OK && response.ContentLength < 1024 * 1024)
{

reader = new StreamReader(response.GetResponseStream(), System.Text.Encoding.Default);

string html = reader.ReadToEnd();

return html;

}

}

catch
{

}

finally
{

if (response != null)
{

response.Close();

response = null;

}

if (reader != null)

reader.Close();

if (request != null)

request = null;

}

return string.Empty;
}

以上是关于C#语言怎样解决从网页上下HTML代码error403 Forbidden的问题的主要内容,如果未能解决你的问题,请参考以下文章

什么是HTML代码,怎样用它,怎样粘贴它到另一个网页

C#怎样读取HTML文件

从C#走进python上下文管理器

网页中图片跑马灯上下滚动的效果怎样可以让他跑一下停一下再跑一下,代码怎么写?

求救:用鼠标在网页上滚动浏览时,页面会象水波一样晃动,该怎样解决??

图灵社区 阅读 怎样在 Markdown 中使程序代码带上行号