如何从某个网页复制所有文本并将其保存到记事本C#

Posted 2023-03-06

技术标签:

【中文标题】如何从某个网页复制所有文本并将其保存到记事本C#【英文标题】：how to copy all text from a certain webpage and save it to notepad C# 【发布时间】：2012-10-29 09:13:24 【问题描述】：

我有一个 C# Windows 窗体应用程序，它根据某些条件启动网页。

现在我希望我的应用程序自动复制该页面中的所有文本（CSV 格式）并将其粘贴并保存在记事本中。

以下是需要复制的数据示例的链接： http://www.wunderground.com/history/airport/FAJS/2012/10/28/DailyHistory.html?req_city=Johannesburg&req_state=&req_statename=South+Africa&format=1

任何帮助将不胜感激。

【问题讨论】：

【参考方案1】：

您可以使用 .NET 4.5 中的新玩具 HttpClient，例如如何获取 google 页面：

 var httpClient = new HttpClient();
 File.WriteAllText("C:\\google.txt",    
                           httpClient.GetStringAsync("http://www.google.com")
                                     .Result);

【讨论】：

@Mr_Green：是的，仍然可以工作，但是您需要安装包含来自 nuget 的 HttpClient 的库【参考方案2】：

http://msdn.microsoft.com/en-us/library/fhd1f0sw.aspx 结合http://www.dotnetspider.com/resources/21720-Writing-string-content-file.aspx

public static void DownloadString ()

    WebClient client = new WebClient();
    string reply = client.DownloadString("http://www.wunderground.com/history/airport/FAJS/2012/10/28/DailyHistory.html?req_city=Johannesburg&req_state=&req_statename=South+Africa&format=1");

    StringBuilder stringData = new StringBuilder();
    stringData = reply;  
    FileStream fs = new FileStream(@"C:\Temp\tmp.txt", FileMode.Create);
    byte[] buffer = new byte[stringData.Length];
    for (int i = 0; i < stringData.Length; i++)
    
        buffer[i] = (byte)stringData[i];
    
    fs.Write(buffer, 0, buffer.Length);
    fs.Close();

编辑 Adil 使用WriteAllText 方法，效果更好。所以你会得到这样的东西：

WebClient client = new WebClient();
string reply = client.DownloadString("http://www.wunderground.com/history/airport/FAJS/2012/10/28/DailyHistory.html?req_city=Johannesburg&req_state=&req_statename=South+Africa&format=1");
System.IO.File.WriteAllText (@"C:\Temp\tmp.txt", reply);

【讨论】：

非常感谢。这实际上是最简单快捷的方法。谢谢【参考方案3】：

简单方法：使用WebClient.DownloadFile并保存为.txt文件：

var webClient = new WebClient();
webClient.DownloadFile("http://www.google.com",@"c:\google.txt");

【讨论】：

【参考方案4】：

您需要WebRequest 来读取流并将字符串保存到文本文件。您可以使用File.WriteAllText 将其写入文件。

WebRequest request = WebRequest.Create ("http://www.contoso.com/default.html");
                    request.Credentials = CredentialCache.DefaultCredentials;            
HttpWebResponse response = (HttpWebResponse)request.GetResponse ();            
Console.WriteLine (response.StatusDescription);            
Stream dataStream = response.GetResponseStream ();            
StreamReader reader = new StreamReader (dataStream);            
string responseFromServer = reader.ReadToEnd ();
System.IO.File.WriteAllText (@"D:\path.txt", responseFromServer );

【讨论】：

【参考方案5】：

您可以使用网络客户端来执行此操作：

System.Net.WebClient wc = new System.Net.WebClient();
byte[] raw = wc.DownloadData("http://www.wunderground.com/history/airport/FAJS/2012/10/28/DailyHistory.html?req_city=Johannesburg&req_state=&req_statename=South+Africa&format=1");

string webData = System.Text.Encoding.UTF8.GetString(raw);

那么字符串webData包含网页的完整文本

【讨论】：

也可以使用DownloadString方法我认为StringBuilder是首选，因为大数据？只是问我是新人。好的，谢谢.. 我在想不同的方式@JPHellemons

以上是关于如何从某个网页复制所有文本并将其保存到记事本C#的主要内容，如果未能解决你的问题，请参考以下文章