使用boost asio获取网页

Posted

技术标签:

【中文标题】使用boost asio获取网页【英文标题】:Using boost asio to get web page 【发布时间】:2017-12-08 02:33:33 【问题描述】:

我正在尝试构建一个程序,该程序将接收股票行情、运行谷歌搜索并输出数据(当前价格、最高价、最低价、百分比变化等)。我正在尝试使用 boost asio,但它没有从服务器返回任何数据。

#include "stdafx.h"
#include <iostream>
#include <istream>
#include <ostream>
#include <string>
#include <boost/asio.hpp>

std::string getStockPage(std::string ticker) 
    boost::asio::ip::tcp::iostream stream;

    stream.connect("www.google.com", "http");
    std::cout << "connected\n";
    stream << "GET /search?q=" << ticker << " HTTP/1.1\r\n";
    stream << "Host: www.google.com\r\n";
    stream << "Cache-Control: no-cache\r\n";
    //stream << "Content-Type: application/x-www-form-urlencoded\r\n\r\n";
    stream << "Connection: close\r\n\r\n";
    std::cout << "sent\n";

    std::ostringstream os;
    //os << stream.rdbuf();
    char buffer[100];
    os << stream.readsome(buffer, 100);
    return std::string(buffer, 100);


int main() 
    std::cout << getStockPage("$tsla");
    std::cout << "done\n";
    std::string temp;
    std::getline(std::cin, temp);
    return 0;



我尝试只读取前 100 个字符,看看它是否在输出响应时出现问题,但它只输出空字符。我希望它输出整个谷歌页面“www.google.com/search?q=$tsla”

任何帮助将不胜感激!

【问题讨论】:

boost.org/doc/libs/1_65_1/doc/html/boost_asio/example/cpp03/… Sending http GET request using boost::asio, similar to cURL的可能重复 【参考方案1】:

std::istream::readsome 被允许总是返回 0 个字节。然后,看起来好像您收到了 NUL 字节,因为您收到了

return std::string(buffer, 100);

而不是

return std::string(buffer, stream.gcount());

真的,就用其他方法

std::ostringstream os;
os << stream.rdbuf();
return os.str();

这在测试时对我有用。请注意,您可以添加冲洗:

stream << "Connection: close\r\n\r\n" << std::flush;

生成的程序

#include <boost/asio.hpp>
#include <iostream>
#include <string>

std::string getStockPage(std::string const& ticker) 
    boost::asio::ip::tcp::iostream stream;

    stream.connect("www.google.com", "http");
    stream    << "GET /search?q=" << ticker << " HTTP/1.1\r\n";
    stream    << "Host: www.google.com\r\n";
    stream    << "Cache-Control: no-cache\r\n";
    // stream << "Content-Type: application/x-www-form-urlencoded\r\n\r\n";
    stream    << "Connection: close\r\n\r\n" << std::flush;

    std::ostringstream os;
    os << stream.rdbuf();
    return os.str();


int main() 
    std::cout << getStockPage("$tsla");

正在打印

HTTP/1.1 302 Found
Location: http://www.google.nl/search?q=%24tsla&gws_rd=cr&dcr=0&ei=3EMqWrKxCILUwAKv9LqICg
Cache-Control: private
Content-Type: text/html; charset=UTF-8
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Date: Fri, 08 Dec 2017 07:48:44 GMT
Server: gws
Content-Length: 288
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=118=MsVZZpoZFEz4mQDqDuuWFRViB8v8yEQju7FPdOw8Rr7ViQ1cJtF6ZeN9u-dSRhGMT4x8F8yDilk9FqsoTkO8IsoQX-YvHXRcCoHcOLk0p4VOTn8AZoldKeh84Ryl0bM0; expires=Sat, 09-Jun-2018 07:48:44 GMT; path=/; domain=.google.com; HttpOnly
Connection: close

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.nl/search?q=%24tsla&amp;gws_rd=cr&amp;dcr=0&amp;ei=3EMqWrKxCILUwAKv9LqICg">here</A>.
</BODY></HTML>

【讨论】:

以上是关于使用boost asio获取网页的主要内容,如果未能解决你的问题,请参考以下文章

使用 boost::asio 获取广播源 IP 地址

使用 boost::asio 获取 UDP 套接字远程地址

如何使用boost asio stable timer expiry获取执行时间点

使用 boost-asio 时实时将缓冲区写入磁盘

如何获取 boost::asio::ip::tcp::socket 的 IP 地址?

Boost asio socket:如何获取IP,连接的端口地址?