Java (JRE 1.8.0_141) - GET 请求的错误 405

Posted

技术标签:

【中文标题】Java (JRE 1.8.0_141) - GET 请求的错误 405【英文标题】:Java (JRE 1.8.0_141) - Error 405 for GET request 【发布时间】:2018-01-14 07:00:11 【问题描述】:

我正在使用 Java JRE 1.8.0_141,我正在尝试访问特定 URL 并将 html 存储到字符串中,以便稍后在代码中操作数据,但每当我调用 getInputStream( )。

该代码似乎可以与其他 URL 一起使用而不会出现问题。故障网址是:

http://www.streeteasy.com/for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2&refined_search=true

这是 Eclipse 4.6.3 的具体错误:

<terminated, exit value: 1>C:\Program Files\Java\jre1.8.0_141\bin\javaw.exe (Aug 6, 2017, 10:53:37 PM)  

Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Server returned HTTP response code: 405 for URL: http://www.streeteasy.com/for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2&refined_search=true
    at RunMe.getHTMLFromURL(RunMe.java:52)
    at RunMe.main(RunMe.java:18)
Caused by: java.io.IOException: Server returned HTTP response code: 405 for URL: http://www.streeteasy.com/for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2&refined_search=true
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
    at RunMe.getHTMLFromURL(RunMe.java:36)
    ... 1 more

我的 RunMe.java 代码如下:

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.LinkedList;

public class RunMe 

public static void main(String[] args) throws IOException 
    // TODO Auto-generated method stub

    System.out.println(getHTMLFromURL("http://www.streeteasy.com/for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2&refined_search=true"));      


public static String getHTMLFromURL(String url)
        try
            URL urlObj = new URL(url);
            URLConnection con = urlObj.openConnection();
            con.setDoOutput(false);
            con.connect();

            BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream())); 
            // CODE FAILS HERE ^

            StringBuilder response = new StringBuilder();
            String inputLine;

            String newLine = System.getProperty("line.separator");
            while ((inputLine = in.readLine()) != null)
                response.append(inputLine + newLine);
            
            in.close();

            return response.toString();
        
        catch (Exception e)
            throw new RuntimeException(e);
        
    

如果不通过这种方法,我知道如何从这个 URL 中提取 HTML 吗?提前致谢!

【问题讨论】:

我认为问题是标题,尝试把这个标题:'User-Agent': 'request', 'Accept': 'text/html;q=0.9,/ i>;q=0.8' 不知道该怎么做!有什么建议吗?如果有帮助, System.out.println(con.getHeaderFields()) 给我: Transfer-Encoding=[chunked], null=[HTTP/1.1 405 Not Allowed], X-DZ=[50.4.77.193], Cache -Control=[private, no-cache, no-store, must-revalidate], Server=[nginx], Edge-Control=[no-store, bypass-cache], Connection=[keep-alive], Surrogate-Control =[no-store, bypass-cache], Expires=[Thu, 01 Jan 1970 00:00:01 GMT], Date=[Mon, 07 Aug 2017 03:55:51 GMT], Content-Type=[text/ html] 你有没有像我提到的那样发送标题。 @FadySaad 我认为这不是问题所在,请参阅我的回答。 抱歉不清楚。我不明白如何按照您的指示进行操作。您建议进行哪些具体的代码编辑? 【参考方案1】:

我对 URL 执行了 curl 命令,并且该站点似乎正在尝试运行 javascript 来呈现页面。

curl -v -L -H "User-Agent: Mozilla/5.0" -H "Accept: text/html" "http://www.streeteasy.com/for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2"

> GET /for-rent/nyc/status:open%7Cprice:1750-2900%7Carea:104,116,119,143,141%7Camenities:pool?page=2 HTTP/1.1
> Host: www.streeteasy.com
> User-Agent: Mozilla/5.0
> Accept: text/html
> 
< HTTP/1.1 405 Not Allowed

// elided

<h1>Pardon Our Interruption...</h1>
<p>As you were browsing <strong>www.streeteasy.com</strong> something about your browser made us think you were a bot. There are a few reasons this might happen:</p>
<ul>
    <li>You're a power user moving through this website with super-human speed.</li>
    <li>You've disabled JavaScript in your web browser.</li>
    <li>A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this <a title='Third party browser plugins that block javascript' href='http://ds.tl/help-third-party-plugins' target='_blank'>support article</a>.</li>
</ul>

<p>After completing the CAPTCHA below, you will immediately regain access to www.streeteasy.com.</p>

除非您可以通过编程方式填写验证码,否则您可能会走运。

编辑

问题显然是 cookie,如下面的讨论所示。

【讨论】:

如果有帮助,我可以在禁用 JavaScript 的情况下在 Chrome/Firefox 上浏览网站。或者,是否可以使用 Java 间接读取文件,也许通过下载呈现的 HTML,读取该源,然后删除文件? @sweebez 我不知道您所说的“使用 Java 间接读取文件”是什么意思。您的客户端上没有文件;如果服务器拒绝回应,除了礼貌地询问他们之外,您无能为力。客户端-服务器 101。我建议您尝试找出浏览器随请求发送的标头。 我想我很困惑为什么 Java 无法访问该网站,而我的浏览器(禁用 JS)却没有问题。 @sweebez 因为,Java 客户端不是浏览器。从禁用 cookie 的浏览器尝试,这就是问题所在。 我认为你是对的——当cookies被禁用时,“请原谅我们的打扰”页面被触发。这是终末诊断吗?

以上是关于Java (JRE 1.8.0_141) - GET 请求的错误 405的主要内容,如果未能解决你的问题,请参考以下文章

Jenkins

Jenkins安装部署

ionic

Java运行参数详解

Hystrix 和 Ribbon 超时警告

Ubuntu 16.10下的 jdk 1.8.0_111