关于使用URLConnection下载文件时出现无限等待线程挂起的问题

Posted 2022-02-02 一步一步往上爬的小蜗牛

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了关于使用URLConnection下载文件时出现无限等待线程挂起的问题相关的知识，希望对你有一定的参考价值。

前言

清明假期前，我们需要执行一个批量操作，把腾讯云的视频下载下来，使用python脚本压缩并转码后上传到腾讯云另一个桶。写完代码后，跑起来观察了一会，一切正常。

然而回来后发现，线程卡住了，而且没有任何的日志输出，所以完全不知道问题出在哪里。重启容器后，代码继续正常跑，可是跑了几个小时后，再次出现。

问题排查

代码如下：

public boolean downloadNet(String videoUrl, String filePath) throws Exception 
        URL url = new URL(videoUrl);
        URLConnection conn = null;
        try 
            conn = url.openConnection();
         catch (IOException e) 
            e.printStackTrace();
            return false;
        

        try 
            InputStream inStream = conn.getInputStream();
            //下载第三方链接视频到本地
            return FileUtil.videoDownLoadToFile(inStream, filePath);
         catch (IOException e) 
            e.printStackTrace();
        

        return false;

首先可以确定，基本逻辑是没有问题，因为前几个小时代码都是可以按照预期正常运作的。问题是随着运行时间变长而出现的，并且控制台没有输出任何错误信息。所以我猜想是：

①线程死锁；

②资源耗尽，如资源没有正确释放，导致网络连接没有正确释放；

③其他原因。

对于原因①，从代码上看是没有涉及多线程编程和共享资源争抢的，而且通过jstack的分析也没有发现有死锁，可以排除。

对于原因②，由于是在脚本中循环下载，并且没有显式释放连接资源，会不会是这个原因呢？然后我进入了 URLConnection 的源码中，发现里面没有声明 close或者 disconnect 之类的方法，并且有这么一段注释：

* Invoking the @code close() methods on the @code InputStream or @code OutputStream of an
* @code URLConnection after a request may free network resources associated with this
* instance, unless particular protocol specifications specify different behaviours
* for it.

大意是调用URLConnection的输入流或者输出流的close方法之后，是可以释放相关的网络资源的。而我在下载完文件到本地后是有关闭对应的输入流的，所以这也不是问题所在。那么问题出现在哪里呢？

我继续查看URLConnection抽象类的源码，发现了这么一段描述：

/**
 * Returns setting for connect timeout.
 * <p>
 * 0 return implies that the option is disabled
 * (i.e., timeout of infinity).
 *
 * @return an @code int that indicates the connect timeout
 *         value in milliseconds
 * @see #setConnectTimeout(int)
 * @see #connect()
 * @since 1.5
 */
public int getConnectTimeout() 
    return connectTimeout;


/**
 * Sets the read timeout to a specified timeout, in
 * milliseconds. A non-zero value specifies the timeout when
 * reading from Input stream when a connection is established to a
 * resource. If the timeout expires before there is data available
 * for read, a java.net.SocketTimeoutException is raised. A
 * timeout of zero is interpreted as an infinite timeout.
 *
 *<p> Some non-standard implementation of this method ignores the
 * specified timeout. To see the read timeout set, please call
 * getReadTimeout().
 *
 * @param timeout an @code int that specifies the timeout
 * value to be used in milliseconds
 * @throws IllegalArgumentException if the timeout parameter is negative
 *
 * @see #getReadTimeout()
 * @see InputStream#read()
 * @since 1.5
 */
public void setReadTimeout(int timeout) 
    if (timeout < 0) 
        throw new IllegalArgumentException("timeout can not be negative");
    
    readTimeout = timeout;

大意是当connectTimeout和readTimeout不设置值，也就是默认值为0时，连接主机超时和主机读取数据超时被设置为无穷时间。

问题解决

至此，就可以定位到问题所在——程序在某一次连接或者读取数据时网络发生了异常，导致一直阻塞。

解决方法也很简单，为连接增加设置连接超时和读取数据超时即可。

public boolean downloadNet(String videoUrl, String filePath) throws Exception 
        URL url = new URL(videoUrl);
        if ("https".equalsIgnoreCase(url.getProtocol())) 
            SslUtils.ignoreSsl();
        
        URLConnection conn = null;
        try 
            conn = url.openConnection();
            //设置 10 s 连接超时
            conn.setConnectTimeout(10 * 1000);
            //设置 2 min 读取数据超时
            conn.setReadTimeout(120 * 1000);
         catch (IOException e) 
            e.printStackTrace();
            return false;
        

        try 
            InputStream inStream = conn.getInputStream();
            return FileUtil.videoDownLoadToFile(inStream, filePath);
         catch (IOException e) 
            e.printStackTrace();
        

        return false;

为连接配置connectTimeout和readTimeout后，脚本数据顺利刷完，再也没有出现阻塞的情况了。

参考链接：

https://www.cnblogs.com/xiohao/p/8854113.html

https://blog.csdn.net/qq_34953641/article/details/62037679?utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromMachineLearnPai2%7Edefault-1.control&dist_request_id=1328767.72334.16177105124228189&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromMachineLearnPai2%7Edefault-1.control

https://www.cnblogs.com/xiohao/p/8854113.html

URLConnection的连接、超时、关闭用法总结_kydkong的博客-CSDN博客_urlconnection关闭Java中可以使用HttpURLConnection来请求WEB资源。 1、 URL请求的类别分为二类,GET与POST请求。二者的区别在于： a:) get请求可以获取静态页面，也可以把参数放在URL字串后面，传递给servlet， b:) post与get的不同之处在于post的参数不是放在URL字串里面，而https://blog.csdn.net/kydkong/article/details/46964055?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_title~default-1.no_search_link&spm=1001.2101.3001.4242 https://blog.csdn.net/kydkong/article/details/46964055?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_title~default-1.no_search_link&spm=1001.2101.3001.4242https://blog.csdn.net/kydkong/article/details/46964055?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_title~default-1.no_search_link&spm=1001.2101.3001.4242
https://www.iteye.com/blog/frejus-2077975https://www.iteye.com/blog/frejus-2077975
HttpURLConnection 设置超时与释放资源_蓝月-CSDN博客_httpurlconnection 超时设置1.连接时间等待时间的设置方法（1）全局设置 -- JDK 1.5以前的版本，只能通过设置这两个系统属性来控制网络超时。System.setProperty("sun.net.client.defaultConnectTimeout", 超时毫秒数字符串);System.setProperty("sun.net.client.defaultReadTimeout", 超时毫秒数字符串); （2局部设置）-- JDK 1.5及之后URL newurl = new URL(url);...https://lanyue.blog.csdn.net/article/details/117786720?utm_medium=distribute.pc_relevant.none-task-blog-2~default~CTRLIST~default-1.no_search_link&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2~default~CTRLIST~default-1.no_search_link
使用HttpURLConnection时遇到的资源未释放的问题_weixin_33924220的博客-CSDN博客http://blog.sina.com.cn/s/blog_56beadc60100j9zu.html今天自己写了一个压力测试的小程序,同时启100个线程，每个线程都串行地访问应用服务器上的一个jsp页面200次。在程序运行了一会儿以后，问题来了： java.net.SocketException: No buffer space available (maximum connecti...https://blog.csdn.net/weixin_33924220/article/details/86255959

HttpURLConnection 设置超时与释放资源_蓝月-CSDN博客_httpurlconnection 超时设置1.连接时间等待时间的设置方法（1）全局设置 -- JDK 1.5以前的版本，只能通过设置这两个系统属性来控制网络超时。System.setProperty("sun.net.client.defaultConnectTimeout", 超时毫秒数字符串);System.setProperty("sun.net.client.defaultReadTimeout", 超时毫秒数字符串); （2局部设置）-- JDK 1.5及之后URL newurl = new URL(url);...https://blog.csdn.net/ITlanyue/article/details/117786720

一定要为HttpUrlConnection设置connectTimeout属性以防止连接被阻塞 - aLa神灯 - 博客园需要给连接代码追加一个超时设置，即通过以下设置代码追加一个超时期限: conn.setConnectTimeout(3000); 这时，我们设置为超时时间为3秒，如果3秒内不能连接就被认为是有错误发生https://www.cnblogs.com/lxh520/p/8413665.html

以上是关于关于使用URLConnection下载文件时出现无限等待线程挂起的问题的主要内容，如果未能解决你的问题，请参考以下文章