基本的http文件下载并保存到python中的磁盘？

Posted 2021-04-03

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了基本的http文件下载并保存到python中的磁盘？相关的知识，希望对你有一定的参考价值。

我是Python的新手，我一直在浏览本网站上的问答，以回答我的问题。但是，我是初学者，我发现很难理解一些解决方案。我需要一个非常基本的解决方案

有人可以向我解释一个简单的解决方案：“通过http下载文件”和“将其保存到Windows中的磁盘”吗？

我也不确定如何使用shutil和os模块。

我想下载的文件不到500 MB，是一个.gz存档文件。如果有人可以解释如何提取存档并利用其中的文件，那就太棒了！

这是一个部分解决方案，我从各种答案中总结出来：

import requests
import os
import shutil

global dump

def download_file():
    global dump
    url = "http://randomsite.com/file.gz"
    file = requests.get(url, stream=True)
    dump = file.raw

def save_file():
    global dump
    location = os.path.abspath("D:folderfile.gz")
    with open("file.gz", 'wb') as location:
        shutil.copyfileobj(dump, location)
    del dump

有人可以指出错误（初学者级别）并解释任何更简单的方法来做到这一点？

谢谢！

答案

一种干净的下载文件的方法是：

import urllib

testfile = urllib.URLopener()
testfile.retrieve("http://randomsite.com/file.gz", "file.gz")

这将从网站下载文件并将其命名为file.gz。这是我最喜欢的解决方案之一，来自Downloading a picture via urllib and python。

此示例使用urllib库，它将直接从源检索文件。

另一答案

如上所述here：

import urllib
urllib.urlretrieve ("http://randomsite.com/file.gz", "file.gz")

EDIT:如果您仍想使用请求，请查看this question或this one。

另一答案

我用wget。

简单而好的图书馆，如果你想要举例？

import wget

file_url = 'http://johndoe.com/download.zip'

file_name = wget.download(file_url)

wget模块支持python 2和python 3版本

另一答案

使用wget，urllib和request的四种方法。

#!/usr/bin/python
import requests
from StringIO import StringIO
from PIL import Image
import profile as profile
import urllib
import wget


url = 'https://tinypng.com/images/social/website.jpg'

def testRequest():
    image_name = 'test1.jpg'
    r = requests.get(url, stream=True)
    with open(image_name, 'wb') as f:
        for chunk in r.iter_content():
            f.write(chunk)

def testRequest2():
    image_name = 'test2.jpg'
    r = requests.get(url)
    i = Image.open(StringIO(r.content))
    i.save(image_name)

def testUrllib():
    image_name = 'test3.jpg'
    testfile = urllib.URLopener()
    testfile.retrieve(url, image_name)

def testwget():
    image_name = 'test4.jpg'
    wget.download(url, image_name)

if __name__ == '__main__':
    profile.run('testRequest()')
    profile.run('testRequest2()')
    profile.run('testUrllib()')
    profile.run('testwget()')

testRequest - 4469882函数在20.236秒内调用（4469842个原始调用）

testRequest2 - 8580函数调用（8574个原始调用），0.072秒

testUrllib - 3836函数调用（3775个原始调用），0.036秒

testwget - 在0.020秒内进行3489次函数调用

另一答案

异域Windows解决方案

import subprocess

subprocess.run("powershell Invoke-WebRequest {} -OutFile {}".format(your_url, filename), shell=True)

另一答案

我开始沿着这条路走下去，因为ESXi的wget没有使用SSL编译，我想从供应商的网站直接将OVA下载到世界另一端的ESXi主机上。

我不得不通过编辑规则（正确）来禁用防火墙（懒惰）/启用https输出

创建了python脚本：

import ssl
import shutil
import tempfile
import urllib.request
context = ssl._create_unverified_context()

dlurl='https://somesite/path/whatever'
with urllib.request.urlopen(durl, context=context) as response:
    with open("file.ova", 'wb') as tmp_file:
        shutil.copyfileobj(response, tmp_file)

ESXi库有点配对，但开源weasel安装程序似乎使用urllib for https ...所以它激发了我沿着这条路走下去

另一答案

另一种保存文件的简洁方法是：

import csv
import urllib

urllib.retrieve("your url goes here" , "output.csv")

以上是关于基本的http文件下载并保存到python中的磁盘？的主要内容，如果未能解决你的问题，请参考以下文章