Python 通过 POST 接收 HTTP 文件

Posted

技术标签:

【中文标题】Python 通过 POST 接收 HTTP 文件【英文标题】:Python Receive HTTP file via POST 【发布时间】:2021-10-01 10:16:15 【问题描述】:

我正在尝试创建一个可以接收文件的 Python Web 服务器。所以有人可以访问该网站,点击表单上的上传按钮,然后文件将被发送到服务器并存储在服务器本地。

这里是index.html的内容

<form enctype="multipart/form-data" action="" method="POST">
    <input type="hidden" name="MAX_FILE_SIZE" value="8000000" />
    <input name="uploadedfile" type="file" /><br />
    <input type="submit" value="Upload File" />
</form>

Server.py 的内容

import socket

class server():
    def __init__(self):
        self.host_ip = socket.gethostbyname(socket.gethostname())
        self.host_port = 81
        self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.data_recv_size = 1024

    def get_data(self, conn):
        """ gets the data from client """
        data = b""
        while b"\r\n\r\n" not in data:
            data += conn.recv(self.data_recv_size)
        return data

    def server(self):
        """ main method starts the server """
        print(f"[+] Server started listening on port self.host_port!")
        print(f"[+] Server Ip: self.host_ip")
        self.s.bind((self.host_ip, self.host_port))
        self.s.listen()

        while True:
            conn, addr = self.s.accept()
            with conn:
                data = self.get_data(conn)
                
                # GET request
                if data[0:5] == b"GET /":
                    index = open("index.html", "rb").read()
                    conn.sendall(b"HTTP/1.0 200 OK\nContent-Type: text/html\n\n" + index)
                    print("[+] Responded to GET request")

                # POST request
                elif data[0:4] == b"POST":
                    with open("output.txt", "ab") as file:
                        file.write(data)
                        print(f"len(data) bytes received from post!")
                        conn.sendall(b"HTTP/1.0 200 OK\r\nContent-Type: text/html")

s = server()
s.server()

服务器的 GET 部分正常工作,当我访问网站时,index.html 文件显示在我的网络浏览器中,我可以看到文件上传表单。

编辑:我将表单更新为最大文件大小 800 万name="MAX_FILE_SIZE" value="8000000",服务器收到的 POST 响应要大得多(我在下面更新了它),但它看起来仍然不包含文件内容。

POST / HTTP/1.1
Host: 169.254.126.211:81
Connection: keep-alive
Content-Length: 2857323
Cache-Control: max-age=0
Origin: http://169.254.126.211:81
Upgrade-Insecure-Requests: 1
DNT: 1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryjbf7KaGShYBQ75wT
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Referer: http://169.254.126.211:81/
Accept-Encoding: gzip, deflate
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8,ru;q=0.7

------WebKitFormBoundaryjbf7KaGShYBQ75wT
Content-Disposition: form-data; name="MAX_FILE_SIZE"

8000000
------WebKitFormBoundaryjbf7KaGShYBQ75wT
Content-Disposition: form-data; name="uploadedfile"; filename="IMG_20210131_165637.jpg"
Content-Type: image/jpeg

ÿØÿá„ÙExif  MM *         @      
°         ö       ¶       ¾POST / HTTP/1.1
Host: 169.254.126.211:81
Connection: keep-alive
Content-Length: 2857323
Cache-Control: max-age=0
Origin: http://169.254.126.211:81
Upgrade-Insecure-Requests: 1
DNT: 1
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryjbf7KaGShYBQ75wT
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Referer: http://169.254.126.211:81/
Accept-Encoding: gzip, deflate
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8,ru;q=0.7

运行脚本时显示 Python IDLE 输出的屏幕截图。

编辑:它只说从 post! 收到了 1024 个字节,所以看起来完整的文件没有被发送。

如何通过 POST 从网络浏览器发送文件,并在服务器上接收文件?

【问题讨论】:

我认为您需要增加表单上的最大帖子大小和脚本上的 data_recv_size。内容长度显示为 2804304 字节,但由于大小限制可能不会保存。 你在哪里看到2804304 bytes?当我运行脚本时,它会打印 674 bytes received from post! 它在您的标头响应中 (Content-Length: 2804304)。您尝试上传的文件是否约为 2.8 MB? 是的,我正在尝试上传一张 2.8MB 的照片来测试 server.py 是否有效。 尝试增加在脚本和上传表单中设置的限制。 【参考方案1】:

在 furas 答案的帮助下,尝试错误并在线进行大量研究。我能够创建一个有效的网络服务器。我将在此处发布完成的脚本,因为它将对将来偶然发现此问题的其他人有用,并且也正在尝试创建文件上传服务器。

import socket, re

class Server():
    def __init__(self):
        self.host_ip = "localhost"
        self.host_port = 81
        self.s = socket.socket()
        self.s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.data_recv_size = 1024
        self.form = b"""<form enctype="multipart/form-data" action="" method="POST">
                <input type="hidden" name="MAX_FILE_SIZE" value="8000000" />
                <input name="uploadedfile" type="file" /><br />
                <input type="submit" value="Upload File" />
            </form>"""

    def get_data(self, conn):
        data = b""
        while True:
            chunk = conn.recv(self.data_recv_size)
            if len(chunk) < self.data_recv_size:
                return data + chunk
            else:
                data += chunk

    def save_file(self, packet):
        name = re.compile(b'name="uploadedfile"; filename="(.+)"').search(packet).group(1)
        data = re.compile(b"WebKitFormBoundary((\n|.)*)Content-Type.+\n.+?\n((\n|.)*)([\-]+WebKitFormBoundary)?")
        with open(name, "wb") as file:
            file.write(data.search(packet).group(3))

    def run(self):
        print(f"[+] Server: http://self.host_ip:self.host_port")
        self.s.bind((self.host_ip, self.host_port))
        self.s.listen()

        while True:
            conn,addr = self.s.accept()
            request = self.get_data(conn)

            # GET request
            if request[0:5] == b"GET /":
                conn.sendall(b"HTTP/1.0 200 OK\nContent-Type: text/html\n\n"+self.form)
                print("[+] Responded to GET request!")

            # POST request
            elif request[0:4] == b"POST":
                packet = self.get_data(conn)
                self.save_file(packet)
                ok_reponse = b"Successfully upload %d bytes to the server!" % len(packet)
                conn.sendall(b"HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n"+ok_reponse)
                print(f"[+] len(packet) bytes received from POST!")


s = Server()
s.run()

这是显示脚本运行的屏幕截图

这是通过服务器上传后保存图像文件的目录的屏幕截图。

所以现在看来​​一切正常。

【讨论】:

【参考方案2】:

get_data() 中,您检查while b"\r\n\r\n" not in data:,因此您只能读取带有headershead,而不是带有已发布文件的body

您必须从标头 Content-Length 获取值并使用它来读取其余数据 - body

但是您的recv(1024) 可能已经阅读了body 的某些部分,这可能会产生问题。您应该逐字节读取 (recv(1)),直到获得 b"\r\n\r\n",然后使用 Content-Length 读取其余数据。


代码中包含 HTML 的最少工作代码,因此每个人都可以简单地复制和运行它。

import socket

class Server():  # PEP8: `CamelCaseName` for classes
    
    def __init__(self):
        #self.host_ip = socket.gethostbyname(socket.gethostname())
        self.host_ip = '0.0.0.0'  # universal IP for server - to connect from other computers
        self.host_port = 8000  # 81 was restricted on my computer

        #self.s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.s = socket.socket()  # default values are `socket.AF_INET, socket.SOCK_STREAM`
        self.s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)  # solution for '[Error 89] Address already in use'. Use before bind()

        self.data_recv_size = 1024

    def get_head(self, conn):
        """ gets headers from client """
        data = b""
        while not data.endswith(b"\r\n\r\n"):
            data += conn.recv(1)
        return data

    def get_body(self, conn, size):
        """ gets the data from client """
        data = b""
        while b"\r\n\r\n" not in data:
            data += conn.recv(self.data_recv_size)
        return data

    def run(self):
        """ main method starts the server """
        print(f"[+] Server: http://self.host_ip:self.host_port")
        self.s.bind((self.host_ip, self.host_port))
        self.s.listen()
        try:
            while True:
                conn, addr = self.s.accept()
                with conn:
                    head = self.get_head(conn)
                    
                    # todo: parse headers
                    lines = head.strip().splitlines()
                    request = lines[0]
                    headers = lines[1:]
                    headers = list(line.split(b': ') for line in headers)
                    #print(headers)
                    headers = dict(headers)
                    for key, value in headers.items():
                        print(f'key.decode(): value.decode()')
                    
                    # GET request
                    if request[0:5] == b"GET /":
                        #html = open("index.html", "rb").read()
                        
                        html = '''<form enctype="multipart/form-data" action="" method="POST">
                                    <input type="hidden" name="MAX_FILE_SIZE" value="8000000" />
                                    <input name="uploadedfile" type="file" /><br />
                                    <input type="submit" value="Upload File" />
                                </form>'''

                        conn.sendall(b"HTTP/1.0 200 OK\nContent-Type: text/html\n\n" + html.encode())
                        print("[+] Responded to GET request")
    
                    # POST request
                    elif request[0:4] == b"POST":
                        size = int(headers[b'Content-Length'].decode())
                        body = self.get_body(conn, size)
                        with open("output.txt", "ab") as file:
                            file.write(head)
                            file.write(body)
                        total_size = len(head)+len(body)    
                        print(f"total_size bytes received from POST")
                        html = f"OK: total_size bytes"
                        conn.sendall(b"HTTP/1.0 200 OK\r\nContent-Type: text/html\r\n\r\n" + html.encode())
                        print("[+] Responded to POST request")
    
        except KeyboardInterrupt:
            print("[+] Stopped by Ctrl+C")
        finally:
            self.s.close()

# --- main ---

s = Server()
s.run()

顺便说一句:

我在控制台中显示http://0.0.0.0:8000,因为在某些控制台中,您可以单击 URL 在浏览器中打开它。

我使用通用地址0.0.0.0,所以它可以同时接收来自所有NICNetwork Internet Controller/Card?)的连接,这意味着LAN电缆、WiFi和其他连接。

PEP 8 -- Style Guide for Python Code


编辑:

Flask 可以更简单

我使用表单一次上传3个文件。

import os
from flask import Flask, request

# create folder for uploaded data
FOLDER = 'uploaded'
os.makedirs(FOLDER, exist_ok=True)

app = Flask(__name__)

@app.route('/', methods=['GET', 'POST'])
def index():

    if request.method == 'GET':
        return '''<form enctype="multipart/form-data" action="" method="POST">
    <input type="hidden" name="MAX_FILE_SIZE" value="8000000" />
    <input name="uploadedfile1" type="file" /><br />
    <input name="uploadedfile2" type="file" /><br />
    <input name="uploadedfile3" type="file" /><br />
    <input type="submit" value="Upload File" />
</form>'''
    
    if request.method == 'POST':
        for field, data in request.files.items():
            print('field:', field)
            print('filename:', data.filename)
            if data.filename:
                data.save(os.path.join(FOLDER, data.filename))
        return "OK"

if __name__ == '__main__':
    app.run(port=80)

【讨论】:

感谢您的帮助,我遇到了您的代码问题,因为它仍未完全接收整个文件。但我设法通过组合 get_head 和 get_body 方法来修复它,然后使用 IF 语句检查块的长度。我将完成的代码放在单独的答案中。 我的代码收到所有请求,但要单独保存文件,它仍然需要解析来自get_head 的数据。它仅表明socket 需要大量代码,并且可以更简单地编写其他模块 - 即httpFlask 我在Flask 中添加了示例,以表明使用FlaskBottle 等Web 框架可以更简单。

以上是关于Python 通过 POST 接收 HTTP 文件的主要内容,如果未能解决你的问题,请参考以下文章

通过http/https的POST方式,发送处理和接收XML文件内容

python的post请求抓取数据

无法接收 Vue 通过 axios 发送到 PHP 的 Post 数据

使用XMLHttpRequest()时如何在python中接收POST数据

Qt通过HTTP POST上传文件

python 怎么处理http post 的请求参数