Python套接字发送/接收逐渐变慢

Posted

技术标签:

【中文标题】Python套接字发送/接收逐渐变慢【英文标题】:Python socket send/recv gets gradually slower 【发布时间】:2021-04-09 03:48:13 【问题描述】:

我正在使用 Python (3.8) 通过我的网络共享文件。 这是由服务器/侦听器发送数据块(当被要求时)和下载数据的客户端/recv:er 完成的。

一切都很好,除了一件事,下载速度总是逐渐但很快变得越来越慢,对 30MB 的文件影响很大,对 250MB 的文件影响很大。

以下是下载 25MB 文件的示例:

query for data
Downloading File [1] at 1.42  kByte/sec
Downloading File [2] at 265.98  kByte/sec
Downloading File [3] at 530.53  kByte/sec
Downloading File [4] at 795.08  kByte/sec
Downloading File [5] at 1056.0  kByte/sec
Downloading File [6] at 1319.12  kByte/sec
Downloading File [7] at 1582.25  kByte/sec
Downloading File [8] at 1845.38  kByte/sec
Downloading File [9] at 2108.51  kByte/sec
Downloading File [10] at 2368.0  kByte/sec
Downloading File [11] at 2635.4  kByte/sec
Downloading File [12] at 2898.53  kByte/sec
Downloading File [13] at 3165.94  kByte/sec
Downloading File [14] at 3424.0  kByte/sec
Downloading File [15] at 3682.85  kByte/sec
Downloading File [16] at 3947.4  kByte/sec
Downloading File [17] at 4220.51  kByte/sec
Downloading File [18] at 4475.08  kByte/sec
Downloading File [19] at 4736.0  kByte/sec
Downloading File [20] at 5010.53  kByte/sec
Downloading File [21] at 5266.53  kByte/sec
Downloading File [22] at 5274.69  kByte/sec
Downloading File [23] at 5227.19  kByte/sec
Downloading File [24] at 5173.18  kByte/sec
Downloading File [25] at 5109.13  kByte/sec
Downloading File [26] at 5040.12  kByte/sec
Downloading File [27] at 4976.76  kByte/sec
Downloading File [28] at 4916.04  kByte/sec
Downloading File [29] at 4839.46  kByte/sec
Downloading File [30] at 4779.34  kByte/sec
Downloading File [31] at 4717.97  kByte/sec
Downloading File [32] at 4654.68  kByte/sec
Downloading File [33] at 4587.11  kByte/sec
Downloading File [34] at 4521.86  kByte/sec
Downloading File [35] at 4489.31  kByte/sec
Downloading File [36] at 4463.14  kByte/sec
Downloading File [37] at 4447.0  kByte/sec
Downloading File [38] at 4429.8  kByte/sec
Downloading File [39] at 4408.23  kByte/sec
Downloading File [40] at 4385.68  kByte/sec
Downloading File [41] at 4362.61  kByte/sec
Downloading File [42] at 4332.12  kByte/sec
Downloading File [43] at 4277.88  kByte/sec
Downloading File [44] at 4241.96  kByte/sec
Downloading File [45] at 4214.6  kByte/sec
Downloading File [46] at 4188.76  kByte/sec
Downloading File [47] at 4161.43  kByte/sec
Downloading File [48] at 4122.81  kByte/sec
Downloading File [49] at 4078.92  kByte/sec
Downloading File [50] at 4038.91  kByte/sec
Downloading File [51] at 3995.1  kByte/sec
Downloading File [52] at 3946.54  kByte/sec
Downloading File [53] at 3905.08  kByte/sec
Downloading File [54] at 3862.33  kByte/sec
Downloading File [55] at 3818.92  kByte/sec
Downloading File [56] at 3778.95  kByte/sec
Downloading File [57] at 3736.93  kByte/sec
Downloading File [58] at 3698.62  kByte/sec
Downloading File [59] at 3669.39  kByte/sec
Downloading File [60] at 3638.99  kByte/sec
Downloading File [61] at 3611.71  kByte/sec
Downloading File [62] at 3576.03  kByte/sec
Downloading File [63] at 3546.88  kByte/sec
Downloading File [64] at 3516.09  kByte/sec
Downloading File [65] at 3483.13  kByte/sec
Downloading File [66] at 3451.92  kByte/sec
Downloading File [67] at 3419.35  kByte/sec
Downloading File [68] at 3392.87  kByte/sec
Downloading File [69] at 3366.28  kByte/sec
Downloading File [70] at 3337.75  kByte/sec
Downloading File [71] at 3306.12  kByte/sec
Downloading File [72] at 3279.61  kByte/sec
Downloading File [73] at 3248.65  kByte/sec
Downloading File [74] at 3222.84  kByte/sec
Downloading File [75] at 3191.29  kByte/sec
Downloading File [76] at 3159.18  kByte/sec
Downloading File [77] at 3127.02  kByte/sec
Downloading File [78] at 3099.15  kByte/sec
Downloading File [79] at 3070.14  kByte/sec
Downloading File [80] at 3033.71  kByte/sec
Downloading File [81] at 3007.82  kByte/sec
Downloading File [82] at 2978.38  kByte/sec
Downloading File [83] at 2950.2  kByte/sec
Downloading File [84] at 2921.61  kByte/sec
Downloading File [85] at 2889.32  kByte/sec
Downloading File [86] at 2860.66  kByte/sec
Downloading File [87] at 2833.2  kByte/sec
Downloading File [88] at 2805.48  kByte/sec
Downloading File [89] at 2775.55  kByte/sec
Downloading File [90] at 2749.85  kByte/sec
Downloading File [91] at 2722.94  kByte/sec
Downloading File [92] at 2696.21  kByte/sec
Downloading File [93] at 2670.54  kByte/sec
Downloading File [94] at 2643.62  kByte/sec
Downloading File [95] at 2620.01  kByte/sec
Downloading File [96] at 2596.48  kByte/sec
Downloading File [97] at 2573.56  kByte/sec
Downloading File [98] at 2550.22  kByte/sec
Downloading File [99] at 2525.19  kByte/sec
Downloading File [100] at 2503.39  kByte/sec
Downloading File done [100%] in around 10seconds

如您所见,需要进行一些初步工作才能加快速度,这很正常(对我来说可以接受),但随后在 5MB/s 左右的最高速度下,它会在没有任何具体原因的情况下缓慢下降。

对于一个更大的文件,它只会越来越多地下降,直到它真正爬起来。

有趣的是,我可以同时下载 2 个文件而不会干扰另一个文件。 假设一个 300MB 的文件以 0.4MB/s 的速度运行非常慢,另一个进程会迅速跃升至 5MB/s(在开始下降之前),所以似乎是重复的发送和/或接收以某种方式减慢了套接字。

代码非常简单,服务器发送数据直到全部发送(块之间有确认),recv 只是接收直到块被下载然后发送确认,冲洗并重复直到全部下载,它工作得很好.

我是否必须在套接字上执行一些魔法,例如清除或其他什么,或者 Python 是否对处理大量数据块感到不自在(我发送 32kb 块,所以它们确实加起来,但它们没有被索引或存储,只是添加到最终结果数据中)?

非常感谢任何帮助!

编辑:淡化发送和接收功能:

# Encodes a long long (8bytes) for the size of the rest of the data
# Then sends it by packages
# Sends msg as a sting, or opens the file if file 1= None and reads it and sends it off by chunks (So we won't fill up all the RAM)
def socket_secure_send(s, msg, file=None, extensive_logging=False):
    # As we all know, data sent over the internet might be split up, so send how many
    # bytes we'll try to push through here. An unsigned 8Byte integer should do the trick
    # hopefully for the foreseeable future:
    datasize = len(msg)
    lengthdata = pack('>Q', datasize)
    # Send a long long, little endian encoded information about the msg size
    try:
        # Send size of data to come
        s.send(lengthdata)

        # Send off the actual data
        
        # send by packets
        max_packet_size = get_configuration_value('server_send_package_size', 32768)
        data_size = len(msg)
        left_to_send = data_size
        sent = 0
        while left_to_send > 0:
            package_size = min(max_packet_size, left_to_send)

            # What? Didn't we already send the size of the data earlier?
            # We sure did, but this is so when someone is downloading your 1.5TB
            # it won't choke the OS:s small buffers
            # So here we'll use a smaller 32bits unsigned integer,
            # but beware, buffers can be small so don't use numbers too big!
            lengthdata = pack('>L', package_size)
            s.send(lengthdata)

            # Now send a chunk of the data
            data_to_send = msg[sent:sent+package_size]
            data_to_send_len = len(data_to_send)
            a = s.send(data_to_send)

            sent += package_size
            left_to_send -= package_size

            # And wait for the little Ack!
            ack = s.recv(1)  # todo check the ack = b'1' or something

        # Check if the data went through, other socket sends b'0' for success and others for error
        ack = s.recv(1)

        return ack == b'0'
    except socket.error:
        print("SSS There was a problem in socket_secure_send")

    return False


def read_chunk(s, chunk_size):
    chunk = b''
    while chunk_size > 0:
        part = s.recv(chunk_size)  # try to recv the missing data in the chunk
        if part is None:
            s.close()
            return

        chunk += part
        chunk_size -= len(part)

    return chunk

# Receive data in packages
# Recv:s in memory, or to a file if file != None
def socket_secure_recv(s, max_read_size, file=None, extensive_logging=False):
    # Recv the long long size data
    try:
        tmp = s.recv(8)
    except socket.error as e:
        err = e.args[0]
        if err == errno.EAGAIN or err == errno.EWOULDBLOCK:
            if extensive_logging:
                if err == errno.EAGAIN:
                    print('s-s-r no (size) data avaliable: errno.EAGAIN')
                if err == errno.EWOULDBLOCK:
                    print('s-s-r no (size) data avaliable: errno.EWOULDBLOCK')
            # No data available
            return None
        else:
            # Real error:
            print("s-s-r error: ", e)
            return None

    if len(tmp) != 8:
        return None

    (to_read, ) = unpack('>Q', tmp)

    data = b''
    read = to_read

    while read > 0:
        # Recv a chunk:
        chunk_size_data = s.recv(4)  # A 4 bit unsigned int for chunk size
        if len(chunk_size_data) != 4:
            return None

        (chunk_size,) = unpack('>L', chunk_size_data)

        # get a chunk:
        chunk = read_chunk(s, min(to_read, chunk_size))

        data = data + chunk

        # Send back ACK
        s.send(b'0')

        read = to_read - len(data)

    s.send(b'0')
    return data



【问题讨论】:

你已经添加了python标签,但是你没有为你的问题添加代码。 32 kb 块对于 python 来说没有问题,并且不需要任何套接字清除。像这样的客户端/服务器应该运行得非常快。因此,它在您的实现中有所体现。如果你只是读/写这些块然后删除它们,一切都应该没问题。也许你已经实现的这些ack,包括我猜的某种命令握手,可能是问题所在。但这只是猜测。您可以在包含时间戳的代码中添加一些日志记录并写入文件。稍后使用它来分析可能浪费时间的地方。 @quamrana 是的,因为这是一个 Python 问题。例如,这种行为在 C/C++ 中非常有效。 在写入文件之前,您不会在内存中保留 1000 个数据添加。但是,如果这是一个 32k 字符串的列表,那仍然只有 32meg 并且需要更改开销。您的磁盘应该能够处理 5MB 的数据写入速率,所以这没什么大不了的。 您最好保留一个列表data_list.append(data) 而不是data = data + chunk,因为这会在您每次执行+ 时复制整个累积数据。但是您不想在内存中保存 250M 文件 - 但您似乎在函数中阅读了整个内容。其他问题,tmp = s.recv(8) 可能会以泪水告终,因为 TCP 可能会给你,比如说,7 个字节然后 1 个字节。您可以拨打电话,准确地收到所要求的内容。 【参考方案1】:

您可以使用列表来缓冲数据:

data = []
read = to_read

while read > 0:
    # Recv a chunk:
    chunk_size_data = s.recv(4)  # A 4 bit unsigned int for chunk size
    if len(chunk_size_data) != 4:
        return None

    (chunk_size,) = unpack('>L', chunk_size_data)

    # get a chunk:
    chunk = read_chunk(s, min(to_read, chunk_size))

    data.append(chunk)

    # Send back ACK
    s.send(b'0')

    read = to_read - len(data)

s.send(b'0')
return b''.join(data)

getting slowerslaps it onto a 'data' variable 在那里逗弄着旧的蜘蛛侠。

【讨论】:

老兄,你就是那个人,谢谢!附言。 b''.join(data) 的额外积分,再次欢呼并感谢!

以上是关于Python套接字发送/接收逐渐变慢的主要内容,如果未能解决你的问题,请参考以下文章

python3套接字发送接收'字节'对象没有属性'读取'

发送 15 条消息后,带有套接字的 React 聊天应用程序变慢

Python 3套接字客户端发送数据和C++套接字服务器接收偏移数据?

使用 python 套接字发送/接收数据

Python套接字不接收而不发送

Python 套接字中的发送和接收是如何工作的?