Python套接字发送/接收逐渐变慢
Posted
技术标签:
【中文标题】Python套接字发送/接收逐渐变慢【英文标题】:Python socket send/recv gets gradually slower 【发布时间】:2021-04-09 03:48:13 【问题描述】:我正在使用 Python (3.8) 通过我的网络共享文件。 这是由服务器/侦听器发送数据块(当被要求时)和下载数据的客户端/recv:er 完成的。
一切都很好,除了一件事,下载速度总是逐渐但很快变得越来越慢,对 30MB 的文件影响很大,对 250MB 的文件影响很大。
以下是下载 25MB 文件的示例:
query for data
Downloading File [1] at 1.42 kByte/sec
Downloading File [2] at 265.98 kByte/sec
Downloading File [3] at 530.53 kByte/sec
Downloading File [4] at 795.08 kByte/sec
Downloading File [5] at 1056.0 kByte/sec
Downloading File [6] at 1319.12 kByte/sec
Downloading File [7] at 1582.25 kByte/sec
Downloading File [8] at 1845.38 kByte/sec
Downloading File [9] at 2108.51 kByte/sec
Downloading File [10] at 2368.0 kByte/sec
Downloading File [11] at 2635.4 kByte/sec
Downloading File [12] at 2898.53 kByte/sec
Downloading File [13] at 3165.94 kByte/sec
Downloading File [14] at 3424.0 kByte/sec
Downloading File [15] at 3682.85 kByte/sec
Downloading File [16] at 3947.4 kByte/sec
Downloading File [17] at 4220.51 kByte/sec
Downloading File [18] at 4475.08 kByte/sec
Downloading File [19] at 4736.0 kByte/sec
Downloading File [20] at 5010.53 kByte/sec
Downloading File [21] at 5266.53 kByte/sec
Downloading File [22] at 5274.69 kByte/sec
Downloading File [23] at 5227.19 kByte/sec
Downloading File [24] at 5173.18 kByte/sec
Downloading File [25] at 5109.13 kByte/sec
Downloading File [26] at 5040.12 kByte/sec
Downloading File [27] at 4976.76 kByte/sec
Downloading File [28] at 4916.04 kByte/sec
Downloading File [29] at 4839.46 kByte/sec
Downloading File [30] at 4779.34 kByte/sec
Downloading File [31] at 4717.97 kByte/sec
Downloading File [32] at 4654.68 kByte/sec
Downloading File [33] at 4587.11 kByte/sec
Downloading File [34] at 4521.86 kByte/sec
Downloading File [35] at 4489.31 kByte/sec
Downloading File [36] at 4463.14 kByte/sec
Downloading File [37] at 4447.0 kByte/sec
Downloading File [38] at 4429.8 kByte/sec
Downloading File [39] at 4408.23 kByte/sec
Downloading File [40] at 4385.68 kByte/sec
Downloading File [41] at 4362.61 kByte/sec
Downloading File [42] at 4332.12 kByte/sec
Downloading File [43] at 4277.88 kByte/sec
Downloading File [44] at 4241.96 kByte/sec
Downloading File [45] at 4214.6 kByte/sec
Downloading File [46] at 4188.76 kByte/sec
Downloading File [47] at 4161.43 kByte/sec
Downloading File [48] at 4122.81 kByte/sec
Downloading File [49] at 4078.92 kByte/sec
Downloading File [50] at 4038.91 kByte/sec
Downloading File [51] at 3995.1 kByte/sec
Downloading File [52] at 3946.54 kByte/sec
Downloading File [53] at 3905.08 kByte/sec
Downloading File [54] at 3862.33 kByte/sec
Downloading File [55] at 3818.92 kByte/sec
Downloading File [56] at 3778.95 kByte/sec
Downloading File [57] at 3736.93 kByte/sec
Downloading File [58] at 3698.62 kByte/sec
Downloading File [59] at 3669.39 kByte/sec
Downloading File [60] at 3638.99 kByte/sec
Downloading File [61] at 3611.71 kByte/sec
Downloading File [62] at 3576.03 kByte/sec
Downloading File [63] at 3546.88 kByte/sec
Downloading File [64] at 3516.09 kByte/sec
Downloading File [65] at 3483.13 kByte/sec
Downloading File [66] at 3451.92 kByte/sec
Downloading File [67] at 3419.35 kByte/sec
Downloading File [68] at 3392.87 kByte/sec
Downloading File [69] at 3366.28 kByte/sec
Downloading File [70] at 3337.75 kByte/sec
Downloading File [71] at 3306.12 kByte/sec
Downloading File [72] at 3279.61 kByte/sec
Downloading File [73] at 3248.65 kByte/sec
Downloading File [74] at 3222.84 kByte/sec
Downloading File [75] at 3191.29 kByte/sec
Downloading File [76] at 3159.18 kByte/sec
Downloading File [77] at 3127.02 kByte/sec
Downloading File [78] at 3099.15 kByte/sec
Downloading File [79] at 3070.14 kByte/sec
Downloading File [80] at 3033.71 kByte/sec
Downloading File [81] at 3007.82 kByte/sec
Downloading File [82] at 2978.38 kByte/sec
Downloading File [83] at 2950.2 kByte/sec
Downloading File [84] at 2921.61 kByte/sec
Downloading File [85] at 2889.32 kByte/sec
Downloading File [86] at 2860.66 kByte/sec
Downloading File [87] at 2833.2 kByte/sec
Downloading File [88] at 2805.48 kByte/sec
Downloading File [89] at 2775.55 kByte/sec
Downloading File [90] at 2749.85 kByte/sec
Downloading File [91] at 2722.94 kByte/sec
Downloading File [92] at 2696.21 kByte/sec
Downloading File [93] at 2670.54 kByte/sec
Downloading File [94] at 2643.62 kByte/sec
Downloading File [95] at 2620.01 kByte/sec
Downloading File [96] at 2596.48 kByte/sec
Downloading File [97] at 2573.56 kByte/sec
Downloading File [98] at 2550.22 kByte/sec
Downloading File [99] at 2525.19 kByte/sec
Downloading File [100] at 2503.39 kByte/sec
Downloading File done [100%] in around 10seconds
如您所见,需要进行一些初步工作才能加快速度,这很正常(对我来说可以接受),但随后在 5MB/s 左右的最高速度下,它会在没有任何具体原因的情况下缓慢下降。
对于一个更大的文件,它只会越来越多地下降,直到它真正爬起来。
有趣的是,我可以同时下载 2 个文件而不会干扰另一个文件。 假设一个 300MB 的文件以 0.4MB/s 的速度运行非常慢,另一个进程会迅速跃升至 5MB/s(在开始下降之前),所以似乎是重复的发送和/或接收以某种方式减慢了套接字。
代码非常简单,服务器发送数据直到全部发送(块之间有确认),recv 只是接收直到块被下载然后发送确认,冲洗并重复直到全部下载,它工作得很好.
我是否必须在套接字上执行一些魔法,例如清除或其他什么,或者 Python 是否对处理大量数据块感到不自在(我发送 32kb 块,所以它们确实加起来,但它们没有被索引或存储,只是添加到最终结果数据中)?
非常感谢任何帮助!
编辑:淡化发送和接收功能:
# Encodes a long long (8bytes) for the size of the rest of the data
# Then sends it by packages
# Sends msg as a sting, or opens the file if file 1= None and reads it and sends it off by chunks (So we won't fill up all the RAM)
def socket_secure_send(s, msg, file=None, extensive_logging=False):
# As we all know, data sent over the internet might be split up, so send how many
# bytes we'll try to push through here. An unsigned 8Byte integer should do the trick
# hopefully for the foreseeable future:
datasize = len(msg)
lengthdata = pack('>Q', datasize)
# Send a long long, little endian encoded information about the msg size
try:
# Send size of data to come
s.send(lengthdata)
# Send off the actual data
# send by packets
max_packet_size = get_configuration_value('server_send_package_size', 32768)
data_size = len(msg)
left_to_send = data_size
sent = 0
while left_to_send > 0:
package_size = min(max_packet_size, left_to_send)
# What? Didn't we already send the size of the data earlier?
# We sure did, but this is so when someone is downloading your 1.5TB
# it won't choke the OS:s small buffers
# So here we'll use a smaller 32bits unsigned integer,
# but beware, buffers can be small so don't use numbers too big!
lengthdata = pack('>L', package_size)
s.send(lengthdata)
# Now send a chunk of the data
data_to_send = msg[sent:sent+package_size]
data_to_send_len = len(data_to_send)
a = s.send(data_to_send)
sent += package_size
left_to_send -= package_size
# And wait for the little Ack!
ack = s.recv(1) # todo check the ack = b'1' or something
# Check if the data went through, other socket sends b'0' for success and others for error
ack = s.recv(1)
return ack == b'0'
except socket.error:
print("SSS There was a problem in socket_secure_send")
return False
def read_chunk(s, chunk_size):
chunk = b''
while chunk_size > 0:
part = s.recv(chunk_size) # try to recv the missing data in the chunk
if part is None:
s.close()
return
chunk += part
chunk_size -= len(part)
return chunk
# Receive data in packages
# Recv:s in memory, or to a file if file != None
def socket_secure_recv(s, max_read_size, file=None, extensive_logging=False):
# Recv the long long size data
try:
tmp = s.recv(8)
except socket.error as e:
err = e.args[0]
if err == errno.EAGAIN or err == errno.EWOULDBLOCK:
if extensive_logging:
if err == errno.EAGAIN:
print('s-s-r no (size) data avaliable: errno.EAGAIN')
if err == errno.EWOULDBLOCK:
print('s-s-r no (size) data avaliable: errno.EWOULDBLOCK')
# No data available
return None
else:
# Real error:
print("s-s-r error: ", e)
return None
if len(tmp) != 8:
return None
(to_read, ) = unpack('>Q', tmp)
data = b''
read = to_read
while read > 0:
# Recv a chunk:
chunk_size_data = s.recv(4) # A 4 bit unsigned int for chunk size
if len(chunk_size_data) != 4:
return None
(chunk_size,) = unpack('>L', chunk_size_data)
# get a chunk:
chunk = read_chunk(s, min(to_read, chunk_size))
data = data + chunk
# Send back ACK
s.send(b'0')
read = to_read - len(data)
s.send(b'0')
return data
【问题讨论】:
你已经添加了python
标签,但是你没有为你的问题添加代码。
32 kb 块对于 python 来说没有问题,并且不需要任何套接字清除。像这样的客户端/服务器应该运行得非常快。因此,它在您的实现中有所体现。如果你只是读/写这些块然后删除它们,一切都应该没问题。也许你已经实现的这些ack,包括我猜的某种命令握手,可能是问题所在。但这只是猜测。您可以在包含时间戳的代码中添加一些日志记录并写入文件。稍后使用它来分析可能浪费时间的地方。
@quamrana 是的,因为这是一个 Python 问题。例如,这种行为在 C/C++ 中非常有效。
在写入文件之前,您不会在内存中保留 1000 个数据添加。但是,如果这是一个 32k 字符串的列表,那仍然只有 32meg 并且需要更改开销。您的磁盘应该能够处理 5MB 的数据写入速率,所以这没什么大不了的。
您最好保留一个列表data_list.append(data)
而不是data = data + chunk
,因为这会在您每次执行+
时复制整个累积数据。但是您不想在内存中保存 250M 文件 - 但您似乎在函数中阅读了整个内容。其他问题,tmp = s.recv(8)
可能会以泪水告终,因为 TCP 可能会给你,比如说,7 个字节然后 1 个字节。您可以拨打电话,准确地收到所要求的内容。
【参考方案1】:
您可以使用列表来缓冲数据:
data = []
read = to_read
while read > 0:
# Recv a chunk:
chunk_size_data = s.recv(4) # A 4 bit unsigned int for chunk size
if len(chunk_size_data) != 4:
return None
(chunk_size,) = unpack('>L', chunk_size_data)
# get a chunk:
chunk = read_chunk(s, min(to_read, chunk_size))
data.append(chunk)
# Send back ACK
s.send(b'0')
read = to_read - len(data)
s.send(b'0')
return b''.join(data)
getting slower
和 slaps it onto a 'data' variable
在那里逗弄着旧的蜘蛛侠。
【讨论】:
老兄,你就是那个人,谢谢!附言。 b''.join(data) 的额外积分,再次欢呼并感谢!以上是关于Python套接字发送/接收逐渐变慢的主要内容,如果未能解决你的问题,请参考以下文章
发送 15 条消息后,带有套接字的 React 聊天应用程序变慢