一个简单的计时器对比各种可迭代对象定义方式的速度区别

Posted 吾码2016

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了一个简单的计时器对比各种可迭代对象定义方式的速度区别相关的知识,希望对你有一定的参考价值。

一个简单的计时器对比各种可迭代对象定义方式的速度区别

前情介绍: 如果对迭代器和生成器不了解,可以先看这两篇

初始版本

import time

reps = 1000
repslist = range(reps)


def timer(func, *pargs, **kargs):
start = time.clock()
for i in repslist:
ret = func(*pargs, **kargs)
elapsed = time.clock() - start
return (elapsed, ret)

这个是初始版本的计时器.

我们先来做个测试跑一遍

from timer import timer
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(abs(x))
return res

def listComp():
return [abs(x) for x in repslist]

def mapCall():
return list(map(abs,repslist))

def genExpr():
return list(abs(x) for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield abs(x)
return list(gen())

print(sys.version)

for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print(\'-\'*33)
print(\'%-9s:%.5f => [%s...%s]\'%(test.__name__,elapsed,result[0],result[-1]))

得到的结果如下:

C:\\Anaconda3\\python.exe C:/Users/Brady/PycharmProjects/FAQ/literor.py
3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.40492 => [0...99999]
---------------------------------
listComp :7.58494 => [0...99999]
---------------------------------
mapCall :4.28971 => [0...99999]
---------------------------------
genExpr :10.49181 => [0...99999]
---------------------------------
genFunc :10.76498 => [0...99999]

从结果中可以看出来:

  • map比列表解析式快,而且两者都比for循环要快得多.
  • 生成器表达式和函数速度居中

如果我们采用自定义函数而非内置函数的话,得到的结果就更有意思了:

from timer import timer
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(x+10)
return res

def listComp():
return [x+10 for x in repslist]

def mapCall():
return list(map(lambda x:x+10,repslist))

def genExpr():
return list(x+10 for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())

print(sys.version)

for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = timer(test)
print(\'-\'*33)
print(\'%-9s:%.5f => [%s...%s]\'%(test.__name__,elapsed,result[0],result[-1]))

我们得到的结果如下:

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :26.69562 => [10...100009]
---------------------------------
listComp :16.46341 => [10...100009]
---------------------------------
mapCall :19.51527 => [10...100009]
---------------------------------
genExpr :10.53358 => [10...100009]
---------------------------------
genFunc :10.85899 => [10...100009]

Process finished with exit code 0

说实话这个结果有点不好解释了...貌似打脸了...

于是我又跑了一遍...得到的结果如下:

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
---------------------------------
forloop :11.92378 => [10...100009]
---------------------------------
listComp :7.27866 => [10...100009]
---------------------------------
mapCall :12.92113 => [10...100009]
---------------------------------
genExpr :10.50988 => [10...100009]
---------------------------------
genFunc :10.56482 => [10...100009]

Process finished with exit code 0

这个结果比较符合我们的预期...

  • 在自定义函数下,map的速度比for循环要慢
  • 列表解析式速度是最块的.
  • 生成器表达式的速度比列表解析式要慢,但是与生成器函数差不多.

进阶版本

这个结果主要是由于python解释器的实现造成的.

同时也说明一个问题... 我们的计时器不够科学...

于是下面我们来优化一下我们的计时器.

  • 考虑平台的兼容性,在类unix系统中,time.time可以提供更好的解析
  • 由于随机的系统载入可能引起的波动,我们在测试中取最短时间比取总运行时间要更可靠.

改版后的计时器

import time
import sys

if sys.platform[:3]==\'win\':
timefunc = time.clock
else:
timfunc = time.time


def trace(*args):
"""
used for debuging
:param args:
:return:
"""

pass

def timer(func,*pargs,**kargs):
_reps = kargs.pop(\'_reps\',1000)
trace(func,pargs,kargs,_reps)
repslist = range(_reps)
start = timefunc()
for i in repslist:
ret = func(*pargs,**kargs)
elapsed = timefunc()-start
return (elapsed,ret)


def best(func,*pargs,**kargs):
_reps = kargs.pop(\'_reps\',50)
best=2**32
for i in range(_reps):
(time,ret)=timer(func,*pargs,_reps=1,**kargs)
if time <best: best=time
return (best,ret)

改版后的测试代码

from timer import timer
from timer import best
import sys

reps = 100000
repslist = range(reps)

def forloop():
res = []
for x in repslist:
res.append(x+10)
return res

def listComp():
return [x+10 for x in repslist]

def mapCall():
return list(map(lambda x:x+10,repslist))

def genExpr():
return list(x+10 for x in repslist)

def genFunc():
def gen():
for x in repslist:
yield x+10
return list(gen())

print(sys.version)

for tester in (timer,best):
print(f\'<{tester.__name__}>\')
for test in(forloop,listComp,mapCall,genExpr,genFunc):
elapsed,result = tester(test)
print(\'-\'*35)
print(\'%-9s:%.5f => [%s...%s]\'%(test.__name__,elapsed,result[0],result[-1]))

来看一下结果

3.7.4 (default, Aug  9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)]
<timer>
-----------------------------------
forloop :11.18427 => [10...100009]
-----------------------------------
listComp :7.33068 => [10...100009]
-----------------------------------
mapCall :13.33474 => [10...100009]
-----------------------------------
genExpr :11.25375 => [10...100009]
-----------------------------------
genFunc :11.03975 => [10...100009]
<best>
-----------------------------------
forloop :0.00904 => [10...100009]
-----------------------------------
listComp :0.00525 => [10...100009]
-----------------------------------
mapCall :0.01133 => [10...100009]
-----------------------------------
genExpr :0.00845 => [10...100009]
-----------------------------------
genFunc :0.00785 => [10...100009]

从运行的最快速度来看的话,完全符合我们上面的结论.

  • 列表解析式的速度是最快的
  • map函数比正常的for循环要慢
  • 生成器表达式比for循环要快,速度与生成器函数差不太多.

结论:

其实这篇文章写来纯粹是为了好玩的. 既然选择了python...就别太纠结运行速度了,毕竟python只负责貌美如花...

python代码的优化,首先考虑的是可读性和简单性,其次实在闲的蛋疼了再去优化性能.

以上是关于一个简单的计时器对比各种可迭代对象定义方式的速度区别的主要内容,如果未能解决你的问题,请参考以下文章

python 11 函数名 迭代器

可迭代对象迭代器和生成器

可迭代对象迭代器和生成器

可迭代对象迭代器和生成器

可迭代对象迭代器和生成器

可迭代对象迭代器和生成器