concurrent.futures模块使用

Posted 2021-02-18 weixia-blog

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了concurrent.futures模块使用相关的知识，希望对你有一定的参考价值。

多线程

通过ThreadPoolExecutor.map方法可以容易的启动多线程来完成任务，该方法返回一个生成器。通过使用next内置函数或for循环，每次迭代返回Future.result()的值

from concurrent import futures

def download_one(url):
    pass
    
def download_many(url_list):
    ...
    # thread_count表示最大线程数
    # url_list表示任务列表
    
    with futures.ThreadPoolExceutor(thread_count) as executor:
        result = executor.map(download_one, url_list)
    
    ...

此外，我们可以通过使用ThreadPoolExecutor.submit方法手动创建Future对象，该方法参数和ThreadPoolExecutor.map区别在于接受单任务而不是任务列表。然后使用concurrent.futures.as_completed方法等待任务完成，通过Future.result()方法返回结果

from concurrent import futures

def download_one(url):
    pass
    
def download_many(url_list):
    ...
    # thread_count表示最大线程数
    # url_list表示任务列表
    
    with futures.ThreadPoolExceutor(thread_count) as excutor:
        future_list = []
        for url in url_list:
            future = executor.submit(download_one, url)
            future_list.append(future)
            
        for future in futures.as_completed(future_list):
            result = future.result()
            
    ...

只有把某件事交给concurrent.futures.Executor子类处理时，才会创建concurrent.futures.Future实例。例如，Executor.submit()方法的参数是一个可调用的对象，调用这个方法后会为传入的可调用对象排期，并返回一个期物

Future.done()方法不阻塞，返回值是布尔值，指明Future链接的可调用对象是否已经执行。客户端代码通常不会询问Future是否运行结束，而是会等待通知。

Future.add_done_callback()方法只有一个参数，类型是可调用的对象，Future运行结束后会调用指定的可调用对象

GIL

因为全局解释器锁（GIL），一次只允许使用一个线程执行Python字节码。

当使用多线程处理CPU密集型任务时，这与顺序执行无异。

标准库中所有执行阻塞型I/O操作的函数，在等待操作系统返回结果时都会释放GIL。这意味GIL几乎对I/O密集型处理无害

多进程

ProcessPoolExecutor类把工作分配给多个Python进程处理。因此，如果需要做CPU密集型处理，使用这个模块能绕开GIL，利用所有可用的CPU核心。

ProcessPoolExecutor和ThreadPoolExecutor类都实现了通用的Executor接口，因此使用concurrent.futures模块能特别轻松地把基于线程的方案转成基于进程的方案

from concurrent import futures

def download_one(url):
    pass
    
def download_many(url_list):
    ...
    
    # url_list表示任务列表
    
    with futures.ProcessPoolExecutor() as executor:
        result = executor.map(download_one, url_list)
    
    ...

ProcessPoolExecutor类中，进程数量参数是可选的，而且大多数情况下不使用——默认值是os.cpu_count()函数返回的CPU数量

以上是关于concurrent.futures模块使用的主要内容，如果未能解决你的问题，请参考以下文章