看了不少书和资料,自认为对于 python 中的线程、进程、协程等略知一二了。
想实现一个多线程池的模型,但是也不想用 queue 甚至是 celery 这些,查了很多资料,国内的原创的不多,并且基本都是停留在 python 2.7 的时代,而且国内的文章即便用 google 搜索,大部分文章也是互相转载。国外的资料比较好的还是在 stackoverflow,国内的简书上好文章不少。
搜了半天,在 python 的官方文档上赫然有着一个例子:
The concurrent.futures module provides a high-level interface for asynchronously executing callables.
The asynchronous execution can be performed with threads, using ThreadPoolExecutor, or separate processes, using ProcessPoolExecutor. Both implement the same interface, which is defined by the abstract Executor class.
用 concurrent.futures 即可,然后
ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously.
找了好久差点想自己实现(担心自己写的很烂)的线程池就在这里,例子如下:
import urllib.request
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
# Start the load operations and mark each future with its URL
future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
这个例子写得很清楚,可以直接运行,不过建议把那些网址换掉,因为 g*w 的关系。
没想到 python 的官方文档做的这么好,我准备从头到底先通读一遍。