如何在Python中使用线程？

我正在努力理解Python中的线程。我看过文档和示例，但坦率地说，许多示例过于复杂，我很难理解它们。

如何清楚地显示为多线程划分的任务？

当前回答

只需注意：线程不需要队列。

这是我可以想象的最简单的例子，它显示了10个并发运行的进程。

import threading
from random import randint
from time import sleep


def print_number(number):

    # Sleeps a random 1 to 10 seconds
    rand_int_var = randint(1, 10)
    sleep(rand_int_var)
    print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"

thread_list = []

for i in range(1, 10):

    # Instantiates the thread
    # (i) does not make a sequence, so (i,)
    t = threading.Thread(target=print_number, args=(i,))
    # Sticks the thread in a list so that it remains accessible
    thread_list.append(t)

# Starts threads
for thread in thread_list:
    thread.start()

# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
    thread.join()

# Demonstrates that the main process waited for threads to complete
print "Done"

2013-09-23 16:07:36

其他回答

作为第二个anwser的python3版本：

import queue as Queue
import threading
import urllib.request

# Called by each thread
def get_url(q, url):
    q.put(urllib.request.urlopen(url).read())

theurls = ["http://google.com", "http://yahoo.com", "http://www.python.org","https://wiki.python.org/moin/"]

q = Queue.Queue()
def thread_func():
    for u in theurls:
        t = threading.Thread(target=get_url, args = (q,u))
        t.daemon = True
        t.start()

    s = q.get()
    
def non_thread_func():
    for u in theurls:
        get_url(q,u)
        

    s = q.get()

您可以测试它：

start = time.time()
thread_func()
end = time.time()
print(end - start)

start = time.time()
non_thread_func()
end = time.time()
print(end - start)

non_thread_func（）花费的时间应该是thread_func（）的4倍

2021-06-28 11:44:33

注意：对于Python中的实际并行化，您应该使用多处理模块来分叉并行执行的多个进程（由于全局解释器锁，Python线程提供了交织，但实际上它们是串行执行的，而不是并行执行的，并且仅在交织I/O操作时有用）。

然而，如果您只是在寻找交错（或者正在执行可以并行化的I/O操作，尽管存在全局解释器锁），那么线程模块就是开始的地方。作为一个非常简单的例子，让我们考虑通过并行对子范围求和来对大范围求和的问题：

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

请注意，以上是一个非常愚蠢的示例，因为它绝对没有I/O，并且由于全局解释器锁，虽然在CPython中交错执行（增加了上下文切换的开销），但仍将串行执行。

2010-05-17 04:35:11

Python 3具有启动并行任务的功能。这使我们的工作更容易。

它有线程池和进程池。

以下内容提供了一个见解：

ThreadPoolExecutor示例（源代码）

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

ProcessPoolExecutor（源）

import concurrent.futures
import math

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def main():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
            print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
    main()

2017-07-20 11:17:23

这里是使用线程导入CSV的一个非常简单的示例。（图书馆的收录可能因不同的目的而有所不同。）

助手函数：

from threading import Thread
from project import app
import csv


def import_handler(csv_file_name):
    thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
    thr.start()

def dump_async_csv_data(csv_file_name):
    with app.app_context():
        with open(csv_file_name) as File:
            reader = csv.DictReader(File)
            for row in reader:
                # DB operation/query

驾驶员功能：

import_handler(csv_file_name)

2018-05-03 08:53:30

自2010年提出这个问题以来，如何使用带有映射和池的Python进行简单的多线程处理已经得到了真正的简化。

下面的代码来自一篇文章/博客文章，您应该明确查看（没有从属关系）-一行中的并行性：一个更好的日常线程任务模型。我将在下面总结一下——它最终只是几行代码：

from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)

以下是多线程版本：

results = []
for item in my_array:
    results.append(my_function(item))

描述

Map是一个很酷的小函数，是将并行性轻松注入Python代码的关键。对于那些不熟悉的人来说，map是从Lisp这样的函数语言中提取出来的。它是一个将另一个函数映射到序列上的函数。Map为我们处理序列上的迭代，应用函数，并在最后将所有结果存储在一个方便的列表中。

实施

map函数的并行版本由两个库提供：multiprocessing，以及它鲜为人知但同样神奇的stepchild：multiprocessing.dummy。

multiprocessing.dummy与多处理模块完全相同，但使用线程（一个重要的区别-对CPU密集型任务使用多个进程；对I/O（和在I/O期间）使用线程）：

multiprocessing.dummy复制了多处理的API，但它不过是线程模块的包装器。

import urllib2
from multiprocessing.dummy import Pool as ThreadPool

urls = [
  'http://www.python.org',
  'http://www.python.org/about/',
  'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
  'http://www.python.org/doc/',
  'http://www.python.org/download/',
  'http://www.python.org/getit/',
  'http://www.python.org/community/',
  'https://wiki.python.org/moin/',
]

# Make the Pool of workers
pool = ThreadPool(4)

# Open the URLs in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)

# Close the pool and wait for the work to finish
pool.close()
pool.join()

计时结果：

Single thread:   14.4 seconds
       4 Pool:   3.1 seconds
       8 Pool:   1.4 seconds
      13 Pool:   1.3 seconds

传递多个参数（仅在Python 3.3及更高版本中如此）：

要传递多个数组，请执行以下操作：

results = pool.starmap(function, zip(list_a, list_b))

或者传递常量和数组：

results = pool.starmap(function, zip(itertools.repeat(constant), list_a))

如果您使用的是早期版本的Python，可以通过此解决方法传递多个参数）。

（感谢user136036提供的有用评论。）

2015-02-11 19:53:42

如何在Python中使用线程？

推荐文章

最新文章

标签