如何在Python中使用线程？

我正在努力理解Python中的线程。我看过文档和示例，但坦率地说，许多示例过于复杂，我很难理解它们。

如何清楚地显示为多线程划分的任务？

当前回答

给定函数f，如下所示：

import threading
threading.Thread(target=f).start()

向f传递参数

threading.Thread(target=f, args=(a,b,c)).start()

2017-03-16 16:07:46

其他回答

下面的代码可以运行10个线程同时打印0到99之间的数字：

from threading import Thread

def test():
    for i in range(0, 100):
        print(i)

thread_list = []

for _ in range(0, 10):
    thread = Thread(target=test)
    thread_list.append(thread)

for thread in thread_list:
    thread.start()

for thread in thread_list:
    thread.join()

下面的代码是上述代码循环版本的简写，运行10个线程，同时打印0到99之间的数字：

from threading import Thread

def test():
    [print(i) for i in range(0, 100)]

thread_list = [Thread(target=test) for _ in range(0, 10)]

[thread.start() for thread in thread_list]

[thread.join() for thread in thread_list]

结果如下：

2022-11-05 00:07:22

这很容易理解。这里有两种简单的线程处理方法。

import time
from concurrent.futures import ThreadPoolExecutor, as_completed
import threading

def a(a=1, b=2):
    print(a)
    time.sleep(5)
    print(b)
    return a+b

def b(**kwargs):
    if "a" in kwargs:
        print("am b")
    else:
        print("nothing")
        
to_do=[]
executor = ThreadPoolExecutor(max_workers=4)
ex1=executor.submit(a)
to_do.append(ex1)
ex2=executor.submit(b, **{"a":1})
to_do.append(ex2)

for future in as_completed(to_do):
    print("Future {} and Future Return is {}\n".format(future, future.result()))

print("threading")

to_do=[]
to_do.append(threading.Thread(target=a))
to_do.append(threading.Thread(target=b, kwargs={"a":1}))

for threads in to_do:
    threads.start()
    
for threads in to_do:
    threads.join()

2021-08-28 13:09:15

注意：对于Python中的实际并行化，您应该使用多处理模块来分叉并行执行的多个进程（由于全局解释器锁，Python线程提供了交织，但实际上它们是串行执行的，而不是并行执行的，并且仅在交织I/O操作时有用）。

然而，如果您只是在寻找交错（或者正在执行可以并行化的I/O操作，尽管存在全局解释器锁），那么线程模块就是开始的地方。作为一个非常简单的例子，让我们考虑通过并行对子范围求和来对大范围求和的问题：

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

请注意，以上是一个非常愚蠢的示例，因为它绝对没有I/O，并且由于全局解释器锁，虽然在CPython中交错执行（增加了上下文切换的开销），但仍将串行执行。

2010-05-17 04:35:11

我想提供一个简单的例子，以及我在自己解决这个问题时发现有用的解释。

在这个答案中，您将找到一些关于Python的GIL（全局解释器锁）的信息，以及一个使用multiprocessing.dummy编写的简单日常示例，以及一些简单的基准测试。

全局解释器锁（GIL）

Python不允许真正意义上的多线程。它有一个多线程包，但是如果你想多线程来加快你的代码，那么使用它通常不是一个好主意。

Python有一个称为全局解释器锁（GIL）的构造。GIL确保在任何时候只能执行一个“线程”。一个线程获取GIL，做一些工作，然后将GIL传递给下一个线程。

这种情况发生得很快，因此在人眼看来，您的线程似乎是并行执行的，但它们实际上只是轮流使用相同的CPU内核。

所有这些GIL传递都增加了执行开销。这意味着如果你想让你的代码运行得更快，那么使用线程打包通常不是个好主意。

使用Python的线程包是有原因的。如果你想同时运行一些事情，而效率不是一个问题，那就很好，也很方便。或者，如果您运行的代码需要等待一些东西（比如一些I/O），那么这可能很有意义。但是线程库不允许您使用额外的CPU内核。

多线程可以外包给操作系统（通过执行多线程处理），以及一些调用Python代码的外部应用程序（例如，Spark或Hadoop），或者Python代码调用的一些代码（例如：您可以让Python代码调用一个C函数来完成昂贵的多线程任务）。

为什么这很重要

因为很多人在了解GIL是什么之前，会花很多时间在他们的Python多线程代码中寻找瓶颈。

一旦这些信息清楚，下面是我的代码：

#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os

# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8

def do_ping(ip):
    if os.name == 'nt':
        print ("Using Windows Ping to " + ip)
        proc = Popen(['ping', ip], stdout=PIPE)
        return proc.communicate()[0]
    else:
        print ("Using Linux / Unix Ping to " + ip)
        proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
        return proc.communicate()[0]


os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
    result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
    do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
    output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")

print ("\nPretty printed output: ")
for key, value in output.items():
    print (key + "\n")
    print (value)

2019-08-07 06:59:20

这里有一个简单的示例：您需要尝试一些替代URL，并返回第一个URL的内容以进行响应。

import Queue
import threading
import urllib2

# Called by each thread
def get_url(q, url):
    q.put(urllib2.urlopen(url).read())

theurls = ["http://google.com", "http://yahoo.com"]

q = Queue.Queue()

for u in theurls:
    t = threading.Thread(target=get_url, args = (q,u))
    t.daemon = True
    t.start()

s = q.get()
print s

在这种情况下，线程被用作一种简单的优化：每个子线程都在等待URL解析和响应，以将其内容放入队列；每个线程都是一个守护进程（如果主线程结束，则不会保持进程运行——这比不结束更常见）；主线程启动所有子线程，在队列中执行get以等待其中一个线程完成put，然后发出结果并终止（这将删除所有可能仍在运行的子线程，因为它们是守护进程线程）。

Python中线程的正确使用总是与I/O操作相关（因为CPython无论如何都不使用多个内核来运行CPU绑定的任务，线程的唯一原因是在等待一些I/O时不会阻塞进程）。顺便说一句，队列几乎总是将工作分配给线程和/或收集工作结果的最佳方式，而且它们本质上是线程安全的，因此它们使您不用担心锁、条件、事件、信号量和其他线程间协调/通信概念。

2010-05-17 04:36:05

如何在Python中使用线程？

推荐文章

最新文章

标签