在Python多处理库中,是否有支持多个参数的pool.map变体?
import multiprocessing
text = "test"
def harvester(text, case):
X = case[0]
text + str(X)
if __name__ == '__main__':
pool = multiprocessing.Pool(processes=6)
case = RAW_DATASET
pool.map(harvester(text, case), case, 1)
pool.close()
pool.join()
您可以使用以下两个函数,以避免为每个新函数编写包装器:
import itertools
from multiprocessing import Pool
def universal_worker(input_pair):
function, args = input_pair
return function(*args)
def pool_args(function, *args):
return zip(itertools.repeat(function), zip(*args))
将函数函数与参数arg_0、arg_1和arg_2的列表一起使用,如下所示:
pool = Pool(n_core)
list_model = pool.map(universal_worker, pool_args(function, arg_0, arg_1, arg_2)
pool.close()
pool.join()
将Python 3.3+与pool.starmap()一起使用:
from multiprocessing.dummy import Pool as ThreadPool
def write(i, x):
print(i, "---", x)
a = ["1","2","3"]
b = ["4","5","6"]
pool = ThreadPool(2)
pool.starmap(write, zip(a,b))
pool.close()
pool.join()
结果:
1 --- 4
2 --- 5
3 --- 6
如果您喜欢,还可以zip()更多参数:zip(a,b,c,d,e)
如果希望将常量值作为参数传递:
import itertools
zip(itertools.repeat(constant), a)
如果您的函数应该返回以下内容:
results = pool.starmap(write, zip(a,b))
这将提供一个包含返回值的列表。
import time
from multiprocessing import Pool
def f1(args):
vfirst, vsecond, vthird = args[0] , args[1] , args[2]
print(f'First Param: {vfirst}, Second value: {vsecond} and finally third value is: {vthird}')
pass
if __name__ == '__main__':
p = Pool()
result = p.map(f1, [['Dog','Cat','Mouse']])
p.close()
p.join()
print(result)
另一个简单的选择是将函数参数包装在元组中,然后包装应该在元组中传递的参数。在处理大量数据时,这可能并不理想。我相信它会为每个元组创建副本。
from multiprocessing import Pool
def f((a,b,c,d)):
print a,b,c,d
return a + b + c +d
if __name__ == '__main__':
p = Pool(10)
data = [(i+0,i+1,i+2,i+3) for i in xrange(10)]
print(p.map(f, data))
p.close()
p.join()
以某种随机顺序给出输出:
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8
7 8 9 10
6 7 8 9
8 9 10 11
9 10 11 12
[6, 10, 14, 18, 22, 26, 30, 34, 38, 42]