不能pickle <type 'instancemethod'>当使用多处理Pool.map()

我试图使用multiprocessing的Pool.map()函数来同时划分工作。当我使用以下代码时，它工作得很好:

import multiprocessing

def f(x):
    return x*x

def go():
    pool = multiprocessing.Pool(processes=4)        
    print pool.map(f, range(10))


if __name__== '__main__' :
    go()

然而，当我在更面向对象的方法中使用它时，它就不起作用了。它给出的错误信息是:

PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup
__builtin__.instancemethod failed

这发生时，以下是我的主程序:

import someClass

if __name__== '__main__' :
    sc = someClass.someClass()
    sc.go()

下面是我的someClass类:

import multiprocessing

class someClass(object):
    def __init__(self):
        pass

    def f(self, x):
        return x*x

    def go(self):
        pool = multiprocessing.Pool(processes=4)       
        print pool.map(self.f, range(10))

有人知道问题是什么吗，或者有什么简单的解决方法吗?

当前回答

我遇到了同样的问题，但发现有一个JSON编码器可以用来在进程之间移动这些对象。

from pyVmomi.VmomiSupport import VmomiJSONEncoder

用这个来创建你的列表:

jsonSerialized = json.dumps(pfVmomiObj, cls=VmomiJSONEncoder)

然后在mapped函数中，使用this来恢复对象:

pfVmomiObj = json.loads(jsonSerialized)

2019-07-28 19:52:41

其他回答

为什么不使用单独的函数?

def func(*args, **kwargs):
    return inst.method(args, kwargs)

print pool.map(func, arr)

2018-05-17 14:33:11

感伤。多重处理对我很有用。

与多处理不同，它有一个池方法，可以序列化所有东西

import pathos.multiprocessing as mp
pool = mp.Pool(processes=2)

2021-11-29 11:28:43

但Steven Bethard的解决方案存在一些局限性:

When you register your class method as a function, the destructor of your class is surprisingly called every time your method processing is finished. So if you have 1 instance of your class that calls n times its method, members may disappear between 2 runs and you may get a message malloc: *** error for object 0x...: pointer being freed was not allocated (e.g. open member file) or pure virtual method called, terminate called without an active exception (which means than the lifetime of a member object I used was shorter than what I thought). I got this when dealing with n greater than the pool size. Here is a short example :

from multiprocessing import Pool, cpu_count
from multiprocessing.pool import ApplyResult

# --------- see Stenven's solution above -------------
from copy_reg import pickle
from types import MethodType

def _pickle_method(method):
    func_name = method.im_func.__name__
    obj = method.im_self
    cls = method.im_class
    return _unpickle_method, (func_name, obj, cls)

def _unpickle_method(func_name, obj, cls):
    for cls in cls.mro():
        try:
            func = cls.__dict__[func_name]
        except KeyError:
            pass
        else:
            break
    return func.__get__(obj, cls)


class Myclass(object):

    def __init__(self, nobj, workers=cpu_count()):

        print "Constructor ..."
        # multi-processing
        pool = Pool(processes=workers)
        async_results = [ pool.apply_async(self.process_obj, (i,)) for i in range(nobj) ]
        pool.close()
        # waiting for all results
        map(ApplyResult.wait, async_results)
        lst_results=[r.get() for r in async_results]
        print lst_results

    def __del__(self):
        print "... Destructor"

    def process_obj(self, index):
        print "object %d" % index
        return "results"

pickle(MethodType, _pickle_method, _unpickle_method)
Myclass(nobj=8, workers=3)
# problem !!! the destructor is called nobj times (instead of once)

输出:

Constructor ...
object 0
object 1
object 2
... Destructor
object 3
... Destructor
object 4
... Destructor
object 5
... Destructor
object 6
... Destructor
object 7
... Destructor
... Destructor
... Destructor
['results', 'results', 'results', 'results', 'results', 'results', 'results', 'results']
... Destructor

__call__方法不是那么等价的，因为[None，…]]的结果:

from multiprocessing import Pool, cpu_count
from multiprocessing.pool import ApplyResult

class Myclass(object):

    def __init__(self, nobj, workers=cpu_count()):

        print "Constructor ..."
        # multiprocessing
        pool = Pool(processes=workers)
        async_results = [ pool.apply_async(self, (i,)) for i in range(nobj) ]
        pool.close()
        # waiting for all results
        map(ApplyResult.wait, async_results)
        lst_results=[r.get() for r in async_results]
        print lst_results

    def __call__(self, i):
        self.process_obj(i)

    def __del__(self):
        print "... Destructor"

    def process_obj(self, i):
        print "obj %d" % i
        return "result"

Myclass(nobj=8, workers=3)
# problem !!! the destructor is called nobj times (instead of once), 
# **and** results are empty !

所以这两种方法都不令人满意……

2011-09-05 14:50:52

我遇到了同样的问题，但发现有一个JSON编码器可以用来在进程之间移动这些对象。

from pyVmomi.VmomiSupport import VmomiJSONEncoder

用这个来创建你的列表:

jsonSerialized = json.dumps(pfVmomiObj, cls=VmomiJSONEncoder)

然后在mapped函数中，使用this来恢复对象:

pfVmomiObj = json.loads(jsonSerialized)

2019-07-28 19:52:41

在这个简单的例子中，someClass。F没有从类中继承任何数据，也没有向类中附加任何数据，一个可能的解决方案是将F分离出来，这样它就可以被pickle:

import multiprocessing


def f(x):
    return x*x


class someClass(object):
    def __init__(self):
        pass

    def go(self):
        pool = multiprocessing.Pool(processes=4)       
        print pool.map(f, range(10))

2018-03-23 18:14:11

不能pickle <type 'instancemethod'>当使用多处理Pool.map()

推荐文章

最新文章

标签