librelist archives

« back to archive

Parallel for loop with non-pickle-able instancemethod

Parallel for loop with non-pickle-able instancemethod

From:
Gustavo Goretkin
Date:
2012-09-09 @ 16:57
I have a pool of objects and I want to run in parallel the same method on
all of them at

Parallel(n_jobs=4)(delayed(obj.method)(100) for obj in obj_pool)

but I cannot Pickle obj.

I tried a wrapper like

def apply(f,arg):
 return f(arg)

Parallel(n_jobs=4)(delayed(apply)(obj.method,100) for obj in obj_pool)

But then the arguments need to be Pickled too.

Any suggestions? I will consider instantiating a new object, but then I
would not be able to access it.

Re: [joblib] Parallel for loop with non-pickle-able instancemethod

From:
Olivier Grisel
Date:
2012-09-09 @ 17:15
2012/9/9 Gustavo Goretkin <gustavo.goretkin@gmail.com>:
> I have a pool of objects and I want to run in parallel the same method on
> all of them at
>
> Parallel(n_jobs=4)(delayed(obj.method)(100) for obj in obj_pool)
>
> but I cannot Pickle obj.
>
> I tried a wrapper like
>
> def apply(f,arg):
>  return f(arg)
>
> Parallel(n_jobs=4)(delayed(apply)(obj.method,100) for obj in obj_pool)
>
> But then the arguments need to be Pickled too.
>
> Any suggestions? I will consider instantiating a new object, but then I
> would not be able to access it.

I am currently working on a pull request to extend the Parallel API to
be able to use custom pickling methods. The main motivation was
no-memory copy support for numpy.memmap objects but it could be
leverage to any other types.

See: https://github.com/joblib/joblib/pull/44

The documentation is still lacking, but assuming you want to register
a custom pickler / unpickler for a class called CustomType, you could
do the following.

def customtype_builder(attribute_1, attribute_2):
    obj = CustomType()
    obj.attribute_1
    obj.attribute_2
    return obj

def reduce_customtype(obj):
    return (customtype_builder, (obj.attribute_1, obj.attribute_2))

Parallel(12, forward_reducers=(CustomType, reduce_customtype))(...)

You can also register custom backward_reducers for the communication
between the child suprocess and the parent process.


-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: [joblib] Parallel for loop with non-pickle-able instancemethod

From:
Olivier Grisel
Date:
2012-09-09 @ 17:16
2012/9/9 Olivier Grisel <olivier.grisel@ensta.org>:
> 2012/9/9 Gustavo Goretkin <gustavo.goretkin@gmail.com>:
>> I have a pool of objects and I want to run in parallel the same method on
>> all of them at
>>
>> Parallel(n_jobs=4)(delayed(obj.method)(100) for obj in obj_pool)
>>
>> but I cannot Pickle obj.
>>
>> I tried a wrapper like
>>
>> def apply(f,arg):
>>  return f(arg)
>>
>> Parallel(n_jobs=4)(delayed(apply)(obj.method,100) for obj in obj_pool)
>>
>> But then the arguments need to be Pickled too.
>>
>> Any suggestions? I will consider instantiating a new object, but then I
>> would not be able to access it.
>
> I am currently working on a pull request to extend the Parallel API to
> be able to use custom pickling methods. The main motivation was
> no-memory copy support for numpy.memmap objects but it could be
> leverage to any other types.
>
> See: https://github.com/joblib/joblib/pull/44
>
> The documentation is still lacking, but assuming you want to register
> a custom pickler / unpickler for a class called CustomType, you could
> do the following.
>
> def customtype_builder(attribute_1, attribute_2):
>     obj = CustomType()
>     obj.attribute_1
>     obj.attribute_2
>     return obj
>
> def reduce_customtype(obj):
>     return (customtype_builder, (obj.attribute_1, obj.attribute_2))
>
> Parallel(12, forward_reducers=(CustomType, reduce_customtype))(...)
>
> You can also register custom backward_reducers for the communication
> between the child suprocess and the parent process.

BTW, as this Pull Request has not yet been reviewed, this API is
subject to change before getting merged in the joblib master repo.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: [joblib] Parallel for loop with non-pickle-able instancemethod

From:
Olivier Grisel
Date:
2012-09-09 @ 17:32
2012/9/9 Olivier Grisel <olivier.grisel@ensta.org>:
> 2012/9/9 Gustavo Goretkin <gustavo.goretkin@gmail.com>:
>> I have a pool of objects and I want to run in parallel the same method on
>> all of them at
>>
>> Parallel(n_jobs=4)(delayed(obj.method)(100) for obj in obj_pool)
>>
>> but I cannot Pickle obj.
>>
>> I tried a wrapper like
>>
>> def apply(f,arg):
>>  return f(arg)
>>
>> Parallel(n_jobs=4)(delayed(apply)(obj.method,100) for obj in obj_pool)
>>
>> But then the arguments need to be Pickled too.
>>
>> Any suggestions? I will consider instantiating a new object, but then I
>> would not be able to access it.
>
> I am currently working on a pull request to extend the Parallel API to
> be able to use custom pickling methods. The main motivation was
> no-memory copy support for numpy.memmap objects but it could be
> leverage to any other types.
>
> See: https://github.com/joblib/joblib/pull/44
>
> The documentation is still lacking, but assuming you want to register
> a custom pickler / unpickler for a class called CustomType, you could
> do the following.
>
> def customtype_builder(attribute_1, attribute_2):
>     obj = CustomType()
>     obj.attribute_1
>     obj.attribute_2
>     return obj
>
> def reduce_customtype(obj):
>     return (customtype_builder, (obj.attribute_1, obj.attribute_2))
>
> Parallel(12, forward_reducers=(CustomType, reduce_customtype))(...)

There is a typo here with unbalanced parens, it should be:

reducers = [(CustomType, reduce_customtype)]
Parallel(12, forward_reducers=reducers)(...)

reducers should be a sequence of tuples (pairs) of (type, reduce_func).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

Re: [joblib] Parallel for loop with non-pickle-able instancemethod

From:
Gael Varoquaux
Date:
2012-09-09 @ 17:14
It's not possible swith Python parallelism. You can only run parallel
computing with objects that Pickle. Sorry.