librelist archives

« back to archive

Numba numpy optimization

Numba numpy optimization

From:
federico vaggi
Date:
2012-03-29 @ 18:21
Hi everyone,

thanks for all the work in setting up Numba - it looks really exciting, and
being able to have jit + vectorization optimization 'for free' is awesome.
I am porting a bunch of numerical code that I wrote in MATLAB over to
Python, and was trying out a bunch of the tools available for fast
numerical computing.

The first thing I ported was a pairwise distance function to calculate the
Haversin distance between all points in two vectors (Nx2 and Mx2).


https://bitbucket.org/FedericoV/numpy-tip-complex-modeling/src/d83acacabda1/src/simulations

The Haversin distance is not available in the scipy.spatial package, so I
implemented it as:

- Naively, as a list of lists.
- Using a vectorized numpy syntax
- Cython
- Compiling the list of list version using shedskin
- Using numexpr

So far, what I get using Cpython:

List Implementation: 2.00559
C-List Implementation: 0.28070
Numpy Implementation: 0.29622
Cython Implementation: 0.19165
Numexpr Implementation: 0.09320

Using PyPy:
List Implementation: 0.27367
Numpy Implementation: 4.48617


It's interesting that the list implementation in PyPy beats Numpy *and* the
compiled C implementation (using Shedskin).
And I also tried the list of lists approach using pypy.

I also tried using Numba on the numpy version, but got a few errors about
shape not being available.  I moved the reshaping of the arrays outside the
function, but I still get errors:

Traceback (most recent call last):
  File "Benchmark.py", line 66, in <module>
    t = Translate(arc_distance_numba)
  File
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 244, in __init__
    self.setup_func()
  File
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 263, in setup_func
    self.lfunc.args[i].name = name
IndexError: list index out of range

Any help/feedback is welcome.  All code is here:

https://bitbucket.org/FedericoV/numpy-tip-complex-modeling

Cheers,

Federico

Re: [numba] Numba numpy optimization

From:
Travis Oliphant
Date:
2012-03-30 @ 15:02
This is a good use-case, but Numba is not yet ready to do this sort of 
calculation.   However, it would be good to add it to the list of examples
of the kind of function that Numba should be able to speed up.   Right now
Numba is early, early.    Give it a few months to mature to enable this 
sort of thing. 

Right now, Numba only supports simple element-by-element calculations 
which can become "ufuncs" in NumPy. 

Thanks for your interest and for your test-case.   The comparisons are 
very interesting. 

Best, 

-Travis



On Mar 29, 2012, at 1:21 PM, federico vaggi wrote:

> Hi everyone,
> 
> thanks for all the work in setting up Numba - it looks really exciting, 
and being able to have jit + vectorization optimization 'for free' is 
awesome.  I am porting a bunch of numerical code that I wrote in MATLAB 
over to Python, and was trying out a bunch of the tools available for fast
numerical computing.
> 
> The first thing I ported was a pairwise distance function to calculate 
the Haversin distance between all points in two vectors (Nx2 and Mx2).
> 
> 
https://bitbucket.org/FedericoV/numpy-tip-complex-modeling/src/d83acacabda1/src/simulations
> 
> The Haversin distance is not available in the scipy.spatial package, so 
I implemented it as:
> 
> - Naively, as a list of lists.
> - Using a vectorized numpy syntax
> - Cython
> - Compiling the list of list version using shedskin
> - Using numexpr
> 
> So far, what I get using Cpython:
> 
> List Implementation: 2.00559 
> C-List Implementation: 0.28070 
> Numpy Implementation: 0.29622 
> Cython Implementation: 0.19165 
> Numexpr Implementation: 0.09320 
> 
> Using PyPy:
> List Implementation: 0.27367 
> Numpy Implementation: 4.48617 
> 
> 
> It's interesting that the list implementation in PyPy beats Numpy *and* 
the compiled C implementation (using Shedskin).
> And I also tried the list of lists approach using pypy.
> 
> I also tried using Numba on the numpy version, but got a few errors 
about shape not being available.  I moved the reshaping of the arrays 
outside the function, but I still get errors:
> 
> Traceback (most recent call last):
>   File "Benchmark.py", line 66, in <module>
>     t = Translate(arc_distance_numba)
>   File 
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 244, in __init__
>     self.setup_func()
>   File 
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 263, in setup_func
>     self.lfunc.args[i].name = name
> IndexError: list index out of range
> 
> Any help/feedback is welcome.  All code is here: 
> 
> https://bitbucket.org/FedericoV/numpy-tip-complex-modeling
> 
> Cheers,
> 
> Federico
> 

Re: [numba] Numba numpy optimization

From:
federico vaggi
Date:
2012-03-30 @ 16:14
Hi Travis,

Thanks for the heads up.  I presume taking the existing sin/cos functions
from numpy and running them through Numba doesn't really result in any
speed increase?

Still, very interesting project so far, will check the progress closely!

Federico

On Fri, Mar 30, 2012 at 5:02 PM, Travis Oliphant <travis@continuum.io>wrote:

> This is a good use-case, but Numba is not yet ready to do this sort of
> calculation.   However, it would be good to add it to the list of examples
> of the kind of function that Numba should be able to speed up.   Right now
> Numba is early, early.    Give it a few months to mature to enable this
> sort of thing.
>
> Right now, Numba only supports simple element-by-element calculations
> which can become "ufuncs" in NumPy.
>
> Thanks for your interest and for your test-case.   The comparisons are
> very interesting.
>
> Best,
>
> -Travis
>
>
>
> On Mar 29, 2012, at 1:21 PM, federico vaggi wrote:
>
> Hi everyone,
>
> thanks for all the work in setting up Numba - it looks really exciting,
> and being able to have jit + vectorization optimization 'for free' is
> awesome.  I am porting a bunch of numerical code that I wrote in MATLAB
> over to Python, and was trying out a bunch of the tools available for fast
> numerical computing.
>
> The first thing I ported was a pairwise distance function to calculate the
> Haversin distance between all points in two vectors (Nx2 and Mx2).
>
>
> 
https://bitbucket.org/FedericoV/numpy-tip-complex-modeling/src/d83acacabda1/src/simulations
>
> The Haversin distance is not available in the scipy.spatial package, so I
> implemented it as:
>
> - Naively, as a list of lists.
> - Using a vectorized numpy syntax
> - Cython
> - Compiling the list of list version using shedskin
> - Using numexpr
>
> So far, what I get using Cpython:
>
> List Implementation: 2.00559
> C-List Implementation: 0.28070
> Numpy Implementation: 0.29622
> Cython Implementation: 0.19165
> Numexpr Implementation: 0.09320
>
> Using PyPy:
> List Implementation: 0.27367
> Numpy Implementation: 4.48617
>
>
> It's interesting that the list implementation in PyPy beats Numpy *and*
> the compiled C implementation (using Shedskin).
> And I also tried the list of lists approach using pypy.
>
> I also tried using Numba on the numpy version, but got a few errors about
> shape not being available.  I moved the reshaping of the arrays outside the
> function, but I still get errors:
>
> Traceback (most recent call last):
>   File "Benchmark.py", line 66, in <module>
>     t = Translate(arc_distance_numba)
>   File
> "/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
> line 244, in __init__
>     self.setup_func()
>   File
> "/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
> line 263, in setup_func
>     self.lfunc.args[i].name = name
> IndexError: list index out of range
>
> Any help/feedback is welcome.  All code is here:
>
> https://bitbucket.org/FedericoV/numpy-tip-complex-modeling
>
> Cheers,
>
> Federico
>
>
>

Re: [numba] Numba numpy optimization

From:
Travis Oliphant
Date:
2012-03-30 @ 17:30
On Mar 30, 2012, at 11:14 AM, federico vaggi wrote:

> Hi Travis,
> 
> Thanks for the heads up.  I presume taking the existing sin/cos 
functions from numpy and running them through Numba doesn't really result 
in any speed increase?

Not really, numba translates Python code to LLVM-code which then gets 
compiled to machine code dynamically.    It uses the llvm intrinsics for 
sin and cos.

Thanks for your interest.

-Travis

> 
> Still, very interesting project so far, will check the progress closely!
> 
> Federico
> 
> On Fri, Mar 30, 2012 at 5:02 PM, Travis Oliphant <travis@continuum.io> wrote:
> This is a good use-case, but Numba is not yet ready to do this sort of 
calculation.   However, it would be good to add it to the list of examples
of the kind of function that Numba should be able to speed up.   Right now
Numba is early, early.    Give it a few months to mature to enable this 
sort of thing. 
> 
> Right now, Numba only supports simple element-by-element calculations 
which can become "ufuncs" in NumPy. 
> 
> Thanks for your interest and for your test-case.   The comparisons are 
very interesting. 
> 
> Best, 
> 
> -Travis
> 
> 
> 
> On Mar 29, 2012, at 1:21 PM, federico vaggi wrote:
> 
>> Hi everyone,
>> 
>> thanks for all the work in setting up Numba - it looks really exciting,
and being able to have jit + vectorization optimization 'for free' is 
awesome.  I am porting a bunch of numerical code that I wrote in MATLAB 
over to Python, and was trying out a bunch of the tools available for fast
numerical computing.
>> 
>> The first thing I ported was a pairwise distance function to calculate 
the Haversin distance between all points in two vectors (Nx2 and Mx2).
>> 
>> 
https://bitbucket.org/FedericoV/numpy-tip-complex-modeling/src/d83acacabda1/src/simulations
>> 
>> The Haversin distance is not available in the scipy.spatial package, so
I implemented it as:
>> 
>> - Naively, as a list of lists.
>> - Using a vectorized numpy syntax
>> - Cython
>> - Compiling the list of list version using shedskin
>> - Using numexpr
>> 
>> So far, what I get using Cpython:
>> 
>> List Implementation: 2.00559 
>> C-List Implementation: 0.28070 
>> Numpy Implementation: 0.29622 
>> Cython Implementation: 0.19165 
>> Numexpr Implementation: 0.09320 
>> 
>> Using PyPy:
>> List Implementation: 0.27367 
>> Numpy Implementation: 4.48617 
>> 
>> 
>> It's interesting that the list implementation in PyPy beats Numpy *and*
the compiled C implementation (using Shedskin).
>> And I also tried the list of lists approach using pypy.
>> 
>> I also tried using Numba on the numpy version, but got a few errors 
about shape not being available.  I moved the reshaping of the arrays 
outside the function, but I still get errors:
>> 
>> Traceback (most recent call last):
>>   File "Benchmark.py", line 66, in <module>
>>     t = Translate(arc_distance_numba)
>>   File 
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 244, in __init__
>>     self.setup_func()
>>   File 
"/home/federicov/epd-python/lib/python2.7/site-packages/numba/translate.py",
line 263, in setup_func
>>     self.lfunc.args[i].name = name
>> IndexError: list index out of range
>> 
>> Any help/feedback is welcome.  All code is here: 
>> 
>> https://bitbucket.org/FedericoV/numpy-tip-complex-modeling
>> 
>> Cheers,
>> 
>> Federico
>> 
> 
>