librelist archives

« back to archive

How to spell an array argument

How to spell an array argument

From:
Travis Oliphant
Date:
2012-06-25 @ 20:07
Hello,

We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions. 

Examples: 

1-d array of float32:   f[:]
1-d array of float64:   d[:]
1-d array of float64 with 10 elements:  d[10]

2-d array of float32:  f[:,:]
2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]

3-D array of double:  d[:, :, :]

arbitrary number of dimensions for float64:   f[...]

I like this much better. 

Thanks, 

-Travis

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-25 @ 20:59
On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
> Hello,hh
>
> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>
> Examples:
>
> 1-d array of float32:   f[:]
> 1-d array of float64:   d[:]
> 1-d array of float64 with 10 elements:  d[10]
>
> 2-d array of float32:  f[:,:]
> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>
> 3-D array of double:  d[:, :, :]
>
> arbitrary number of dimensions for float64:   f[...]
>
> I like this much better.
>
> Thanks,
>
> -Travis
>
>

I don't think types should be specified explicitly at all. It would be
much more convenient for users to have Numba specialize functions when
they are called based on their input types (and reuse cached versions
automatically). If it shouldn't be a generic/templated function for
whatever reason, then what about having a numba.dtypes module, where
you can immediately slice the types, e.g. numba.dtypes.double[:, :]
specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
specifies a Fortran contiguous array and double[:, ::1] a C contiguous
array (as James suggests, stride information is probably most
important here).

Re: [numba] How to spell an array argument

From:
Dag Sverre Seljebotn
Date:
2012-06-25 @ 21:20

mark florisson <markflorisson88@gmail.com>
 wrote:

>On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>> Hello,hh
>>
>> We need a way to spell an array type in the decorator.   Mark Wiebe
>suggested a good way to use the character code for the type in the
>array and then brackets with ':' and ellipsis, and integers to spell
>how many dimensions.
>>
>> Examples:
>>
>> 1-d array of float32:   f[:]
>> 1-d array of float64:   d[:]
>> 1-d array of float64 with 10 elements:  d[10]
>>
>> 2-d array of float32:  f[:,:]
>> 2-d array of float64 (with 10 elements in the last-dimension):  
>d[:,10]
>>
>> 3-D array of double:  d[:, :, :]
>>
>> arbitrary number of dimensions for float64:   f[...]
>>
>> I like this much better.
>>
>> Thanks,
>>
>> -Travis
>>
>>
>
>I don't think types should be specified explicitly at all. It would be
>much more convenient for users to have Numba specialize functions when
>they are called based on their input types (and reuse cached versions
>automatically). If it shouldn't be a generic/templated function for
>whatever reason, then what about having a numba.dtypes module, where
>you can immediately slice the types, e.g. numba.dtypes.double[:, :]
>specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
>specifies a Fortran contiguous array and double[:, ::1] a C contiguous
>array (as James suggests, stride information is probably most
>important here).

I am puzzled too, I thought the whole idea of numba (rather than just 
using Cython) was to generate code dynamically, when the exact 
types/arguments are available. (otherwise just improving the cython.inline
mode and feeding the cython code through clang for compilation would give 
quicker results per developer hour)

At any rate, if numba wants static typing, please make it something both 
numba and Cython (and Jython, IronPython,...) can support. It's just a 
namespace.

(numba could just define it and Cython could use it...as long as there 
isn't 'numba' in the namespace name there doesn't need to be any red tape 
and design by comittee)

Eg:

from types import float64, scalar
arrtype1 = float64[:,:] # specific
arrtype2 = scalar[...] # generic


Dag

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-25 @ 21:48
On 25 June 2012 22:20, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
>
>
> mark florisson <markflorisson88@gmail.com>
>  wrote:
>
>>On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>>> Hello,hh
>>>
>>> We need a way to spell an array type in the decorator.   Mark Wiebe
>>suggested a good way to use the character code for the type in the
>>array and then brackets with ':' and ellipsis, and integers to spell
>>how many dimensions.
>>>
>>> Examples:
>>>
>>> 1-d array of float32:   f[:]
>>> 1-d array of float64:   d[:]
>>> 1-d array of float64 with 10 elements:  d[10]
>>>
>>> 2-d array of float32:  f[:,:]
>>> 2-d array of float64 (with 10 elements in the last-dimension):
>>d[:,10]
>>>
>>> 3-D array of double:  d[:, :, :]
>>>
>>> arbitrary number of dimensions for float64:   f[...]
>>>
>>> I like this much better.
>>>
>>> Thanks,
>>>
>>> -Travis
>>>
>>>
>>
>>I don't think types should be specified explicitly at all. It would be
>>much more convenient for users to have Numba specialize functions when
>>they are called based on their input types (and reuse cached versions
>>automatically). If it shouldn't be a generic/templated function for
>>whatever reason, then what about having a numba.dtypes module, where
>>you can immediately slice the types, e.g. numba.dtypes.double[:, :]
>>specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
>>specifies a Fortran contiguous array and double[:, ::1] a C contiguous
>>array (as James suggests, stride information is probably most
>>important here).
>
> I am puzzled too, I thought the whole idea of numba (rather than just 
using Cython) was to generate code dynamically, when the exact 
types/arguments are available. (otherwise just improving the cython.inline
mode and feeding the cython code through clang for compilation would give 
quicker results per developer hour)
>
> At any rate, if numba wants static typing, please make it something both
numba and Cython (and Jython, IronPython,...) can support. It's just a 
namespace.
>
> (numba could just define it and Cython could use it...as long as there 
isn't 'numba' in the namespace name there doesn't need to be any red tape 
and design by comittee)
>
> Eg:
>
> from types import float64, scalar
> arrtype1 = float64[:,:] # specific
> arrtype2 = scalar[...] # generic
>

Ah, I was thinking to have the same for Cython but just change the
imports. It would indeed be more elegant to share it. In any case,
types is already in the stdlib.

> Dag
>
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
Travis Oliphant
Date:
2012-06-25 @ 21:48
The type specification is not the end-goal.   But, it is a useful 
intermediate plateau and for that plateau we should use something better 
than the '[[[d]]]' syntax being used right now in Numba. 

Ideally, there is a "call-site" that specializes as necessary at run-time,
yes --- so you have no need to actually indicate the types in practice.   

If Mark and Jon can get us there by the end of summer, I will be very 
happy and very impressed.

-Travis


On Jun 25, 2012, at 4:20 PM, Dag Sverre Seljebotn wrote:

> 
> 
> mark florisson <markflorisson88@gmail.com>
> wrote:
> 
>> On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>>> Hello,hh
>>> 
>>> We need a way to spell an array type in the decorator.   Mark Wiebe
>> suggested a good way to use the character code for the type in the
>> array and then brackets with ':' and ellipsis, and integers to spell
>> how many dimensions.
>>> 
>>> Examples:
>>> 
>>> 1-d array of float32:   f[:]
>>> 1-d array of float64:   d[:]
>>> 1-d array of float64 with 10 elements:  d[10]
>>> 
>>> 2-d array of float32:  f[:,:]
>>> 2-d array of float64 (with 10 elements in the last-dimension):  
>> d[:,10]
>>> 
>>> 3-D array of double:  d[:, :, :]
>>> 
>>> arbitrary number of dimensions for float64:   f[...]
>>> 
>>> I like this much better.
>>> 
>>> Thanks,
>>> 
>>> -Travis
>>> 
>>> 
>> 
>> I don't think types should be specified explicitly at all. It would be
>> much more convenient for users to have Numba specialize functions when
>> they are called based on their input types (and reuse cached versions
>> automatically). If it shouldn't be a generic/templated function for
>> whatever reason, then what about having a numba.dtypes module, where
>> you can immediately slice the types, e.g. numba.dtypes.double[:, :]
>> specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
>> specifies a Fortran contiguous array and double[:, ::1] a C contiguous
>> array (as James suggests, stride information is probably most
>> important here).
> 
> I am puzzled too, I thought the whole idea of numba (rather than just 
using Cython) was to generate code dynamically, when the exact 
types/arguments are available. (otherwise just improving the cython.inline
mode and feeding the cython code through clang for compilation would give 
quicker results per developer hour)
> 
> At any rate, if numba wants static typing, please make it something both
numba and Cython (and Jython, IronPython,...) can support. It's just a 
namespace.
> 
> (numba could just define it and Cython could use it...as long as there 
isn't 'numba' in the namespace name there doesn't need to be any red tape 
and design by comittee)
> 
> Eg:
> 
> from types import float64, scalar
> arrtype1 = float64[:,:] # specific
> arrtype2 = scalar[...] # generic
> 
> 
> Dag
> 
> -- 
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
James Bergstra
Date:
2012-06-25 @ 22:03
On Mon, Jun 25, 2012 at 5:48 PM, Travis Oliphant <travis@continuum.io> wrote:
> The type specification is not the end-goal.   But, it is a useful 
intermediate plateau and for that plateau we should use something better 
than the '[[[d]]]' syntax being used right now in Numba.
>
> Ideally, there is a "call-site" that specializes as necessary at 
run-time, yes --- so you have no need to actually indicate the types in 
practice.
>
> If Mark and Jon can get us there by the end of summer, I will be very 
happy and very impressed.
>

Yeah it would be really amazing. Sorry if my comment was off topic, I
should have tried to get more context. Keep up the good work :)

Re: [numba] How to spell an array argument

From:
James Bergstra
Date:
2012-06-25 @ 21:34
I also anticipated using a wrapper around this syntax that would
translate calls JIT-style as Mark suggests... but it actually leads to
another API  problem.

If the JIT runtime has *all* the information about an array, then a
programmer will want to guide the code generator, by telling it how
much of that information to use. For example, should  the code
generator specialize to:

* exactly the dtype and physical memory addresses of all the arguments, or
* just the shape
* just the strides
* just the strides and shapes of some of the dimensions
* just the dtype

The code generator may not even support all the possibilities for
shape, stride, and dtype, and the user should certainly not be forced
to be too specific. But at the same time, if the user doesn't
intervene then performance might be hurt by either too many cache
misses, poorly optimized code, and/or a complex and mysterious cache
mechanism.

For different (application, code-generator) pairs, different levels of
specialization are appropriate, and the performance penalty associated
with choosing badly can be from nil to huge.

Also, somewhat problematically, the programmers hints should be
associated with a *call* location, rather than with the function
definition itself, as would be suggested by the use of a decorator.



On Mon, Jun 25, 2012 at 5:20 PM, Dag Sverre Seljebotn
<d.s.seljebotn@astro.uio.no> wrote:
>
>
> mark florisson <markflorisson88@gmail.com>
>  wrote:
>
>>On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>>> Hello,hh
>>>
>>> We need a way to spell an array type in the decorator.   Mark Wiebe
>>suggested a good way to use the character code for the type in the
>>array and then brackets with ':' and ellipsis, and integers to spell
>>how many dimensions.
>>>
>>> Examples:
>>>
>>> 1-d array of float32:   f[:]
>>> 1-d array of float64:   d[:]
>>> 1-d array of float64 with 10 elements:  d[10]
>>>
>>> 2-d array of float32:  f[:,:]
>>> 2-d array of float64 (with 10 elements in the last-dimension):
>>d[:,10]
>>>
>>> 3-D array of double:  d[:, :, :]
>>>
>>> arbitrary number of dimensions for float64:   f[...]
>>>
>>> I like this much better.
>>>
>>> Thanks,
>>>
>>> -Travis
>>>
>>>
>>
>>I don't think types should be specified explicitly at all. It would be
>>much more convenient for users to have Numba specialize functions when
>>they are called based on their input types (and reuse cached versions
>>automatically). If it shouldn't be a generic/templated function for
>>whatever reason, then what about having a numba.dtypes module, where
>>you can immediately slice the types, e.g. numba.dtypes.double[:, :]
>>specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
>>specifies a Fortran contiguous array and double[:, ::1] a C contiguous
>>array (as James suggests, stride information is probably most
>>important here).
>
> I am puzzled too, I thought the whole idea of numba (rather than just 
using Cython) was to generate code dynamically, when the exact 
types/arguments are available. (otherwise just improving the cython.inline
mode and feeding the cython code through clang for compilation would give 
quicker results per developer hour)
>
> At any rate, if numba wants static typing, please make it something both
numba and Cython (and Jython, IronPython,...) can support. It's just a 
namespace.
>
> (numba could just define it and Cython could use it...as long as there 
isn't 'numba' in the namespace name there doesn't need to be any red tape 
and design by comittee)
>
> Eg:
>
> from types import float64, scalar
> arrtype1 = float64[:,:] # specific
> arrtype2 = scalar[...] # generic
>
>
> Dag
>
> --
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-25 @ 21:44
On 25 June 2012 22:34, James Bergstra <james.bergstra@gmail.com> wrote:
> I also anticipated using a wrapper around this syntax that would
> translate calls JIT-style as Mark suggests... but it actually leads to
> another API  problem.
>
> If the JIT runtime has *all* the information about an array, then a
> programmer will want to guide the code generator, by telling it how
> much of that information to use. For example, should  the code
> generator specialize to:
>
> * exactly the dtype and physical memory addresses of all the arguments, or
> * just the shape
> * just the strides
> * just the strides and shapes of some of the dimensions
> * just the dtype
>
> The code generator may not even support all the possibilities for
> shape, stride, and dtype, and the user should certainly not be forced
> to be too specific. But at the same time, if the user doesn't
> intervene then performance might be hurt by either too many cache
> misses, poorly optimized code, and/or a complex and mysterious cache
> mechanism.
>
> For different (application, code-generator) pairs, different levels of
> specialization are appropriate, and the performance penalty associated
> with choosing badly can be from nil to huge.
>
> Also, somewhat problematically, the programmers hints should be
> associated with a *call* location, rather than with the function
> definition itself, as would be suggested by the use of a decorator.
>
>

I don't think this is a problem, you never want to specialize for
shape or strides specific strides. What you do want to specialize for
is the general ordering for strides, contiguity and differing stride
patterns, as well as the dtype and ndims. In minivect I specialize
for:

    - contiguous (C or Fortran have the same specialization)
    - inner dimension contiguous (C or Fortran (last dimension or
first dimension))
    - Tiled C and Tiled Fortran in case of differing contiguity or
striding patterns
    - C or Fortran ordered strided arrays

Whether those specializations will be useful for Numba will depend on
the code in question. But if the JIT is going to see many
combinations, it can prevent this with bookkeeping and collapse
several specializations into one generic implementation when needed,
and generate faster specializations for the most frequently called
versions.

> On Mon, Jun 25, 2012 at 5:20 PM, Dag Sverre Seljebotn
> <d.s.seljebotn@astro.uio.no> wrote:
>>
>>
>> mark florisson <markflorisson88@gmail.com>
>>  wrote:
>>
>>>On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>>>> Hello,hh
>>>>
>>>> We need a way to spell an array type in the decorator.   Mark Wiebe
>>>suggested a good way to use the character code for the type in the
>>>array and then brackets with ':' and ellipsis, and integers to spell
>>>how many dimensions.
>>>>
>>>> Examples:
>>>>
>>>> 1-d array of float32:   f[:]
>>>> 1-d array of float64:   d[:]
>>>> 1-d array of float64 with 10 elements:  d[10]
>>>>
>>>> 2-d array of float32:  f[:,:]
>>>> 2-d array of float64 (with 10 elements in the last-dimension):
>>>d[:,10]
>>>>
>>>> 3-D array of double:  d[:, :, :]
>>>>
>>>> arbitrary number of dimensions for float64:   f[...]
>>>>
>>>> I like this much better.
>>>>
>>>> Thanks,
>>>>
>>>> -Travis
>>>>
>>>>
>>>
>>>I don't think types should be specified explicitly at all. It would be
>>>much more convenient for users to have Numba specialize functions when
>>>they are called based on their input types (and reuse cached versions
>>>automatically). If it shouldn't be a generic/templated function for
>>>whatever reason, then what about having a numba.dtypes module, where
>>>you can immediately slice the types, e.g. numba.dtypes.double[:, :]
>>>specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
>>>specifies a Fortran contiguous array and double[:, ::1] a C contiguous
>>>array (as James suggests, stride information is probably most
>>>important here).
>>
>> I am puzzled too, I thought the whole idea of numba (rather than just 
using Cython) was to generate code dynamically, when the exact 
types/arguments are available. (otherwise just improving the cython.inline
mode and feeding the cython code through clang for compilation would give 
quicker results per developer hour)
>>
>> At any rate, if numba wants static typing, please make it something 
both numba and Cython (and Jython, IronPython,...) can support. It's just 
a namespace.
>>
>> (numba could just define it and Cython could use it...as long as there 
isn't 'numba' in the namespace name there doesn't need to be any red tape 
and design by comittee)
>>
>> Eg:
>>
>> from types import float64, scalar
>> arrtype1 = float64[:,:] # specific
>> arrtype2 = scalar[...] # generic
>>
>>
>> Dag
>>
>> --
>> Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
James Bergstra
Date:
2012-06-25 @ 22:06
On Mon, Jun 25, 2012 at 5:44 PM, mark florisson
<markflorisson88@gmail.com> wrote:
> I don't think this is a problem, you never want to specialize for
> shape or strides specific strides. What you do want to specialize for
> is the general ordering for strides, contiguity and differing stride
> patterns, as well as the dtype and ndims.

Well, except for unrolling - for that you need shape (!)

Re: [numba] How to spell an array argument

From:
Dag Sverre Seljebotn
Date:
2012-06-25 @ 22:24

James Bergstra <james.bergstra@gmail.com> wrote:

>On Mon, Jun 25, 2012 at 5:44 PM, mark florisson
><markflorisson88@gmail.com> wrote:
>> I don't think this is a problem, you never want to specialize for
>> shape or strides specific strides. What you do want to specialize for
>> is the general ordering for strides, contiguity and differing stride
>> patterns, as well as the dtype and ndims.
>
>Well, except for unrolling - for that you need shape (!)

For tiny axis lengths.

I don't see this as a real problem...yes, there is a combinatorially 
exploding space of possible instantiations, but very little (no?) code 
explores any significant portion of it. I feel that even a completely dumb
instantiation and caching mechanism (no merging of seldom used 
instantiations) would make numba a very useful tool.

Anyway, I guess I didn't have the context either. Even for the repr() used
during debugging and development a type syntax is good.

Dag

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-25 @ 22:15
On 25 June 2012 23:06, James Bergstra <james.bergstra@gmail.com> wrote:
> On Mon, Jun 25, 2012 at 5:44 PM, mark florisson
> <markflorisson88@gmail.com> wrote:
>> I don't think this is a problem, you never want to specialize for
>> shape or strides specific strides. What you do want to specialize for
>> is the general ordering for strides, contiguity and differing stride
>> patterns, as well as the dtype and ndims.
>
> Well, except for unrolling - for that you need shape (!)

Hm, valid point, it would aid in perfect unrolling, but you could
always unroll and adjust the loop step and add a fixup loop at the end
(which is slightly less efficient).

Does Theano do this kind of thing? Has it been found to be beneficial?

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-25 @ 21:00
On 25 June 2012 21:59, mark florisson <markflorisson88@gmail.com> wrote:
> On 25 June 2012 21:07, Travis Oliphant <travis@continuum.io> wrote:
>> Hello,hh
>>
>> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>>
>> Examples:
>>
>> 1-d array of float32:   f[:]
>> 1-d array of float64:   d[:]
>> 1-d array of float64 with 10 elements:  d[10]
>>
>> 2-d array of float32:  f[:,:]
>> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>>
>> 3-D array of double:  d[:, :, :]
>>
>> arbitrary number of dimensions for float64:   f[...]
>>
>> I like this much better.
>>
>> Thanks,
>>
>> -Travis
>>
>>
>
> I don't think types should be specified explicitly at all. It would be
> much more convenient for users to have Numba specialize functions when
> they are called based on their input types (and reuse cached versions
> automatically). If it shouldn't be a generic/templated function for
> whatever reason, then what about having a numba.dtypes module, where
> you can immediately slice the types, e.g. numba.dtypes.double[:, :]
> specifies a 2D (strided) double array. numba.dtypes.double[::1, :]
> specifies a Fortran contiguous array and double[:, ::1] a C contiguous
> array (as James suggests, stride information is probably most
> important here).

(This is also how Cython memoryviews work, which is very similar to
how Fortran arrays are specified).

Re: [numba] How to spell an array argument

From:
Jon Riehl
Date:
2012-06-26 @ 01:08
This is a better syntax than the ad hoc one I made up (didn't want to
write a parser until I had some buy in on one).  I'll add support for
handling these declarations shortly.

Please note there may still be a lag between when the decorator
supports this syntax and when the code generator uses the information
presented.

Regarding call time translation: I would imagine that keeping some
decoration is useful for run-time type checks, and maybe to guide any
loop unrolling.  Perhaps there should be additional markup (say "_")
that indicates to the translator that the given type or value should
be determined at call-specialization-time. Some quick examples:

* 1-d array of float32, where code should be generated for each
observed value of shape[0]: f[_]
* 1-d array of call-site specialized datatype, indexed by observed
datatype: _[:]
* 1 or higher dimensional array of float64 with shape[-1] == 10,
indexed by the observed shape[:-1] tuple: d[_..., 10]

It might also be useful to have a C/FORTRAN contiguous flag.  Straw
men: "f[:,:]_c"/"d[...]_f"/etc.

Thanks,
-Jon

On Mon, Jun 25, 2012 at 3:07 PM, Travis Oliphant <travis@continuum.io> wrote:
> Hello,
>
> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>
> Examples:
>
> 1-d array of float32:   f[:]
> 1-d array of float64:   d[:]
> 1-d array of float64 with 10 elements:  d[10]
>
> 2-d array of float32:  f[:,:]
> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>
> 3-D array of double:  d[:, :, :]
>
> arbitrary number of dimensions for float64:   f[...]
>
> I like this much better.
>
> Thanks,
>
> -Travis
>
>

Re: [numba] How to spell an array argument

From:
Travis Oliphant
Date:
2012-06-26 @ 01:51
On Jun 25, 2012, at 8:08 PM, Jon Riehl wrote:

> This is a better syntax than the ad hoc one I made up (didn't want to
> write a parser until I had some buy in on one).  I'll add support for
> handling these declarations shortly.
> 
> Please note there may still be a lag between when the decorator
> supports this syntax and when the code generator uses the information
> presented.
> 
> Regarding call time translation: I would imagine that keeping some
> decoration is useful for run-time type checks, and maybe to guide any
> loop unrolling.  Perhaps there should be additional markup (say "_")
> that indicates to the translator that the given type or value should
> be determined at call-specialization-time. Some quick examples:
> 
> * 1-d array of float32, where code should be generated for each
> observed value of shape[0]: f[_]
> * 1-d array of call-site specialized datatype, indexed by observed
> datatype: _[:]
> * 1 or higher dimensional array of float64 with shape[-1] == 10,
> indexed by the observed shape[:-1] tuple: d[_..., 10]
> 
> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
> men: "f[:,:]_c"/"d[...]_f"/etc.

I like your thoughts here.   The "_" syntax makes some sense and is kind 
of nice.   Other opinions? 

-Travis

Re: [numba] How to spell an array argument

From:
James Bergstra
Date:
2012-06-26 @ 04:14
On Mon, Jun 25, 2012 at 9:51 PM, Travis Oliphant <travis@continuum.io> wrote:
>
> On Jun 25, 2012, at 8:08 PM, Jon Riehl wrote:
>
>> This is a better syntax than the ad hoc one I made up (didn't want to
>> write a parser until I had some buy in on one).  I'll add support for
>> handling these declarations shortly.
>>
>> Please note there may still be a lag between when the decorator
>> supports this syntax and when the code generator uses the information
>> presented.

For sure! I believe there's a lot to be said for planning for the
future, and then only doing it when really necessary... or quite
possibly never.

>> Regarding call time translation: I would imagine that keeping some
>> decoration is useful for run-time type checks, and maybe to guide any
>> loop unrolling.  Perhaps there should be additional markup (say "_")
>> that indicates to the translator that the given type or value should
>> be determined at call-specialization-time. Some quick examples:
>>
>> * 1-d array of float32, where code should be generated for each
>> observed value of shape[0]: f[_]
>> * 1-d array of call-site specialized datatype, indexed by observed
>> datatype: _[:]
I'm with you

>> * 1 or higher dimensional array of float64 with shape[-1] == 10,
>> indexed by the observed shape[:-1] tuple: d[_..., 10]

This one took me a bit of thinking, but then I got it and it made
perfect sense.... in theory anyway. I'm reading this as "somehow,
you've managed to write a code generator that specializes to *all*
observed shapes and only works for inputs with shape[-1] == 10."

>>
>> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
>> men: "f[:,:]_c"/"d[...]_f"/etc.
>
> I like your thoughts here.   The "_" syntax makes some sense and is kind
of nice.   Other opinions?

I'm not sure I get this. Above, syntax like d[10] means that the code
generator works for double vectors of len 10. It's providing static
type information -- expresses what logical input shapes and dtypes are
acceptable to an implementation. It seems to me that stride
restrictions (e.g. C/FORTRAN) should also fall into this category,
it's natural to write an implementation that only works for e.g.
contiguous data. In this view, it makes sense to write something like
"d[10]c" to mean you require 10 contiguous float64s, and "d[10]_"
would mean that any stride is OK, because the specific stride
descriptor 'c' was replaced by the fill-in-at-runtime blank character
'_'

You might imagine a more expressive language describing the striding
of memory than a crude 3-way enum (N/C/F) for (Non-contiguous,
C-contiguous, F-contiguous) but the 3-way enum would be a useful
start. Later, maybe the string might look like 'd[:,:,:]s???', in
which the question-marks get replaced by some description of the
striding structure.

- James

Re: [numba] How to spell an array argument

From:
Dag Sverre Seljebotn
Date:
2012-06-26 @ 07:46
On 06/26/2012 03:08 AM, Jon Riehl wrote:
> This is a better syntax than the ad hoc one I made up (didn't want to
> write a parser until I had some buy in on one).  I'll add support for
> handling these declarations shortly.
>
> Please note there may still be a lag between when the decorator
> supports this syntax and when the code generator uses the information
> presented.
>
> Regarding call time translation: I would imagine that keeping some
> decoration is useful for run-time type checks, and maybe to guide any
> loop unrolling.  Perhaps there should be additional markup (say "_")
> that indicates to the translator that the given type or value should
> be determined at call-specialization-time. Some quick examples:
>
> * 1-d array of float32, where code should be generated for each
> observed value of shape[0]: f[_]
> * 1-d array of call-site specialized datatype, indexed by observed
> datatype: _[:]
> * 1 or higher dimensional array of float64 with shape[-1] == 10,
> indexed by the observed shape[:-1] tuple: d[_..., 10]
>
> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
> men: "f[:,:]_c"/"d[...]_f"/etc.


So Cython has a "DSL" for this already. It's complicated a bit by 
supporting the indirect feature of PEP 3118, which NumPy doesn't 
support, but "strided" vs. "contiguous", and "int[::1, :, :]" should 
relate to Numba as well.

http://docs.cython.org/src/userguide/memoryviews.html#specifying-data-layout

Dag


>
> Thanks,
> -Jon
>
> On Mon, Jun 25, 2012 at 3:07 PM, Travis Oliphant<travis@continuum.io>  wrote:
>> Hello,
>>
>> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>>
>> Examples:
>>
>> 1-d array of float32:   f[:]
>> 1-d array of float64:   d[:]
>> 1-d array of float64 with 10 elements:  d[10]
>>
>> 2-d array of float32:  f[:,:]
>> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>>
>> 3-D array of double:  d[:, :, :]
>>
>> arbitrary number of dimensions for float64:   f[...]
>>
>> I like this much better.
>>
>> Thanks,
>>
>> -Travis
>>
>>

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-26 @ 09:31
On 26 June 2012 08:46, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
> On 06/26/2012 03:08 AM, Jon Riehl wrote:
>> This is a better syntax than the ad hoc one I made up (didn't want to
>> write a parser until I had some buy in on one).  I'll add support for
>> handling these declarations shortly.
>>
>> Please note there may still be a lag between when the decorator
>> supports this syntax and when the code generator uses the information
>> presented.
>>
>> Regarding call time translation: I would imagine that keeping some
>> decoration is useful for run-time type checks, and maybe to guide any
>> loop unrolling.  Perhaps there should be additional markup (say "_")
>> that indicates to the translator that the given type or value should
>> be determined at call-specialization-time. Some quick examples:
>>
>> * 1-d array of float32, where code should be generated for each
>> observed value of shape[0]: f[_]
>> * 1-d array of call-site specialized datatype, indexed by observed
>> datatype: _[:]
>> * 1 or higher dimensional array of float64 with shape[-1] == 10,
>> indexed by the observed shape[:-1] tuple: d[_..., 10]
>>
>> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
>> men: "f[:,:]_c"/"d[...]_f"/etc.
>
>
> So Cython has a "DSL" for this already. It's complicated a bit by
> supporting the indirect feature of PEP 3118, which NumPy doesn't
> support, but "strided" vs. "contiguous", and "int[::1, :, :]" should
> relate to Numba as well.
>
> http://docs.cython.org/src/userguide/memoryviews.html#specifying-data-layout
>
> Dag
>

As a final note, I don't think you want your type system to be strings
which you have to parse. They should be objects to begin with, and
should only become strings when printed.

That said, the above syntax supports all the slicing variants, but not
the placeholder (you'd need numba._, but it may be confused with
gettext). But then, I have heard no compelling argument as to why you
would specialize on shape and not strides (since you can unroll
anyway, with a probably near-negligible runtime overhead, but with
full generality).


>>
>> Thanks,
>> -Jon
>>
>> On Mon, Jun 25, 2012 at 3:07 PM, Travis Oliphant<travis@continuum.io>  wrote:
>>> Hello,
>>>
>>> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>>>
>>> Examples:
>>>
>>> 1-d array of float32:   f[:]
>>> 1-d array of float64:   d[:]
>>> 1-d array of float64 with 10 elements:  d[10]
>>>
>>> 2-d array of float32:  f[:,:]
>>> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>>>
>>> 3-D array of double:  d[:, :, :]
>>>
>>> arbitrary number of dimensions for float64:   f[...]
>>>
>>> I like this much better.
>>>
>>> Thanks,
>>>
>>> -Travis
>>>
>>>
>

Re: [numba] How to spell an array argument

From:
mark florisson
Date:
2012-06-26 @ 11:38
On 26 June 2012 10:31, mark florisson <markflorisson88@gmail.com> wrote:
> On 26 June 2012 08:46, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
>> On 06/26/2012 03:08 AM, Jon Riehl wrote:
>>> This is a better syntax than the ad hoc one I made up (didn't want to
>>> write a parser until I had some buy in on one).  I'll add support for
>>> handling these declarations shortly.
>>>
>>> Please note there may still be a lag between when the decorator
>>> supports this syntax and when the code generator uses the information
>>> presented.
>>>
>>> Regarding call time translation: I would imagine that keeping some
>>> decoration is useful for run-time type checks, and maybe to guide any
>>> loop unrolling.  Perhaps there should be additional markup (say "_")
>>> that indicates to the translator that the given type or value should
>>> be determined at call-specialization-time. Some quick examples:
>>>
>>> * 1-d array of float32, where code should be generated for each
>>> observed value of shape[0]: f[_]
>>> * 1-d array of call-site specialized datatype, indexed by observed
>>> datatype: _[:]
>>> * 1 or higher dimensional array of float64 with shape[-1] == 10,
>>> indexed by the observed shape[:-1] tuple: d[_..., 10]
>>>
>>> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
>>> men: "f[:,:]_c"/"d[...]_f"/etc.
>>
>>
>> So Cython has a "DSL" for this already. It's complicated a bit by
>> supporting the indirect feature of PEP 3118, which NumPy doesn't
>> support, but "strided" vs. "contiguous", and "int[::1, :, :]" should
>> relate to Numba as well.
>>
>> http://docs.cython.org/src/userguide/memoryviews.html#specifying-data-layout
>>
>> Dag
>>
>
> As a final note, I don't think you want your type system to be strings
> which you have to parse. They should be objects to begin with, and
> should only become strings when printed.
>
> That said, the above syntax supports all the slicing variants, but not
> the placeholder (you'd need numba._, but it may be confused with
> gettext). But then, I have heard no compelling argument as to why you
> would specialize on shape and not strides (since you can unroll
> anyway, with a probably near-negligible runtime overhead, but with
> full generality).
>

Can I go ahead and implement a proper type system? I see that type
parsing already occurs independently in several places (printing,
promotion, conversion to llvm types), this really begs for an object
oriented approach.

>>>
>>> Thanks,
>>> -Jon
>>>
>>> On Mon, Jun 25, 2012 at 3:07 PM, Travis Oliphant<travis@continuum.io>  wrote:
>>>> Hello,
>>>>
>>>> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>>>>
>>>> Examples:
>>>>
>>>> 1-d array of float32:   f[:]
>>>> 1-d array of float64:   d[:]
>>>> 1-d array of float64 with 10 elements:  d[10]
>>>>
>>>> 2-d array of float32:  f[:,:]
>>>> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>>>>
>>>> 3-D array of double:  d[:, :, :]
>>>>
>>>> arbitrary number of dimensions for float64:   f[...]
>>>>
>>>> I like this much better.
>>>>
>>>> Thanks,
>>>>
>>>> -Travis
>>>>
>>>>
>>

Re: [numba] How to spell an array argument

From:
Travis Oliphant
Date:
2012-06-26 @ 13:23
On Jun 26, 2012, at 6:38 AM, mark florisson wrote:

> On 26 June 2012 10:31, mark florisson <markflorisson88@gmail.com> wrote:
>> On 26 June 2012 08:46, Dag Sverre Seljebotn <d.s.seljebotn@astro.uio.no> wrote:
>>> On 06/26/2012 03:08 AM, Jon Riehl wrote:
>>>> This is a better syntax than the ad hoc one I made up (didn't want to
>>>> write a parser until I had some buy in on one).  I'll add support for
>>>> handling these declarations shortly.
>>>> 
>>>> Please note there may still be a lag between when the decorator
>>>> supports this syntax and when the code generator uses the information
>>>> presented.
>>>> 
>>>> Regarding call time translation: I would imagine that keeping some
>>>> decoration is useful for run-time type checks, and maybe to guide any
>>>> loop unrolling.  Perhaps there should be additional markup (say "_")
>>>> that indicates to the translator that the given type or value should
>>>> be determined at call-specialization-time. Some quick examples:
>>>> 
>>>> * 1-d array of float32, where code should be generated for each
>>>> observed value of shape[0]: f[_]
>>>> * 1-d array of call-site specialized datatype, indexed by observed
>>>> datatype: _[:]
>>>> * 1 or higher dimensional array of float64 with shape[-1] == 10,
>>>> indexed by the observed shape[:-1] tuple: d[_..., 10]
>>>> 
>>>> It might also be useful to have a C/FORTRAN contiguous flag.  Straw
>>>> men: "f[:,:]_c"/"d[...]_f"/etc.
>>> 
>>> 
>>> So Cython has a "DSL" for this already. It's complicated a bit by
>>> supporting the indirect feature of PEP 3118, which NumPy doesn't
>>> support, but "strided" vs. "contiguous", and "int[::1, :, :]" should
>>> relate to Numba as well.
>>> 
>>> http://docs.cython.org/src/userguide/memoryviews.html#specifying-data-layout
>>> 
>>> Dag
>>> 
>> 
>> As a final note, I don't think you want your type system to be strings
>> which you have to parse. They should be objects to begin with, and
>> should only become strings when printed.
>> 
>> That said, the above syntax supports all the slicing variants, but not
>> the placeholder (you'd need numba._, but it may be confused with
>> gettext). But then, I have heard no compelling argument as to why you
>> would specialize on shape and not strides (since you can unroll
>> anyway, with a probably near-negligible runtime overhead, but with
>> full generality).
>> 
> 
> Can I go ahead and implement a proper type system? I see that type
> parsing already occurs independently in several places (printing,
> promotion, conversion to llvm types), this really begs for an object
> oriented approach.

This is definitely in scope and very likely necessary --- especially, to 
implement support for unsigned, boolean, and complex numbers.   Coordinate
with Jon who has already been thinking of such things, I'm sure. 

Thanks,

-Travis




> 
>>>> 
>>>> Thanks,
>>>> -Jon
>>>> 
>>>> On Mon, Jun 25, 2012 at 3:07 PM, Travis Oliphant<travis@continuum.io>  wrote:
>>>>> Hello,
>>>>> 
>>>>> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>>>>> 
>>>>> Examples:
>>>>> 
>>>>> 1-d array of float32:   f[:]
>>>>> 1-d array of float64:   d[:]
>>>>> 1-d array of float64 with 10 elements:  d[10]
>>>>> 
>>>>> 2-d array of float32:  f[:,:]
>>>>> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>>>>> 
>>>>> 3-D array of double:  d[:, :, :]
>>>>> 
>>>>> arbitrary number of dimensions for float64:   f[...]
>>>>> 
>>>>> I like this much better.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> -Travis
>>>>> 
>>>>> 
>>> 

Re: [numba] How to spell an array argument

From:
James Bergstra
Date:
2012-06-25 @ 20:17
On Mon, Jun 25, 2012 at 4:07 PM, Travis Oliphant <travis@continuum.io> wrote:
> Hello,
>
> We need a way to spell an array type in the decorator.   Mark Wiebe 
suggested a good way to use the character code for the type in the array 
and then brackets with ':' and ellipsis, and integers to spell how many 
dimensions.
>
> Examples:
>
> 1-d array of float32:   f[:]
> 1-d array of float64:   d[:]
> 1-d array of float64 with 10 elements:  d[10]
>
> 2-d array of float32:  f[:,:]
> 2-d array of float64 (with 10 elements in the last-dimension):   d[:,10]
>
> 3-D array of double:  d[:, :, :]
>
> arbitrary number of dimensions for float64:   f[...]
You meant d[...] here right?

> I like this much better.
>
> Thanks,
>

Looks good to me.

Twists that come to mind:

- f[10, :, :, 3] stands for 10 rgb images of unknown size, makes sense.

- does it make sense to combine it all like this f[10, ..., :, :, 3] ?

Is it important / desirable that this decorator be programmed with
strides / contiguity information as well as shape information? Strides
are at least as important for efficient iteration as shape.