librelist archives

« back to archive

Trying to understand the bytecode vs. AST business

Trying to understand the bytecode vs. AST business

From:
James Bergstra
Date:
2012-07-02 @ 23:06
A couple of people have posted recently about the merit / importance /
necessity  of working with an AST as opposed to byte-code. If anyone
has time to go into a little more detail on the rationale for me, I'd
appreciate it. I still think numba is great for putting bytecode VMing
on my radar, and I think there are some advantages to bytecode that
haven't been mentioned in the recent emails on ASTs.

It seems to me that there are at least two potentially-different
applications of an llvm decorator: let's call the two I have in mind
the low-level and high-level decorators.

The low-level decorator to my mind is essentially like cython.  You
write something like

@lowlevel
def foo(x):
   rval = np.zeros(len(x))
   for i, x_i in enumerate(x):
      rval[i] = (x_i ** 2).sum()
   return rval

This is a useful thing, as evidenced by several projects that do this
sort of thing in various ways.  Maybe not all language features are
supported here, and that makes the job of compiling it more
manageable. What distinguishes this lowlevel decorator from the
high-level one is that static analysis + JIT specialization makes
sense in this case, because most of the logic of the function is
actually *in* the function.

Contrast that code above with this:

@highlevel
def bar(x, f):
   rval = np.zeros(len(x))
   for i, x_i in enumerate(x):
      rval[i] = f(x_i)
   return rval

bar(rand(10, 5), lambda x_i : (x_i ** 2).sum())


In this case, I think the decorator should be in-lining the lambda.
That would make it a powerful tool. One of the great strengths of
using Python instead of C is the possibility of refactoring code in
more natural ways. The bar function is a limited case of the
generalized ufunc, or a numpy.map. It seems to me that AST analysis is
of limited use here, because if I understand correctly you can't get
an AST for a lambda (or a function with a closure? Are there other
such cases?).  Working with bytecode is sometimes painful, but at
least it's always there.

Using bytecode would make it possible in many cases to llvm-compile
3rd party code... like maybe even stuff like `bar(rand(100, 100),
lambda x_i: skimage.canny(x_i) ** 2)`.

If that kind of thing can be done via AST analysis without bytecode
then that's awesome, but I'm just saying it might be a shame to give
up this sort of feature.

- James

Re: [numba] Trying to understand the bytecode vs. AST business

From:
mark florisson
Date:
2012-07-03 @ 09:10
On 3 July 2012 00:06, James Bergstra <james.bergstra@gmail.com> wrote:
> A couple of people have posted recently about the merit / importance /
> necessity  of working with an AST as opposed to byte-code. If anyone
> has time to go into a little more detail on the rationale for me, I'd
> appreciate it. I still think numba is great for putting bytecode VMing
> on my radar, and I think there are some advantages to bytecode that
> haven't been mentioned in the recent emails on ASTs.
>
> It seems to me that there are at least two potentially-different
> applications of an llvm decorator: let's call the two I have in mind
> the low-level and high-level decorators.
>
> The low-level decorator to my mind is essentially like cython.  You
> write something like
>
> @lowlevel
> def foo(x):
>    rval = np.zeros(len(x))
>    for i, x_i in enumerate(x):
>       rval[i] = (x_i ** 2).sum()
>    return rval
>
> This is a useful thing, as evidenced by several projects that do this
> sort of thing in various ways.  Maybe not all language features are
> supported here, and that makes the job of compiling it more
> manageable. What distinguishes this lowlevel decorator from the
> high-level one is that static analysis + JIT specialization makes
> sense in this case, because most of the logic of the function is
> actually *in* the function.
>
> Contrast that code above with this:
>
> @highlevel
> def bar(x, f):
>    rval = np.zeros(len(x))
>    for i, x_i in enumerate(x):
>       rval[i] = f(x_i)
>    return rval
>
> bar(rand(10, 5), lambda x_i : (x_i ** 2).sum())
>
>
> In this case, I think the decorator should be in-lining the lambda.
> That would make it a powerful tool. One of the great strengths of
> using Python instead of C is the possibility of refactoring code in
> more natural ways. The bar function is a limited case of the
> generalized ufunc, or a numpy.map. It seems to me that AST analysis is
> of limited use here, because if I understand correctly you can't get
> an AST for a lambda (or a function with a closure? Are there other
> such cases?).  Working with bytecode is sometimes painful, but at
> least it's always there.
>
> Using bytecode would make it possible in many cases to llvm-compile
> 3rd party code... like maybe even stuff like `bar(rand(100, 100),
> lambda x_i: skimage.canny(x_i) ** 2)`.
>
> If that kind of thing can be done via AST analysis without bytecode
> then that's awesome, but I'm just saying it might be a shame to give
> up this sort of feature.
>
> - James

That is a valid point. The point I was making initially is subtly
different, though. It was that Numba should operate *internally* on an
AST, and not bytecode. You could always reconstruct an AST from the
given bytecode and operate on that. Going from bytecode immediately to
code generation, or having multiple bytecode passes with state between
them, is what you want to avoid.

On the other hand, you could mandate in these cases (of lambdas and
such), that people use actual def functions of which numba can get the
source (that is, the inspect module), and when some numba-annotated
function gets called, it resolves the dependencies, which if you want
them inlined or at least natively called, they must also be numba
functions.

Or you could even rely on SEP 201 with functions you call compiled up
front (at runtime), or create a followup and compile new ones on the
fly when you need them
(https://github.com/numfocus/sep/blob/master/sep201.rst).

The question is, how many cases do you want to handle? If you're going
to use the result of some unknown function, will you start
just-in-time specializations for individual code blocks instead of
entire functions?

Re: [numba] Trying to understand the bytecode vs. AST business

From:
Travis Oliphant
Date:
2012-07-03 @ 15:20
In this context, it is useful to know about meta which is Sean Ross-Ross's
package for converting from bytecode to AST.

-Travis


On Jul 3, 2012, at 4:10 AM, mark florisson wrote:

> On 3 July 2012 00:06, James Bergstra <james.bergstra@gmail.com> wrote:
>> A couple of people have posted recently about the merit / importance /
>> necessity  of working with an AST as opposed to byte-code. If anyone
>> has time to go into a little more detail on the rationale for me, I'd
>> appreciate it. I still think numba is great for putting bytecode VMing
>> on my radar, and I think there are some advantages to bytecode that
>> haven't been mentioned in the recent emails on ASTs.
>> 
>> It seems to me that there are at least two potentially-different
>> applications of an llvm decorator: let's call the two I have in mind
>> the low-level and high-level decorators.
>> 
>> The low-level decorator to my mind is essentially like cython.  You
>> write something like
>> 
>> @lowlevel
>> def foo(x):
>>   rval = np.zeros(len(x))
>>   for i, x_i in enumerate(x):
>>      rval[i] = (x_i ** 2).sum()
>>   return rval
>> 
>> This is a useful thing, as evidenced by several projects that do this
>> sort of thing in various ways.  Maybe not all language features are
>> supported here, and that makes the job of compiling it more
>> manageable. What distinguishes this lowlevel decorator from the
>> high-level one is that static analysis + JIT specialization makes
>> sense in this case, because most of the logic of the function is
>> actually *in* the function.
>> 
>> Contrast that code above with this:
>> 
>> @highlevel
>> def bar(x, f):
>>   rval = np.zeros(len(x))
>>   for i, x_i in enumerate(x):
>>      rval[i] = f(x_i)
>>   return rval
>> 
>> bar(rand(10, 5), lambda x_i : (x_i ** 2).sum())
>> 
>> 
>> In this case, I think the decorator should be in-lining the lambda.
>> That would make it a powerful tool. One of the great strengths of
>> using Python instead of C is the possibility of refactoring code in
>> more natural ways. The bar function is a limited case of the
>> generalized ufunc, or a numpy.map. It seems to me that AST analysis is
>> of limited use here, because if I understand correctly you can't get
>> an AST for a lambda (or a function with a closure? Are there other
>> such cases?).  Working with bytecode is sometimes painful, but at
>> least it's always there.
>> 
>> Using bytecode would make it possible in many cases to llvm-compile
>> 3rd party code... like maybe even stuff like `bar(rand(100, 100),
>> lambda x_i: skimage.canny(x_i) ** 2)`.
>> 
>> If that kind of thing can be done via AST analysis without bytecode
>> then that's awesome, but I'm just saying it might be a shame to give
>> up this sort of feature.
>> 
>> - James
> 
> That is a valid point. The point I was making initially is subtly
> different, though. It was that Numba should operate *internally* on an
> AST, and not bytecode. You could always reconstruct an AST from the
> given bytecode and operate on that. Going from bytecode immediately to
> code generation, or having multiple bytecode passes with state between
> them, is what you want to avoid.
> 
> On the other hand, you could mandate in these cases (of lambdas and
> such), that people use actual def functions of which numba can get the
> source (that is, the inspect module), and when some numba-annotated
> function gets called, it resolves the dependencies, which if you want
> them inlined or at least natively called, they must also be numba
> functions.
> 
> Or you could even rely on SEP 201 with functions you call compiled up
> front (at runtime), or create a followup and compile new ones on the
> fly when you need them
> (https://github.com/numfocus/sep/blob/master/sep201.rst).
> 
> The question is, how many cases do you want to handle? If you're going
> to use the result of some unknown function, will you start
> just-in-time specializations for individual code blocks instead of
> entire functions?

Re: [numba] Trying to understand the bytecode vs. AST business

From:
Ondřej Čertík
Date:
2012-07-03 @ 18:56
On Tue, Jul 3, 2012 at 8:20 AM, Travis Oliphant <travis@continuum.io> wrote:
> In this context, it is useful to know about meta which is Sean 
Ross-Ross's package for converting from bytecode to AST.

Ha -- I didn't know that was possible!

Here is the package: https://github.com/srossross/Meta
Note that it needs Python 2.7 (there were some syntax errors in 2.6).

I tested it on simple things and indeed it works like a charm.
So this gives a lot more possibilities (e.g. to go from bytecode to
AST and then from there).

Ondrej

Re: [numba] Trying to understand the bytecode vs. AST business

From:
Dag Sverre Seljebotn
Date:
2012-07-03 @ 18:57
On 07/03/2012 08:56 PM, Ondřej Čertík wrote:
> On Tue, Jul 3, 2012 at 8:20 AM, Travis Oliphant<travis@continuum.io>  wrote:
>> In this context, it is useful to know about meta which is Sean 
Ross-Ross's package for converting from bytecode to AST.
>
> Ha -- I didn't know that was possible!
>
> Here is the package: https://github.com/srossross/Meta
> Note that it needs Python 2.7 (there were some syntax errors in 2.6).
>
> I tested it on simple things and indeed it works like a charm.
> So this gives a lot more possibilities (e.g. to go from bytecode to
> AST and then from there).

I actually took this for granted (as did Mark Florisson I guess).

Dag

Re: [numba] Trying to understand the bytecode vs. AST business

From:
James Bergstra
Date:
2012-07-03 @ 15:32
On Tue, Jul 3, 2012 at 5:10 AM, mark florisson
<markflorisson88@gmail.com> wrote:
> That is a valid point. The point I was making initially is subtly
> different, though. It was that Numba should operate *internally* on an
> AST, and not bytecode. You could always reconstruct an AST from the
> given bytecode and operate on that. Going from bytecode immediately to
> code generation, or having multiple bytecode passes with state between
> them, is what you want to avoid.

This makes perfect sense. So the numba internal-use AST/IR in question
isn't necessarily the original Python AST?

Update: Travis' comment about "meta" is interesting. I had thought
that information was lost in rendering the AST as bytecode, but it
seems I was mistaken (so much the better!)  His one-line comment makes
the whole reply below admittedly off topic, but I still wanted to
share it so sending anyway.

> On the other hand, you could mandate in these cases (of lambdas and
> such), that people use actual def functions of which numba can get the
> source (that is, the inspect module), and when some numba-annotated
> function gets called, it resolves the dependencies, which if you want
> them inlined or at least natively called, they must also be numba
> functions.
>
> Or you could even rely on SEP 201 with functions you call compiled up
> front (at runtime), or create a followup and compile new ones on the
> fly when you need them
> (https://github.com/numfocus/sep/blob/master/sep201.rst).
>
> The question is, how many cases do you want to handle? If you're going
> to use the result of some unknown function, will you start
> just-in-time specializations for individual code blocks instead of
> entire functions?

I think this is one of the key design decision points that I was
curious about. The requirement that all functions within the call
(while tempting!) gives up a lot of functionality. For another example
beyond my foo() and bar() consider, well, just about any
object-oriented design in which some of the math functionality is in a
member method of an argument to bar. Doing the dynamic introspection
is a hugely useful feature.

I guess I'm thinking more about the call-site specialization than the
decorator-site specialization as the focal point of a future numba
API. For example in my autodiff code I have an implementation of some
iterative function minimization routines. This might be an unusual
case, but it's very clear how it should work:

1. user code calls fmin(f, *otherstuff)
2. fmin calls f once to get a ballpark runtime estimate
3. fmin optionally specializes the heck out of f using numba
4. fmin calls f hundreds / thousands of times

So in this case an explicit specialization makes syntactic sense, and
the JIT-specialized implementation is meant to be used internally by
fmin, and then thrown away. The user here is a perfectly happy numpy
user, and he/she is meant to never meant to even see numba in this use
case. He/she puts in normal Python numpy-using code, and gets back an
ndarray with the argmin of f().  If minimizable functions had to be
decorated with numba, and only use other functions decorated with
numba, it would be a major abstraction leak and an awkward limitation.


Thanks for pointing out that SEP. We came up against this issue in
Theano and didn't dealt with it in a way that is reusable between
projects like this proposal would be. Theano dealt with it by
replacing the Python interpreter with custom code that steps through
the elements o f the Theano computation graph. It makes a big
difference in performance when the elements themselves are small C
functions. Theano would almost definitely switch to using this SEP
proposal once it's fleshed out a little.

- James