librelist archives

« back to archive

composing grammars?

composing grammars?

From:
Ant Skelton
Date:
2011-05-07 @ 14:25
Hi,

I have a complex application that originally included two treetop grammars;
one for a C like language, and another for a Ruty style templating engine. I
have split the former into a language grammar and an infix notation
expression grammar, so I need to "compose" these, by which I mean include
the expression grammar into the language grammar. My ultimate goal is to
reuse the expression grammar in the templating grammar also, to give my
template syntax more oomph.

The approved way to do this, as per the rdocs, is something like:

class FooBarParser < Parslet::Parser
    root(:foobar)

    rule(:foobar) do
        FooParser.new.as(:foo) >> BarParser.new.as(:bar)
    end
end

But I was wondering if there were any disadvantages to reusing the same
parser object, like this:

class BarFooParser < Parslet::Parser

    def initialize(*args)
        super(*args)
        @foo = FooParser.new
        @bar = BarParser.new
    end

    root(:barfoo)

    rule(:barfoo) do
        @bar.as(:bar) >> @foo.as(:foo)
    end
end

or even this:

class FooBazParser < Parslet::Parser
    class << self; attr_accessor :foo, :baz; end

    def initialize(*args)
        super(*args)
        self.class.foo ||= FooParser.new
        self.class.baz ||= BazParser.new
    end

    root(:foobaz)

    rule(:foobaz) do
        self.class.foo.as(:foo) >> self.class.baz.as(:baz)
    end
end

These all work ok in my simple test cases, but I'm wondering if parsers
store any state which might make this approach go wrong once the parser
instance is referenced in multiple rules, or on multiple branches of an
ordered choice expression? And what's the overhead of a Parser.new ? Is it
even worth bothering?

cheers
ant

Re: [ruby.parslet] composing grammars?

From:
Jason Garber
Date:
2011-05-07 @ 20:27
I'm curious about this too. I have a grammar within a grammar in RedCloth.
As far as I can tell, there's very little overhead compared to just
including the other rules. Another one of the brilliances of Parslet.
Interested in hearing an official answer, though. :)

Jason

On Sat, May 7, 2011 at 8:25 AM, Ant Skelton <ant@ant.org> wrote:

> Hi,
>
> I have a complex application that originally included two treetop grammars;
> one for a C like language, and another for a Ruty style templating engine. I
> have split the former into a language grammar and an infix notation
> expression grammar, so I need to "compose" these, by which I mean include
> the expression grammar into the language grammar. My ultimate goal is to
> reuse the expression grammar in the templating grammar also, to give my
> template syntax more oomph.
>
> The approved way to do this, as per the rdocs, is something like:
>
> class FooBarParser < Parslet::Parser
>     root(:foobar)
>
>     rule(:foobar) do
>         FooParser.new.as(:foo) >> BarParser.new.as(:bar)
>     end
> end
>
> But I was wondering if there were any disadvantages to reusing the same
> parser object, like this:
>
> class BarFooParser < Parslet::Parser
>
>     def initialize(*args)
>         super(*args)
>         @foo = FooParser.new
>         @bar = BarParser.new
>     end
>
>     root(:barfoo)
>
>     rule(:barfoo) do
>         @bar.as(:bar) >> @foo.as(:foo)
>     end
> end
>
> or even this:
>
> class FooBazParser < Parslet::Parser
>     class << self; attr_accessor :foo, :baz; end
>
>     def initialize(*args)
>         super(*args)
>         self.class.foo ||= FooParser.new
>         self.class.baz ||= BazParser.new
>     end
>
>     root(:foobaz)
>
>     rule(:foobaz) do
>         self.class.foo.as(:foo) >> self.class.baz.as(:baz)
>     end
> end
>
> These all work ok in my simple test cases, but I'm wondering if parsers
> store any state which might make this approach go wrong once the parser
> instance is referenced in multiple rules, or on multiple branches of an
> ordered choice expression? And what's the overhead of a Parser.new ? Is it
> even worth bothering?
>
> cheers
> ant
>

Re: composing grammars?

From:
Kaspar Schiess
Date:
2011-05-08 @ 11:44
Hi Ant,

> The approved way to do this, as per the rdocs, is something like:
>
>     class FooBarParser < Parslet::Parser
>          root(:foobar)
>
>          rule(:foobar) do
>     FooParser.new.as <http://FooParser.new.as>(:foo) >> BarParser.new.as
>     <http://BarParser.new.as>(:bar)
>          end
>     end

This should work well: all parsers are atoms as well, so you just 
concatenate them. And since rules memoize the parsers generated anyway, 
you'll only create one instance per FooBarParser instance. This means 
that the following tricks are not needed!


> But I was wondering if there were any disadvantages to reusing the same
> parser object, like this:
>
[snipped: Instance variable example]
[snipped: Class variable example]

> These all work ok in my simple test cases, but I'm wondering if parsers
> store any state which might make this approach go wrong once the parser
> instance is referenced in multiple rules, or on multiple branches of an
> ordered choice expression? And what's the overhead of a Parser.new ? Is
> it even worth bothering?

Parsers (Parslet::Parser) have practically no state unless used. Then 
they keep a memoize structure of <rule_name, parser atom> tuples around. 
Parslet atoms have some state, like the last error that occurred in 
them. This is very limited and only created once a parse is attempted. 
Nothing that prohibits reuse of instances. Most state is passed through 
the parser tree during the parse and thrown away afterwards.

You can look at it that way: If creating parsers were costly, you could 
not create hundreds of them on the first parse... every rule creates a 
parser object.

Parsers can also more clumsily be composed through module inclusion, see 
ip_address.rb in the examples directory.

Generally speaking, I would go with the pattern that you cite first and 
not worry too much. If you hit a bottleneck, post the code here and 
we'll try to find the problem. Elegant and readable should be good as 
well with parslet.

best regards,
kaspar


Re: [ruby.parslet] Re: composing grammars?

From:
Ant Skelton
Date:
2011-05-08 @ 15:19
Hi,

This should work well: all parsers are atoms as well, so you just
> concatenate them. And since rules memoize the parsers generated anyway,
> you'll only create one instance per FooBarParser instance. This means
> that the following tricks are not needed!
>

OK, thanks for the info!

In a related note; I have been trying to embed my expression-related
transformations into my expression grammar and have them automatically
applied when the parse() method is called. I want to encapsulate the
transformations along with the grammar rules, so that my AST is always
returned, not the PORO tree.

 I tried this:

class ExpressionParser < Parselet::Parser
    def parse(*args)
       self.transform(super(*args))
    end
end

 This works fine when you call the parse() method on an instance of this
class; but it doesn't get called when this grammar is composed into another
parent grammar. I suspect that this is because it is invoked by it's try()
method, so instead I tried this:

  def try(source, context)
        result = super(source, context)

        unless result.error?
            success(transform(flatten(result.result)))
        end
  end

Which works, but feels evil. Is there an easier way to do this? I want do
avoid the "do all your parsing first, then do all your transforms" approach,
because it feels right that my Expression class should own everything to do
with expressions.

cheers

ant

Re: [ruby.parslet] composing grammars?

From:
Jonathan Rochkind
Date:
2011-05-09 @ 14:59
For that matter, I've been wondering. Is:

rule :foobar do
    stuff
end

Really just syntactic sugar for exactly:

def foobar
   stuff
end

?

Either way, can you compose simply using ruby modules instead?

module Foo
    rule :foobar do
      stuff
    end
end

class FooBar < Parslet::Parser
    include Foo
....
end

?

I guess that would require using non-conflicting rules for your names in 
Foo and Bar, if it works at all.


On 5/7/2011 10:25 AM, Ant Skelton wrote:
> Hi,
>
> I have a complex application that originally included two treetop 
> grammars; one for a C like language, and another for a Ruty style 
> templating engine. I have split the former into a language grammar and 
> an infix notation expression grammar, so I need to "compose" these, by 
> which I mean include the expression grammar into the language grammar. 
> My ultimate goal is to reuse the expression grammar in the templating 
> grammar also, to give my template syntax more oomph.
>
> The approved way to do this, as per the rdocs, is something like:
>
>     class FooBarParser < Parslet::Parser
>         root(:foobar)
>
>         rule(:foobar) do
>     FooParser.new.as <http://FooParser.new.as>(:foo) >>
>     BarParser.new.as <http://BarParser.new.as>(:bar)
>         end
>     end
>
> But I was wondering if there were any disadvantages to reusing the 
> same parser object, like this:
>
>     class BarFooParser < Parslet::Parser
>
>         def initialize(*args)
>             super(*args)
>             @foo = FooParser.new
>             @bar = BarParser.new
>         end
>
>         root(:barfoo)
>
>         rule(:barfoo) do
>             @bar.as <http://bar.as>(:bar) >> @foo.as
>     <http://foo.as>(:foo)
>         end
>     end
>
> or even this:
>
>     class FooBazParser < Parslet::Parser
>         class << self; attr_accessor :foo, :baz; end
>
>         def initialize(*args)
>             super(*args)
>             self.class.foo ||= FooParser.new
>             self.class.baz ||= BazParser.new
>         end
>
>         root(:foobaz)
>
>         rule(:foobaz) do
>     self.class.foo.as <http://self.class.foo.as>(:foo) >>
>     self.class.baz.as <http://self.class.baz.as>(:baz)
>         end
>     end
>
> These all work ok in my simple test cases, but I'm wondering if 
> parsers store any state which might make this approach go wrong once 
> the parser instance is referenced in multiple rules, or on multiple 
> branches of an ordered choice expression? And what's the overhead of a 
> Parser.new ? Is it even worth bothering?
>
> cheers
> ant

Re: composing grammars?

From:
Kaspar Schiess
Date:
2011-05-10 @ 17:33
Hei!

> Really just syntactic sugar for exactly:
>
> def foobar
>    stuff
> end
>
> ?

No. More like

def foobar
   stuff
end
memoize :foobar

> Either way, can you compose simply using ruby modules instead?
>
> module Foo
>     rule :foobar do
>       stuff
>     end
> end
Yes. See ip_parser.rb...

> class FooBar < Parslet::Parser
>     include Foo
> ....
> end
>
> ?
>
> I guess that would require using non-conflicting rules for your names in
> Foo and Bar, if it works at all.
Yup. That's it.

regards,
kaspar