Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- Jonathan Rochkind
- Date:
- 2012-04-17 @ 02:11
I've struggled with figuring out how to provide useful parse error
messages to users too.
If there are hints for how to do that with present parslet, please do
share examples.
If Parslet can be modified to make this easier, I'm all for it. If you
have to choose between error messages useful for grammar developers and
error messages useful for end-users entering strings to be parsed -- I'd
definitely choose in favor of the end-users.
The developer has other tools available to him or her, like a debugger,
and unit tests, and the ability to change the code and see what happens,
and the ability to send sub-strings through specific rules like Parlset
makes so easy.
The end-user has nothing but the error messages we can provide them.
Parlset is super easy to debug grammars even without the error trees --
it's pure ruby nature and ability to take an individual rule and parse a
sub-string with it make it easier than anything else I've used. I've
barely ever needed to use the 'error tree' to debug my grammars when
writing em.
But I've needed to figure out a way to provide good "can't parse" errors
to end-users, and not been able to figure one out.
________________________________________
From: ruby.parslet@librelist.com [ruby.parslet@librelist.com] on behalf of
John Mettraux [jmettraux@gmail.com]
Sent: Monday, April 16, 2012 6:17 PM
To: ruby.parslet@librelist.com
Subject: Re: [ruby.parslet] Re: issue #64 discussion (error trees)
On Mon, Apr 16, 2012 at 07:14:19PM +0200, Kaspar Schiess wrote:
>
> Just reviewing your latest set of changes on the arboriculture branch of
> your parslet repository. What you basically propose is a different
> approach to what errors matter and how they should be displayed. Let me
> try to explain to you your own approach and see if I get your idea
> straight. Only then will I try to do a critique of it and then maybe we
> can find convergence.
Hello Kaspar,
great.
> You use the concept of deepest error, which is the error that happened
> at the parse position that was most advanced in the source file. The
> changes you propose would completely remove the stack-trace like
> error-trees and replace them with just one error message that is
> associated with that deepest parse. It would also be mostly associated
> with the most concrete parse at that position: the error would not say
> that a high level rule failed, but that a rule like match() or str() failed.
>
> If the above doesn't capture your idea, consider what is below
> irrelevant to discussion. Feel free to set me right.
You're right. My motivation is coming from "end users" complaining about
error messages pointing at the haystack rather than at the needle. I was
already offering some pinpointing by reading the error_tree, but it fell
short because the error_tree felt truncated (hence my opening of
https://github.com/kschiess/parslet/issues/64 ).
> This approach seems to work well with the grammar you use. Have you
> thought about how this generalizes? It seems really easy to construct a
> pathological grammar where the deepest error carries no meaning to the
> user of your language.
Granted, my vision is certainly limited to the walls of my "grammar cubicle"
I have a couple other grammars I use elsewhere but they're not as fat as the
one with which I'm striving now.
> How does the grammar writer know how the deepest error relates to the
> grammar? What should I fiddle with if I know the input is correct but
> the grammar is not? It seems that we have two set of needs here. As a
> grammar writer I want to know how my grammar failed to parse X; as a
> writer of X I might indeed just want to know about one position to
> twiddle. The error tree anchors the errors back into the structure of
> the grammar; but it leaves the problem of what to display to the user
> (writer of X) completely unsolved. I know I've gone half way only and
> solved my own problem there. Finally somebody notices.
When developing the grammar, I used the error_tree a lot. Now that I'm
handing grammar, parser and transformer to the user, I need a helpful error,
the users aren't as patient as I am.
The parser I work with most of my time is the Ruby one. Most of the time it's
providing me with a decent error message pointing at my mistake, the rest of
the time
> Another concern I've been having (that you probably didn't think of
> here) is the time parslet is spending in the management of all those
> error objects. Even with efficient GC, constructing all those objects
> takes a lot of time when we probably don't need half of them. Your
> approach doesn't address the problem, it just filters what to keep
> differently.
Exactly.
"half of them": from what my exploration taught me, we could keep the error
with the deepest pos and discard errors (ie not instantiate them) with
smaller pos. Granted, some combinations of grammar and source could yield
"instantiate 90% of the errors" and the win would be worthless.
> I am thinking: could we do a first parse for getting just results, and
> once that fails, do a second parse that constructs error information
> using a kind of aggregator? Aggregation could then implement either of
> our ideas about how errors should look like... We might be winning on
> more than one front at once. How does that sound?
I like the idea a lot, sounds right, keep the happy path lean.
> We'd finally be comparing different kinds of apples when benchmarking
> against Treetop, at least...
>
> I will now try to hack your grammar to produce better error messages,
> without changing parslet. Just because I think this might be doable ;)
> I'll report back.
Looking forward to the results.
Please take some time to look at individual commits in my arboriculture fork,
it's not all misguided adventure ;-)
Thanks a ton!
John
Re: issue #64 discussion (error trees)
- From:
- Kaspar Schiess
- Date:
- 2012-04-17 @ 07:50
On 17.04.12 04:11, Jonathan Rochkind wrote:
> If Parslet can be modified to make this easier, I'm all for it. If you
have to choose between error messages useful for grammar developers and
error messages useful for end-users entering strings to be parsed -- I'd
definitely choose in favor of the end-users.
I refuse the excluded middle here. Current parser errors are what makes
parslet cool to me and to people I've been getting feedback from. If you
want an engine that is just geared towards end users - parslet is not it.
The debugging argument you make is invalid. Treetop also generates Ruby
code and could be amended to debugging with the techniques you describe
- and it's faster too! - but I don't see you using it...
This is a clear case of shifting needs. Need no. 1 is mine: how to debug
a parser engine? Need no. 2 is the grammar writers need, how to debug
the grammar I am writing? Need no. 3 is the grammar users need, how to
make my input conform to the grammar?
I am all for pluggable error engines (how cool would that be!), but I
will not support either/or decisions on this one.
My current plan is to fix the engine we have (it's broken a bit) and
then see how to make it pluggable. Then I will try to get jmettraux to
implement his strategy as a plugin. Let's see how that goes (unless
someone has a better idea that doesn't mean sacrifice).
kaspar
Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- Jonathan Rochkind
- Date:
- 2012-04-17 @ 15:11
Obviously it's your code, you can do what you want with it!
The reason I don't use treetop is because it's horribly painful to use,
and I could never get a grammar to work with it. Parslet is a dream, so
much of a dream that I personally never even had to use the error_tree
to debug my grammars and get them working.
But if you or other users find the current error_tree useful, then of
course, that's that, I'm just me.
Suggestions or examples of how to provide useful end-user parse error
messages with Parslet would definitely be welcome though, if anyone has
them.
On 4/17/2012 3:50 AM, Kaspar Schiess wrote:
> On 17.04.12 04:11, Jonathan Rochkind wrote:
>> If Parslet can be modified to make this easier, I'm all for it. If you
have to choose between error messages useful for grammar developers and
error messages useful for end-users entering strings to be parsed -- I'd
definitely choose in favor of the end-users.
> I refuse the excluded middle here. Current parser errors are what makes
> parslet cool to me and to people I've been getting feedback from. If you
> want an engine that is just geared towards end users - parslet is not it.
>
> The debugging argument you make is invalid. Treetop also generates Ruby
> code and could be amended to debugging with the techniques you describe
> - and it's faster too! - but I don't see you using it...
>
> This is a clear case of shifting needs. Need no. 1 is mine: how to debug
> a parser engine? Need no. 2 is the grammar writers need, how to debug
> the grammar I am writing? Need no. 3 is the grammar users need, how to
> make my input conform to the grammar?
>
> I am all for pluggable error engines (how cool would that be!), but I
> will not support either/or decisions on this one.
>
> My current plan is to fix the engine we have (it's broken a bit) and
> then see how to make it pluggable. Then I will try to get jmettraux to
> implement his strategy as a plugin. Let's see how that goes (unless
> someone has a better idea that doesn't mean sacrifice).
>
> kaspar
>
>
>
>
Re: issue #64 discussion (error trees)
- From:
- Kaspar Schiess
- Date:
- 2012-04-18 @ 09:38
On 17.04.12 17:11, Jonathan Rochkind wrote:
> Suggestions or examples of how to provide useful end-user parse error
> messages with Parslet would definitely be welcome though, if anyone has
> them.
I think such suggestions will be a result of the current discussion,
once we've implemented all the things we would like to. I even have some
new ideas that I want to try out first.
If anyone knows about parser engines that do well in this respect,
please share! We'll blatantly copy what they do.
k
Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- Nigel Thorne
- Date:
- 2012-04-19 @ 22:43
I just wanted to say I love the idea of parsers having different pluginable
error reporters for different kinds of feedback.
Great idea guys!
---
"No man is an island... except Philip"
On 18 April 2012 19:38, Kaspar Schiess <eule@space.ch> wrote:
> On 17.04.12 17:11, Jonathan Rochkind wrote:
> > Suggestions or examples of how to provide useful end-user parse error
> > messages with Parslet would definitely be welcome though, if anyone has
> > them.
>
> I think such suggestions will be a result of the current discussion,
> once we've implemented all the things we would like to. I even have some
> new ideas that I want to try out first.
>
> If anyone knows about parser engines that do well in this respect,
> please share! We'll blatantly copy what they do.
>
> k
>
>
Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- John Mettraux
- Date:
- 2012-04-17 @ 08:10
On Tue, Apr 17, 2012 at 09:50:16AM +0200, Kaspar Schiess wrote:
>
> My current plan is to fix the engine we have (it's broken a bit) and
> then see how to make it pluggable. Then I will try to get jmettraux to
> implement his strategy as a plugin. Let's see how that goes (unless
> someone has a better idea that doesn't mean sacrifice).
Hello Kaspar,
sounds great.
The technique I was using until my fork was to catch the ParseFailed then
mine the info in the exception and the parser to re-raise an exception that I
deemed more appropriate for my users. I opened issue #64 when I realized that
some of the required information got discarded.
Thanks in advance,
John
Re: issue #64 discussion (error trees)
- From:
- Kaspar Schiess
- Date:
- 2012-04-16 @ 17:14
Hei John,
Just reviewing your latest set of changes on the arboriculture branch of
your parslet repository. What you basically propose is a different
approach to what errors matter and how they should be displayed. Let me
try to explain to you your own approach and see if I get your idea
straight. Only then will I try to do a critique of it and then maybe we
can find convergence.
You use the concept of deepest error, which is the error that happened
at the parse position that was most advanced in the source file. The
changes you propose would completely remove the stack-trace like
error-trees and replace them with just one error message that is
associated with that deepest parse. It would also be mostly associated
with the most concrete parse at that position: the error would not say
that a high level rule failed, but that a rule like match() or str() failed.
If the above doesn't capture your idea, consider what is below
irrelevant to discussion. Feel free to set me right.
This approach seems to work well with the grammar you use. Have you
thought about how this generalizes? It seems really easy to construct a
pathological grammar where the deepest error carries no meaning to the
user of your language.
How does the grammar writer know how the deepest error relates to the
grammar? What should I fiddle with if I know the input is correct but
the grammar is not? It seems that we have two set of needs here. As a
grammar writer I want to know how my grammar failed to parse X; as a
writer of X I might indeed just want to know about one position to
twiddle. The error tree anchors the errors back into the structure of
the grammar; but it leaves the problem of what to display to the user
(writer of X) completely unsolved. I know I've gone half way only and
solved my own problem there. Finally somebody notices.
Another concern I've been having (that you probably didn't think of
here) is the time parslet is spending in the management of all those
error objects. Even with efficient GC, constructing all those objects
takes a lot of time when we probably don't need half of them. Your
approach doesn't address the problem, it just filters what to keep
differently.
I am thinking: could we do a first parse for getting just results, and
once that fails, do a second parse that constructs error information
using a kind of aggregator? Aggregation could then implement either of
our ideas about how errors should look like... We might be winning on
more than one front at once. How does that sound?
We'd finally be comparing different kinds of apples when benchmarking
against Treetop, at least...
I will now try to hack your grammar to produce better error messages,
without changing parslet. Just because I think this might be doable ;)
I'll report back.
regards, kaspar
Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- John Mettraux
- Date:
- 2012-04-16 @ 22:17
On Mon, Apr 16, 2012 at 07:14:19PM +0200, Kaspar Schiess wrote:
>
> Just reviewing your latest set of changes on the arboriculture branch of
> your parslet repository. What you basically propose is a different
> approach to what errors matter and how they should be displayed. Let me
> try to explain to you your own approach and see if I get your idea
> straight. Only then will I try to do a critique of it and then maybe we
> can find convergence.
Hello Kaspar,
great.
> You use the concept of deepest error, which is the error that happened
> at the parse position that was most advanced in the source file. The
> changes you propose would completely remove the stack-trace like
> error-trees and replace them with just one error message that is
> associated with that deepest parse. It would also be mostly associated
> with the most concrete parse at that position: the error would not say
> that a high level rule failed, but that a rule like match() or str() failed.
>
> If the above doesn't capture your idea, consider what is below
> irrelevant to discussion. Feel free to set me right.
You're right. My motivation is coming from "end users" complaining about
error messages pointing at the haystack rather than at the needle. I was
already offering some pinpointing by reading the error_tree, but it fell
short because the error_tree felt truncated (hence my opening of
https://github.com/kschiess/parslet/issues/64 ).
> This approach seems to work well with the grammar you use. Have you
> thought about how this generalizes? It seems really easy to construct a
> pathological grammar where the deepest error carries no meaning to the
> user of your language.
Granted, my vision is certainly limited to the walls of my "grammar cubicle"
I have a couple other grammars I use elsewhere but they're not as fat as the
one with which I'm striving now.
> How does the grammar writer know how the deepest error relates to the
> grammar? What should I fiddle with if I know the input is correct but
> the grammar is not? It seems that we have two set of needs here. As a
> grammar writer I want to know how my grammar failed to parse X; as a
> writer of X I might indeed just want to know about one position to
> twiddle. The error tree anchors the errors back into the structure of
> the grammar; but it leaves the problem of what to display to the user
> (writer of X) completely unsolved. I know I've gone half way only and
> solved my own problem there. Finally somebody notices.
When developing the grammar, I used the error_tree a lot. Now that I'm
handing grammar, parser and transformer to the user, I need a helpful error,
the users aren't as patient as I am.
The parser I work with most of my time is the Ruby one. Most of the time it's
providing me with a decent error message pointing at my mistake, the rest of
the time
> Another concern I've been having (that you probably didn't think of
> here) is the time parslet is spending in the management of all those
> error objects. Even with efficient GC, constructing all those objects
> takes a lot of time when we probably don't need half of them. Your
> approach doesn't address the problem, it just filters what to keep
> differently.
Exactly.
"half of them": from what my exploration taught me, we could keep the error
with the deepest pos and discard errors (ie not instantiate them) with
smaller pos. Granted, some combinations of grammar and source could yield
"instantiate 90% of the errors" and the win would be worthless.
> I am thinking: could we do a first parse for getting just results, and
> once that fails, do a second parse that constructs error information
> using a kind of aggregator? Aggregation could then implement either of
> our ideas about how errors should look like... We might be winning on
> more than one front at once. How does that sound?
I like the idea a lot, sounds right, keep the happy path lean.
> We'd finally be comparing different kinds of apples when benchmarking
> against Treetop, at least...
>
> I will now try to hack your grammar to produce better error messages,
> without changing parslet. Just because I think this might be doable ;)
> I'll report back.
Looking forward to the results.
Please take some time to look at individual commits in my arboriculture fork,
it's not all misguided adventure ;-)
Thanks a ton!
John
Re: issue #64 discussion (error trees)
- From:
- Kaspar Schiess
- Date:
- 2012-04-17 @ 07:44
On 17.04.12 00:17, John Mettraux wrote:
> Please take some time to look at individual commits in my arboriculture fork,
> it's not all misguided adventure;-)
Believe me, I've been through the commits. As you say, it is not all
misguided (rather: its not misguided at all), but the commits are large
and you mix code cleanup (boy scouting) with real changes - very hard to
pull. But go on exploring, I have a benevolent eye on the changes and
will prompt you for patches (if I may) later on, ok?
On the topic of what goes where: When we're discussing changes to
parslet, I'd like to do it here. Also problems with specific grammars.
When we're talking about bugs and remedies, we can use issues. Is that
unlogical to you guys?
kaspar
Re: [ruby.parslet] Re: issue #64 discussion (error trees)
- From:
- John Mettraux
- Date:
- 2012-04-17 @ 08:06
On Tue, Apr 17, 2012 at 09:44:06AM +0200, Kaspar Schiess wrote:
> On 17.04.12 00:17, John Mettraux wrote:
> > Please take some time to look at individual commits in my arboriculture fork,
> > it's not all misguided adventure;-)
>
> Believe me, I've been through the commits. As you say, it is not all
> misguided (rather: its not misguided at all), but the commits are large
> and you mix code cleanup (boy scouting) with real changes - very hard to
> pull. But go on exploring, I have a benevolent eye on the changes and
> will prompt you for patches (if I may) later on, ok?
Hello Kaspar,
pick what you want without asking. I was/am not really hoping for them to
get pulled in, just wanted to share the ideas in them.
> On the topic of what goes where: When we're discussing changes to
> parslet, I'd like to do it here. Also problems with specific grammars.
> When we're talking about bugs and remedies, we can use issues. Is that
> unlogical to you guys?
Sounds good. Don't hesitate to take the lead and redirect a conversation from
the issue tracker and link to issues from the mailing list.
John