librelist archives

« back to archive

Lexing string with escaped quotes

Lexing string with escaped quotes

From:
Paul Harper
Date:
2012-01-11 @ 04:23
Hey all,

I am having trouble lexing escaped quotes in my strings. I want to allow
something like this: "as\"df".

Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in javascript,
I hoped this simple test case would work:

%lex
%x string
%%

'"'                                     this.begin('string');
<string>'"'                             this.popState();
<string>(\"|[^"])*                      { return 'STRING'; }

<<EOF>> return 'EOF'

/lex

%start string

%%

string
  : STRING EOF { return $1; }
  ;

But it can't parse anything. If I change the pattern on from (\"|[^"])* to
[^"]* it can parse strings without escaped quotes just fine. I don't quite
know where to go from here. How would this be done?

Re: [jison] Lexing string with escaped quotes

From:
Zachary Carter
Date:
2012-01-11 @ 07:01
Short answer: you must escape the escape character.

The pattern (\"|[^"])* matches quotes or non-quote characters -- not
escaped quotes and non-quote characters. The backslash in front of the
quote is stripped!

The JavaScript example is unaffected because the pattern never sees
the opening and closing quote marks, so matching all quote or
non-quote characters works.

In the lexer example, the pattern needs to distinguish between escaped
quotes and the closing quote, so it fails. The pattern (\\\"|[^"])*
should work as you intended.

Hope that helps,
Zach

On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com> wrote:
> Hey all,
>
> I am having trouble lexing escaped quotes in my strings. I want to allow
> something like this: "as\"df".
>
> Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in javascript, I
> hoped this simple test case would work:
>
> %lex
> %x string
> %%
>
> '"'                                     this.begin('string');
> <string>'"'                             this.popState();
> <string>(\"|[^"])*                      { return 'STRING'; }
>
> <<EOF>> return 'EOF'
>
> /lex
>
> %start string
>
> %%
>
> string
>   : STRING EOF { return $1; }
>   ;
>
> But it can't parse anything. If I change the pattern on
> from (\"|[^"])* to [^"]* it can parse strings without escaped quotes just
> fine. I don't quite know where to go from here. How would this be done?

Re: [jison] Lexing string with escaped quotes

From:
Robert Plummer
Date:
2012-01-11 @ 13:25
Man, this is really helpful, can you put this into the documentation if it
isn't already?

On Wed, Jan 11, 2012 at 2:01 AM, Zachary Carter <zack.carter@gmail.com>wrote:

> Short answer: you must escape the escape character.
>
> The pattern (\"|[^"])* matches quotes or non-quote characters -- not
> escaped quotes and non-quote characters. The backslash in front of the
> quote is stripped!
>
> The JavaScript example is unaffected because the pattern never sees
> the opening and closing quote marks, so matching all quote or
> non-quote characters works.
>
> In the lexer example, the pattern needs to distinguish between escaped
> quotes and the closing quote, so it fails. The pattern (\\\"|[^"])*
> should work as you intended.
>
> Hope that helps,
> Zach
>
> On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com>
> wrote:
> > Hey all,
> >
> > I am having trouble lexing escaped quotes in my strings. I want to allow
> > something like this: "as\"df".
> >
> > Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in
> javascript, I
> > hoped this simple test case would work:
> >
> > %lex
> > %x string
> > %%
> >
> > '"'                                     this.begin('string');
> > <string>'"'                             this.popState();
> > <string>(\"|[^"])*                      { return 'STRING'; }
> >
> > <<EOF>> return 'EOF'
> >
> > /lex
> >
> > %start string
> >
> > %%
> >
> > string
> >   : STRING EOF { return $1; }
> >   ;
> >
> > But it can't parse anything. If I change the pattern on
> > from (\"|[^"])* to [^"]* it can parse strings without escaped quotes just
> > fine. I don't quite know where to go from here. How would this be done?
>



-- 
Robert Plummer

Re: [jison] Lexing string with escaped quotes

From:
Paul Harper
Date:
2012-01-12 @ 03:16
Very helpful indeed. Worked like a charm. Thanks!

On Wed, Jan 11, 2012 at 6:25 AM, Robert Plummer <
robertleeplummerjr@gmail.com> wrote:

> Man, this is really helpful, can you put this into the documentation if it
> isn't already?
>
>
> On Wed, Jan 11, 2012 at 2:01 AM, Zachary Carter <zack.carter@gmail.com>wrote:
>
>> Short answer: you must escape the escape character.
>>
>> The pattern (\"|[^"])* matches quotes or non-quote characters -- not
>> escaped quotes and non-quote characters. The backslash in front of the
>> quote is stripped!
>>
>> The JavaScript example is unaffected because the pattern never sees
>> the opening and closing quote marks, so matching all quote or
>> non-quote characters works.
>>
>> In the lexer example, the pattern needs to distinguish between escaped
>> quotes and the closing quote, so it fails. The pattern (\\\"|[^"])*
>> should work as you intended.
>>
>> Hope that helps,
>> Zach
>>
>> On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com>
>> wrote:
>> > Hey all,
>> >
>> > I am having trouble lexing escaped quotes in my strings. I want to allow
>> > something like this: "as\"df".
>> >
>> > Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in
>> javascript, I
>> > hoped this simple test case would work:
>> >
>> > %lex
>> > %x string
>> > %%
>> >
>> > '"'                                     this.begin('string');
>> > <string>'"'                             this.popState();
>> > <string>(\"|[^"])*                      { return 'STRING'; }
>> >
>> > <<EOF>> return 'EOF'
>> >
>> > /lex
>> >
>> > %start string
>> >
>> > %%
>> >
>> > string
>> >   : STRING EOF { return $1; }
>> >   ;
>> >
>> > But it can't parse anything. If I change the pattern on
>> > from (\"|[^"])* to [^"]* it can parse strings without escaped quotes
>> just
>> > fine. I don't quite know where to go from here. How would this be done?
>>
>
>
>
> --
> Robert Plummer
>