Hey all,
I am having trouble lexing escaped quotes in my strings. I want to allow
something like this: "as\"df".
Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in javascript,
I hoped this simple test case would work:
%lex
%x string
%%
'"' this.begin('string');
<string>'"' this.popState();
<string>(\"|[^"])* { return 'STRING'; }
<<EOF>> return 'EOF'
/lex
%start string
%%
string
: STRING EOF { return $1; }
;
But it can't parse anything. If I change the pattern on from (\"|[^"])* to
[^"]* it can parse strings without escaped quotes just fine. I don't quite
know where to go from here. How would this be done?
Short answer: you must escape the escape character. The pattern (\"|[^"])* matches quotes or non-quote characters -- not escaped quotes and non-quote characters. The backslash in front of the quote is stripped! The JavaScript example is unaffected because the pattern never sees the opening and closing quote marks, so matching all quote or non-quote characters works. In the lexer example, the pattern needs to distinguish between escaped quotes and the closing quote, so it fails. The pattern (\\\"|[^"])* should work as you intended. Hope that helps, Zach On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com> wrote: > Hey all, > > I am having trouble lexing escaped quotes in my strings. I want to allow > something like this: "as\"df". > > Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in javascript, I > hoped this simple test case would work: > > %lex > %x string > %% > > '"' this.begin('string'); > <string>'"' this.popState(); > <string>(\"|[^"])* { return 'STRING'; } > > <<EOF>> return 'EOF' > > /lex > > %start string > > %% > > string > : STRING EOF { return $1; } > ; > > But it can't parse anything. If I change the pattern on > from (\"|[^"])* to [^"]* it can parse strings without escaped quotes just > fine. I don't quite know where to go from here. How would this be done?
Man, this is really helpful, can you put this into the documentation if it isn't already? On Wed, Jan 11, 2012 at 2:01 AM, Zachary Carter <zack.carter@gmail.com>wrote: > Short answer: you must escape the escape character. > > The pattern (\"|[^"])* matches quotes or non-quote characters -- not > escaped quotes and non-quote characters. The backslash in front of the > quote is stripped! > > The JavaScript example is unaffected because the pattern never sees > the opening and closing quote marks, so matching all quote or > non-quote characters works. > > In the lexer example, the pattern needs to distinguish between escaped > quotes and the closing quote, so it fails. The pattern (\\\"|[^"])* > should work as you intended. > > Hope that helps, > Zach > > On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com> > wrote: > > Hey all, > > > > I am having trouble lexing escaped quotes in my strings. I want to allow > > something like this: "as\"df". > > > > Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in > javascript, I > > hoped this simple test case would work: > > > > %lex > > %x string > > %% > > > > '"' this.begin('string'); > > <string>'"' this.popState(); > > <string>(\"|[^"])* { return 'STRING'; } > > > > <<EOF>> return 'EOF' > > > > /lex > > > > %start string > > > > %% > > > > string > > : STRING EOF { return $1; } > > ; > > > > But it can't parse anything. If I change the pattern on > > from (\"|[^"])* to [^"]* it can parse strings without escaped quotes just > > fine. I don't quite know where to go from here. How would this be done? > -- Robert Plummer
Very helpful indeed. Worked like a charm. Thanks! On Wed, Jan 11, 2012 at 6:25 AM, Robert Plummer < robertleeplummerjr@gmail.com> wrote: > Man, this is really helpful, can you put this into the documentation if it > isn't already? > > > On Wed, Jan 11, 2012 at 2:01 AM, Zachary Carter <zack.carter@gmail.com>wrote: > >> Short answer: you must escape the escape character. >> >> The pattern (\"|[^"])* matches quotes or non-quote characters -- not >> escaped quotes and non-quote characters. The backslash in front of the >> quote is stripped! >> >> The JavaScript example is unaffected because the pattern never sees >> the opening and closing quote marks, so matching all quote or >> non-quote characters works. >> >> In the lexer example, the pattern needs to distinguish between escaped >> quotes and the closing quote, so it fails. The pattern (\\\"|[^"])* >> should work as you intended. >> >> Hope that helps, >> Zach >> >> On Tue, Jan 10, 2012 at 11:23 PM, Paul Harper <benekastah@gmail.com> >> wrote: >> > Hey all, >> > >> > I am having trouble lexing escaped quotes in my strings. I want to allow >> > something like this: "as\"df". >> > >> > Since "as\"df".match(/(\"|[^"])*/); seems to work just fine in >> javascript, I >> > hoped this simple test case would work: >> > >> > %lex >> > %x string >> > %% >> > >> > '"' this.begin('string'); >> > <string>'"' this.popState(); >> > <string>(\"|[^"])* { return 'STRING'; } >> > >> > <<EOF>> return 'EOF' >> > >> > /lex >> > >> > %start string >> > >> > %% >> > >> > string >> > : STRING EOF { return $1; } >> > ; >> > >> > But it can't parse anything. If I change the pattern on >> > from (\"|[^"])* to [^"]* it can parse strings without escaped quotes >> just >> > fine. I don't quite know where to go from here. How would this be done? >> > > > > -- > Robert Plummer >