Issues with the grammar and reference implementation
- From:
- Petri Lehtinen
- Date:
- 2010-02-26 @ 19:07
I found a few more issues.
The grammar is a bit ambiguous regarding unquoted property names and
numbers, as both may start with a dash '-'. The reference
implementation doesn't allow property names starting with a dash, and
I personally think that the lexical analysis would be simpler if this
ambiguity was removed. The grammar should be like this:
property_name
json_string
[A-Za-z_][A-Za-z_\-]
Moreover, I don't see why digits should be prohibited after the
initial alpha or underscore. Allowing digits would make the property
names more like identifiers in popular programming languages (altough
dash is not usually allowed in identifiers as it is also the
subtraction operator. So, one option would be to completely prohibit
dashes in property names.)
Another problem with property names is that the keywords ('any',
'array', 'boolean', 'integer', 'null', 'number', 'object', 'string' or
'union') are not explicitly prohibited, i.e. the following orderly
schema is valid:
object { integer integer; }
This might be a bit confusing for the user. Moreover, the reference
implementation doesn't allow this, as it makes a clear separation
between keywords and property names on the lexical level.
In addition to the more complex ones above, I found some minor bugs in
the reference implementation:
* Empty union yields an empty JSON schema:
union {} --> {}
This is incorrect as, according to the JSON schema specification, a
union must hold at least two sub-schemas, and any JSON value is
valid for an empty schema.
* Negative values for ranges are accepted:
string{-5,} --> {'type': 'string', 'minLength': -5}
Petri Lehtinen