A "left-recursive" grammar means that the parser can get into a loop in the parsing rules without making any progress consuming the input. Imagine that each production is a subroutine that might eat some tokens or call some other subroutines. For example, consider this production:
expr → number '+' number
This means that the expr
parser will first call the number
parser (which might consume some tokens), then try to eat a +
token (and fail if the next token is not of that type) and then call the number
parser again to consume more tokens. If that is all successful, the expr
parser will yield success.
Now consider this left-recursive production:
expr → expr '+' expr
The first thing the expr
parser will do is to call itself recursively, putting itself into an unproductive infinite loop.
In contrast, consider this production, which is not left-recursive:
expr → '(' expr ')'
The first thing the parser will do is try to consume an open parenthesis token, and it will call itself recursively only if that is successful. It can never get into an infinite loop doing this because there are only so many parentheses and it has to eat one each time before it can recurse; eventually it must run out of parentheses and then the recursion will stop. Your Expr = [0-9]+ / '(' Expr ')'
example is similarly safe.
Here is a portion of the left-recursive grammar you showed:
expr → [0-9]+ | sum sum → expr '*' expr
If the next token is not [0-9]+
, then the expr
parser will call sum
, which will immediately call expr
again, and the parser will be in an infinite loop.
Usually you can rewrite rules like sum
in a non-left-recursive form. In this case it will look something like this:
expr → term more_terms more_terms → '*' term more_terms | '+' term more_terms | term → [0-9]+ | '(' expr ')'
This is not left-recursive because the more_terms
parser must consume a *
or a +
token before it can call itself recursively; it is impossible for the parser to get into an infinite loop by doing this. Similarly the expr
parser calls term
, but term
will not call back to expr
without consuming a parenthesis.
In general the transformation says to turn a left-recursive grammar like this:
S → S 'p' | 'q'
into a non-left-recursive one like this:
S → 'q' ps ps → 'p' ps |
I don't know if there are any good online resources (I suppose there must be) but if you like Perl, chapter 8 of Higher-Order Perl is available online and discusses precisely this point.
Also, I hope it is not insulting to suggest that you read the Wikipedia article about left recursion, and perhaps the article about recursive descent parsing, which is the name for the technique your parser generator is trying to use.