2.3. Using Precedences

Going back to our earlier expression-parsing example, wouldn't it be nicer if we didn't have to explicitly separate the expressions into terms and factors, merely to make it clear that '*' and '/' operators bind more tightly than '+' and '-'?

We could just change the grammar as follows (making the appropriate changes to the expression datatype too):

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
      | Exp '+' Exp             { Plus $1 $3 }
      | Exp '-' Exp             { Minus $1 $3 }
      | Exp '*' Exp             { Times $1 $3 }
      | Exp '/' Exp             { Div $1 $3 }
      | '(' Exp ')'             { Brack $2 }
      | int                     { Int $1 }
      | var                     { Var $1 }

but now Happy will complain that there are shift/reduce conflicts because the grammar is ambiguous - we haven't specified whether e.g. 1 + 2 * 3 is to be parsed as 1 + (2 * 3) or (1 + 2) * 3. Happy allows these ambiguities to be resolved by specifying the precedences of the operators involved using directives in the header[1]:

...
%right in
%left '+' '-'
%left '*' '/'
%%
...

The %left or %right directive is followed by a list of terminals, and declares all these tokens to be left or right-associative respectively. The precedence of these tokens with respect to other tokens is established by the order of the %left and %right directives: earlier means lower precedence. A higher precedence causes an operator to bind more tightly; in our example above, because '*' has a higher precedence than '+', the expression 1 + 2 * 3 will parse as 1 + (2 * 3).

What happens when two operators have the same precedence? This is when the associativity comes into play. Operators specified as left associative will cause expressions like 1 + 2 - 3 to parse as (1 + 2) - 3, whereas right-associative operators would parse as 1 + (2 - 3). There is also a %nonassoc directive which indicates that the specified operators may not be used together. For example, if we add the comparison operators '>' and '<' to our grammar, then we would probably give their precedence as:

...
%nonassoc '>' '<'
%right in
%left '+' '-'
%left '*' '/'
%%
...

which indicates that '>' and '<' bind less tightly than the other operators, and the non-associativity causes expressions such as 1 > 2 > 3 to be disallowed.

2.3.1. How precedence works

The precedence directives, %left, %right and %nonassoc, assign precedence levels to the tokens in the declaration. A rule in the grammar may also have a precedence: if the last terminal in the left hand side of the rule has a precedence, then this is the precedence of the whole rule.

The precedences are used to resolve ambiguities in the grammar. If there is a shift/reduce conflict, then the precedence of the rule and the lookahead token are examined in order to resolve the conflict:

2.3.2. Context-dependent Precedence

The precedence of an individual rule can be overriden, using context precedence. This is useful when, for example, a particular token has a different precedence depending on the context. A common example is the minus sign: it has high precedence when used as prefix negation, but a lower precedence when used as binary subtraction.

We can implement this in Happy as follows:

%nonassoc '>' '<'
%right in
%left '+' '-'
%left '*' '/'
%left NEG
%%

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
      | Exp '+' Exp             { Plus $1 $3 }
      | Exp '-' Exp             { Minus $1 $3 }
      | Exp '*' Exp             { Times $1 $3 }
      | Exp '/' Exp             { Div $1 $3 }
      | '(' Exp ')'             { Brack $2 }
      | '-' Exp %prec NEG       { Negate $2 }
      | int                     { Int $1 }
      | var                     { Var $1 }

We invent a new token NEG as a placeholder for the precedence of our prefix negation rule. The NEG token doesn't need to appear in a %token directive. The prefix negation rule has a %prec NEG directive attached, which overrides the default precedence for the rule (which would normally be the precedence of '-') with the precedence of NEG.

Notes

[1]

Users of yacc will find this familiar, Happy's precedence scheme works in exactly the same way.