2.3. Using Precedences

2.3. Using Precedences
Prev	Chapter 2. Using Happy	Next

Going back to our earlier expression-parsing example, wouldn't it be nicer if we didn't have to explicitly separate the expressions into terms and factors, merely to make it clear that '*' and '/' operators bind more tightly than '+' and '-'?

We could just change the grammar as follows (making the appropriate changes to the expression datatype too):

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
      | Exp '+' Exp             { Plus $1 $3 }
      | Exp '-' Exp             { Minus $1 $3 }
      | Exp '*' Exp             { Times $1 $3 }
      | Exp '/' Exp             { Div $1 $3 }
      | '(' Exp ')'             { Brack $2 }
      | int                     { Int $1 }
      | var                     { Var $1 }

but now Happy will complain that there are shift/reduce conflicts because the grammar is ambiguous - we haven't specified whether e.g. 1 + 2 * 3 is to be parsed as 1 + (2 * 3) or (1 + 2) * 3. Happy allows these ambiguities to be resolved by specifying the precedences of the operators involved using directives in the header^[2]:

...
%right in
%left '+' '-'
%left '*' '/'
%%
...

The %left or %right directive is followed by a list of terminals, and declares all these tokens to be left or right-associative respectively. The precedence of these tokens with respect to other tokens is established by the order of the %left and %right directives: earlier means lower precedence. A higher precedence causes an operator to bind more tightly; in our example above, because '*' has a higher precedence than '+', the expression 1 + 2 * 3 will parse as 1 + (2 * 3).

What happens when two operators have the same precedence? This is when the associativity comes into play. Operators specified as left associative will cause expressions like 1 + 2 - 3 to parse as (1 + 2) - 3, whereas right-associative operators would parse as 1 + (2 - 3). There is also a %nonassoc directive which indicates that the specified operators may not be used together. For example, if we add the comparison operators '>' and '<' to our grammar, then we would probably give their precedence as:

...
%right in
%nonassoc '>' '<'
%left '+' '-'
%left '*' '/'
%%
...

which indicates that '>' and '<' bind less tightly than the other operators, and the non-associativity causes expressions such as 1 > 2 > 3 to be disallowed.

2.3.1. How precedence works

The precedence directives, %left, %right and %nonassoc, assign precedence levels to the tokens in the declaration. A rule in the grammar may also have a precedence: if the last terminal in the left hand side of the rule has a precedence, then this is the precedence of the whole rule.

The precedences are used to resolve ambiguities in the grammar. If there is a shift/reduce conflict, then the precedence of the rule and the lookahead token are examined in order to resolve the conflict:

If the precedence of the rule is higher, then the conflict is resolved as a reduce.
If the precedence of the lookahead token is higher, then the conflict is resolved as a shift.
If the precedences are equal, then
- If the token is left-associative, then reduce
- If the token is right-associative, then shift
- If the token is non-associative, then fail
If either the rule or the token has no precedence, then the default is to shift (these conflicts are reported by Happy, whereas ones that are automatically resolved by the precedence rules are not).

2.3.2. Context-dependent Precedence

The precedence of an individual rule can be overriden, using context precedence. This is useful when, for example, a particular token has a different precedence depending on the context. A common example is the minus sign: it has high precedence when used as prefix negation, but a lower precedence when used as binary subtraction.

We can implement this in Happy as follows:

%right in
%nonassoc '>' '<'
%left '+' '-'
%left '*' '/'
%left NEG
%%

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
      | Exp '+' Exp             { Plus $1 $3 }
      | Exp '-' Exp             { Minus $1 $3 }
      | Exp '*' Exp             { Times $1 $3 }
      | Exp '/' Exp             { Div $1 $3 }
      | '(' Exp ')'             { Brack $2 }
      | '-' Exp %prec NEG       { Negate $2 }
      | int                     { Int $1 }
      | var                     { Var $1 }

We invent a new token NEG as a placeholder for the precedence of our prefix negation rule. The NEG token doesn't need to appear in a %token directive. The prefix negation rule has a %prec NEG directive attached, which overrides the default precedence for the rule (which would normally be the precedence of '-') with the precedence of NEG.

^[2]Users of yacc will find this familiar, Happy's precedence scheme works in exactly the same way.

Prev	Up	Next
2.2. Parsing sequences	Home	2.4. Type Signatures