The presence of one or more %attribute
directives indicates
that a grammar is an attribute grammar. Attributes are calculated properties
that are associated with the non-terminals in a parse tree. Each
%attribute
directive generates a field in the attributes
record with the given name and type.
The first %attribute
directive in a grammar defines the default attribute. The
default attribute is distinguished in two ways: 1) if no attribute specifier is
given on an attribute reference,
the default attribute is assumed (see Section 4.2.2, “Semantic Rules”)
and 2) the value for the default attribute of the starting non-terminal becomes the
return value of the parse.
Optionally, one may specify a type declaration for the attribute record using
the %attributetype
declaration. This allows you to define the
type given to the attribute record and, more importantly, allows you to introduce
type variables that can be subsequently used in %attribute
declarations. If the %attributetype
directive is given without
any %attribute
declarations, then the %attributetype
declaration has no effect.
For example, the following declarations:
%attributetype { MyAttributes a } %attribute value { a } %attribute num { Int } %attribute label { String }
would generate this attribute record declaration in the parser:
data MyAttributes a = HappyAttributes { value :: a, num :: Int, label :: String }
and value
would be the default attribute.
In an ordinary Happy grammar, a production consists of a list of terminals and/or non-terminals followed by an uninterpreted code fragment enclosed in braces. With an attribute grammar, the format is very similar, but the braces enclose a set of semantic rules rather than uninterpreted Haskell code. Each semantic rule is either an attribute calculation or a conditional, and rules are separated by semicolons[3].
Both attribute calculations and conditionals may contain attribute references
and/or terminal references. Just like regular Happy grammars, the tokens
$1
through $<n>
, where
n
is the number of symbols in the production, refer to
subtrees of the parse. If the referenced symbol is a terminal, then the
value of the reference is just the value of the terminal, the same way as
in a regular Happy grammar. If the referenced symbol is a non-terminal,
then the reference may be followed by an attribute specifier, which is
a dot followed by an attribute name. If the attribute specifier is omitted,
then the default attribute is assumed (the default attribute is the first
attribute appearing in an %attribute
declaration).
The special reference $$
references the
attributes of the current node in the parse tree; it behaves exactly
like the numbered references. Additionally, the reference $>
always references the rightmost symbol in the production.
An attribute calculation rule is of the form:
<attribute reference> = <Haskell expression>
A rule of this form defines the value of an attribute, possibly as a function
of the attributes of $$
(inherited attributes), the attributes
of non-terminals in the production (synthesized attributes), or the values of
terminals in the production. The value for an attribute can only
be defined once for a particular production.
The following rule calculates the default attribute of the current production in terms of the first and second items of the production (a synthesized attribute):
$$ = $1 : $2
This rule calculates the length attribute of a non-terminal in terms of the length of the current non-terminal (an inherited attribute):
$1.length = $$.length + 1
Conditional rules allow the rejection of strings due to context-sensitive properties. All conditional rules have the form:
where <Haskell expression>
For non-monadic parsers, all conditional expressions
must be of the same (monomorphic) type. At
the end of the parse, the conditionals will be reduced using
seq
, which gives the grammar an opportunity to call
error
with an informative message. For monadic parsers,
all conditional statements must have type Monad m => m ()
where
m
is the monad in which the parser operates. All conditionals
will be sequenced at the end of the parse, which allows the conditionals to call
fail
with an informative message.
The following conditional rule will cause the (non-monadic) parser to fail if the inherited length attribute is not 0.
where if $$.length == 0 then () else error "length not equal to 0"
This conditional is the monadic equivalent:
where unless ($$.length == 0) (fail "length not equal to 0")
[3] Note that semantic rules must not rely on layout, because whitespace alignment is not guaranteed to be preserved