LL(1) Grammar of ALGOL 60

Ego sum rex Romanus, et supra grammaticam.

(I am the Roman emperor, and above grammar.)

— Sigismund, Holy Roman Emperor

The grammar for ALGOL 60 is written in a way to relate to the semantics of the language. This leads to a lot of redundant BNF expressions. This grammar is also not immediately amenable to parsing by computer or by human.

This page describes an LL(1) (for split lexer+parser parsers) grammar for ALGOL 60 to make parsing easier. It still accepts all strings that ALGOL 60 accepts, but may accept strings that are non-sensical. Scannerless parsers will require a larger lookahead.

It is designed to be similar to how a recursive descent parser would be written, and condenses production rules that are redundantly named. If one of the alternative productions of a metavariable is ⟨empty⟩, then it must only be taken if the other productions do not parse.

This grammar encodes precedence as in the Report. This grammar is right recursive and the Report says nothing about the association of arithmetic operators, but a parser might want to rearange the parse tree to keep assocation correct as expected.

Programs

⟨block⟩ ::=
begin⟨declaration list⟩⟨statement⟩⟨statement list⟩
| ⟨identifier⟩:⟨block⟩
⟨declaration list⟩ ::=
⟨declaration⟩;⟨declaration list⟩
| ⟨empty⟩
⟨statement list⟩ ::=
;⟨statement⟩⟨statement list⟩
| end
⟨empty⟩ ::=

Declarations

To detect if a string is produced by ⟨declaration⟩, look ahead to see if the current token is produced by ⟨declaration first⟩.

⟨declaration first⟩ ::=
real
| integer
| Boolean
| own
| array
| switch
| procedure
⟨type name⟩ ::=
real
| integer
| Boolean
⟨declaration⟩ ::=
own⟨own declaration next⟩
| array⟨array declaration next⟩
| switch⟨switch declaration next⟩
| procedure⟨procedure declaration next⟩
| ⟨type name⟩⟨type name next⟩
⟨type name next⟩ ::=
procedure⟨procedure declaration next⟩
| array⟨array declaration next⟩
| ⟨type declaration next⟩
⟨own declaration next⟩ ::=
array⟨array declaration next⟩
| ⟨type name⟩⟨maybe array declaration next⟩
⟨maybe array declaration next⟩ ::=
array⟨array declaration next⟩
| ⟨type declaration next⟩

Array declarations

⟨array declaration next⟩ ::=
⟨identifier⟩⟨array segment list⟩
⟨array segment list⟩ ::=
[⟨expression⟩:⟨expression⟩⟨array bounds continue⟩⟨next array segment or empty⟩
| ,⟨identifier⟩⟨array segment list⟩
⟨next array segment or empty⟩ ::=
,⟨identifier⟩⟨array segment list⟩
| ⟨empty⟩
⟨array bounds continue⟩ ::=
,⟨expression⟩:⟨expression⟩⟨array bounds continue⟩
| ]

Switch declarations

⟨switch declaration next⟩ ::=
⟨identifier⟩:=⟨expression⟩⟨switch list⟩
⟨switch list⟩ ::=
⟨empty⟩
| ,⟨expression⟩⟨switch list⟩

Type declarations

⟨type declaration next⟩ ::=
⟨identifier⟩⟨type declaration list⟩
⟨type declaration list⟩ ::=
,⟨identifier⟩⟨type declaration list⟩
| ⟨empty⟩

Procedure declarations

⟨procedure declaration next⟩ ::=
⟨identifier⟩⟨formal parameter part⟩⟨value part⟩⟨specification part⟩⟨statement⟩
⟨formal parameter part⟩ ::=
(⟨identifier⟩⟨identifier list⟩⟨named parameter list⟩
| ;
⟨identifier list⟩ ::=
,⟨identifier⟩⟨identifier list⟩
| )
⟨named parameter list⟩ ::=
⟨identifier⟩:(⟨identifier⟩⟨identifier list⟩⟨named parameter list⟩
| ;
⟨value part⟩ ::=
value⟨identifier⟩⟨semicolon-terminated identifier list⟩
| ⟨empty⟩
⟨semicolon-terminated identifier list⟩ ::=
;
| ,⟨identifier⟩⟨semicolon-terminated identifier list⟩
⟨specification part⟩ ::=
⟨specifier⟩⟨identifier⟩⟨semicolon-terminated identifier list⟩⟨specification part⟩
| ⟨empty⟩
⟨specifier⟩ ::=
string
| ⟨type name⟩⟨specifier next⟩
| label
| switch
| procedure
| array
⟨specifier next⟩ ::=
array
| procedure
| ⟨empty⟩

Statements

To detect if a non-empty string is produced by ⟨statement⟩, look ahead to see if the current token is produced by ⟨statement first⟩. Statements can be empty (the dummy statement).

⟨statement first⟩ ::=
for
| if
| begin
| go to
| ⟨identifier⟩
⟨statement⟩ ::=
for⟨for statement continue⟩
| if⟨expression⟩⟨if statement continue⟩
| ⟨unconditional statement⟩
| ⟨identifier⟩:⟨statement⟩
⟨unconditional statement⟩ ::=
⟨block⟩
| go to⟨expression⟩
| ⟨identifier⟩⟨assignment or procedure statement next⟩
| ⟨empty⟩
⟨assignment or procedure statement next⟩ ::=
:=⟨left part list⟩
| (⟨procedure statment next⟩
| ⟨empty⟩

For statements

⟨for statement continue⟩ ::=
⟨identifier⟩:=⟨for element⟩⟨for element list⟩⟨statement⟩
⟨for element⟩ ::=
⟨expression⟩⟨for element next⟩
⟨for element next⟩ ::=
step⟨expression⟩until⟨expression⟩
| while⟨expression⟩
| ⟨empty⟩
⟨for element list⟩ ::=
,⟨for element⟩⟨for element list⟩
| do

If statements

⟨if statement continue⟩ ::=
then⟨unconditional statement⟩⟨if statement next⟩
| for⟨for statement continue⟩
⟨if statement next⟩ ::=
else⟨statement⟩
| ⟨empty⟩

Assignment Statements

⟨left part list⟩ ::=
⟨variable⟩:=⟨left part list⟩
| ⟨expression⟩

Procedure Statements

⟨procedure statement next⟩ ::=
⟨actual parameter⟩⟨actual parameter list⟩⟨named expression list⟩
| ⟨empty⟩
⟨actual parameter list⟩ ::=
)
| ,⟨actual parameter⟩⟨actual parameter list⟩
⟨actual parameter⟩ ::=
⟨string⟩
| ⟨expression⟩
⟨named expression list⟩ ::=
⟨identifier⟩:(⟨actual parameter⟩⟨actual parameter list⟩⟨named expression list⟩
| ⟨empty⟩

Expressions

The grammar of expressions are simplified to allow for semantically invalid expressions. A separate semantic analysis pass should be done afterwards to reject semantically invalid programs.

To match an expression, look ahead enough tokens to match ⟨expression start⟩.

⟨expression start⟩ ::=
⟨identifier⟩
| +
| -
| ⟨number⟩
| (
| ¬
| if
| true
| false
⟨expression⟩ ::=
⟨primary expression⟩⟨expression next⟩
| if⟨expression⟩then⟨expression⟩⟨if expression next⟩
| ⟨adding operator⟩⟨term⟩
⟨expression next⟩ ::=
⟨adding operator⟩⟨term⟩
| ⟨multiplying operator⟩⟨factor⟩
| ⟨primary expression⟩
| ⟨implication⟩
| ⟨Boolean term⟩
| ⟨Boolean factor⟩
| ⟨Boolean secondary⟩
| ⟨relational operator⟩⟨simple arithmetic expression⟩
⟨primary expression⟩ ::=
⟨identifier⟩⟨identifier expression next⟩
| (⟨expression⟩)
| ¬⟨primary expression⟩
| true
| false
| ⟨number⟩

Relational expressions

⟨relational operator⟩ ::=
<
|
| =
|
| >
|

If expressions

⟨if expression next⟩ ::=
⟨empty⟩
| else⟨expression⟩

Variable expressions

⟨identifier expression next⟩ ::=
[⟨expression⟩⟨subscript list⟩
| (⟨procedure statement next⟩
| ⟨empty⟩
⟨subscript list⟩ ::=
]
| ,⟨expression⟩⟨subscript list⟩

Arithmetic expressions

⟨adding operator⟩ ::=
+
| -
⟨adding operator⟩ ::=
×
| /
| ÷
⟨simple arithmetic expression⟩ ::=
⟨adding operator⟩⟨term⟩⟨simple arithmetic expression next⟩
| ⟨term⟩⟨simple arithmetic expression next⟩
⟨simple arithmetic expression next⟩ ::=
⟨adding operator⟩⟨term⟩⟨simple arithmetic expression next⟩
| ⟨empty⟩
⟨term⟩ ::=
⟨factor⟩⟨term next⟩
⟨term next⟩ ::=
⟨multiplying operator⟩⟨factor⟩⟨term next⟩
| ⟨empty⟩
⟨factor⟩ ::=
⟨primary expression⟩⟨factor next⟩
⟨factor next⟩ ::=
⟨primary expression⟩⟨factor next⟩
| ⟨empty⟩

Boolean expressions

⟨implication⟩ ::=
⟨Boolean term⟩⟨implication next⟩
⟨implication next⟩ ::=
⟨Boolean term⟩⟨implication next⟩
| ⟨empty⟩
⟨Boolean term⟩ ::=
⟨Boolean factor⟩⟨Boolean term next⟩
⟨Boolean term next⟩ ::=
⟨Boolean factor⟩⟨Boolean term next⟩
| ⟨empty⟩
⟨Boolean factor⟩ ::=
⟨primary expression⟩⟨Boolean factor next⟩
⟨Boolean factor next⟩ ::=
⟨primary expression⟩⟨Boolean factor next⟩
| ⟨empty⟩

Identifiers, Strings, and Digits

⟨identifier⟩ ::=
⟨Omitted⟩
⟨digit⟩ ::=
⟨Omitted⟩
⟨string⟩ ::=
⟨Omitted⟩

These are deliberately omitted as they are too open-ended in the Report, and because it is trivial to define LL(1) grammars for parsing these objects.

Numbers

Lookahead is required to disambiguate a number with a fixed sign from ⟨adding operator⟩ next to a ⟨factor⟩.

⟨number⟩ ::=
⟨digit⟩⟨unsigned number next⟩
| ⟨adding operator⟩⟨unsigned number next⟩
⟨unsigned number next⟩ ::=
.⟨fractional part⟩
| ⟨integer⟩
| ⟨digit⟩⟨unsigned number next⟩
⟨fractional part⟩ ::=
⟨digit⟩⟨fracitonal part next⟩
⟨fractional part next⟩ ::=
⟨integer⟩
| ⟨digit⟩⟨fractional part next⟩
| ⟨empty⟩
⟨integer⟩ ::=
⟨adding operator⟩⟨digit⟩⟨unsigned integer next⟩
| ⟨digit⟩⟨unsigned integer next⟩
⟨unsigned integer next⟩ ::=
⟨digit⟩⟨unsigned integer next⟩
| ⟨empty⟩