Node:Input File Grammar, Next:, Previous:Input File Format, Up:Input File Format



Input File Grammar

The grammar is in EBNF format, with a|b meaning "a or b", {c} meaning 0 or more repetitions of c and [d] meaning 0 or 1 repetitions of d.

Vmgen input is not free-format, so you have to take care where you put newlines (and, in a few cases, white space).

description: {instruction|comment|eval-escape|c-escape}

instruction: simple-inst|superinst

simple-inst: ident '(' stack-effect ')' newline c-code newline newline

stack-effect: {ident} '--' {ident}

super-inst: ident '=' ident {ident}

comment:      '\ '  text newline

eval-escape:  '\E ' text newline

c-escape:     '\C ' text newline

Note that the \s in this grammar are meant literally, not as C-style encodings for non-printable characters.

There are two ways to delimit the C code in simple-inst:

The text in comment, eval-escape and c-escape must not contain a newline. Ident must conform to the usual conventions of C identifiers (otherwise the C compiler would choke on the Vmgen output), except that idents in stack-effect may have a stack prefix (for stack prefix syntax, see Eval escapes).

The c-escape passes the text through to each output file (without the \C). This is useful mainly for conditional compilation (i.e., you write \C #if ... etc.).

In addition to the syntax given in the grammer, Vmgen also processes sync lines (lines starting with #line), as produced by m4 -s (see Invoking m4) and similar tools. This allows associating C compiler error messages with the original source of the C code.

Vmgen understands a few extensions beyond the grammar given here, but these extensions are only useful for building Gforth. You can find a description of the format used for Gforth in prim.