gforth/doc/vmgen.texi - diff

Return to vmgen.texi CVS log

Up to [gforth] / gforth / doc

Diff for /gforth/doc/vmgen.texi between versions 1.12 and 1.29

version 1.12, 2002/08/16 09:43:49	version 1.29, 2005/10/02 11:30:34
Line 10 This manual is for Vmgen	Line 10 This manual is for Vmgen
(version @value{VERSION}, @value{UPDATED}),	(version @value{VERSION}, @value{UPDATED}),
the virtual machine interpreter generator	the virtual machine interpreter generator

Copyright @copyright{} 2002 Free Software Foundation, Inc.	Copyright @copyright{} 2002,2003,2005 Free Software Foundation, Inc.

@quotation	@quotation
Permission is granted to copy, distribute and/or modify this document	Permission is granted to copy, distribute and/or modify this document
Line 27 Software Foundation raise funds for GNU	Line 27 Software Foundation raise funds for GNU
@end quotation	@end quotation
@end copying	@end copying

@dircategory GNU programming tools	@dircategory Software development
@direntry	@direntry
* Vmgen: (vmgen). Interpreter generator	* Vmgen: (vmgen). Virtual machine interpreter generator
@end direntry	@end direntry

@titlepage	@titlepage
Line 57 Software Foundation raise funds for GNU	Line 57 Software Foundation raise funds for GNU
* Invoking Vmgen::	* Invoking Vmgen::
* Example::	* Example::
* Input File Format::	* Input File Format::
	* Error messages:: reported by Vmgen
* Using the generated code::	* Using the generated code::
	* Hints:: VM archictecture, efficiency
	* The future::
* Changes:: from earlier versions	* Changes:: from earlier versions
* Contact:: Bug reporting etc.	* Contact:: Bug reporting etc.
* Copying This Manual:: Manual License	* Copying This Manual:: Manual License
Line 82 Input File Format	Line 85 Input File Format
* Input File Grammar::	* Input File Grammar::
* Simple instructions::	* Simple instructions::
* Superinstructions::	* Superinstructions::
	* Store Optimization::
* Register Machines:: How to define register VM instructions	* Register Machines:: How to define register VM instructions

	Input File Grammar

	* Eval escapes:: what follows \E

Simple instructions	Simple instructions

	* Explicit stack access:: If the C code accesses a stack pointer
* C Code Macros:: Macros recognized by Vmgen	* C Code Macros:: Macros recognized by Vmgen
* C Code restrictions:: Vmgen makes assumptions about C code	* C Code restrictions:: Vmgen makes assumptions about C code
	* Stack growth direction:: is configurable per stack

Using the generated code	Using the generated code

Line 98 Using the generated code	Line 108 Using the generated code
* VM disassembler:: for debugging the front end	* VM disassembler:: for debugging the front end
* VM profiler:: for finding worthwhile superinstructions	* VM profiler:: for finding worthwhile superinstructions

	Hints

	* Floating point:: and stacks

Copying This Manual	Copying This Manual

* GNU Free Documentation License:: License for copying this manual.	* GNU Free Documentation License:: License for copying this manual.
Line 151 In this setup, Vmgen can generate most o	Line 165 In this setup, Vmgen can generate most o
machine instructions from a simple description of the virtual machine	machine instructions from a simple description of the virtual machine
instructions (@pxref{Input File Format}), in particular:	instructions (@pxref{Input File Format}), in particular:

@table @asis	@table @strong

@item VM instruction execution	@item VM instruction execution

Line 172 Useful for optimizing the VM interpreter	Line 186 Useful for optimizing the VM interpreter

@end table	@end table

	To create parts of the interpretive system that do not deal with VM
	instructions, you have to use other tools (e.g., @command{bison}) and/or
	hand-code them.

@cindex efficiency features overview	@cindex efficiency features overview
@noindent	@noindent
Vmgen supports efficient interpreters though various optimizations, in	Vmgen supports efficient interpreters though various optimizations, in
Line 209 offered by Vmgen.	Line 227 offered by Vmgen.

There are many potential uses of the instruction descriptions that are	There are many potential uses of the instruction descriptions that are
not implemented at the moment, but we are open for feature requests, and	not implemented at the moment, but we are open for feature requests, and
we will implement new features if someone asks for them; so the feature	we will consider new features if someone asks for them; so the feature
list above is not exhaustive.	list above is not exhaustive.

@c *********************************************************************	@c *********************************************************************
Line 300 interpreter, but some systems also suppo	Line 318 interpreter, but some systems also suppo
as an image file, or in a full-blown linkable file format (e.g., JVM).	as an image file, or in a full-blown linkable file format (e.g., JVM).
Vmgen currently has no special support for such features, but the	Vmgen currently has no special support for such features, but the
information in the instruction descriptions can be helpful, and we are	information in the instruction descriptions can be helpful, and we are
open for feature requests and suggestions.	open to feature requests and suggestions.

@c --------------------------------------------------------------------	@c --------------------------------------------------------------------
@node Data handling, Dispatch, Front end and VM interpreter, Concepts	@node Data handling, Dispatch, Front end and VM interpreter, Concepts
Line 310 open for feature requests and suggestion	Line 328 open for feature requests and suggestion
@cindex register machine	@cindex register machine
Most VMs use one or more stacks for passing temporary data between VM	Most VMs use one or more stacks for passing temporary data between VM
instructions. Another option is to use a register machine architecture	instructions. Another option is to use a register machine architecture
for the virtual machine; however, this option is either slower or	for the virtual machine; we believe that using a stack architecture is
	usually both simpler and faster.

	however, this option is slower or
significantly more complex to implement than a stack machine architecture.	significantly more complex to implement than a stack machine architecture.

Vmgen has special support and optimizations for stack VMs, making their	Vmgen has special support and optimizations for stack VMs, making their
Line 356 After executing one VM instruction, the	Line 377 After executing one VM instruction, the
the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).	the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).
Vmgen supports two methods of dispatch:	Vmgen supports two methods of dispatch:

@table @asis	@table @strong

@item switch dispatch	@item switch dispatch
@cindex switch dispatch	@cindex switch dispatch
Line 379 instruction. Threaded code cannot be im	Line 400 instruction. Threaded code cannot be im
be implemented using GNU C's labels-as-values extension (@pxref{Labels	be implemented using GNU C's labels-as-values extension (@pxref{Labels
as Values, , Labels as Values, gcc.info, GNU C Manual}).	as Values, , Labels as Values, gcc.info, GNU C Manual}).

	@c call threading
@end table	@end table

Threaded code can be twice as fast as switch dispatch, depending on the	Threaded code can be twice as fast as switch dispatch, depending on the
Line 392 interpreter, the benchmark, and the mach	Line 414 interpreter, the benchmark, and the mach
The usual way to invoke Vmgen is as follows:	The usual way to invoke Vmgen is as follows:

@example	@example
vmgen @var{infile}	vmgen @var{inputfile}
@end example	@end example

Here @var{infile} is the VM instruction description file, which usually	Here @var{inputfile} is the VM instruction description file, which
ends in @file{.vmg}. The output filenames are made by taking the	usually ends in @file{.vmg}. The output filenames are made by taking
basename of @file{infile} (i.e., the output files will be created in the	the basename of @file{inputfile} (i.e., the output files will be created
current working directory) and replacing @file{.vmg} with @file{-vm.i},	in the current working directory) and replacing @file{.vmg} with
@file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},	@file{-vm.i}, @file{-disasm.i}, @file{-gen.i}, @file{-labels.i},
and @file{-peephole.i}. E.g., @command{vmgen hack/foo.vmg} will create	@file{-profile.i}, and @file{-peephole.i}. E.g., @command{vmgen
@file{foo-vm.i} etc.	hack/foo.vmg} will create @file{foo-vm.i}, @file{foo-disasm.i},
	@file{foo-gen.i}, @file{foo-labels.i}, @file{foo-profile.i} and
	@file{foo-peephole.i}.

The command-line options supported by Vmgen are	The command-line options supported by Vmgen are

Line 563 sort -k 3 >mini-super.vmg #sort se	Line 587 sort -k 3 >mini-super.vmg #sort se
The file @file{peephole-blacklist} contains all instructions that	The file @file{peephole-blacklist} contains all instructions that
directly access a stack or stack pointer (for mini: @code{call},	directly access a stack or stack pointer (for mini: @code{call},
@code{return}); the sort step is necessary to ensure that prefixes	@code{return}); the sort step is necessary to ensure that prefixes
preceed larger superinstructions.	precede larger superinstructions.

Now you can create a version of mini with superinstructions by just	Now you can create a version of mini with superinstructions by just
saying @samp{make}	saying @samp{make}


@c ***************************************************************	@c ***************************************************************
@node Input File Format, Using the generated code, Example, Top	@node Input File Format, Error messages, Example, Top
@chapter Input File Format	@chapter Input File Format
@cindex input file format	@cindex input file format
@cindex format, input file	@cindex format, input file
Line 584 Most examples are taken from the example	Line 608 Most examples are taken from the example
* Input File Grammar::	* Input File Grammar::
* Simple instructions::	* Simple instructions::
* Superinstructions::	* Superinstructions::
	* Store Optimization::
* Register Machines:: How to define register VM instructions	* Register Machines:: How to define register VM instructions
@end menu	@end menu

Line 598 The grammar is in EBNF format, with @cod	Line 623 The grammar is in EBNF format, with @cod
of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.	of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.

@cindex free-format, not	@cindex free-format, not
	@cindex newlines, significance in syntax
Vmgen input is not free-format, so you have to take care where you put	Vmgen input is not free-format, so you have to take care where you put
spaces and especially newlines; it's not as bad as makefiles, though:	newlines (and, in a few cases, white space).
any sequence of spaces and tabs is equivalent to a single space.

@example	@example
description: @{instruction\|comment\|eval-escape@}	description: @{instruction\|comment\|eval-escape\|c-escape@}

instruction: simple-inst\|superinst	instruction: simple-inst\|superinst

simple-inst: ident ' (' stack-effect ' )' newline c-code newline newline	simple-inst: ident '(' stack-effect ')' newline c-code newline newline

stack-effect: @{ident@} ' --' @{ident@}	stack-effect: @{ident@} '--' @{ident@}

super-inst: ident ' =' ident @{ident@}	super-inst: ident '=' ident @{ident@}

comment: '\ ' text newline	comment: '\ ' text newline

eval-escape: '\e ' text newline	eval-escape: '\E ' text newline

	c-escape: '\C ' text newline
@end example	@end example
@c \+ \- \g \f \c	@c \+ \- \g \f \c

Note that the @code{\}s in this grammar are meant literally, not as	Note that the @code{\}s in this grammar are meant literally, not as
C-style encodings for non-printable characters.	C-style encodings for non-printable characters.

The C code in @code{simple-inst} must not contain empty lines (because	There are two ways to delimit the C code in @code{simple-inst}:
Vmgen would mistake that as the end of the simple-inst. The text in
@code{comment} and @code{eval-escape} must not contain a newline.	@itemize @bullet
@code{Ident} must conform to the usual conventions of C identifiers
(otherwise the C compiler would choke on the Vmgen output).	@item
	If you start it with a @samp{@{} at the start of a line (i.e., not even
	white space before it), you have to end it with a @samp{@}} at the start
	of a line (followed by a newline). In this case you may have empty
	lines within the C code (typically used between variable definitions and
	statements).

	@item
	You do not start it with @samp{@{}. Then the C code ends at the first
	empty line, so you cannot have empty lines within this code.

	@end itemize

	The text in @code{comment}, @code{eval-escape} and @code{c-escape} must
	not contain a newline. @code{Ident} must conform to the usual
	conventions of C identifiers (otherwise the C compiler would choke on
	the Vmgen output), except that idents in @code{stack-effect} may have a
	stack prefix (for stack prefix syntax, @pxref{Eval escapes}).

	@cindex C escape
	@cindex @code{\C}
	@cindex conditional compilation of Vmgen output
	The @code{c-escape} passes the text through to each output file (without
	the @samp{\C}). This is useful mainly for conditional compilation
	(i.e., you write @samp{\C #if ...} etc.).

	@cindex sync lines
	@cindex @code{#line}
	In addition to the syntax given in the grammer, Vmgen also processes
	sync lines (lines starting with @samp{#line}), as produced by @samp{m4
	-s} (@pxref{Invoking m4, , Invoking m4, m4.info, GNU m4}) and similar
	tools. This allows associating C compiler error messages with the
	original source of the C code.

Vmgen understands a few extensions beyond the grammar given here, but	Vmgen understands a few extensions beyond the grammar given here, but
these extensions are only useful for building Gforth. You can find a	these extensions are only useful for building Gforth. You can find a
description of the format used for Gforth in @file{prim}.	description of the format used for Gforth in @file{prim}.

	@menu
	* Eval escapes:: what follows \E
	@end menu

	@node Eval escapes, , Input File Grammar, Input File Grammar
@subsection Eval escapes	@subsection Eval escapes
@cindex escape to Forth	@cindex escape to Forth
@cindex eval escape	@cindex eval escape
	@cindex @code{\E}

@c woanders?	@c woanders?
The text in @code{eval-escape} is Forth code that is evaluated when	The text in @code{eval-escape} is Forth code that is evaluated when
Vmgen reads the line. If you do not know (and do not want to learn)	Vmgen reads the line. You will normally use this feature to define
Forth, you can build the text according to the following grammar; these	stacks and types.
rules are normally all Forth you need for using Vmgen:
	If you do not know (and do not want to learn) Forth, you can build the
	text according to the following grammar; these rules are normally all
	Forth you need for using Vmgen:

@example	@example
text: stack-decl\|type-prefix-decl\|stack-prefix-decl	text: stack-decl\|type-prefix-decl\|stack-prefix-decl\|set-flag

stack-decl: 'stack ' ident ident ident	stack-decl: 'stack ' ident ident ident
type-prefix-decl:	type-prefix-decl:
's" ' string '" ' ('single'\|'double') ident 'type-prefix' ident	's" ' string '" ' ('single'\|'double') ident 'type-prefix' ident
stack-prefix-decl: ident 'stack-prefix' string	stack-prefix-decl: ident 'stack-prefix' string
	set-flag: ('store-optimization'\|'include-skipped-insts') ('on'\|'off')
@end example	@end example

Note that the syntax of this code is not checked thoroughly (there are	Note that the syntax of this code is not checked thoroughly (there are
many other Forth program fragments that could be written there).	many other Forth program fragments that could be written in an
	eval-escape).

	A stack prefix can contain letters, digits, or @samp{:}, and may start
	with an @samp{#}; e.g., in Gforth the return stack has the stack prefix
	@samp{R:}. This restriction is not checked during the stack prefix
	definition, but it is enforced by the parsing rules for stack items
	later.

If you know Forth, the stack effects of the non-standard words involved	If you know Forth, the stack effects of the non-standard words involved
are:	are:
Line 661 are:	Line 737 are:
@findex single	@findex single
@findex double	@findex double
@findex stack-prefix	@findex stack-prefix
	@findex store-optimization
@example	@example
stack ( "name" "pointer" "type" -- )	stack ( "name" "pointer" "type" -- )
( name execution: -- stack )	( name execution: -- stack )
type-prefix ( addr u xt1 xt2 n stack "prefix" -- )	type-prefix ( addr u item-size stack "prefix" -- )
single ( -- xt1 xt2 n )	single ( -- item-size )
double ( -- xt1 xt2 n )	double ( -- item-size )
stack-prefix ( stack "prefix" -- )	stack-prefix ( stack "prefix" -- )
	store-optimization ( -- addr )
	include-skipped-insts ( -- addr )
@end example	@end example

	An @var{item-size} takes three cells on the stack.

@c --------------------------------------------------------------------	@c --------------------------------------------------------------------
@node Simple instructions, Superinstructions, Input File Grammar, Input File Format	@node Simple instructions, Superinstructions, Input File Grammar, Input File Format
Line 726 Before we can use @code{data-stack} in t	Line 806 Before we can use @code{data-stack} in t
@cindex stack basic type	@cindex stack basic type
@cindex basic type of a stack	@cindex basic type of a stack
@cindex type of a stack, basic	@cindex type of a stack, basic
@cindex stack growth direction
This line defines the stack @code{data-stack}, which uses the stack	This line defines the stack @code{data-stack}, which uses the stack
pointer @code{sp}, and each item has the basic type @code{Cell}; other	pointer @code{sp}, and each item has the basic type @code{Cell}; other
types have to fit into one or two @code{Cell}s (depending on whether the	types have to fit into one or two @code{Cell}s (depending on whether the
type is @code{single} or @code{double} wide), and are cast from and to	type is @code{single} or @code{double} wide), and are cast from and to
Cells on accessing the @code{data-stack} with type cast macros	Cells on accessing the @code{data-stack} with type cast macros
(@pxref{VM engine}). Stacks grow towards lower addresses in	(@pxref{VM engine}). By default, stacks grow towards lower addresses in
Vmgen-erated interpreters.	Vmgen-erated interpreters (@pxref{Stack growth direction}).

@cindex stack prefix	@cindex stack prefix
@cindex prefix, stack	@cindex prefix, stack
Line 751 name. Stack prefixes are defined like t	Line 830 name. Stack prefixes are defined like t

@example	@example
\E inst-stream stack-prefix #	\E inst-stream stack-prefix #
	\E data-stack stack-prefix S:
@end example	@end example

This definition defines that the stack prefix @code{#} specifies the	This definition defines that the stack prefix @code{#} specifies the
Line 767 If there are multiple instruction stream	Line 847 If there are multiple instruction stream
first one (just as the intuition suggests).	first one (just as the intuition suggests).

@menu	@menu
	* Explicit stack access:: If the C code accesses a stack pointer
* C Code Macros:: Macros recognized by Vmgen	* C Code Macros:: Macros recognized by Vmgen
* C Code restrictions:: Vmgen makes assumptions about C code	* C Code restrictions:: Vmgen makes assumptions about C code
	* Stack growth direction:: is configurable per stack
@end menu	@end menu

@c --------------------------------------------------------------------	@c --------------------------------------------------------------------
@node C Code Macros, C Code restrictions, Simple instructions, Simple instructions	@node Explicit stack access, C Code Macros, Simple instructions, Simple instructions
	@subsection Explicit stack access
	@cindex stack access, explicit
	@cindex Stack pointer access
	@cindex explicit stack access

	Not all stack effects can be specified using the stack effect
	specifications above. For VM instructions that have other stack
	effects, you can specify them explicitly by accessing the stack
	pointer in the C code; however, you have to notify Vmgen of such
	explicit stack accesses, otherwise Vmgens optimizations could conflict
	with your explicit stack accesses.

	You notify Vmgen by putting @code{...} with the appropriate stack
	prefix into the stack comment. Then the VM instruction will first
	take the other stack items specified in the stack effect into C
	variables, then make sure that all other stack items for that stack
	are in memory, and that the stack pointer for the stack points to the
	top-of-stack (by default, unless you change the stack access
	transformation: @pxref{Stack growth direction}).

	The general rule is: If you mention a stack pointer in the C code of a
	VM instruction, you should put a @code{...} for that stack in the stack
	effect.

	Consider this example:

	@example
	return ( #iadjust S:... target afp i1 -- i2 )
	SET_IP(target);
	sp = (Cell )(((char )sp)+iadjust);
	fp = afp;
	i2=i1;
	@end example

	First the variables @code{target afp i1} are popped off the stack,
	then the stack pointer @code{sp} is set correctly for the new stack
	depth, then the C code changes the stack depth and does other things,
	and finally @code{i2} is pushed on the stack with the new depth.

	The position of the @code{...} within the stack effect does not
	matter. You can use several @code{...}s, for different stacks, and
	also several for the same stack (that has no additional effect). If
	you use @code{...} without a stack prefix, this specifies all the
	stacks except the instruction stream.

	You cannot use @code{...} for the instruction stream, but that is not
	necessary: At the start of the C code, @code{IP} points to the start
	of the next VM instruction (i.e., right beyond the end of the current
	VM instruction), and you can change the instruction pointer with
	@code{SET_IP} (@pxref{VM engine}).


	@c --------------------------------------------------------------------
	@node C Code Macros, C Code restrictions, Explicit stack access, Simple instructions
@subsection C Code Macros	@subsection C Code Macros
@cindex macros recognized by Vmgen	@cindex macros recognized by Vmgen
@cindex basic block, VM level	@cindex basic block, VM level
Line 793 level, this also sets the instruction po	Line 929 level, this also sets the instruction po
This ends a basic block (for profiling), even if the instruction	This ends a basic block (for profiling), even if the instruction
contains no @code{SET_IP}.	contains no @code{SET_IP}.

@item TAIL;	@item INST_TAIL;
@findex TAIL;	@findex INST_TAIL;
Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and	Vmgen replaces @samp{INST_TAIL;} with code for ending a VM instruction and
dispatching the next VM instruction. Even without a @samp{TAIL;} this	dispatching the next VM instruction. Even without a @samp{INST_TAIL;} this
happens automatically when control reaches the end of the C code. If	happens automatically when control reaches the end of the C code. If
you want to have this in the middle of the C code, you need to use	you want to have this in the middle of the C code, you need to use
@samp{TAIL;}. A typical example is a conditional VM branch:	@samp{INST_TAIL;}. A typical example is a conditional VM branch:

@example	@example
if (branch_condition) @{	if (branch_condition) @{
SET_IP(target); TAIL;	SET_IP(target); INST_TAIL;
@}	@}
/* implicit tail follows here */	/* implicit tail follows here */
@end example	@end example

In this example, @samp{TAIL;} is not strictly necessary, because there	In this example, @samp{INST_TAIL;} is not strictly necessary, because there
is another one implicitly after the if-statement, but using it improves	is another one implicitly after the if-statement, but using it improves
branch prediction accuracy slightly and allows other optimizations.	branch prediction accuracy slightly and allows other optimizations.

Line 822 typical application is in conditional VM	Line 958 typical application is in conditional VM

@example	@example
if (branch_condition) @{	if (branch_condition) @{
SET_IP(target); TAIL; /* now this TAIL is necessary */	SET_IP(target); INST_TAIL; /* now this INST_TAIL is necessary */
@}	@}
SUPER_CONTINUE;	SUPER_CONTINUE;
@end example	@end example

	@item VM_JUMP
	@findex VM_JUMP
	@code{VM_JUMP(target)} is equivalent to @code{goto *(target)}, but
	allows Vmgen to do dynamic superinstructions and replication. You
	still need to say @code{SUPER_END}. Also, the goto only happens at
	the end (wherever the VM_JUMP is). Essentially, this just suppresses
	much of the ordinary dispatch mechanism.

@end table	@end table

Note that Vmgen is not smart about C-level tokenization, comments,	Note that Vmgen is not smart about C-level tokenization, comments,
strings, or conditional compilation, so it will interpret even a	strings, or conditional compilation, so it will interpret even a
commented-out SUPER_END as ending a basic block (or, e.g.,	commented-out SUPER_END as ending a basic block (or, e.g.,
@samp{RETAIL;} as @samp{TAIL;}). Conversely, Vmgen requires the literal	@samp{RESET_IP;} as @samp{SET_IP;}). Conversely, Vmgen requires the literal
presence of these strings; Vmgen will not see them if they are hiding in	presence of these strings; Vmgen will not see them if they are hiding in
a C preprocessor macro.	a C preprocessor macro.


@c --------------------------------------------------------------------	@c --------------------------------------------------------------------
@node C Code restrictions, , C Code Macros, Simple instructions	@node C Code restrictions, Stack growth direction, C Code Macros, Simple instructions
@subsection C Code restrictions	@subsection C Code restrictions
@cindex C code restrictions	@cindex C code restrictions
@cindex restrictions on C code	@cindex restrictions on C code
Line 879 The Vmgen-erated code loads the stack it	Line 1023 The Vmgen-erated code loads the stack it
memory into variables before the user-supplied C code, and stores them	memory into variables before the user-supplied C code, and stores them
from variables to stack-pointer-indexed memory afterwards. If you do	from variables to stack-pointer-indexed memory afterwards. If you do
any writes to the stack through its stack pointer in your C code, it	any writes to the stack through its stack pointer in your C code, it
will not affact the variables, and your write may be overwritten by the	will not affect the variables, and your write may be overwritten by the
stores after the C code. Similarly, a read from a stack using a stack	stores after the C code. Similarly, a read from a stack using a stack
pointer will not reflect computations of stack items in the same VM	pointer will not reflect computations of stack items in the same VM
instruction.	instruction.
Line 898 macros can be implemented in several way	Line 1042 macros can be implemented in several way
@samp{IP} points to the next instruction, and @samp{IPTOS} is its	@samp{IP} points to the next instruction, and @samp{IPTOS} is its
contents.	contents.

	@c --------------------------------------------------------------------
	@node Stack growth direction, , C Code restrictions, Simple instructions
	@subsection Stack growth direction
	@cindex stack growth direction

	@cindex @code{stack-access-transform}
	By default, the stacks grow towards lower addresses. You can change
	this for a stack by setting the @code{stack-access-transform} field of
	the stack to an xt @code{( itemnum -- index )} that performs the
	appropriate index transformation.

	E.g., if you want to let @code{data-stack} grow towards higher
	addresses, with the stack pointer always pointing just beyond the
	top-of-stack, use this right after defining @code{data-stack}:

	@example
	\E : sp-access-transform ( itemnum -- index ) negate 1- ;
	\E ' sp-access-transform ' data-stack >body stack-access-transform !
	@end example

	This means that @code{sp-access-transform} will be used to generate
	indexes for accessing @code{data-stack}. The definition of
	@code{sp-access-transform} above transforms n into -n-1, e.g, 1 into -2.
	This will access the 0th data-stack element (top-of-stack) at sp[-1],
	the 1st at sp[-2], etc., which is the typical way upward-growing
	stacks are used. If you need a different transform and do not know
	enough Forth to program it, let me know.

@c --------------------------------------------------------------------	@c --------------------------------------------------------------------
@node Superinstructions, Register Machines, Simple instructions, Input File Format	@node Superinstructions, Store Optimization, Simple instructions, Input File Format
@section Superinstructions	@section Superinstructions
@cindex superinstructions, defining	@cindex superinstructions, defining
@cindex defining superinstructions	@cindex defining superinstructions
Line 957 accesses a stack pointer should not be u	Line 1128 accesses a stack pointer should not be u
does not check these restrictions, they just result in bugs in your	does not check these restrictions, they just result in bugs in your
interpreter.	interpreter.

	@cindex include-skipped-insts
	The Vmgen flag @code{include-skipped-insts} influences superinstruction
	code generation. Currently there is no support in the peephole
	optimizer for both variations, so leave this flag alone for now.

	@c -------------------------------------------------------------------
	@node Store Optimization, Register Machines, Superinstructions, Input File Format
	@section Store Optimization
	@cindex store optimization
	@cindex optimization, stack stores
	@cindex stack stores, optimization
	@cindex eliminating stack stores

	This minor optimization (0.6\%--0.8\% reduction in executed instructions
	for Gforth) puts additional requirements on the instruction descriptions
	and is therefore disabled by default.

	What does it do? Consider an instruction like

	@example
	dup ( n -- n n )
	@end example

	For simplicity, also assume that we are not caching the top-of-stack in
	a register. Now, the C code for dup first loads @code{n} from the
	stack, and then stores it twice to the stack, one time to the address
	where it came from; that time is unnecessary, but gcc does not optimize
	it away, so vmgen can do it instead (if you turn on the store
	optimization).

	Vmgen uses the stack item's name to determine if the stack item contains
	the same value as it did at the start. Therefore, if you use the store
	optimization, you have to ensure that stack items that have the same
	name on input and output also have the same value, and are not changed
	in the C code you supply. I.e., the following code could fail if you
	turn on the store optimization:

	@example
	add1 ( n -- n )
	n++;
	@end example

	Instead, you have to use different names, i.e.:

	@example
	add1 ( n1 -- n2 )
	n2=n1+1;
	@end example

	Similarly, the store optimization assumes that the stack pointer is only
	changed by Vmgen-erated code. If your C code changes the stack pointer,
	use different names in input and output stack items to avoid a (probably
	wrong) store optimization, or turn the store optimization off for this
	VM instruction.

	To turn on the store optimization, write

	@example
	\E store-optimization on
	@end example

	at the start of the file. You can turn this optimization on or off
	between any two VM instruction descriptions. For turning it off again,
	you can use

	@example
	\E store-optimization off
	@end example

@c -------------------------------------------------------------------	@c -------------------------------------------------------------------
@node Register Machines, , Superinstructions, Input File Format	@node Register Machines, , Store Optimization, Input File Format
@section Register Machines	@section Register Machines
@cindex Register VM	@cindex Register VM
@cindex Superinstructions for register VMs	@cindex Superinstructions for register VMs
Line 1013 VM interpreters. However, if you have i	Line 1253 VM interpreters. However, if you have i
direction, please let me know (@pxref{Contact}).	direction, please let me know (@pxref{Contact}).

@c ********************************************************************	@c ********************************************************************
@node Using the generated code, Changes, Input File Format, Top	@node Error messages, Using the generated code, Input File Format, Top
	@chapter Error messages
	@cindex error messages

	These error messages are created by Vmgen:

	@table @code

	@cindex @code{# can only be on the input side} error
	@item # can only be on the input side
	You have used an instruction-stream prefix (usually @samp{#}) after the
	@samp{--} (the output side); you can only use it before (the input
	side).

	@cindex @code{prefix for this combination must be defined earlier} error
	@item the prefix for this superinstruction must be defined earlier
	You have defined a superinstruction (e.g. @code{abc = a b c}) without
	defining its direct prefix (e.g., @code{ab = a b}),
	@xref{Superinstructions}.

	@cindex @code{sync line syntax} error
	@item sync line syntax
	If you are using a preprocessor (e.g., @command{m4}) to generate Vmgen
	input code, you may want to create @code{#line} directives (aka sync
	lines). This error indicates that such a line is not in th syntax
	expected by Vmgen (this should not happen; please report the offending
	line in a bug report).

	@cindex @code{syntax error, wrong char} error
	@item syntax error, wrong char
	A syntax error. If you do not see right away where the error is, it may
	be helpful to check the following: Did you put an empty line in a VM
	instruction where the C code is not delimited by braces (then the empty
	line ends the VM instruction)? If you used brace-delimited C code, did
	you put the delimiting braces (and only those) at the start of the line,
	without preceding white space? Did you forget a delimiting brace?

	@cindex @code{too many stacks} error
	@item too many stacks
	Vmgen currently supports 3 stacks (plus the instruction stream); if you
	need more, let us know.

	@cindex @code{unknown prefix} error
	@item unknown prefix
	The stack item does not match any defined type prefix (after stripping
	away any stack prefix). You should either declare the type prefix you
	want for that stack item, or use a different type prefix

	@cindex @code{unknown primitive} error
	@item unknown primitive
	You have used the name of a simple VM instruction in a superinstruction
	definition without defining the simple VM instruction first.

	@end table

	In addition, the C compiler can produce errors due to code produced by
	Vmgen; e.g., you need to define type cast functions.

	@c ********************************************************************
	@node Using the generated code, Hints, Error messages, Top
@chapter Using the generated code	@chapter Using the generated code
@cindex generated code, usage	@cindex generated code, usage
@cindex Using vmgen-erated code	@cindex Using vmgen-erated code

The easiest way to create a working VM interpreter with Vmgen is	The easiest way to create a working VM interpreter with Vmgen is
probably to start with @file{vmgen-ex}, and modify it for your purposes.	probably to start with @file{vmgen-ex}, and modify it for your purposes.
This chapter is just the reference manual for the macros etc. used by	This chapter explains what the various wrapper and generated files do.
the generated code, the other context expected by the generated code,	It also contains reference-manual style descriptions of the macros,
and what you can do with the various generated files.	variables etc. used by the generated code, and you can skip that on
	first reading.

@menu	@menu
* VM engine:: Executing VM code	* VM engine:: Executing VM code
Line 1059 In our example the engine function also	Line 1359 In our example the engine function also
@file{@var{name}-labels.i} (@pxref{VM instruction table}).	@file{@var{name}-labels.i} (@pxref{VM instruction table}).

@cindex tracing VM code	@cindex tracing VM code
	@cindex superinstructions and tracing
In addition to executing the code, the VM engine can optionally also	In addition to executing the code, the VM engine can optionally also
print out a trace of the executed instructions, their arguments and	print out a trace of the executed instructions, their arguments and
results. For superinstructions it prints the trace as if only component	results. For superinstructions it prints the trace as if only component
Line 1080 The following macros and variables are u	Line 1381 The following macros and variables are u
@item LABEL(@var{inst_name})	@item LABEL(@var{inst_name})
This is used just before each VM instruction to provide a jump or	This is used just before each VM instruction to provide a jump or
@code{switch} label (the @samp{:} is provided by Vmgen). For switch	@code{switch} label (the @samp{:} is provided by Vmgen). For switch
dispatch this should expand to @samp{case @var{label}}; for	dispatch this should expand to @samp{case @var{label}:}; for
threaded-code dispatch this should just expand to @samp{@var{label}}.	threaded-code dispatch this should just expand to @samp{@var{label}:}.
In either case @var{label} is usually the @var{inst_name} with some	In either case @var{label} is usually the @var{inst_name} with some
prefix or suffix to avoid naming conflicts.	prefix or suffix to avoid naming conflicts.

Line 1093 should expand to nothing.	Line 1394 should expand to nothing.
@findex NAME	@findex NAME
@item NAME(@var{inst_name_string})	@item NAME(@var{inst_name_string})
Called on entering a VM instruction with a string containing the name of	Called on entering a VM instruction with a string containing the name of
the VM instruction as parameter. In normal execution this should be a	the VM instruction as parameter. In normal execution this should be
noop, but for tracing this usually prints the name, and possibly other	expand to nothing, but for tracing this usually prints the name, and
information (several VM registers in our example).	possibly other information (several VM registers in our example).

@findex DEF_CA	@findex DEF_CA
@item DEF_CA	@item DEF_CA
Line 1114 different ways for best performance on v	Line 1415 different ways for best performance on v
@samp{NEXT_P0} is invoked right at the start of the VM instruction (but	@samp{NEXT_P0} is invoked right at the start of the VM instruction (but
after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C	after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
code, and @samp{NEXT_P2} at the end. The actual jump has to be	code, and @samp{NEXT_P2} at the end. The actual jump has to be
performed by @samp{NEXT_P2}.	performed by @samp{NEXT_P2} (if you would do it earlier, important parts
	of the VM instruction would not be executed).

The simplest variant is if @samp{NEXT_P2} does everything and the other	The simplest variant is if @samp{NEXT_P2} does everything and the other
macros do nothing. Then also related macros like @samp{IP},	macros do nothing. Then also related macros like @samp{IP},
Line 1189 type. For @samp{inst-stream}, the name	Line 1491 type. For @samp{inst-stream}, the name
plain r-value; typically it is a macro that abstracts away the	plain r-value; typically it is a macro that abstracts away the
differences between the various implementations of @code{NEXT_P*}.	differences between the various implementations of @code{NEXT_P*}.

	@cindex IMM_ARG
	@findex IMM_ARG
	@item IMM_ARG(access,value)
	Define this to expland to ``(access)''. This is just a placeholder for
	future extensions.

@cindex top of stack caching	@cindex top of stack caching
@cindex stack caching	@cindex stack caching
@cindex TOS	@cindex TOS
Line 1218 profiling.	Line 1526 profiling.
@item SUPER_CONTINUE	@item SUPER_CONTINUE
This is just a hint to Vmgen and does nothing at the C level.	This is just a hint to Vmgen and does nothing at the C level.

	@findex MAYBE_UNUSED
	@item MAYBE_UNUSED
	This should be defined as @code{__attribute__((unused))} for gcc-2.7 and
	higher. It suppresses the warnings about unused variables in the code
	for superinstructions. You need to define this only if you are using
	superinstructions.

@findex VM_DEBUG	@findex VM_DEBUG
@item VM_DEBUG	@item VM_DEBUG
If this is defined, the tracing code will be compiled in (slower	If this is defined, the tracing code will be compiled in (slower
Line 1395 instruction instead of laying down @code	Line 1710 instruction instead of laying down @code

The code for peephole optimization is in @file{vmgen-ex/peephole.c}.	The code for peephole optimization is in @file{vmgen-ex/peephole.c}.
You can use this file almost verbatim. Vmgen generates	You can use this file almost verbatim. Vmgen generates
@file{@var{file}-peephole.i} which contains data for the peephoile	@file{@var{file}-peephole.i} which contains data for the peephole
optimizer.	optimizer.

@findex init_peeptable	@findex init_peeptable
Line 1541 it uses variables and functions defined	Line 1856 it uses variables and functions defined
plus @code{VM_IS_INST} already defined for the VM disassembler	plus @code{VM_IS_INST} already defined for the VM disassembler
(@pxref{VM disassembler}).	(@pxref{VM disassembler}).

	@c **********************************************************
	@node Hints, The future, Using the generated code, Top
	@chapter Hints
	@cindex hints

	@menu
	* Floating point:: and stacks
	@end menu

	@c --------------------------------------------------------------------
	@node Floating point, , Hints, Hints
	@section Floating point

	How should you deal with floating point values? Should you use the same
	stack as for integers/pointers, or a different one? This section
	discusses this issue with a view on execution speed.

	The simpler approach is to use a separate floating-point stack. This
	allows you to choose FP value size without considering the size of the
	integers/pointers, and you avoid a number of performance problems. The
	main downside is that this needs an FP stack pointer (and that may not
	fit in the register file on the 386 arhitecture, costing some
	performance, but comparatively little if you take the other option into
	account). If you use a separate FP stack (with stack pointer @code{fp}),
	using an fpTOS is helpful on most machines, but some spill the fpTOS
	register into memory, and fpTOS should not be used there.

	The other approach is to share one stack (pointed to by, say, @code{sp})
	between integer/pointer and floating-point values. This is ok if you do
	not use @code{spTOS}. If you do use @code{spTOS}, the compiler has to
	decide whether to put that variable into an integer or a floating point
	register, and the other type of operation becomes quite expensive on
	most machines (because moving values between integer and FP registers is
	quite expensive). If a value of one type has to be synthesized out of
	two values of the other type (@code{double} types), things are even more
	interesting.

	One way around this problem would be to not use the @code{spTOS}
	supported by Vmgen, but to use explicit top-of-stack variables (one for
	integers, one for FP values), and having a kind of accumulator+stack
	architecture (e.g., Ocaml bytecode uses this approach); however, this is
	a major change, and it's ramifications are not completely clear.

	@c **********************************************************
	@node The future, Changes, Hints, Top
	@chapter The future
	@cindex future ideas

	We have a number of ideas for future versions of Vmgen. However, there
	are so many possible things to do that we would like some feedback from
	you. What are you doing with Vmgen, what features are you missing, and
	why?

	One idea we are thinking about is to generate just one @file{.c} file
	instead of letting you copy and adapt all the wrapper files (you would
	still have to define stuff like the type-specific macros, and stack
	pointers etc. somewhere). The advantage would be that, if we change the
	wrapper files between versions, you would not need to integrate your
	changes and our changes to them; Vmgen would also be easier to use for
	beginners. The main disadvantage of that is that it would reduce the
	flexibility of Vmgen a little (well, those who like flexibility could
	still patch the resulting @file{.c} file, like they are now doing for
	the wrapper files). In any case, if you are doing things to the wrapper
	files that would cause problems in a generated-@file{.c}-file approach,
	please let us know.

@c **********************************************************	@c **********************************************************
@node Changes, Contact, Using the generated code, Top	@node Changes, Contact, The future, Top
@chapter Changes	@chapter Changes
@cindex Changes from old versions	@cindex Changes from old versions

	User-visible changes between 0.5.9-20020822 and 0.5.9-20020901:

	The store optimization is now disabled by default, but can be enabled by
	the user (@pxref{Store Optimization}). Documentation for this
	optimization is also new.

	User-visible changes between 0.5.9-20010501 and 0.5.9-20020822:

	There is now a manual (in info, HTML, Postscript, or plain text format).

	There is the vmgen-ex2 variant of the vmgen-ex example; the new
	variant uses a union type instead of lots of casting.

	Both variants of the example can now be compiled with an ANSI C compiler
	(using switch dispatch and losing quite a bit of performance); tested
	with @command{lcc}.

Users of the gforth-0.5.9-20010501 version of Vmgen need to change	Users of the gforth-0.5.9-20010501 version of Vmgen need to change
several things in their source code to use the current version. I	several things in their source code to use the current version. I
recommend keeping the gforth-0.5.9-20010501 version until you have	recommend keeping the gforth-0.5.9-20010501 version until you have
Line 1558 The required changes are:	Line 1955 The required changes are:

@table @code	@table @code

	@cindex @code{TAIL;}, changes
	@item TAIL;
	has been renamed into @code{INST_TAIL;} (less chance of an accidental
	match).

@cindex @code{vm_@var{A}2@var{B}}, changes	@cindex @code{vm_@var{A}2@var{B}}, changes
@item vm_@var{A}2@var{B}	@item vm_@var{A}2@var{B}
now takes two arguments.	now takes two arguments.
Line 1576 Also some new macros have to be defined,	Line 1978 Also some new macros have to be defined,
@node Contact, Copying This Manual, Changes, Top	@node Contact, Copying This Manual, Changes, Top
@chapter Contact	@chapter Contact

	To report a bug, use
	@url{https://savannah.gnu.org/bugs/?func=addbug&group_id=2672}.

	For discussion on Vmgen (e.g., how to use it), use the mailing list
	@email{bug-vmgen@@mail.freesoftware.fsf.org} (use
	@url{http://mail.gnu.org/mailman/listinfo/help-vmgen} to subscribe).

	You can find vmgen information at
	@url{http://www.complang.tuwien.ac.at/anton/vmgen/}.

@c ***********************************************************	@c ***********************************************************
@node Copying This Manual, Index, Contact, Top	@node Copying This Manual, Index, Contact, Top
@appendix Copying This Manual	@appendix Copying This Manual

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>

Removed from v.1.12
changed lines
	Added in v.1.29