File:  [gforth] / gforth / doc / vmgen.texi
Revision 1.5: download - view: text, annotated - select for diffs
Thu Aug 1 21:14:25 2002 UTC (21 years, 8 months ago) by anton
Branches: MAIN
CVS tags: HEAD
documentation changes

@include version.texi

@c @ifnottex
This file documents vmgen (Gforth @value{VERSION}).

@chapter Introduction

Vmgen is a tool for writing efficient interpreters.  It takes a simple
virtual machine description and generates efficient C code for dealing
with the virtual machine code in various ways (in particular, executing
it).  The run-time efficiency of the resulting interpreters is usually
within a factor of 10 of machine code produced by an optimizing
compiler.

The interpreter design strategy supported by vmgen is to divide the
interpreter into two parts:

@itemize @bullet

@item The @emph{front end} takes the source code of the language to be
implemented, and translates it into virtual machine code.  This is
similar to an ordinary compiler front end; typically an interpreter
front-end performs no optimization, so it is relatively simple to
implement and runs fast.

@item The @emph{virtual machine interpreter} executes the virtual
machine code.

@end itemize

Such a division is usually used in interpreters, for modularity as well
as for efficiency reasons.  The virtual machine code is typically passed
between front end and virtual machine interpreter in memory, like in a
load-and-go compiler; this avoids the complexity and time cost of
writing the code to a file and reading it again.

A @emph{virtual machine} (VM) represents the program as a sequence of
@emph{VM instructions}, following each other in memory, similar to real
machine code.  Control flow occurs through VM branch instructions, like
in a real machine.

In this setup, vmgen can generate most of the code dealing with virtual
machine instructions from a simple description of the virtual machine
instructions (@pxref...), in particular:

@table @emph

@item VM instruction execution

@item VM code generation
Useful in the front end.

@item VM code decompiler
Useful for debugging the front end.

@item VM code tracing
Useful for debugging the front end and the VM interpreter.  You will
typically provide other means for debugging the user's programs at the
source level.

@item VM code profiling
Useful for optimizing the VM insterpreter with superinstructions
(@pxref...).

@end table

VMgen supports efficient interpreters though various optimizations, in
particular

@itemize

@item Threaded code

@item Caching the top-of-stack in a register

@item Combining VM instructions into superinstructions

@item
Replicating VM (super)instructions for better BTB prediction accuracy
(not yet in vmgen-ex, but already in Gforth).

@end itemize

As a result, vmgen-based interpreters are only about an order of
magintude slower than native code from an optimizing C compiler on small
benchmarks; on large benchmarks, which spend more time in the run-time
system, the slowdown is often less (e.g., the slowdown of a
Vmgen-generated JVM interpreter over the best JVM JIT compiler we
measured is only a factor of 2-3 for large benchmarks; some other JITs
and all other interpreters we looked at were slower than our
interpreter).

VMs are usually designed as stack machines (passing data between VM
instructions on a stack), and vmgen supports such designs especially
well; however, you can also use vmgen for implementing a register VM and
still benefit from most of the advantages offered by vmgen.

There are many potential uses of the instruction descriptions that are
not implemented at the moment, but we are open for feature requests, and
we will implement new features if someone asks for them; so the feature
list above is not exhaustive.

@c *********************************************************************
@chapter Why interpreters?

Interpreters are a popular language implementation technique because
they combine all three of the following advantages:

@itemize

@item Ease of implementation

@item Portability

@item Fast edit-compile-run cycle

@end itemize

The main disadvantage of interpreters is their run-time speed.  However,
there are huge differences between different interpreters in this area:
the slowdown over optimized C code on programs consisting of simple
operations is typically a factor of 10 for the more efficient
interpreters, and a factor of 1000 for the less efficient ones (the
slowdown for programs executing complex operations is less, because the
time spent in libraries for executing complex operations is the same in
all implementation strategies).

Vmgen makes it even easier to implement interpreters.  It also supports
techniques for building efficient interpreters.

@c ********************************************************************

@chapter Concepts

@c --------------------------------------------------------------------
@section Front-end and virtual machine interpreter

@cindex front-end
Interpretive systems are typically divided into a @emph{front end} that
parses the input language and produces an intermediate representation
for the program, and an interpreter that executes the intermediate
representation of the program.

@cindex virtual machine
@cindex VM
@cindex instruction, VM
For efficient interpreters the intermediate representation of choice is
virtual machine code (rather than, e.g., an abstract syntax tree).
@emph{Virtual machine} (VM) code consists of VM instructions arranged
sequentially in memory; they are executed in sequence by the VM
interpreter, except for VM branch instructions, which implement control
structures.  The conceptual similarity to real machine code results in
the name @emph{virtual machine}.

In this framework, vmgen supports building the VM interpreter and any
other component dealing with VM instructions.  It does not have any
support for the front end, apart from VM code generation support.  The
front end can be implemented with classical compiler front-end
techniques, supported by tools like @command{flex} and @command{bison}.

The intermediate representation is usually just internal to the
interpreter, but some systems also support saving it to a file, either
as an image file, or in a full-blown linkable file format (e.g., JVM).
Vmgen currently has no special support for such features, but the
information in the instruction descriptions can be helpful, and we are
open for feature requests and suggestions.

@section Data handling

@cindex stack machine
@cindex register machine
Most VMs use one or more stacks for passing temporary data between VM
instructions.  Another option is to use a register machine architecture
for the virtual machine; however, this option is either slower or
significantly more complex to implement than a stack machine architecture.

Vmgen has special support and optimizations for stack VMs, making their
implementation easy and efficient.

You can also implement a register VM with vmgen (@pxref{Register
Machines}), and you will still profit from most vmgen features.

@cindex stack item size
@cindex size, stack items
Stack items all have the same size, so they typically will be as wide as
an integer, pointer, or floating-point value.  Vmgen supports treating
two consecutive stack items as a single value, but anything larger is
best kept in some other memory area (e.g., the heap), with pointers to
the data on the stack.

@cindex instruction stream
@cindex immediate arguments
Another source of data is immediate arguments VM instructions (in the VM
instruction stream).  The VM instruction stream is handled similar to a
stack in vmgen.

@cindex garbage collection
@cindex reference counting
Vmgen has no built-in support for nor restrictions against @emph{garbage
collection}.  If you need garbage collection, you need to provide it in
your run-time libraries.  Using @emph{reference counting} is probably
harder, but might be possible (contact us if you are interested).
@c reference counting might be possible by including counting code in 
@c the conversion macros.

@c *************************************************************
@chapter Invoking vmgen

The usual way to invoke vmgen is as follows:

@example
vmgen @var{infile}
@end example

Here @var{infile} is the VM instruction description file, which usually
ends in @file{.vmg}.  The output filenames are made by taking the
basename of @file{infile} (i.e., the output files will be created in the
current working directory) and replacing @file{.vmg} with @file{-vm.i},
@file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},
and @file{-peephole.i}.  E.g., @command{bison hack/foo.vmg} will create
@file{foo-vm.i} etc.

The command-line options supported by vmgen are

@table @option

@cindex -h, command-line option
@cindex --help, command-line option
@item --help
@itemx -h
Print a message about the command-line options

@cindex -v, command-line option
@cindex --version, command-line option
@item --version
@itemx -v
Print version and exit
@end table

@c env vars GFORTHDIR GFORTHDATADIR

@c ****************************************************************
@chapter Example

@section Example overview

There are two versions of the same example for using vmgen:
@file{vmgen-ex} and @file{vmgen-ex2} (you can also see Gforth as
example, but it uses additional (undocumented) features, and also
differs in some other respects).  The example implements @emph{mini}, a
tiny Modula-2-like language with a small JavaVM-like virtual machine.
The difference between the examples is that @file{vmgen-ex} uses many
casts, and @file{vmgen-ex2} tries to avoids most casts and uses unions
instead.

The files provided with each example are:

@example
Makefile
README
disasm.c           wrapper file
engine.c           wrapper file
peephole.c         wrapper file
profile.c          wrapper file
mini-inst.vmg      simple VM instructions
mini-super.vmg     superinstructions (empty at first)
mini.h             common declarations
mini.l             scanner
mini.y             front end (parser, VM code generator)
support.c          main() and other support functions
fib.mini           example mini program
simple.mini        example mini program
test.mini          example mini program (tests everything)
test.out           test.mini output
stat.awk           script for aggregating profile information
peephole-blacklist list of instructions not allowed in superinstructions
seq2rule.awk       script for creating superinstructions
@end example

For your own interpreter, you would typically copy the following files
and change little, if anything:

@example
disasm.c           wrapper file
engine.c           wrapper file
peephole.c         wrapper file
profile.c          wrapper file
stat.awk           script for aggregating profile information
seq2rule.awk       script for creating superinstructions
@end example

You would typically change much in or replace the following files:

@example
Makefile
mini-inst.vmg      simple VM instructions
mini.h             common declarations
mini.l             scanner
mini.y             front end (parser, VM code generator)
support.c          main() and other support functions
peephole-blacklist list of instructions not allowed in superinstructions
@end example

You can build the example by @code{cd}ing into the example's directory,
and then typing @samp{make}; you can check that it works with @samp{make
check}.  You can run run mini programs like this:

@example
./mini fib.mini
@end example

To learn about the options, type @samp{./mini -h}.

@section Using profiling to create superinstructions

I have not added rules for this in the @file{Makefile} (there are many
options for selecting superinstructions, and I did not want to hardcode
one into the @file{Makefile}), but there are some supporting scripts, and
here's an example:

Suppose you want to use @file{fib.mini} and @file{test.mini} as training
programs, you get the profiles like this:

@example
make fib.prof test.prof #takes a few seconds
@end example

You can aggregate these profiles with @file{stat.awk}:

@example
awk -f stat.awk fib.prof test.prof
@end example

The result contains lines like:

@example
      2      16        36910041 loadlocal lit
@end example

This means that the sequence @code{loadlocal lit} statically occurs a
total of 16 times in 2 profiles, with a dynamic execution count of
36910041.

The numbers can be used in various ways to select superinstructions.
E.g., if you just want to select all sequences with a dynamic
execution count exceeding 10000, you would use the following pipeline:

@example
awk -f stat.awk fib.prof test.prof|
awk '$3>=10000'|                #select sequences
fgrep -v -f peephole-blacklist| #eliminate wrong instructions
awk -f seq2rule.awk|      #transform sequences into superinstruction rules
sort -k 3 >mini-super.vmg       #sort sequences
@end example

The file @file{peephole-blacklist} contains all instructions that
directly access a stack or stack pointer (for mini: @code{call},
@code{return}); the sort step is necessary to ensure that prefixes
preceed larger superinstructions.

Now you can create a version of mini with superinstructions by just
saying @samp{make}

@c ***************************************************************
@chapter Input File Format

Vmgen takes as input a file containing specifications of virtual machine
instructions.  This file usually has a name ending in @file{.vmg}.

Most examples are taken from the example in @file{vmgen-ex}.

@section Input File Grammar

The grammar is in EBNF format, with @code{@var{a}|@var{b}} meaning
``@var{a} or @var{b}'', @code{@{@var{c}@}} meaning 0 or more repetitions
of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.

Vmgen input is not free-format, so you have to take care where you put
spaces and especially newlines; it's not as bad as makefiles, though:
any sequence of spaces and tabs is equivalent to a single space.

@example
description: {instruction|comment|eval-escape}

instruction: simple-inst|superinst

simple-inst: ident " (" stack-effect " )" newline c-code newline newline

stack-effect: {ident} " --" {ident}

super-inst: ident " =" ident {ident}  

comment:      "\ "  text newline

eval-escape:  "\e " text newline
@end example
@c \+ \- \g \f \c

Note that the @code{\}s in this grammar are meant literally, not as
C-style encodings for non-printable characters.

The C code in @code{simple-inst} must not contain empty lines (because
vmgen would mistake that as the end of the simple-inst.  The text in
@code{comment} and @code{eval-escape} must not contain a newline.
@code{Ident} must conform to the usual conventions of C identifiers
(otherwise the C compiler would choke on the vmgen output).

Vmgen understands a few extensions beyond the grammar given here, but
these extensions are only useful for building Gforth.  You can find a
description of the format used for Gforth in @file{prim}.

@subsection
@c woanders?
The text in @code{eval-escape} is Forth code that is evaluated when
vmgen reads the line.  If you do not know (and do not want to learn)
Forth, you can build the text according to the following grammar; these
rules are normally all Forth you need for using vmgen:

@example
text: stack-decl|type-prefix-decl|stack-prefix-decl

stack-decl: "stack " ident ident ident
type-prefix-decl: 
    's" ' string '" ' ("single"|"double") ident "type-prefix" ident
stack-prefix-decl:  ident "stack-prefix" string
@end example

Note that the syntax of this code is not checked thoroughly (there are
many other Forth program fragments that could be written there).

If you know Forth, the stack effects of the non-standard words involved
are:

@example
stack        ( "name" "pointer" "type" -- )
             ( name execution: -- stack )
type-prefix  ( addr u xt1 xt2 n stack "prefix" -- )
single       ( -- xt1 xt2 n )
double       ( -- xt1 xt2 n )
stack-prefix ( stack "prefix" -- )
@end example


@section Simple instructions

We will use the following simple VM instruction description as example:

@example
sub ( i1 i2 -- i )
i = i1-i2;
@end example

The first line specifies the name of the VM instruction (@code{sub}) and
its stack effect (@code{i1 i2 -- i}).  The rest of the description is
just plain C code.

@cindex stack effect
The stack effect specifies that @code{sub} pulls two integers from the
data stack and puts them in the C variables @code{i1} and @code{i2} (with
the rightmost item (@code{i2}) taken from the top of stack) and later
pushes one integer (@code{i)) on the data stack (the rightmost item is
on the top afterwards).

How do we know the type and stack of the stack items?  Vmgen uses
prefixes, similar to Fortran; in contrast to Fortran, you have to
define the prefix first:

@example
\E s" Cell"   single data-stack type-prefix i
@end example

This defines the prefix @code{i} to refer to the type @code{Cell}
(defined as @code{long} in @file{mini.h}) and, by default, to the
@code{data-stack}.  It also specifies that this type takes one stack
item (@code{single}).  The type prefix is part of the variable name.

Before we can use @code{data-stack} in this way, we have to define it:

@example
\E stack data-stack sp Cell
@end example
@c !! use something other than Cell

This line defines the stack @code{data-stack}, which uses the stack
pointer @code{sp}, and each item has the basic type @code{Cell}; other
types have to fit into one or two @code{Cell}s (depending on whether the
type is @code{single} or @code{double} wide), and are converted from and
to Cells on accessing the @code{data-stack) with conversion macros
(@pxref{Conversion macros}).  Stacks grow towards lower addresses in
vmgen-erated interpreters.

We can override the default stack of a stack item by using a stack
prefix.  E.g., consider the following instruction:

@example
lit ( #i -- i )
@end example

The VM instruction @code{lit} takes the item @code{i} from the
instruction stream (indicated by the prefix @code{#}), and pushes it on
the (default) data stack.  The stack prefix is not part of the variable
name.  Stack prefixes are defined like this:

@example
\E inst-stream stack-prefix #
@end example

This definition defines that the stack prefix @code{#} specifies the
``stack'' @code{inst-stream}.  Since the instruction stream behaves a
little differently than an ordinary stack, it is predefined, and you do
not need to define it.

The instruction stream contains instructions and their immediate
arguments, so specifying that an argument comes from the instruction
stream indicates an immediate argument.  Of course, instruction stream
arguments can only appear to the left of @code{--} in the stack effect.
If there are multiple instruction stream arguments, the leftmost is the
first one (just as the intuition suggests).

@subsubsection C Code Macros

Vmgen recognizes the following strings in the C code part of simple
instructions:

@table @samp

@item SET_IP
As far as vmgen is concerned, a VM instruction containing this ends a VM
basic block (used in profiling to delimit profiled sequences).  On the C
level, this also sets the instruction pointer.

@item SUPER_END
This ends a basic block (for profiling), without a SET_IP.

@item TAIL;
Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and
dispatching the next VM instruction.  This happens automatically when
control reaches the end of the C code.  If you want to have this in the
middle of the C code, you need to use @samp{TAIL;}.  A typical example
is a conditional VM branch:

@example
if (branch_condition) {
  SET_IP(target); TAIL;
}
/* implicit tail follows here */
@end example

In this example, @samp{TAIL;} is not strictly necessary, because there
is another one implicitly after the if-statement, but using it improves
branch prediction accuracy slightly and allows other optimizations.

@item SUPER_CONTINUE
This indicates that the implicit tail at the end of the VM instruction
dispatches the sequentially next VM instruction even if there is a
@code{SET_IP} in the VM instruction.  This enables an optimization that
is not yet implemented in the vmgen-ex code (but in Gforth).  The
typical application is in conditional VM branches:

@example
if (branch_condition) {
  SET_IP(target); TAIL; /* now this TAIL is necessary */
}
SUPER_CONTINUE;
@end example

@end table

Note that vmgen is not smart about C-level tokenization, comments,
strings, or conditional compilation, so it will interpret even a
commented-out SUPER_END as ending a basic block (or, e.g.,
@samp{RETAIL;} as @samp{TAIL;}).  Conversely, vmgen requires the literal
presence of these strings; vmgen will not see them if they are hiding in
a C preprocessor macro.


@subsubsection C Code restrictions

Vmgen generates code and performs some optimizations under the
assumption that the user-supplied C code does not access the stack
pointers or stack items, and that accesses to the instruction pointer
only occur through special macros.  In general you should heed these
restrictions.  However, if you need to break these restrictions, read
the following.

Accessing a stack or stack pointer directly can be a problem for several
reasons: 

@itemize

@item
You may cache the top-of-stack item in a local variable (that is
allocated to a register).  This is the most frequent source of trouble.
You can deal with it either by not using top-of-stack caching (slowdown
factor 1-1.4, depending on machine), or by inserting flushing code
(e.g., @samp{IF_spTOS(sp[...] = spTOS);}) at the start and reloading
code (e.g., @samp{IF_spTOS(spTOS = sp[0])}) at the end of problematic C
code.  Vmgen inserts a stack pointer update before the start of the
user-supplied C code, so the flushing code has to use an index that
corrects for that.  In the future, this flushing may be done
automatically by mentioning a special string in the C code.
@c sometimes flushing and/or reloading unnecessary

@item
The vmgen-erated code loads the stack items from stack-pointer-indexed
memory into variables before the user-supplied C code, and stores them
from variables to stack-pointer-indexed memory afterwards.  If you do
any writes to the stack through its stack pointer in your C code, it
will not affact the variables, and your write may be overwritten by the
stores after the C code.  Similarly, a read from a stack using a stack
pointer will not reflect computations of stack items in the same VM
instruction.

@item
Superinstructions keep stack items in variables across the whole
superinstruction.  So you should not include VM instructions, that
access a stack or stack pointer, as components of superinstructions.

@end itemize

You should access the instruction pointer only through its special
macros (@samp{IP}, @samp{SET_IP}, @samp{IPTOS}); this ensure that these
macros can be implemented in several ways for best performance.
@samp{IP} points to the next instruction, and @samp{IPTOS} is its
contents.


@section Superinstructions

Here is an example of a superinstruction definition:

@example
lit_sub = lit sub
@end example

@code{lit_sub} is the name of the superinstruction, and @code{lit} and
@code{sub} are its components.  This superinstruction performs the same
action as the sequence @code{lit} and @code{sub}.  It is generated
automatically by the VM code generation functions whenever that sequence
occurs, so you only need to add this definition if you want to use this
superinstruction (and even that can be partially automatized,
@pxref{...}).

Vmgen requires that the component instructions are simple instructions
defined before superinstructions using the components.  Currently, vmgen
also requires that all the subsequences at the start of a
superinstruction (prefixes) must be defined as superinstruction before
the superinstruction.  I.e., if you want to define a superinstruction

@example
sumof5 = add add add add
@end example

you first have to define

@example
add ( n1 n2 -- n )
n = n1+n2;

sumof3 = add add
sumof4 = add add add
@end example

Here, @code{sumof4} is the longest prefix of @code{sumof5}, and @code{sumof3}
is the longest prefix of @code{sumof4}.

Note that vmgen assumes that only the code it generates accesses stack
pointers, the instruction pointer, and various stack items, and it
performs optimizations based on this assumption.  Therefore, VM
instructions that change the instruction pointer should only be used as
last component; a VM instruction that accesses a stack pointer should
not be used as component at all.  Vmgen does not check these
restrictions, they just result in bugs in your interpreter.

@c ********************************************************************
@chapter Using the generated code

The easiest way to create a working VM interpreter with vmgen is
probably to start with one of the examples, and modify it for your
purposes.  This chapter is just the reference manual for the macros
etc. used by the generated code, and the other context expected by the
generated code, and what you can do with the various generated files.

@section VM engine

The VM engine is the VM interpreter that executes the VM code.  It is
essential for an interpretive system.

The main file generated for the VM interpreter is
@file{@var{name}-vm.i}.  It uses the following macros and variables (and
you have to define them):

@table @code

@item LABEL(@var{inst_name})
This is used just before each VM instruction to provide a jump or
@code{switch} label (the @samp{:} is provided by vmgen).  For switch
dispatch this should expand to @samp{case @var{label}}; for
threaded-code dispatch this should just expand to @samp{case
@var{label}}.  In either case @var{label} is usually the @var{inst_name}
with some prefix or suffix to avoid naming conflicts.

@item NAME(@var{inst_name_string})
Called on entering a VM instruction with a string containing the name of
the VM instruction as parameter.  In normal execution this should be a
noop, but for tracing this usually prints the name, and possibly other
information (several VM registers in our example).

@item DEF_CA
Usually empty.  Called just inside a new scope at the start of a VM
instruction.  Can be used to define variables that should be visible
during every VM instruction.  If you define this macro as non-empty, you
have to provide the finishing @samp{;} in the macro.

@item NEXT_P0 NEXT_P1 NEXT_P2
The three parts of instruction dispatch.  They can be defined in
different ways for best performance on various processors (see
@file{engine.c} in the example or @file{engine/threaded.h} in Gforth).
@samp{NEXT_P0} is invoked right at the start of the VM isntruction (but
after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
code, and @samp{NEXT_P2} at the end.  The actual jump has to be
performed by @samp{NEXT_P2}.

The simplest variant is if @samp{NEXT_P2} does everything and the other
macros do nothing.  Then also related macros like @samp{IP},
@samp{SET_IP}, @samp{IP}, @samp{INC_IP} and @samp{IPTOS} are very
straightforward to define.  For switch dispatch this code consists just
of a jump to the dispatch code (@samp{goto next_inst;} in our example;
for direct threaded code it consists of something like
@samp{({cfa=*ip++; goto *cfa;})}.

Pulling code (usually the @samp{cfa=*ip;}) up into @samp{NEXT_P1}
usually does not cause problems, but pulling things up into
@samp{NEXT_P0} usually requires changing the other macros (and, at least
for Gforth on Alpha, it does not buy much, because the compiler often
manages to schedule the relevant stuff up by itself).  An even more
extreme variant is to pull code up even further, into, e.g., NEXT_P1 of
the previous VM instruction (prefetching, useful on PowerPCs).

@item INC_IP(@var{n})
This increments IP by @var{n}.

@item vm_@var{A}2@var{B}(a,b)
Type casting macro that assigns @samp{a} (of type @var{A}) to @samp{b}
(of type @var{B}).  This is mainly used for getting stack items into
variables and back.  So you need to define macros for every combination
of stack basic type (@code{Cell} in our example) and type-prefix types
used with that stack (in both directions).  For the type-prefix type,
you use the type-prefix (not the C type string) as type name (e.g.,
@samp{vm_Cell2i}, not @samp{vm_Cell2Cell}).  In addition, you have to
define a vm_@var{X}2@var{X} macro for the stack basic type (used in
superinstructions).

The stack basic type for the predefined @samp{inst-stream} is
@samp{Cell}.  If you want a stack with the same item size, making its
basic type @samp{Cell} usually reduces the number of macros you have to
define.

Here our examples differ a lot: @file{vmgen-ex} uses casts in these
macros, whereas @file{vmgen-ex2} uses union-field selection (or
assignment to union fields).

@item vm_two@var{A}2@var{B}(a1,a2,b)
@item vm_@var{B}2two@var{A}(b,a1,a2)
Conversions between two stack items (@code{a1}, @code{a2}) and a
variable @code{b} of a type that takes two stack items.  This does not
occur in our small examples, but you can look at Gforth for examples.

@item @var{stackpointer}
For each stack used, the stackpointer name given in the stack
declaration is used.  For a regular stack this must be an l-expression;
typically it is a variable declared as a pointer to the stack's basic
type.  For @samp{inst-stream}, the name is @samp{IP}, and it can be a
plain r-value; typically it is a macro that abstracts away the
differences between the various implementations of NEXT_P*.

@item @var{stackpointer}TOS
The top-of-stack for the stack pointed to by @var{stackpointer}.  If you
are using top-of-stack caching for that stack, this should be defined as
variable; if you are not using top-of-stack caching for that stack, this
should be a macro expanding to @samp{@var{stackpointer}[0]}.  The stack
pointer for the predefined @samp{inst-stream} is called @samp{IP}, so
the top-of-stack is called @samp{IPTOS}.

@item IF_@var{stackpointer}TOS(@var{expr})
Macro for executing @var{expr}, if top-of-stack caching is used for the
@var{stackpointer} stack.  I.e., this should do @var{expr} if there is
top-of-stack caching for @var{stackpointer}; otherwise it should do
nothing.

@item VM_DEBUG
If this is defined, the tracing code will be compiled in (slower
interpretation, but better debugging).  Our example compiles two
versions of the engine, a fast-running one that cannot trace, and one
with potential tracing and profiling.

@item vm_debug
Needed only if @samp{VM_DEBUG} is defined.  If this variable contains
true, the VM instructions produce trace output.  It can be turned on or
off at any time.

@item vm_out
Needed only if @samp{VM_DEBUG} is defined.  Specifies the file on which
to print the trace output (type @samp{FILE *}).

@item printarg_@var{type}(@var{value})
Needed only if @samp{VM_DEBUG} is defined.  Macro or function for
printing @var{value} in a way appropriate for the @var{type}.  This is
used for printing the values of stack items during tracing.  @var{Type}
is normally the type prefix specified in a @code{type-prefix} definition
(e.g., @samp{printarg_i}); in superinstructions it is currently the
basic type of the stack.

@end table

The file @file{@var{name}-labels.i} is used for enumerating or listing
all virtual machine instructions and uses the following macro:

@table @samp

@item INST_ADDR(@var{inst_name})
For switch dispatch, this is just the name of the switch label (the same
name as used in @samp{LABEL(@var{inst_name})}).  For threaded-code
dispatch, this is the address of the label defined in
@samp{LABEL(@var{inst_name})}); the address is taken with @samp{&&}
(@pxref{labels-as-values}).

@end table



@section Stacks, types, and prefixes



Invocation

Input Syntax

Concepts: Front end, VM, Stacks,  Types, input stream

Contact


Required changes:
vm_...2... -> two arguments
"vm_two...2...(arg1,arg2,arg3);" -> "vm_two...2...(arg3,arg1,arg2)" (no ";").
define INST_ADDR and LABEL
define VM_IS_INST also for disassembler

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>