Diff for /gforth/doc/vmgen.texi between versions 1.12 and 1.29

version 1.12, 2002/08/16 09:43:49 version 1.29, 2005/10/02 11:30:34
Line 10  This manual is for Vmgen Line 10  This manual is for Vmgen
 (version @value{VERSION}, @value{UPDATED}),  (version @value{VERSION}, @value{UPDATED}),
 the virtual machine interpreter generator  the virtual machine interpreter generator
   
 Copyright @copyright{} 2002 Free Software Foundation, Inc.  Copyright @copyright{} 2002,2003,2005 Free Software Foundation, Inc.
   
 @quotation  @quotation
 Permission is granted to copy, distribute and/or modify this document  Permission is granted to copy, distribute and/or modify this document
Line 27  Software Foundation raise funds for GNU Line 27  Software Foundation raise funds for GNU
 @end quotation  @end quotation
 @end copying  @end copying
   
 @dircategory GNU programming tools  @dircategory Software development
 @direntry  @direntry
 * Vmgen: (vmgen).               Interpreter generator  * Vmgen: (vmgen).               Virtual machine interpreter generator
 @end direntry  @end direntry
   
 @titlepage  @titlepage
Line 57  Software Foundation raise funds for GNU Line 57  Software Foundation raise funds for GNU
 * Invoking Vmgen::                * Invoking Vmgen::              
 * Example::                       * Example::                     
 * Input File Format::             * Input File Format::           
   * Error messages::              reported by Vmgen
 * Using the generated code::      * Using the generated code::    
   * Hints::                       VM archictecture, efficiency
   * The future::                  
 * Changes::                     from earlier versions  * Changes::                     from earlier versions
 * Contact::                     Bug reporting etc.  * Contact::                     Bug reporting etc.
 * Copying This Manual::         Manual License  * Copying This Manual::         Manual License
Line 82  Input File Format Line 85  Input File Format
 * Input File Grammar::            * Input File Grammar::          
 * Simple instructions::           * Simple instructions::         
 * Superinstructions::             * Superinstructions::           
   * Store Optimization::          
 * Register Machines::           How to define register VM instructions  * Register Machines::           How to define register VM instructions
   
   Input File Grammar
   
   * Eval escapes::                what follows \E
   
 Simple instructions  Simple instructions
   
   * Explicit stack access::       If the C code accesses a stack pointer
 * C Code Macros::               Macros recognized by Vmgen  * C Code Macros::               Macros recognized by Vmgen
 * C Code restrictions::         Vmgen makes assumptions about C code  * C Code restrictions::         Vmgen makes assumptions about C code
   * Stack growth direction::      is configurable per stack
   
 Using the generated code  Using the generated code
   
Line 98  Using the generated code Line 108  Using the generated code
 * VM disassembler::             for debugging the front end  * VM disassembler::             for debugging the front end
 * VM profiler::                 for finding worthwhile superinstructions  * VM profiler::                 for finding worthwhile superinstructions
   
   Hints
   
   * Floating point::              and stacks
   
 Copying This Manual  Copying This Manual
   
 * GNU Free Documentation License::  License for copying this manual.  * GNU Free Documentation License::  License for copying this manual.
Line 151  In this setup, Vmgen can generate most o Line 165  In this setup, Vmgen can generate most o
 machine instructions from a simple description of the virtual machine  machine instructions from a simple description of the virtual machine
 instructions (@pxref{Input File Format}), in particular:  instructions (@pxref{Input File Format}), in particular:
   
 @table @asis  @table @strong
   
 @item VM instruction execution  @item VM instruction execution
   
Line 172  Useful for optimizing the VM interpreter Line 186  Useful for optimizing the VM interpreter
   
 @end table  @end table
   
   To create parts of the interpretive system that do not deal with VM
   instructions, you have to use other tools (e.g., @command{bison}) and/or
   hand-code them.
   
 @cindex efficiency features overview  @cindex efficiency features overview
 @noindent  @noindent
 Vmgen supports efficient interpreters though various optimizations, in  Vmgen supports efficient interpreters though various optimizations, in
Line 209  offered by Vmgen. Line 227  offered by Vmgen.
   
 There are many potential uses of the instruction descriptions that are  There are many potential uses of the instruction descriptions that are
 not implemented at the moment, but we are open for feature requests, and  not implemented at the moment, but we are open for feature requests, and
 we will implement new features if someone asks for them; so the feature  we will consider new features if someone asks for them; so the feature
 list above is not exhaustive.  list above is not exhaustive.
   
 @c *********************************************************************  @c *********************************************************************
Line 300  interpreter, but some systems also suppo Line 318  interpreter, but some systems also suppo
 as an image file, or in a full-blown linkable file format (e.g., JVM).  as an image file, or in a full-blown linkable file format (e.g., JVM).
 Vmgen currently has no special support for such features, but the  Vmgen currently has no special support for such features, but the
 information in the instruction descriptions can be helpful, and we are  information in the instruction descriptions can be helpful, and we are
 open for feature requests and suggestions.  open to feature requests and suggestions.
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node Data handling, Dispatch, Front end and VM interpreter, Concepts  @node Data handling, Dispatch, Front end and VM interpreter, Concepts
Line 310  open for feature requests and suggestion Line 328  open for feature requests and suggestion
 @cindex register machine  @cindex register machine
 Most VMs use one or more stacks for passing temporary data between VM  Most VMs use one or more stacks for passing temporary data between VM
 instructions.  Another option is to use a register machine architecture  instructions.  Another option is to use a register machine architecture
 for the virtual machine; however, this option is either slower or  for the virtual machine; we believe that using a stack architecture is
   usually both simpler and faster.
   
   however, this option is slower or
 significantly more complex to implement than a stack machine architecture.  significantly more complex to implement than a stack machine architecture.
   
 Vmgen has special support and optimizations for stack VMs, making their  Vmgen has special support and optimizations for stack VMs, making their
Line 356  After executing one VM instruction, the Line 377  After executing one VM instruction, the
 the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).  the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).
 Vmgen supports two methods of dispatch:  Vmgen supports two methods of dispatch:
   
 @table @asis  @table @strong
   
 @item switch dispatch  @item switch dispatch
 @cindex switch dispatch  @cindex switch dispatch
Line 379  instruction.  Threaded code cannot be im Line 400  instruction.  Threaded code cannot be im
 be implemented using GNU C's labels-as-values extension (@pxref{Labels  be implemented using GNU C's labels-as-values extension (@pxref{Labels
 as Values, , Labels as Values, gcc.info, GNU C Manual}).  as Values, , Labels as Values, gcc.info, GNU C Manual}).
   
   @c call threading
 @end table  @end table
   
 Threaded code can be twice as fast as switch dispatch, depending on the  Threaded code can be twice as fast as switch dispatch, depending on the
Line 392  interpreter, the benchmark, and the mach Line 414  interpreter, the benchmark, and the mach
 The usual way to invoke Vmgen is as follows:  The usual way to invoke Vmgen is as follows:
   
 @example  @example
 vmgen @var{infile}  vmgen @var{inputfile}
 @end example  @end example
   
 Here @var{infile} is the VM instruction description file, which usually  Here @var{inputfile} is the VM instruction description file, which
 ends in @file{.vmg}.  The output filenames are made by taking the  usually ends in @file{.vmg}.  The output filenames are made by taking
 basename of @file{infile} (i.e., the output files will be created in the  the basename of @file{inputfile} (i.e., the output files will be created
 current working directory) and replacing @file{.vmg} with @file{-vm.i},  in the current working directory) and replacing @file{.vmg} with
 @file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},  @file{-vm.i}, @file{-disasm.i}, @file{-gen.i}, @file{-labels.i},
 and @file{-peephole.i}.  E.g., @command{vmgen hack/foo.vmg} will create  @file{-profile.i}, and @file{-peephole.i}.  E.g., @command{vmgen
 @file{foo-vm.i} etc.  hack/foo.vmg} will create @file{foo-vm.i}, @file{foo-disasm.i},
   @file{foo-gen.i}, @file{foo-labels.i}, @file{foo-profile.i} and
   @file{foo-peephole.i}.
   
 The command-line options supported by Vmgen are  The command-line options supported by Vmgen are
   
Line 563  sort -k 3 >mini-super.vmg       #sort se Line 587  sort -k 3 >mini-super.vmg       #sort se
 The file @file{peephole-blacklist} contains all instructions that  The file @file{peephole-blacklist} contains all instructions that
 directly access a stack or stack pointer (for mini: @code{call},  directly access a stack or stack pointer (for mini: @code{call},
 @code{return}); the sort step is necessary to ensure that prefixes  @code{return}); the sort step is necessary to ensure that prefixes
 preceed larger superinstructions.  precede larger superinstructions.
   
 Now you can create a version of mini with superinstructions by just  Now you can create a version of mini with superinstructions by just
 saying @samp{make}  saying @samp{make}
   
   
 @c ***************************************************************  @c ***************************************************************
 @node Input File Format, Using the generated code, Example, Top  @node Input File Format, Error messages, Example, Top
 @chapter Input File Format  @chapter Input File Format
 @cindex input file format  @cindex input file format
 @cindex format, input file  @cindex format, input file
Line 584  Most examples are taken from the example Line 608  Most examples are taken from the example
 * Input File Grammar::            * Input File Grammar::          
 * Simple instructions::           * Simple instructions::         
 * Superinstructions::             * Superinstructions::           
   * Store Optimization::          
 * Register Machines::           How to define register VM instructions  * Register Machines::           How to define register VM instructions
 @end menu  @end menu
   
Line 598  The grammar is in EBNF format, with @cod Line 623  The grammar is in EBNF format, with @cod
 of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.  of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.
   
 @cindex free-format, not  @cindex free-format, not
   @cindex newlines, significance in syntax
 Vmgen input is not free-format, so you have to take care where you put  Vmgen input is not free-format, so you have to take care where you put
 spaces and especially newlines; it's not as bad as makefiles, though:  newlines (and, in a few cases, white space).
 any sequence of spaces and tabs is equivalent to a single space.  
   
 @example  @example
 description: @{instruction|comment|eval-escape@}  description: @{instruction|comment|eval-escape|c-escape@}
   
 instruction: simple-inst|superinst  instruction: simple-inst|superinst
   
 simple-inst: ident ' (' stack-effect ' )' newline c-code newline newline  simple-inst: ident '(' stack-effect ')' newline c-code newline newline
   
 stack-effect: @{ident@} ' --' @{ident@}  stack-effect: @{ident@} '--' @{ident@}
   
 super-inst: ident ' =' ident @{ident@}    super-inst: ident '=' ident @{ident@}  
   
 comment:      '\ '  text newline  comment:      '\ '  text newline
   
 eval-escape:  '\e ' text newline  eval-escape:  '\E ' text newline
   
   c-escape:     '\C ' text newline
 @end example  @end example
 @c \+ \- \g \f \c  @c \+ \- \g \f \c
   
 Note that the @code{\}s in this grammar are meant literally, not as  Note that the @code{\}s in this grammar are meant literally, not as
 C-style encodings for non-printable characters.  C-style encodings for non-printable characters.
   
 The C code in @code{simple-inst} must not contain empty lines (because  There are two ways to delimit the C code in @code{simple-inst}:
 Vmgen would mistake that as the end of the simple-inst.  The text in  
 @code{comment} and @code{eval-escape} must not contain a newline.  @itemize @bullet
 @code{Ident} must conform to the usual conventions of C identifiers  
 (otherwise the C compiler would choke on the Vmgen output).  @item
   If you start it with a @samp{@{} at the start of a line (i.e., not even
   white space before it), you have to end it with a @samp{@}} at the start
   of a line (followed by a newline).  In this case you may have empty
   lines within the C code (typically used between variable definitions and
   statements).
   
   @item
   You do not start it with @samp{@{}.  Then the C code ends at the first
   empty line, so you cannot have empty lines within this code.
   
   @end itemize
   
   The text in @code{comment}, @code{eval-escape} and @code{c-escape} must
   not contain a newline.  @code{Ident} must conform to the usual
   conventions of C identifiers (otherwise the C compiler would choke on
   the Vmgen output), except that idents in @code{stack-effect} may have a
   stack prefix (for stack prefix syntax, @pxref{Eval escapes}).
   
   @cindex C escape
   @cindex @code{\C}
   @cindex conditional compilation of Vmgen output
   The @code{c-escape} passes the text through to each output file (without
   the @samp{\C}).  This is useful mainly for conditional compilation
   (i.e., you write @samp{\C #if ...} etc.).
   
   @cindex sync lines
   @cindex @code{#line}
   In addition to the syntax given in the grammer, Vmgen also processes
   sync lines (lines starting with @samp{#line}), as produced by @samp{m4
   -s} (@pxref{Invoking m4, , Invoking m4, m4.info, GNU m4}) and similar
   tools.  This allows associating C compiler error messages with the
   original source of the C code.
   
 Vmgen understands a few extensions beyond the grammar given here, but  Vmgen understands a few extensions beyond the grammar given here, but
 these extensions are only useful for building Gforth.  You can find a  these extensions are only useful for building Gforth.  You can find a
 description of the format used for Gforth in @file{prim}.  description of the format used for Gforth in @file{prim}.
   
   @menu
   * Eval escapes::                what follows \E
   @end menu
   
   @node Eval escapes,  , Input File Grammar, Input File Grammar
 @subsection Eval escapes  @subsection Eval escapes
 @cindex escape to Forth  @cindex escape to Forth
 @cindex eval escape  @cindex eval escape
   @cindex @code{\E}
   
 @c woanders?  @c woanders?
 The text in @code{eval-escape} is Forth code that is evaluated when  The text in @code{eval-escape} is Forth code that is evaluated when
 Vmgen reads the line.  If you do not know (and do not want to learn)  Vmgen reads the line.  You will normally use this feature to define
 Forth, you can build the text according to the following grammar; these  stacks and types.
 rules are normally all Forth you need for using Vmgen:  
   If you do not know (and do not want to learn) Forth, you can build the
   text according to the following grammar; these rules are normally all
   Forth you need for using Vmgen:
   
 @example  @example
 text: stack-decl|type-prefix-decl|stack-prefix-decl  text: stack-decl|type-prefix-decl|stack-prefix-decl|set-flag
   
 stack-decl: 'stack ' ident ident ident  stack-decl: 'stack ' ident ident ident
 type-prefix-decl:   type-prefix-decl: 
     's" ' string '" ' ('single'|'double') ident 'type-prefix' ident      's" ' string '" ' ('single'|'double') ident 'type-prefix' ident
 stack-prefix-decl:  ident 'stack-prefix' string  stack-prefix-decl:  ident 'stack-prefix' string
   set-flag: ('store-optimization'|'include-skipped-insts') ('on'|'off')
 @end example  @end example
   
 Note that the syntax of this code is not checked thoroughly (there are  Note that the syntax of this code is not checked thoroughly (there are
 many other Forth program fragments that could be written there).  many other Forth program fragments that could be written in an
   eval-escape).
   
   A stack prefix can contain letters, digits, or @samp{:}, and may start
   with an @samp{#}; e.g., in Gforth the return stack has the stack prefix
   @samp{R:}.  This restriction is not checked during the stack prefix
   definition, but it is enforced by the parsing rules for stack items
   later.
   
 If you know Forth, the stack effects of the non-standard words involved  If you know Forth, the stack effects of the non-standard words involved
 are:  are:
Line 661  are: Line 737  are:
 @findex single  @findex single
 @findex double  @findex double
 @findex stack-prefix  @findex stack-prefix
   @findex store-optimization
 @example  @example
 stack        ( "name" "pointer" "type" -- )  stack                 ( "name" "pointer" "type" -- )
              ( name execution: -- stack )                        ( name execution: -- stack )
 type-prefix  ( addr u xt1 xt2 n stack "prefix" -- )  type-prefix           ( addr u item-size stack "prefix" -- )
 single       ( -- xt1 xt2 n )  single                ( -- item-size )
 double       ( -- xt1 xt2 n )  double                ( -- item-size )
 stack-prefix ( stack "prefix" -- )  stack-prefix          ( stack "prefix" -- )
   store-optimization    ( -- addr )
   include-skipped-insts ( -- addr )
 @end example  @end example
   
   An @var{item-size} takes three cells on the stack.
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node Simple instructions, Superinstructions, Input File Grammar, Input File Format  @node Simple instructions, Superinstructions, Input File Grammar, Input File Format
Line 726  Before we can use @code{data-stack} in t Line 806  Before we can use @code{data-stack} in t
 @cindex stack basic type  @cindex stack basic type
 @cindex basic type of a stack  @cindex basic type of a stack
 @cindex type of a stack, basic  @cindex type of a stack, basic
 @cindex stack growth direction  
 This line defines the stack @code{data-stack}, which uses the stack  This line defines the stack @code{data-stack}, which uses the stack
 pointer @code{sp}, and each item has the basic type @code{Cell}; other  pointer @code{sp}, and each item has the basic type @code{Cell}; other
 types have to fit into one or two @code{Cell}s (depending on whether the  types have to fit into one or two @code{Cell}s (depending on whether the
 type is @code{single} or @code{double} wide), and are cast from and to  type is @code{single} or @code{double} wide), and are cast from and to
 Cells on accessing the @code{data-stack} with type cast macros  Cells on accessing the @code{data-stack} with type cast macros
 (@pxref{VM engine}).  Stacks grow towards lower addresses in  (@pxref{VM engine}).  By default, stacks grow towards lower addresses in
 Vmgen-erated interpreters.  Vmgen-erated interpreters (@pxref{Stack growth direction}).
   
 @cindex stack prefix  @cindex stack prefix
 @cindex prefix, stack  @cindex prefix, stack
Line 751  name.  Stack prefixes are defined like t Line 830  name.  Stack prefixes are defined like t
   
 @example  @example
 \E inst-stream stack-prefix #  \E inst-stream stack-prefix #
   \E data-stack  stack-prefix S:
 @end example  @end example
   
 This definition defines that the stack prefix @code{#} specifies the  This definition defines that the stack prefix @code{#} specifies the
Line 767  If there are multiple instruction stream Line 847  If there are multiple instruction stream
 first one (just as the intuition suggests).  first one (just as the intuition suggests).
   
 @menu  @menu
   * Explicit stack access::       If the C code accesses a stack pointer
 * C Code Macros::               Macros recognized by Vmgen  * C Code Macros::               Macros recognized by Vmgen
 * C Code restrictions::         Vmgen makes assumptions about C code  * C Code restrictions::         Vmgen makes assumptions about C code
   * Stack growth direction::      is configurable per stack
 @end menu  @end menu
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node C Code Macros, C Code restrictions, Simple instructions, Simple instructions  @node  Explicit stack access, C Code Macros, Simple instructions, Simple instructions
   @subsection Explicit stack access
   @cindex stack access, explicit
   @cindex Stack pointer access
   @cindex explicit stack access
   
   Not all stack effects can be specified using the stack effect
   specifications above.  For VM instructions that have other stack
   effects, you can specify them explicitly by accessing the stack
   pointer in the C code; however, you have to notify Vmgen of such
   explicit stack accesses, otherwise Vmgens optimizations could conflict
   with your explicit stack accesses.
   
   You notify Vmgen by putting @code{...} with the appropriate stack
   prefix into the stack comment.  Then the VM instruction will first
   take the other stack items specified in the stack effect into C
   variables, then make sure that all other stack items for that stack
   are in memory, and that the stack pointer for the stack points to the
   top-of-stack (by default, unless you change the stack access
   transformation: @pxref{Stack growth direction}).
   
   The general rule is: If you mention a stack pointer in the C code of a
   VM instruction, you should put a @code{...} for that stack in the stack
   effect.
   
   Consider this example:
   
   @example
   return ( #iadjust S:... target afp i1 -- i2 )
   SET_IP(target);
   sp = (Cell *)(((char *)sp)+iadjust);
   fp = afp;
   i2=i1;
   @end example
   
   First the variables @code{target afp i1} are popped off the stack,
   then the stack pointer @code{sp} is set correctly for the new stack
   depth, then the C code changes the stack depth and does other things,
   and finally @code{i2} is pushed on the stack with the new depth.
   
   The position of the @code{...} within the stack effect does not
   matter.  You can use several @code{...}s, for different stacks, and
   also several for the same stack (that has no additional effect).  If
   you use @code{...} without a stack prefix, this specifies all the
   stacks except the instruction stream.
   
   You cannot use @code{...} for the instruction stream, but that is not
   necessary: At the start of the C code, @code{IP} points to the start
   of the next VM instruction (i.e., right beyond the end of the current
   VM instruction), and you can change the instruction pointer with
   @code{SET_IP} (@pxref{VM engine}).
   
   
   @c --------------------------------------------------------------------
   @node C Code Macros, C Code restrictions, Explicit stack access, Simple instructions
 @subsection C Code Macros  @subsection C Code Macros
 @cindex macros recognized by Vmgen  @cindex macros recognized by Vmgen
 @cindex basic block, VM level  @cindex basic block, VM level
Line 793  level, this also sets the instruction po Line 929  level, this also sets the instruction po
 This ends a basic block (for profiling), even if the instruction  This ends a basic block (for profiling), even if the instruction
 contains no @code{SET_IP}.  contains no @code{SET_IP}.
   
 @item TAIL;  @item INST_TAIL;
 @findex TAIL;  @findex INST_TAIL;
 Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and  Vmgen replaces @samp{INST_TAIL;} with code for ending a VM instruction and
 dispatching the next VM instruction.  Even without a @samp{TAIL;} this  dispatching the next VM instruction.  Even without a @samp{INST_TAIL;} this
 happens automatically when control reaches the end of the C code.  If  happens automatically when control reaches the end of the C code.  If
 you want to have this in the middle of the C code, you need to use  you want to have this in the middle of the C code, you need to use
 @samp{TAIL;}.  A typical example is a conditional VM branch:  @samp{INST_TAIL;}.  A typical example is a conditional VM branch:
   
 @example  @example
 if (branch_condition) @{  if (branch_condition) @{
   SET_IP(target); TAIL;    SET_IP(target); INST_TAIL;
 @}  @}
 /* implicit tail follows here */  /* implicit tail follows here */
 @end example  @end example
   
 In this example, @samp{TAIL;} is not strictly necessary, because there  In this example, @samp{INST_TAIL;} is not strictly necessary, because there
 is another one implicitly after the if-statement, but using it improves  is another one implicitly after the if-statement, but using it improves
 branch prediction accuracy slightly and allows other optimizations.  branch prediction accuracy slightly and allows other optimizations.
   
Line 822  typical application is in conditional VM Line 958  typical application is in conditional VM
   
 @example  @example
 if (branch_condition) @{  if (branch_condition) @{
   SET_IP(target); TAIL; /* now this TAIL is necessary */    SET_IP(target); INST_TAIL; /* now this INST_TAIL is necessary */
 @}  @}
 SUPER_CONTINUE;  SUPER_CONTINUE;
 @end example  @end example
   
   @item VM_JUMP
   @findex VM_JUMP
   @code{VM_JUMP(target)} is equivalent to @code{goto *(target)}, but
   allows Vmgen to do dynamic superinstructions and replication.  You
   still need to say @code{SUPER_END}.  Also, the goto only happens at
   the end (wherever the VM_JUMP is).  Essentially, this just suppresses
   much of the ordinary dispatch mechanism.
   
 @end table  @end table
   
 Note that Vmgen is not smart about C-level tokenization, comments,  Note that Vmgen is not smart about C-level tokenization, comments,
 strings, or conditional compilation, so it will interpret even a  strings, or conditional compilation, so it will interpret even a
 commented-out SUPER_END as ending a basic block (or, e.g.,  commented-out SUPER_END as ending a basic block (or, e.g.,
 @samp{RETAIL;} as @samp{TAIL;}).  Conversely, Vmgen requires the literal  @samp{RESET_IP;} as @samp{SET_IP;}).  Conversely, Vmgen requires the literal
 presence of these strings; Vmgen will not see them if they are hiding in  presence of these strings; Vmgen will not see them if they are hiding in
 a C preprocessor macro.  a C preprocessor macro.
   
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node C Code restrictions,  , C Code Macros, Simple instructions  @node C Code restrictions, Stack growth direction, C Code Macros, Simple instructions
 @subsection C Code restrictions  @subsection C Code restrictions
 @cindex C code restrictions  @cindex C code restrictions
 @cindex restrictions on C code  @cindex restrictions on C code
Line 879  The Vmgen-erated code loads the stack it Line 1023  The Vmgen-erated code loads the stack it
 memory into variables before the user-supplied C code, and stores them  memory into variables before the user-supplied C code, and stores them
 from variables to stack-pointer-indexed memory afterwards.  If you do  from variables to stack-pointer-indexed memory afterwards.  If you do
 any writes to the stack through its stack pointer in your C code, it  any writes to the stack through its stack pointer in your C code, it
 will not affact the variables, and your write may be overwritten by the  will not affect the variables, and your write may be overwritten by the
 stores after the C code.  Similarly, a read from a stack using a stack  stores after the C code.  Similarly, a read from a stack using a stack
 pointer will not reflect computations of stack items in the same VM  pointer will not reflect computations of stack items in the same VM
 instruction.  instruction.
Line 898  macros can be implemented in several way Line 1042  macros can be implemented in several way
 @samp{IP} points to the next instruction, and @samp{IPTOS} is its  @samp{IP} points to the next instruction, and @samp{IPTOS} is its
 contents.  contents.
   
   @c --------------------------------------------------------------------
   @node Stack growth direction,  , C Code restrictions, Simple instructions
   @subsection Stack growth direction
   @cindex stack growth direction
   
   @cindex @code{stack-access-transform}
   By default, the stacks grow towards lower addresses.  You can change
   this for a stack by setting the @code{stack-access-transform} field of
   the stack to an xt @code{( itemnum -- index )} that performs the
   appropriate index transformation.
   
   E.g., if you want to let @code{data-stack} grow towards higher
   addresses, with the stack pointer always pointing just beyond the
   top-of-stack, use this right after defining @code{data-stack}:
   
   @example
   \E : sp-access-transform ( itemnum -- index ) negate 1- ;
   \E ' sp-access-transform ' data-stack >body stack-access-transform !
   @end example
   
   This means that @code{sp-access-transform} will be used to generate
   indexes for accessing @code{data-stack}.  The definition of
   @code{sp-access-transform} above transforms n into -n-1, e.g, 1 into -2.
   This will access the 0th data-stack element (top-of-stack) at sp[-1],
   the 1st at sp[-2], etc., which is the typical way upward-growing
   stacks are used.  If you need a different transform and do not know
   enough Forth to program it, let me know.
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node Superinstructions, Register Machines, Simple instructions, Input File Format  @node Superinstructions, Store Optimization, Simple instructions, Input File Format
 @section Superinstructions  @section Superinstructions
 @cindex superinstructions, defining  @cindex superinstructions, defining
 @cindex defining superinstructions  @cindex defining superinstructions
Line 957  accesses a stack pointer should not be u Line 1128  accesses a stack pointer should not be u
 does not check these restrictions, they just result in bugs in your  does not check these restrictions, they just result in bugs in your
 interpreter.  interpreter.
   
   @cindex include-skipped-insts
   The Vmgen flag @code{include-skipped-insts} influences superinstruction
   code generation.  Currently there is no support in the peephole
   optimizer for both variations, so leave this flag alone for now.
   
   @c -------------------------------------------------------------------
   @node  Store Optimization, Register Machines, Superinstructions, Input File Format
   @section Store Optimization
   @cindex store optimization
   @cindex optimization, stack stores
   @cindex stack stores, optimization
   @cindex eliminating stack stores
   
   This minor optimization (0.6\%--0.8\% reduction in executed instructions
   for Gforth) puts additional requirements on the instruction descriptions
   and is therefore disabled by default.
   
   What does it do?  Consider an instruction like
   
   @example
   dup ( n -- n n )
   @end example
   
   For simplicity, also assume that we are not caching the top-of-stack in
   a register.  Now, the C code for dup first loads @code{n} from the
   stack, and then stores it twice to the stack, one time to the address
   where it came from; that time is unnecessary, but gcc does not optimize
   it away, so vmgen can do it instead (if you turn on the store
   optimization).
   
   Vmgen uses the stack item's name to determine if the stack item contains
   the same value as it did at the start.  Therefore, if you use the store
   optimization, you have to ensure that stack items that have the same
   name on input and output also have the same value, and are not changed
   in the C code you supply.  I.e., the following code could fail if you
   turn on the store optimization:
   
   @example
   add1 ( n -- n )
   n++;
   @end example
   
   Instead, you have to use different names, i.e.:
   
   @example
   add1 ( n1 -- n2 )
   n2=n1+1;
   @end example
   
   Similarly, the store optimization assumes that the stack pointer is only
   changed by Vmgen-erated code.  If your C code changes the stack pointer,
   use different names in input and output stack items to avoid a (probably
   wrong) store optimization, or turn the store optimization off for this
   VM instruction.
   
   To turn on the store optimization, write
   
   @example
   \E store-optimization on
   @end example
   
   at the start of the file.  You can turn this optimization on or off
   between any two VM instruction descriptions.  For turning it off again,
   you can use
   
   @example
   \E store-optimization off
   @end example
   
 @c -------------------------------------------------------------------  @c -------------------------------------------------------------------
 @node Register Machines,  , Superinstructions, Input File Format  @node Register Machines,  , Store Optimization, Input File Format
 @section Register Machines  @section Register Machines
 @cindex Register VM  @cindex Register VM
 @cindex Superinstructions for register VMs  @cindex Superinstructions for register VMs
Line 1013  VM interpreters.  However, if you have i Line 1253  VM interpreters.  However, if you have i
 direction, please let me know (@pxref{Contact}).  direction, please let me know (@pxref{Contact}).
   
 @c ********************************************************************  @c ********************************************************************
 @node Using the generated code, Changes, Input File Format, Top  @node Error messages, Using the generated code, Input File Format, Top
   @chapter Error messages
   @cindex error messages
   
   These error messages are created by Vmgen:
   
   @table @code
   
   @cindex @code{# can only be on the input side} error
   @item # can only be on the input side
   You have used an instruction-stream prefix (usually @samp{#}) after the
   @samp{--} (the output side); you can only use it before (the input
   side).
   
   @cindex @code{prefix for this combination must be defined earlier} error
   @item the prefix for this superinstruction must be defined earlier
   You have defined a superinstruction (e.g. @code{abc = a b c}) without
   defining its direct prefix (e.g., @code{ab = a b}),
   @xref{Superinstructions}.
   
   @cindex @code{sync line syntax} error
   @item sync line syntax
   If you are using a preprocessor (e.g., @command{m4}) to generate Vmgen
   input code, you may want to create @code{#line} directives (aka sync
   lines).  This error indicates that such a line is not in th syntax
   expected by Vmgen (this should not happen; please report the offending
   line in a bug report).
   
   @cindex @code{syntax error, wrong char} error
   @item syntax error, wrong char
   A syntax error.  If you do not see right away where the error is, it may
   be helpful to check the following: Did you put an empty line in a VM
   instruction where the C code is not delimited by braces (then the empty
   line ends the VM instruction)?  If you used brace-delimited C code, did
   you put the delimiting braces (and only those) at the start of the line,
   without preceding white space?  Did you forget a delimiting brace?
   
   @cindex @code{too many stacks} error
   @item too many stacks
   Vmgen currently supports 3 stacks (plus the instruction stream); if you
   need more, let us know.
   
   @cindex @code{unknown prefix} error
   @item unknown prefix
   The stack item does not match any defined type prefix (after stripping
   away any stack prefix).  You should either declare the type prefix you
   want for that stack item, or use a different type prefix
   
   @cindex @code{unknown primitive} error
   @item unknown primitive
   You have used the name of a simple VM instruction in a superinstruction
   definition without defining the simple VM instruction first.
   
   @end table
   
   In addition, the C compiler can produce errors due to code produced by
   Vmgen; e.g., you need to define type cast functions.
   
   @c ********************************************************************
   @node Using the generated code, Hints, Error messages, Top
 @chapter Using the generated code  @chapter Using the generated code
 @cindex generated code, usage  @cindex generated code, usage
 @cindex Using vmgen-erated code  @cindex Using vmgen-erated code
   
 The easiest way to create a working VM interpreter with Vmgen is  The easiest way to create a working VM interpreter with Vmgen is
 probably to start with @file{vmgen-ex}, and modify it for your purposes.  probably to start with @file{vmgen-ex}, and modify it for your purposes.
 This chapter is just the reference manual for the macros etc. used by  This chapter explains what the various wrapper and generated files do.
 the generated code, the other context expected by the generated code,  It also contains reference-manual style descriptions of the macros,
 and what you can do with the various generated files.  variables etc. used by the generated code, and you can skip that on
   first reading.
   
 @menu  @menu
 * VM engine::                   Executing VM code  * VM engine::                   Executing VM code
Line 1059  In our example the engine function also Line 1359  In our example the engine function also
 @file{@var{name}-labels.i} (@pxref{VM instruction table}).  @file{@var{name}-labels.i} (@pxref{VM instruction table}).
   
 @cindex tracing VM code  @cindex tracing VM code
   @cindex superinstructions and tracing
 In addition to executing the code, the VM engine can optionally also  In addition to executing the code, the VM engine can optionally also
 print out a trace of the executed instructions, their arguments and  print out a trace of the executed instructions, their arguments and
 results.  For superinstructions it prints the trace as if only component  results.  For superinstructions it prints the trace as if only component
Line 1080  The following macros and variables are u Line 1381  The following macros and variables are u
 @item LABEL(@var{inst_name})  @item LABEL(@var{inst_name})
 This is used just before each VM instruction to provide a jump or  This is used just before each VM instruction to provide a jump or
 @code{switch} label (the @samp{:} is provided by Vmgen).  For switch  @code{switch} label (the @samp{:} is provided by Vmgen).  For switch
 dispatch this should expand to @samp{case @var{label}}; for  dispatch this should expand to @samp{case @var{label}:}; for
 threaded-code dispatch this should just expand to @samp{@var{label}}.  threaded-code dispatch this should just expand to @samp{@var{label}:}.
 In either case @var{label} is usually the @var{inst_name} with some  In either case @var{label} is usually the @var{inst_name} with some
 prefix or suffix to avoid naming conflicts.  prefix or suffix to avoid naming conflicts.
   
Line 1093  should expand to nothing. Line 1394  should expand to nothing.
 @findex NAME  @findex NAME
 @item NAME(@var{inst_name_string})  @item NAME(@var{inst_name_string})
 Called on entering a VM instruction with a string containing the name of  Called on entering a VM instruction with a string containing the name of
 the VM instruction as parameter.  In normal execution this should be a  the VM instruction as parameter.  In normal execution this should be
 noop, but for tracing this usually prints the name, and possibly other  expand to nothing, but for tracing this usually prints the name, and
 information (several VM registers in our example).  possibly other information (several VM registers in our example).
   
 @findex DEF_CA  @findex DEF_CA
 @item DEF_CA  @item DEF_CA
Line 1114  different ways for best performance on v Line 1415  different ways for best performance on v
 @samp{NEXT_P0} is invoked right at the start of the VM instruction (but  @samp{NEXT_P0} is invoked right at the start of the VM instruction (but
 after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C  after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
 code, and @samp{NEXT_P2} at the end.  The actual jump has to be  code, and @samp{NEXT_P2} at the end.  The actual jump has to be
 performed by @samp{NEXT_P2}.  performed by @samp{NEXT_P2} (if you would do it earlier, important parts
   of the VM instruction would not be executed).
   
 The simplest variant is if @samp{NEXT_P2} does everything and the other  The simplest variant is if @samp{NEXT_P2} does everything and the other
 macros do nothing.  Then also related macros like @samp{IP},  macros do nothing.  Then also related macros like @samp{IP},
Line 1189  type.  For @samp{inst-stream}, the name Line 1491  type.  For @samp{inst-stream}, the name
 plain r-value; typically it is a macro that abstracts away the  plain r-value; typically it is a macro that abstracts away the
 differences between the various implementations of @code{NEXT_P*}.  differences between the various implementations of @code{NEXT_P*}.
   
   @cindex IMM_ARG
   @findex IMM_ARG
   @item IMM_ARG(access,value)
   Define this to expland to ``(access)''.  This is just a placeholder for
   future extensions.
   
 @cindex top of stack caching  @cindex top of stack caching
 @cindex stack caching  @cindex stack caching
 @cindex TOS  @cindex TOS
Line 1218  profiling. Line 1526  profiling.
 @item SUPER_CONTINUE  @item SUPER_CONTINUE
 This is just a hint to Vmgen and does nothing at the C level.  This is just a hint to Vmgen and does nothing at the C level.
   
   @findex MAYBE_UNUSED
   @item MAYBE_UNUSED
   This should be defined as @code{__attribute__((unused))} for gcc-2.7 and
   higher.  It suppresses the warnings about unused variables in the code
   for superinstructions.  You need to define this only if you are using
   superinstructions.
   
 @findex VM_DEBUG  @findex VM_DEBUG
 @item VM_DEBUG  @item VM_DEBUG
 If this is defined, the tracing code will be compiled in (slower  If this is defined, the tracing code will be compiled in (slower
Line 1395  instruction instead of laying down @code Line 1710  instruction instead of laying down @code
   
 The code for peephole optimization is in @file{vmgen-ex/peephole.c}.  The code for peephole optimization is in @file{vmgen-ex/peephole.c}.
 You can use this file almost verbatim.  Vmgen generates  You can use this file almost verbatim.  Vmgen generates
 @file{@var{file}-peephole.i} which contains data for the peephoile  @file{@var{file}-peephole.i} which contains data for the peephole
 optimizer.  optimizer.
   
 @findex init_peeptable  @findex init_peeptable
Line 1541  it uses variables and functions defined Line 1856  it uses variables and functions defined
 plus @code{VM_IS_INST} already defined for the VM disassembler  plus @code{VM_IS_INST} already defined for the VM disassembler
 (@pxref{VM disassembler}).  (@pxref{VM disassembler}).
   
   @c **********************************************************
   @node Hints, The future, Using the generated code, Top
   @chapter Hints
   @cindex hints
   
   @menu
   * Floating point::              and stacks
   @end menu
   
   @c --------------------------------------------------------------------
   @node Floating point,  , Hints, Hints
   @section Floating point
   
   How should you deal with floating point values?  Should you use the same
   stack as for integers/pointers, or a different one?  This section
   discusses this issue with a view on execution speed.
   
   The simpler approach is to use a separate floating-point stack.  This
   allows you to choose FP value size without considering the size of the
   integers/pointers, and you avoid a number of performance problems.  The
   main downside is that this needs an FP stack pointer (and that may not
   fit in the register file on the 386 arhitecture, costing some
   performance, but comparatively little if you take the other option into
   account).  If you use a separate FP stack (with stack pointer @code{fp}),
   using an fpTOS is helpful on most machines, but some spill the fpTOS
   register into memory, and fpTOS should not be used there.
   
   The other approach is to share one stack (pointed to by, say, @code{sp})
   between integer/pointer and floating-point values.  This is ok if you do
   not use @code{spTOS}.  If you do use @code{spTOS}, the compiler has to
   decide whether to put that variable into an integer or a floating point
   register, and the other type of operation becomes quite expensive on
   most machines (because moving values between integer and FP registers is
   quite expensive).  If a value of one type has to be synthesized out of
   two values of the other type (@code{double} types), things are even more
   interesting.
   
   One way around this problem would be to not use the @code{spTOS}
   supported by Vmgen, but to use explicit top-of-stack variables (one for
   integers, one for FP values), and having a kind of accumulator+stack
   architecture (e.g., Ocaml bytecode uses this approach); however, this is
   a major change, and it's ramifications are not completely clear.
   
   @c **********************************************************
   @node The future, Changes, Hints, Top
   @chapter The future
   @cindex future ideas
   
   We have a number of ideas for future versions of Vmgen.  However, there
   are so many possible things to do that we would like some feedback from
   you.  What are you doing with Vmgen, what features are you missing, and
   why?
   
   One idea we are thinking about is to generate just one @file{.c} file
   instead of letting you copy and adapt all the wrapper files (you would
   still have to define stuff like the type-specific macros, and stack
   pointers etc. somewhere).  The advantage would be that, if we change the
   wrapper files between versions, you would not need to integrate your
   changes and our changes to them; Vmgen would also be easier to use for
   beginners.  The main disadvantage of that is that it would reduce the
   flexibility of Vmgen a little (well, those who like flexibility could
   still patch the resulting @file{.c} file, like they are now doing for
   the wrapper files).  In any case, if you are doing things to the wrapper
   files that would cause problems in a generated-@file{.c}-file approach,
   please let us know.
   
 @c **********************************************************  @c **********************************************************
 @node Changes, Contact, Using the generated code, Top  @node Changes, Contact, The future, Top
 @chapter Changes  @chapter Changes
 @cindex Changes from old versions  @cindex Changes from old versions
   
   User-visible changes between 0.5.9-20020822 and 0.5.9-20020901:
   
   The store optimization is now disabled by default, but can be enabled by
   the user (@pxref{Store Optimization}).  Documentation for this
   optimization is also new.
   
   User-visible changes between 0.5.9-20010501 and 0.5.9-20020822:
   
   There is now a manual (in info, HTML, Postscript, or plain text format).
   
   There is the vmgen-ex2 variant of the vmgen-ex example; the new
   variant uses a union type instead of lots of casting.
   
   Both variants of the example can now be compiled with an ANSI C compiler
   (using switch dispatch and losing quite a bit of performance); tested
   with @command{lcc}.
   
 Users of the gforth-0.5.9-20010501 version of Vmgen need to change  Users of the gforth-0.5.9-20010501 version of Vmgen need to change
 several things in their source code to use the current version.  I  several things in their source code to use the current version.  I
 recommend keeping the gforth-0.5.9-20010501 version until you have  recommend keeping the gforth-0.5.9-20010501 version until you have
Line 1558  The required changes are: Line 1955  The required changes are:
   
 @table @code  @table @code
   
   @cindex @code{TAIL;}, changes
   @item TAIL;
   has been renamed into @code{INST_TAIL;} (less chance of an accidental
   match).
   
 @cindex @code{vm_@var{A}2@var{B}}, changes  @cindex @code{vm_@var{A}2@var{B}}, changes
 @item vm_@var{A}2@var{B}  @item vm_@var{A}2@var{B}
 now takes two arguments.  now takes two arguments.
Line 1576  Also some new macros have to be defined, Line 1978  Also some new macros have to be defined,
 @node Contact, Copying This Manual, Changes, Top  @node Contact, Copying This Manual, Changes, Top
 @chapter Contact  @chapter Contact
   
   To report a bug, use
   @url{https://savannah.gnu.org/bugs/?func=addbug&group_id=2672}.
   
   For discussion on Vmgen (e.g., how to use it), use the mailing list
   @email{bug-vmgen@@mail.freesoftware.fsf.org} (use
   @url{http://mail.gnu.org/mailman/listinfo/help-vmgen} to subscribe).
   
   You can find vmgen information at
   @url{http://www.complang.tuwien.ac.at/anton/vmgen/}.
   
 @c ***********************************************************  @c ***********************************************************
 @node Copying This Manual, Index, Contact, Top  @node Copying This Manual, Index, Contact, Top
 @appendix Copying This Manual  @appendix Copying This Manual

Removed from v.1.12  
changed lines
  Added in v.1.29


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>