Diff for /gforth/doc/vmgen.texi between versions 1.12 and 1.13

version 1.12, 2002/08/16 09:43:49 version 1.13, 2002/08/19 07:38:16
Line 57  Software Foundation raise funds for GNU Line 57  Software Foundation raise funds for GNU
 * Invoking Vmgen::                * Invoking Vmgen::              
 * Example::                       * Example::                     
 * Input File Format::             * Input File Format::           
   * Error messages::              reported by Vmgen
 * Using the generated code::      * Using the generated code::    
   * Hints::                       VM archictecture, efficiency
   * The future::                  
 * Changes::                     from earlier versions  * Changes::                     from earlier versions
 * Contact::                     Bug reporting etc.  * Contact::                     Bug reporting etc.
 * Copying This Manual::         Manual License  * Copying This Manual::         Manual License
Line 98  Using the generated code Line 101  Using the generated code
 * VM disassembler::             for debugging the front end  * VM disassembler::             for debugging the front end
 * VM profiler::                 for finding worthwhile superinstructions  * VM profiler::                 for finding worthwhile superinstructions
   
   Hints
   
   * Floating point::              and stacks
   
 Copying This Manual  Copying This Manual
   
 * GNU Free Documentation License::  License for copying this manual.  * GNU Free Documentation License::  License for copying this manual.
Line 151  In this setup, Vmgen can generate most o Line 158  In this setup, Vmgen can generate most o
 machine instructions from a simple description of the virtual machine  machine instructions from a simple description of the virtual machine
 instructions (@pxref{Input File Format}), in particular:  instructions (@pxref{Input File Format}), in particular:
   
 @table @asis  @table @strong
   
 @item VM instruction execution  @item VM instruction execution
   
Line 172  Useful for optimizing the VM interpreter Line 179  Useful for optimizing the VM interpreter
   
 @end table  @end table
   
   To create parts of the interpretive system that do not deal with VM
   instructions, you have to use other tools (e.g., @command{bison}) and/or
   hand-code them.
   
 @cindex efficiency features overview  @cindex efficiency features overview
 @noindent  @noindent
 Vmgen supports efficient interpreters though various optimizations, in  Vmgen supports efficient interpreters though various optimizations, in
Line 209  offered by Vmgen. Line 220  offered by Vmgen.
   
 There are many potential uses of the instruction descriptions that are  There are many potential uses of the instruction descriptions that are
 not implemented at the moment, but we are open for feature requests, and  not implemented at the moment, but we are open for feature requests, and
 we will implement new features if someone asks for them; so the feature  we will consider new features if someone asks for them; so the feature
 list above is not exhaustive.  list above is not exhaustive.
   
 @c *********************************************************************  @c *********************************************************************
Line 300  interpreter, but some systems also suppo Line 311  interpreter, but some systems also suppo
 as an image file, or in a full-blown linkable file format (e.g., JVM).  as an image file, or in a full-blown linkable file format (e.g., JVM).
 Vmgen currently has no special support for such features, but the  Vmgen currently has no special support for such features, but the
 information in the instruction descriptions can be helpful, and we are  information in the instruction descriptions can be helpful, and we are
 open for feature requests and suggestions.  open to feature requests and suggestions.
   
 @c --------------------------------------------------------------------  @c --------------------------------------------------------------------
 @node Data handling, Dispatch, Front end and VM interpreter, Concepts  @node Data handling, Dispatch, Front end and VM interpreter, Concepts
Line 310  open for feature requests and suggestion Line 321  open for feature requests and suggestion
 @cindex register machine  @cindex register machine
 Most VMs use one or more stacks for passing temporary data between VM  Most VMs use one or more stacks for passing temporary data between VM
 instructions.  Another option is to use a register machine architecture  instructions.  Another option is to use a register machine architecture
 for the virtual machine; however, this option is either slower or  for the virtual machine; we believe that using a stack architecture is
   usually both simpler and faster.
   
   however, this option is slower or
 significantly more complex to implement than a stack machine architecture.  significantly more complex to implement than a stack machine architecture.
   
 Vmgen has special support and optimizations for stack VMs, making their  Vmgen has special support and optimizations for stack VMs, making their
Line 356  After executing one VM instruction, the Line 370  After executing one VM instruction, the
 the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).  the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}).
 Vmgen supports two methods of dispatch:  Vmgen supports two methods of dispatch:
   
 @table @asis  @table @strong
   
 @item switch dispatch  @item switch dispatch
 @cindex switch dispatch  @cindex switch dispatch
Line 379  instruction.  Threaded code cannot be im Line 393  instruction.  Threaded code cannot be im
 be implemented using GNU C's labels-as-values extension (@pxref{Labels  be implemented using GNU C's labels-as-values extension (@pxref{Labels
 as Values, , Labels as Values, gcc.info, GNU C Manual}).  as Values, , Labels as Values, gcc.info, GNU C Manual}).
   
   @c call threading
 @end table  @end table
   
 Threaded code can be twice as fast as switch dispatch, depending on the  Threaded code can be twice as fast as switch dispatch, depending on the
Line 392  interpreter, the benchmark, and the mach Line 407  interpreter, the benchmark, and the mach
 The usual way to invoke Vmgen is as follows:  The usual way to invoke Vmgen is as follows:
   
 @example  @example
 vmgen @var{infile}  vmgen @var{inputfile}
 @end example  @end example
   
 Here @var{infile} is the VM instruction description file, which usually  Here @var{inputfile} is the VM instruction description file, which
 ends in @file{.vmg}.  The output filenames are made by taking the  usually ends in @file{.vmg}.  The output filenames are made by taking
 basename of @file{infile} (i.e., the output files will be created in the  the basename of @file{inputfile} (i.e., the output files will be created
 current working directory) and replacing @file{.vmg} with @file{-vm.i},  in the current working directory) and replacing @file{.vmg} with
 @file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},  @file{-vm.i}, @file{-disasm.i}, @file{-gen.i}, @file{-labels.i},
 and @file{-peephole.i}.  E.g., @command{vmgen hack/foo.vmg} will create  @file{-profile.i}, and @file{-peephole.i}.  E.g., @command{vmgen
 @file{foo-vm.i} etc.  hack/foo.vmg} will create @file{foo-vm.i}, @file{foo-disasm.i},
   @file{foo-gen.i}, @file{foo-labels.i}, @file{foo-profile.i} and
   @file{foo-peephole.i}.
   
 The command-line options supported by Vmgen are  The command-line options supported by Vmgen are
   
Line 563  sort -k 3 >mini-super.vmg       #sort se Line 580  sort -k 3 >mini-super.vmg       #sort se
 The file @file{peephole-blacklist} contains all instructions that  The file @file{peephole-blacklist} contains all instructions that
 directly access a stack or stack pointer (for mini: @code{call},  directly access a stack or stack pointer (for mini: @code{call},
 @code{return}); the sort step is necessary to ensure that prefixes  @code{return}); the sort step is necessary to ensure that prefixes
 preceed larger superinstructions.  precede larger superinstructions.
   
 Now you can create a version of mini with superinstructions by just  Now you can create a version of mini with superinstructions by just
 saying @samp{make}  saying @samp{make}
   
   
 @c ***************************************************************  @c ***************************************************************
 @node Input File Format, Using the generated code, Example, Top  @node Input File Format, Error messages, Example, Top
 @chapter Input File Format  @chapter Input File Format
 @cindex input file format  @cindex input file format
 @cindex format, input file  @cindex format, input file
Line 615  super-inst: ident ' =' ident @{ident@} Line 632  super-inst: ident ' =' ident @{ident@}
   
 comment:      '\ '  text newline  comment:      '\ '  text newline
   
 eval-escape:  '\e ' text newline  eval-escape:  '\E ' text newline
 @end example  @end example
 @c \+ \- \g \f \c  @c \+ \- \g \f \c
   
Line 636  description of the format used for Gfort Line 653  description of the format used for Gfort
 @cindex escape to Forth  @cindex escape to Forth
 @cindex eval escape  @cindex eval escape
   
   
   
 @c woanders?  @c woanders?
 The text in @code{eval-escape} is Forth code that is evaluated when  The text in @code{eval-escape} is Forth code that is evaluated when
 Vmgen reads the line.  If you do not know (and do not want to learn)  Vmgen reads the line.  You will normally use this feature to define
 Forth, you can build the text according to the following grammar; these  stacks and types.
 rules are normally all Forth you need for using Vmgen:  
   If you do not know (and do not want to learn) Forth, you can build the
   text according to the following grammar; these rules are normally all
   Forth you need for using Vmgen:
   
 @example  @example
 text: stack-decl|type-prefix-decl|stack-prefix-decl  text: stack-decl|type-prefix-decl|stack-prefix-decl
Line 652  stack-prefix-decl:  ident 'stack-prefix' Line 674  stack-prefix-decl:  ident 'stack-prefix'
 @end example  @end example
   
 Note that the syntax of this code is not checked thoroughly (there are  Note that the syntax of this code is not checked thoroughly (there are
 many other Forth program fragments that could be written there).  many other Forth program fragments that could be written in an
   eval-escape).
   
 If you know Forth, the stack effects of the non-standard words involved  If you know Forth, the stack effects of the non-standard words involved
 are:  are:
Line 793  level, this also sets the instruction po Line 816  level, this also sets the instruction po
 This ends a basic block (for profiling), even if the instruction  This ends a basic block (for profiling), even if the instruction
 contains no @code{SET_IP}.  contains no @code{SET_IP}.
   
 @item TAIL;  @item INST_TAIL;
 @findex TAIL;  @findex INST_TAIL;
 Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and  Vmgen replaces @samp{INST_TAIL;} with code for ending a VM instruction and
 dispatching the next VM instruction.  Even without a @samp{TAIL;} this  dispatching the next VM instruction.  Even without a @samp{INST_TAIL;} this
 happens automatically when control reaches the end of the C code.  If  happens automatically when control reaches the end of the C code.  If
 you want to have this in the middle of the C code, you need to use  you want to have this in the middle of the C code, you need to use
 @samp{TAIL;}.  A typical example is a conditional VM branch:  @samp{INST_TAIL;}.  A typical example is a conditional VM branch:
   
 @example  @example
 if (branch_condition) @{  if (branch_condition) @{
   SET_IP(target); TAIL;    SET_IP(target); INST_TAIL;
 @}  @}
 /* implicit tail follows here */  /* implicit tail follows here */
 @end example  @end example
   
 In this example, @samp{TAIL;} is not strictly necessary, because there  In this example, @samp{INST_TAIL;} is not strictly necessary, because there
 is another one implicitly after the if-statement, but using it improves  is another one implicitly after the if-statement, but using it improves
 branch prediction accuracy slightly and allows other optimizations.  branch prediction accuracy slightly and allows other optimizations.
   
Line 822  typical application is in conditional VM Line 845  typical application is in conditional VM
   
 @example  @example
 if (branch_condition) @{  if (branch_condition) @{
   SET_IP(target); TAIL; /* now this TAIL is necessary */    SET_IP(target); INST_TAIL; /* now this INST_TAIL is necessary */
 @}  @}
 SUPER_CONTINUE;  SUPER_CONTINUE;
 @end example  @end example
Line 832  SUPER_CONTINUE; Line 855  SUPER_CONTINUE;
 Note that Vmgen is not smart about C-level tokenization, comments,  Note that Vmgen is not smart about C-level tokenization, comments,
 strings, or conditional compilation, so it will interpret even a  strings, or conditional compilation, so it will interpret even a
 commented-out SUPER_END as ending a basic block (or, e.g.,  commented-out SUPER_END as ending a basic block (or, e.g.,
 @samp{RETAIL;} as @samp{TAIL;}).  Conversely, Vmgen requires the literal  @samp{RESET_IP;} as @samp{SET_IP;}).  Conversely, Vmgen requires the literal
 presence of these strings; Vmgen will not see them if they are hiding in  presence of these strings; Vmgen will not see them if they are hiding in
 a C preprocessor macro.  a C preprocessor macro.
   
Line 879  The Vmgen-erated code loads the stack it Line 902  The Vmgen-erated code loads the stack it
 memory into variables before the user-supplied C code, and stores them  memory into variables before the user-supplied C code, and stores them
 from variables to stack-pointer-indexed memory afterwards.  If you do  from variables to stack-pointer-indexed memory afterwards.  If you do
 any writes to the stack through its stack pointer in your C code, it  any writes to the stack through its stack pointer in your C code, it
 will not affact the variables, and your write may be overwritten by the  will not affect the variables, and your write may be overwritten by the
 stores after the C code.  Similarly, a read from a stack using a stack  stores after the C code.  Similarly, a read from a stack using a stack
 pointer will not reflect computations of stack items in the same VM  pointer will not reflect computations of stack items in the same VM
 instruction.  instruction.
Line 1013  VM interpreters.  However, if you have i Line 1036  VM interpreters.  However, if you have i
 direction, please let me know (@pxref{Contact}).  direction, please let me know (@pxref{Contact}).
   
 @c ********************************************************************  @c ********************************************************************
 @node Using the generated code, Changes, Input File Format, Top  @node Error messages, Using the generated code, Input File Format, Top
   @chapter Error messages
   @cindex error messages
   
   These error messages are created by Vmgen:
   
   @table @code
   
   @cindex @code{# can only be on the input side} error
   @item # can only be on the input side
   You have used an instruction-stream prefix (usually @samp{#}) after the
   @samp{--} (the output side); you can only use it before (the input
   side).
   
   @cindex @code{prefix for this combination must be defined earlier} error
   @item the prefix for this combination must be defined earlier
   You have defined a superinstruction (e.g. @code{abc = a b c}) without
   defining its direct prefix (e.g., @code{ab = a b}),
   @xref{Superinstructions}.
   
   @cindex @code{sync line syntax} error
   @item sync line syntax
   If you are using a preprocessor (e.g., @command{m4}) to generate Vmgen
   input code, you may want to create @code{#line} directives (aka sync
   lines).  This error indicates that such a line is not in th syntax
   expected by Vmgen (this should not happen).
   
   @cindex @code{syntax error, wrong char} error
   @cindex syntax error, wrong char
   A syntax error.  Note that Vmgen is sometimes anal retentive about white
   space, especially about newlines.
   
   @cindex @code{too many stacks} error
   @item too many stacks
   Vmgen currently supports 4 stacks; if you need more, let us know.
   
   @cindex @code{unknown prefix} error
   @item unknown prefix
   The stack item does not match any defined type prefix (after stripping
   away any stack prefix).  You should either declare the type prefix you
   want for that stack item, or use a different type prefix
   
   @item @code{unknown primitive} error
   @item unknown primitive
   You have used the name of a simple VM instruction in a superinstruction
   definition without defining the simple VM instruction first.
   
   @end table
   
   In addition, the C compiler can produce errors due to code produced by
   Vmgen; e.g., you need to define type cast functions.
   
   @c ********************************************************************
   @node Using the generated code, Hints, Error messages, Top
 @chapter Using the generated code  @chapter Using the generated code
 @cindex generated code, usage  @cindex generated code, usage
 @cindex Using vmgen-erated code  @cindex Using vmgen-erated code
   
 The easiest way to create a working VM interpreter with Vmgen is  The easiest way to create a working VM interpreter with Vmgen is
 probably to start with @file{vmgen-ex}, and modify it for your purposes.  probably to start with @file{vmgen-ex}, and modify it for your purposes.
 This chapter is just the reference manual for the macros etc. used by  This chapter explains what the various wrapper and generated files do.
 the generated code, the other context expected by the generated code,  It also contains reference-manual style descriptions of the macros,
 and what you can do with the various generated files.  variables etc. used by the generated code, and you can skip that on
   first reading.
   
 @menu  @menu
 * VM engine::                   Executing VM code  * VM engine::                   Executing VM code
Line 1059  In our example the engine function also Line 1136  In our example the engine function also
 @file{@var{name}-labels.i} (@pxref{VM instruction table}).  @file{@var{name}-labels.i} (@pxref{VM instruction table}).
   
 @cindex tracing VM code  @cindex tracing VM code
   @cindex superinstructions and tracing
 In addition to executing the code, the VM engine can optionally also  In addition to executing the code, the VM engine can optionally also
 print out a trace of the executed instructions, their arguments and  print out a trace of the executed instructions, their arguments and
 results.  For superinstructions it prints the trace as if only component  results.  For superinstructions it prints the trace as if only component
Line 1080  The following macros and variables are u Line 1158  The following macros and variables are u
 @item LABEL(@var{inst_name})  @item LABEL(@var{inst_name})
 This is used just before each VM instruction to provide a jump or  This is used just before each VM instruction to provide a jump or
 @code{switch} label (the @samp{:} is provided by Vmgen).  For switch  @code{switch} label (the @samp{:} is provided by Vmgen).  For switch
 dispatch this should expand to @samp{case @var{label}}; for  dispatch this should expand to @samp{case @var{label}:}; for
 threaded-code dispatch this should just expand to @samp{@var{label}}.  threaded-code dispatch this should just expand to @samp{@var{label}:}.
 In either case @var{label} is usually the @var{inst_name} with some  In either case @var{label} is usually the @var{inst_name} with some
 prefix or suffix to avoid naming conflicts.  prefix or suffix to avoid naming conflicts.
   
Line 1093  should expand to nothing. Line 1171  should expand to nothing.
 @findex NAME  @findex NAME
 @item NAME(@var{inst_name_string})  @item NAME(@var{inst_name_string})
 Called on entering a VM instruction with a string containing the name of  Called on entering a VM instruction with a string containing the name of
 the VM instruction as parameter.  In normal execution this should be a  the VM instruction as parameter.  In normal execution this should be
 noop, but for tracing this usually prints the name, and possibly other  expand to nothing, but for tracing this usually prints the name, and
 information (several VM registers in our example).  possibly other information (several VM registers in our example).
   
 @findex DEF_CA  @findex DEF_CA
 @item DEF_CA  @item DEF_CA
Line 1114  different ways for best performance on v Line 1192  different ways for best performance on v
 @samp{NEXT_P0} is invoked right at the start of the VM instruction (but  @samp{NEXT_P0} is invoked right at the start of the VM instruction (but
 after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C  after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
 code, and @samp{NEXT_P2} at the end.  The actual jump has to be  code, and @samp{NEXT_P2} at the end.  The actual jump has to be
 performed by @samp{NEXT_P2}.  performed by @samp{NEXT_P2} (if you would do it earlier, important parts
   of the VM instruction would not be executed).
   
 The simplest variant is if @samp{NEXT_P2} does everything and the other  The simplest variant is if @samp{NEXT_P2} does everything and the other
 macros do nothing.  Then also related macros like @samp{IP},  macros do nothing.  Then also related macros like @samp{IP},
Line 1541  it uses variables and functions defined Line 1620  it uses variables and functions defined
 plus @code{VM_IS_INST} already defined for the VM disassembler  plus @code{VM_IS_INST} already defined for the VM disassembler
 (@pxref{VM disassembler}).  (@pxref{VM disassembler}).
   
   @c **********************************************************
   @node Hints, The future, Using the generated code, Top
   @chapter Hints
   @cindex hints
   
   @menu
   * Floating point::              and stacks
   @end menu
   
   @c --------------------------------------------------------------------
   @node Floating point,  , Hints, Hints
   @section Floating point
   
   How should you deal with floating point values?  Should you use the same
   stack as for integers/pointers, or a different one?  This section
   discusses this issue with a view on execution speed.
   
   The simpler approach is to use a separate floating-point stack.  This
   allows you to choose FP value size without considering the size of the
   integers/pointers, and you avoid a number of performance problems.  The
   main downside is that this needs an FP stack pointer (and that may not
   fit in the register file on the 386 arhitecture, costing some
   performance, but comparatively little if you take the other option into
   account).  If you use a separate FP stack (with stack pointer @code{fp}),
   using an fpTOS is helpful on most machines, but some spill the fpTOS
   register into memory, and fpTOS should not be used there.
   
   The other approach is to share one stack (pointed to by, say, @code{sp})
   between integer/pointer and floating-point values.  This is ok if you do
   not use @code{spTOS}.  If you do use @code{spTOS}, the compiler has to
   decide whether to put that variable into an integer or a floating point
   register, and the other type of operation becomes quite expensive on
   most machines (because moving values between integer and FP registers is
   quite expensive).  If a value of one type has to be synthesized out of
   two values of the other type (@code{double} types), things are even more
   interesting.
   
   One way around this problem would be to not use the @code{spTOS}
   supported by Vmgen, but to use explicit top-of-stack variables (one for
   integers, one for FP values), and having a kind of accumulator+stack
   architecture (e.g., Ocaml bytecode uses this approach); however, this is
   a major change, and it's ramifications are not completely clear.
   
 @c **********************************************************  @c **********************************************************
 @node Changes, Contact, Using the generated code, Top  @node The future, Changes, Hints, Top
   @chapter The future
   @cindex future ideas
   
   We have a number of ideas for future versions of Gforth.  However, there
   are so many possible things to do that we would like some feedback from
   you.  What are you doing with Vmgen, what features are you missing, and
   why?
   
   One idea we are thinking about is to generate just one @file{.c} file
   instead of letting you copy and adapt all the wrapper files (you would
   still have to define stuff like the type-specific macros, and stack
   pointers etc. somewhere).  The advantage would be that, if we change the
   wrapper files between versions, you would not need to integrate your
   changes and our changes to them; Vmgen would also be easier to use for
   beginners.  The main disadvantage of that is that it would reduce the
   flexibility of Vmgen a little (well, those who like flexibility could
   still patch the resulting @file{.c} file, like they are now doing for
   the wrapper files).  In any case, if you are doing things to the wrapper
   files that would cause problems in a generated-@file{.c}-file approach,
   please let us know.
   
   @c **********************************************************
   @node Changes, Contact, The future, Top
 @chapter Changes  @chapter Changes
 @cindex Changes from old versions  @cindex Changes from old versions
   
Line 1558  The required changes are: Line 1702  The required changes are:
   
 @table @code  @table @code
   
   @cindex @code{TAIL;}, changes
   @item TAIL;
   has been renamed into @code{INST_TAIL;} (less chance of an accidental
   match).
   
 @cindex @code{vm_@var{A}2@var{B}}, changes  @cindex @code{vm_@var{A}2@var{B}}, changes
 @item vm_@var{A}2@var{B}  @item vm_@var{A}2@var{B}
 now takes two arguments.  now takes two arguments.

Removed from v.1.12  
changed lines
  Added in v.1.13


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>