Annotation of gforth/doc/vmgen.texi, revision 1.5

1.1       anton       1: @include version.texi
                      2: 
                      3: @c @ifnottex
                      4: This file documents vmgen (Gforth @value{VERSION}).
                      5: 
1.2       anton       6: @chapter Introduction
1.1       anton       7: 
                      8: Vmgen is a tool for writing efficient interpreters.  It takes a simple
                      9: virtual machine description and generates efficient C code for dealing
                     10: with the virtual machine code in various ways (in particular, executing
                     11: it).  The run-time efficiency of the resulting interpreters is usually
                     12: within a factor of 10 of machine code produced by an optimizing
                     13: compiler.
                     14: 
                     15: The interpreter design strategy supported by vmgen is to divide the
                     16: interpreter into two parts:
                     17: 
                     18: @itemize @bullet
                     19: 
                     20: @item The @emph{front end} takes the source code of the language to be
                     21: implemented, and translates it into virtual machine code.  This is
                     22: similar to an ordinary compiler front end; typically an interpreter
                     23: front-end performs no optimization, so it is relatively simple to
                     24: implement and runs fast.
                     25: 
                     26: @item The @emph{virtual machine interpreter} executes the virtual
                     27: machine code.
                     28: 
                     29: @end itemize
                     30: 
                     31: Such a division is usually used in interpreters, for modularity as well
                     32: as for efficiency reasons.  The virtual machine code is typically passed
                     33: between front end and virtual machine interpreter in memory, like in a
                     34: load-and-go compiler; this avoids the complexity and time cost of
                     35: writing the code to a file and reading it again.
                     36: 
                     37: A @emph{virtual machine} (VM) represents the program as a sequence of
                     38: @emph{VM instructions}, following each other in memory, similar to real
                     39: machine code.  Control flow occurs through VM branch instructions, like
                     40: in a real machine.
                     41: 
                     42: In this setup, vmgen can generate most of the code dealing with virtual
                     43: machine instructions from a simple description of the virtual machine
                     44: instructions (@pxref...), in particular:
                     45: 
                     46: @table @emph
                     47: 
                     48: @item VM instruction execution
                     49: 
                     50: @item VM code generation
                     51: Useful in the front end.
                     52: 
                     53: @item VM code decompiler
                     54: Useful for debugging the front end.
                     55: 
                     56: @item VM code tracing
                     57: Useful for debugging the front end and the VM interpreter.  You will
                     58: typically provide other means for debugging the user's programs at the
                     59: source level.
                     60: 
                     61: @item VM code profiling
                     62: Useful for optimizing the VM insterpreter with superinstructions
                     63: (@pxref...).
                     64: 
                     65: @end table
                     66: 
                     67: VMgen supports efficient interpreters though various optimizations, in
                     68: particular
                     69: 
                     70: @itemize
                     71: 
                     72: @item Threaded code
                     73: 
                     74: @item Caching the top-of-stack in a register
                     75: 
                     76: @item Combining VM instructions into superinstructions
                     77: 
                     78: @item
                     79: Replicating VM (super)instructions for better BTB prediction accuracy
                     80: (not yet in vmgen-ex, but already in Gforth).
                     81: 
                     82: @end itemize
                     83: 
                     84: As a result, vmgen-based interpreters are only about an order of
                     85: magintude slower than native code from an optimizing C compiler on small
                     86: benchmarks; on large benchmarks, which spend more time in the run-time
1.2       anton      87: system, the slowdown is often less (e.g., the slowdown of a
                     88: Vmgen-generated JVM interpreter over the best JVM JIT compiler we
                     89: measured is only a factor of 2-3 for large benchmarks; some other JITs
                     90: and all other interpreters we looked at were slower than our
                     91: interpreter).
1.1       anton      92: 
                     93: VMs are usually designed as stack machines (passing data between VM
                     94: instructions on a stack), and vmgen supports such designs especially
                     95: well; however, you can also use vmgen for implementing a register VM and
                     96: still benefit from most of the advantages offered by vmgen.
                     97: 
1.2       anton      98: There are many potential uses of the instruction descriptions that are
                     99: not implemented at the moment, but we are open for feature requests, and
                    100: we will implement new features if someone asks for them; so the feature
                    101: list above is not exhaustive.
1.1       anton     102: 
1.2       anton     103: @c *********************************************************************
                    104: @chapter Why interpreters?
                    105: 
                    106: Interpreters are a popular language implementation technique because
                    107: they combine all three of the following advantages:
                    108: 
                    109: @itemize
                    110: 
                    111: @item Ease of implementation
                    112: 
                    113: @item Portability
                    114: 
                    115: @item Fast edit-compile-run cycle
                    116: 
                    117: @end itemize
                    118: 
                    119: The main disadvantage of interpreters is their run-time speed.  However,
                    120: there are huge differences between different interpreters in this area:
                    121: the slowdown over optimized C code on programs consisting of simple
                    122: operations is typically a factor of 10 for the more efficient
                    123: interpreters, and a factor of 1000 for the less efficient ones (the
                    124: slowdown for programs executing complex operations is less, because the
                    125: time spent in libraries for executing complex operations is the same in
                    126: all implementation strategies).
                    127: 
                    128: Vmgen makes it even easier to implement interpreters.  It also supports
                    129: techniques for building efficient interpreters.
                    130: 
                    131: @c ********************************************************************
                    132: 
                    133: @chapter Concepts
                    134: 
                    135: @c --------------------------------------------------------------------
                    136: @section Front-end and virtual machine interpreter
                    137: 
                    138: @cindex front-end
                    139: Interpretive systems are typically divided into a @emph{front end} that
                    140: parses the input language and produces an intermediate representation
                    141: for the program, and an interpreter that executes the intermediate
                    142: representation of the program.
                    143: 
                    144: @cindex virtual machine
                    145: @cindex VM
                    146: @cindex instruction, VM
                    147: For efficient interpreters the intermediate representation of choice is
                    148: virtual machine code (rather than, e.g., an abstract syntax tree).
                    149: @emph{Virtual machine} (VM) code consists of VM instructions arranged
                    150: sequentially in memory; they are executed in sequence by the VM
                    151: interpreter, except for VM branch instructions, which implement control
                    152: structures.  The conceptual similarity to real machine code results in
                    153: the name @emph{virtual machine}.
                    154: 
                    155: In this framework, vmgen supports building the VM interpreter and any
                    156: other component dealing with VM instructions.  It does not have any
                    157: support for the front end, apart from VM code generation support.  The
                    158: front end can be implemented with classical compiler front-end
1.3       anton     159: techniques, supported by tools like @command{flex} and @command{bison}.
1.2       anton     160: 
                    161: The intermediate representation is usually just internal to the
                    162: interpreter, but some systems also support saving it to a file, either
                    163: as an image file, or in a full-blown linkable file format (e.g., JVM).
                    164: Vmgen currently has no special support for such features, but the
                    165: information in the instruction descriptions can be helpful, and we are
                    166: open for feature requests and suggestions.
1.3       anton     167: 
                    168: @section Data handling
                    169: 
                    170: @cindex stack machine
                    171: @cindex register machine
                    172: Most VMs use one or more stacks for passing temporary data between VM
                    173: instructions.  Another option is to use a register machine architecture
                    174: for the virtual machine; however, this option is either slower or
                    175: significantly more complex to implement than a stack machine architecture.
                    176: 
                    177: Vmgen has special support and optimizations for stack VMs, making their
                    178: implementation easy and efficient.
                    179: 
                    180: You can also implement a register VM with vmgen (@pxref{Register
                    181: Machines}), and you will still profit from most vmgen features.
                    182: 
                    183: @cindex stack item size
                    184: @cindex size, stack items
                    185: Stack items all have the same size, so they typically will be as wide as
                    186: an integer, pointer, or floating-point value.  Vmgen supports treating
                    187: two consecutive stack items as a single value, but anything larger is
                    188: best kept in some other memory area (e.g., the heap), with pointers to
                    189: the data on the stack.
                    190: 
                    191: @cindex instruction stream
                    192: @cindex immediate arguments
                    193: Another source of data is immediate arguments VM instructions (in the VM
                    194: instruction stream).  The VM instruction stream is handled similar to a
                    195: stack in vmgen.
                    196: 
                    197: @cindex garbage collection
                    198: @cindex reference counting
                    199: Vmgen has no built-in support for nor restrictions against @emph{garbage
                    200: collection}.  If you need garbage collection, you need to provide it in
                    201: your run-time libraries.  Using @emph{reference counting} is probably
                    202: harder, but might be possible (contact us if you are interested).
                    203: @c reference counting might be possible by including counting code in 
                    204: @c the conversion macros.
                    205: 
                    206: @c *************************************************************
                    207: @chapter Invoking vmgen
                    208: 
                    209: The usual way to invoke vmgen is as follows:
                    210: 
                    211: @example
                    212: vmgen @var{infile}
                    213: @end example
                    214: 
                    215: Here @var{infile} is the VM instruction description file, which usually
                    216: ends in @file{.vmg}.  The output filenames are made by taking the
                    217: basename of @file{infile} (i.e., the output files will be created in the
                    218: current working directory) and replacing @file{.vmg} with @file{-vm.i},
                    219: @file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},
                    220: and @file{-peephole.i}.  E.g., @command{bison hack/foo.vmg} will create
                    221: @file{foo-vm.i} etc.
                    222: 
                    223: The command-line options supported by vmgen are
                    224: 
                    225: @table @option
                    226: 
                    227: @cindex -h, command-line option
                    228: @cindex --help, command-line option
                    229: @item --help
                    230: @itemx -h
                    231: Print a message about the command-line options
                    232: 
                    233: @cindex -v, command-line option
                    234: @cindex --version, command-line option
                    235: @item --version
                    236: @itemx -v
                    237: Print version and exit
                    238: @end table
                    239: 
                    240: @c env vars GFORTHDIR GFORTHDATADIR
                    241: 
1.5     ! anton     242: @c ****************************************************************
        !           243: @chapter Example
        !           244: 
        !           245: @section Example overview
        !           246: 
        !           247: There are two versions of the same example for using vmgen:
        !           248: @file{vmgen-ex} and @file{vmgen-ex2} (you can also see Gforth as
        !           249: example, but it uses additional (undocumented) features, and also
        !           250: differs in some other respects).  The example implements @emph{mini}, a
        !           251: tiny Modula-2-like language with a small JavaVM-like virtual machine.
        !           252: The difference between the examples is that @file{vmgen-ex} uses many
        !           253: casts, and @file{vmgen-ex2} tries to avoids most casts and uses unions
        !           254: instead.
        !           255: 
        !           256: The files provided with each example are:
        !           257: 
        !           258: @example
        !           259: Makefile
        !           260: README
        !           261: disasm.c           wrapper file
        !           262: engine.c           wrapper file
        !           263: peephole.c         wrapper file
        !           264: profile.c          wrapper file
        !           265: mini-inst.vmg      simple VM instructions
        !           266: mini-super.vmg     superinstructions (empty at first)
        !           267: mini.h             common declarations
        !           268: mini.l             scanner
        !           269: mini.y             front end (parser, VM code generator)
        !           270: support.c          main() and other support functions
        !           271: fib.mini           example mini program
        !           272: simple.mini        example mini program
        !           273: test.mini          example mini program (tests everything)
        !           274: test.out           test.mini output
        !           275: stat.awk           script for aggregating profile information
        !           276: peephole-blacklist list of instructions not allowed in superinstructions
        !           277: seq2rule.awk       script for creating superinstructions
        !           278: @end example
        !           279: 
        !           280: For your own interpreter, you would typically copy the following files
        !           281: and change little, if anything:
        !           282: 
        !           283: @example
        !           284: disasm.c           wrapper file
        !           285: engine.c           wrapper file
        !           286: peephole.c         wrapper file
        !           287: profile.c          wrapper file
        !           288: stat.awk           script for aggregating profile information
        !           289: seq2rule.awk       script for creating superinstructions
        !           290: @end example
        !           291: 
        !           292: You would typically change much in or replace the following files:
        !           293: 
        !           294: @example
        !           295: Makefile
        !           296: mini-inst.vmg      simple VM instructions
        !           297: mini.h             common declarations
        !           298: mini.l             scanner
        !           299: mini.y             front end (parser, VM code generator)
        !           300: support.c          main() and other support functions
        !           301: peephole-blacklist list of instructions not allowed in superinstructions
        !           302: @end example
        !           303: 
        !           304: You can build the example by @code{cd}ing into the example's directory,
        !           305: and then typing @samp{make}; you can check that it works with @samp{make
        !           306: check}.  You can run run mini programs like this:
        !           307: 
        !           308: @example
        !           309: ./mini fib.mini
        !           310: @end example
        !           311: 
        !           312: To learn about the options, type @samp{./mini -h}.
        !           313: 
        !           314: @section Using profiling to create superinstructions
        !           315: 
        !           316: I have not added rules for this in the @file{Makefile} (there are many
        !           317: options for selecting superinstructions, and I did not want to hardcode
        !           318: one into the @file{Makefile}), but there are some supporting scripts, and
        !           319: here's an example:
        !           320: 
        !           321: Suppose you want to use @file{fib.mini} and @file{test.mini} as training
        !           322: programs, you get the profiles like this:
        !           323: 
        !           324: @example
        !           325: make fib.prof test.prof #takes a few seconds
        !           326: @end example
        !           327: 
        !           328: You can aggregate these profiles with @file{stat.awk}:
        !           329: 
        !           330: @example
        !           331: awk -f stat.awk fib.prof test.prof
        !           332: @end example
        !           333: 
        !           334: The result contains lines like:
        !           335: 
        !           336: @example
        !           337:       2      16        36910041 loadlocal lit
        !           338: @end example
        !           339: 
        !           340: This means that the sequence @code{loadlocal lit} statically occurs a
        !           341: total of 16 times in 2 profiles, with a dynamic execution count of
        !           342: 36910041.
        !           343: 
        !           344: The numbers can be used in various ways to select superinstructions.
        !           345: E.g., if you just want to select all sequences with a dynamic
        !           346: execution count exceeding 10000, you would use the following pipeline:
        !           347: 
        !           348: @example
        !           349: awk -f stat.awk fib.prof test.prof|
        !           350: awk '$3>=10000'|                #select sequences
        !           351: fgrep -v -f peephole-blacklist| #eliminate wrong instructions
        !           352: awk -f seq2rule.awk|      #transform sequences into superinstruction rules
        !           353: sort -k 3 >mini-super.vmg       #sort sequences
        !           354: @end example
        !           355: 
        !           356: The file @file{peephole-blacklist} contains all instructions that
        !           357: directly access a stack or stack pointer (for mini: @code{call},
        !           358: @code{return}); the sort step is necessary to ensure that prefixes
        !           359: preceed larger superinstructions.
        !           360: 
        !           361: Now you can create a version of mini with superinstructions by just
        !           362: saying @samp{make}
        !           363: 
1.3       anton     364: @c ***************************************************************
                    365: @chapter Input File Format
                    366: 
                    367: Vmgen takes as input a file containing specifications of virtual machine
                    368: instructions.  This file usually has a name ending in @file{.vmg}.
                    369: 
1.5     ! anton     370: Most examples are taken from the example in @file{vmgen-ex}.
1.3       anton     371: 
                    372: @section Input File Grammar
                    373: 
                    374: The grammar is in EBNF format, with @code{@var{a}|@var{b}} meaning
                    375: ``@var{a} or @var{b}'', @code{@{@var{c}@}} meaning 0 or more repetitions
                    376: of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.
                    377: 
                    378: Vmgen input is not free-format, so you have to take care where you put
                    379: spaces and especially newlines; it's not as bad as makefiles, though:
                    380: any sequence of spaces and tabs is equivalent to a single space.
                    381: 
                    382: @example
                    383: description: {instruction|comment|eval-escape}
                    384: 
                    385: instruction: simple-inst|superinst
                    386: 
                    387: simple-inst: ident " (" stack-effect " )" newline c-code newline newline
                    388: 
                    389: stack-effect: {ident} " --" {ident}
                    390: 
                    391: super-inst: ident " =" ident {ident}  
                    392: 
                    393: comment:      "\ "  text newline
                    394: 
                    395: eval-escape:  "\e " text newline
                    396: @end example
                    397: @c \+ \- \g \f \c
                    398: 
                    399: Note that the @code{\}s in this grammar are meant literally, not as
1.5     ! anton     400: C-style encodings for non-printable characters.
1.3       anton     401: 
                    402: The C code in @code{simple-inst} must not contain empty lines (because
                    403: vmgen would mistake that as the end of the simple-inst.  The text in
                    404: @code{comment} and @code{eval-escape} must not contain a newline.
                    405: @code{Ident} must conform to the usual conventions of C identifiers
                    406: (otherwise the C compiler would choke on the vmgen output).
                    407: 
                    408: Vmgen understands a few extensions beyond the grammar given here, but
                    409: these extensions are only useful for building Gforth.  You can find a
                    410: description of the format used for Gforth in @file{prim}.
                    411: 
                    412: @subsection
                    413: @c woanders?
                    414: The text in @code{eval-escape} is Forth code that is evaluated when
                    415: vmgen reads the line.  If you do not know (and do not want to learn)
                    416: Forth, you can build the text according to the following grammar; these
                    417: rules are normally all Forth you need for using vmgen:
                    418: 
                    419: @example
                    420: text: stack-decl|type-prefix-decl|stack-prefix-decl
                    421: 
                    422: stack-decl: "stack " ident ident ident
                    423: type-prefix-decl: 
                    424:     's" ' string '" ' ("single"|"double") ident "type-prefix" ident
                    425: stack-prefix-decl:  ident "stack-prefix" string
                    426: @end example
                    427: 
                    428: Note that the syntax of this code is not checked thoroughly (there are
                    429: many other Forth program fragments that could be written there).
                    430: 
                    431: If you know Forth, the stack effects of the non-standard words involved
                    432: are:
                    433: 
                    434: @example
                    435: stack        ( "name" "pointer" "type" -- )
                    436:              ( name execution: -- stack )
                    437: type-prefix  ( addr u xt1 xt2 n stack "prefix" -- )
                    438: single       ( -- xt1 xt2 n )
                    439: double       ( -- xt1 xt2 n )
                    440: stack-prefix ( stack "prefix" -- )
                    441: @end example
                    442: 
1.5     ! anton     443: 
1.3       anton     444: @section Simple instructions
                    445: 
                    446: We will use the following simple VM instruction description as example:
                    447: 
                    448: @example
                    449: sub ( i1 i2 -- i )
                    450: i = i1-i2;
                    451: @end example
                    452: 
                    453: The first line specifies the name of the VM instruction (@code{sub}) and
                    454: its stack effect (@code{i1 i2 -- i}).  The rest of the description is
                    455: just plain C code.
                    456: 
                    457: @cindex stack effect
                    458: The stack effect specifies that @code{sub} pulls two integers from the
1.5     ! anton     459: data stack and puts them in the C variables @code{i1} and @code{i2} (with
1.3       anton     460: the rightmost item (@code{i2}) taken from the top of stack) and later
                    461: pushes one integer (@code{i)) on the data stack (the rightmost item is
                    462: on the top afterwards).
                    463: 
                    464: How do we know the type and stack of the stack items?  Vmgen uses
                    465: prefixes, similar to Fortran; in contrast to Fortran, you have to
                    466: define the prefix first:
                    467: 
                    468: @example
                    469: \E s" Cell"   single data-stack type-prefix i
                    470: @end example
                    471: 
                    472: This defines the prefix @code{i} to refer to the type @code{Cell}
                    473: (defined as @code{long} in @file{mini.h}) and, by default, to the
                    474: @code{data-stack}.  It also specifies that this type takes one stack
                    475: item (@code{single}).  The type prefix is part of the variable name.
                    476: 
                    477: Before we can use @code{data-stack} in this way, we have to define it:
                    478: 
                    479: @example
                    480: \E stack data-stack sp Cell
                    481: @end example
                    482: @c !! use something other than Cell
                    483: 
                    484: This line defines the stack @code{data-stack}, which uses the stack
                    485: pointer @code{sp}, and each item has the basic type @code{Cell}; other
                    486: types have to fit into one or two @code{Cell}s (depending on whether the
                    487: type is @code{single} or @code{double} wide), and are converted from and
                    488: to Cells on accessing the @code{data-stack) with conversion macros
                    489: (@pxref{Conversion macros}).  Stacks grow towards lower addresses in
1.5     ! anton     490: vmgen-erated interpreters.
1.3       anton     491: 
                    492: We can override the default stack of a stack item by using a stack
                    493: prefix.  E.g., consider the following instruction:
                    494: 
                    495: @example
                    496: lit ( #i -- i )
                    497: @end example
                    498: 
                    499: The VM instruction @code{lit} takes the item @code{i} from the
1.5     ! anton     500: instruction stream (indicated by the prefix @code{#}), and pushes it on
1.3       anton     501: the (default) data stack.  The stack prefix is not part of the variable
                    502: name.  Stack prefixes are defined like this:
                    503: 
                    504: @example
                    505: \E inst-stream stack-prefix #
                    506: @end example
                    507: 
1.5     ! anton     508: This definition defines that the stack prefix @code{#} specifies the
1.3       anton     509: ``stack'' @code{inst-stream}.  Since the instruction stream behaves a
                    510: little differently than an ordinary stack, it is predefined, and you do
                    511: not need to define it.
                    512: 
                    513: The instruction stream contains instructions and their immediate
                    514: arguments, so specifying that an argument comes from the instruction
                    515: stream indicates an immediate argument.  Of course, instruction stream
                    516: arguments can only appear to the left of @code{--} in the stack effect.
                    517: If there are multiple instruction stream arguments, the leftmost is the
                    518: first one (just as the intuition suggests).
                    519: 
1.5     ! anton     520: @subsubsection C Code Macros
        !           521: 
        !           522: Vmgen recognizes the following strings in the C code part of simple
        !           523: instructions:
        !           524: 
        !           525: @table @samp
        !           526: 
        !           527: @item SET_IP
        !           528: As far as vmgen is concerned, a VM instruction containing this ends a VM
        !           529: basic block (used in profiling to delimit profiled sequences).  On the C
        !           530: level, this also sets the instruction pointer.
        !           531: 
        !           532: @item SUPER_END
        !           533: This ends a basic block (for profiling), without a SET_IP.
        !           534: 
        !           535: @item TAIL;
        !           536: Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and
        !           537: dispatching the next VM instruction.  This happens automatically when
        !           538: control reaches the end of the C code.  If you want to have this in the
        !           539: middle of the C code, you need to use @samp{TAIL;}.  A typical example
        !           540: is a conditional VM branch:
        !           541: 
        !           542: @example
        !           543: if (branch_condition) {
        !           544:   SET_IP(target); TAIL;
        !           545: }
        !           546: /* implicit tail follows here */
        !           547: @end example
        !           548: 
        !           549: In this example, @samp{TAIL;} is not strictly necessary, because there
        !           550: is another one implicitly after the if-statement, but using it improves
        !           551: branch prediction accuracy slightly and allows other optimizations.
        !           552: 
        !           553: @item SUPER_CONTINUE
        !           554: This indicates that the implicit tail at the end of the VM instruction
        !           555: dispatches the sequentially next VM instruction even if there is a
        !           556: @code{SET_IP} in the VM instruction.  This enables an optimization that
        !           557: is not yet implemented in the vmgen-ex code (but in Gforth).  The
        !           558: typical application is in conditional VM branches:
        !           559: 
        !           560: @example
        !           561: if (branch_condition) {
        !           562:   SET_IP(target); TAIL; /* now this TAIL is necessary */
        !           563: }
        !           564: SUPER_CONTINUE;
        !           565: @end example
        !           566: 
        !           567: @end table
        !           568: 
        !           569: Note that vmgen is not smart about C-level tokenization, comments,
        !           570: strings, or conditional compilation, so it will interpret even a
        !           571: commented-out SUPER_END as ending a basic block (or, e.g.,
        !           572: @samp{RETAIL;} as @samp{TAIL;}).  Conversely, vmgen requires the literal
        !           573: presence of these strings; vmgen will not see them if they are hiding in
        !           574: a C preprocessor macro.
        !           575: 
        !           576: 
        !           577: @subsubsection C Code restrictions
        !           578: 
        !           579: Vmgen generates code and performs some optimizations under the
        !           580: assumption that the user-supplied C code does not access the stack
        !           581: pointers or stack items, and that accesses to the instruction pointer
        !           582: only occur through special macros.  In general you should heed these
        !           583: restrictions.  However, if you need to break these restrictions, read
        !           584: the following.
        !           585: 
        !           586: Accessing a stack or stack pointer directly can be a problem for several
        !           587: reasons: 
        !           588: 
        !           589: @itemize
        !           590: 
        !           591: @item
        !           592: You may cache the top-of-stack item in a local variable (that is
        !           593: allocated to a register).  This is the most frequent source of trouble.
        !           594: You can deal with it either by not using top-of-stack caching (slowdown
        !           595: factor 1-1.4, depending on machine), or by inserting flushing code
        !           596: (e.g., @samp{IF_spTOS(sp[...] = spTOS);}) at the start and reloading
        !           597: code (e.g., @samp{IF_spTOS(spTOS = sp[0])}) at the end of problematic C
        !           598: code.  Vmgen inserts a stack pointer update before the start of the
        !           599: user-supplied C code, so the flushing code has to use an index that
        !           600: corrects for that.  In the future, this flushing may be done
        !           601: automatically by mentioning a special string in the C code.
        !           602: @c sometimes flushing and/or reloading unnecessary
        !           603: 
        !           604: @item
        !           605: The vmgen-erated code loads the stack items from stack-pointer-indexed
        !           606: memory into variables before the user-supplied C code, and stores them
        !           607: from variables to stack-pointer-indexed memory afterwards.  If you do
        !           608: any writes to the stack through its stack pointer in your C code, it
        !           609: will not affact the variables, and your write may be overwritten by the
        !           610: stores after the C code.  Similarly, a read from a stack using a stack
        !           611: pointer will not reflect computations of stack items in the same VM
        !           612: instruction.
        !           613: 
        !           614: @item
        !           615: Superinstructions keep stack items in variables across the whole
        !           616: superinstruction.  So you should not include VM instructions, that
        !           617: access a stack or stack pointer, as components of superinstructions.
        !           618: 
        !           619: @end itemize
        !           620: 
        !           621: You should access the instruction pointer only through its special
        !           622: macros (@samp{IP}, @samp{SET_IP}, @samp{IPTOS}); this ensure that these
        !           623: macros can be implemented in several ways for best performance.
        !           624: @samp{IP} points to the next instruction, and @samp{IPTOS} is its
        !           625: contents.
        !           626: 
        !           627: 
1.3       anton     628: @section Superinstructions
1.5     ! anton     629: 
        !           630: Here is an example of a superinstruction definition:
        !           631: 
        !           632: @example
        !           633: lit_sub = lit sub
        !           634: @end example
        !           635: 
        !           636: @code{lit_sub} is the name of the superinstruction, and @code{lit} and
        !           637: @code{sub} are its components.  This superinstruction performs the same
        !           638: action as the sequence @code{lit} and @code{sub}.  It is generated
        !           639: automatically by the VM code generation functions whenever that sequence
        !           640: occurs, so you only need to add this definition if you want to use this
        !           641: superinstruction (and even that can be partially automatized,
        !           642: @pxref{...}).
        !           643: 
        !           644: Vmgen requires that the component instructions are simple instructions
        !           645: defined before superinstructions using the components.  Currently, vmgen
        !           646: also requires that all the subsequences at the start of a
        !           647: superinstruction (prefixes) must be defined as superinstruction before
        !           648: the superinstruction.  I.e., if you want to define a superinstruction
        !           649: 
        !           650: @example
        !           651: sumof5 = add add add add
        !           652: @end example
        !           653: 
        !           654: you first have to define
        !           655: 
        !           656: @example
        !           657: add ( n1 n2 -- n )
        !           658: n = n1+n2;
        !           659: 
        !           660: sumof3 = add add
        !           661: sumof4 = add add add
        !           662: @end example
        !           663: 
        !           664: Here, @code{sumof4} is the longest prefix of @code{sumof5}, and @code{sumof3}
        !           665: is the longest prefix of @code{sumof4}.
        !           666: 
        !           667: Note that vmgen assumes that only the code it generates accesses stack
        !           668: pointers, the instruction pointer, and various stack items, and it
        !           669: performs optimizations based on this assumption.  Therefore, VM
        !           670: instructions that change the instruction pointer should only be used as
        !           671: last component; a VM instruction that accesses a stack pointer should
        !           672: not be used as component at all.  Vmgen does not check these
        !           673: restrictions, they just result in bugs in your interpreter.
        !           674: 
        !           675: @c ********************************************************************
        !           676: @chapter Using the generated code
        !           677: 
        !           678: The easiest way to create a working VM interpreter with vmgen is
        !           679: probably to start with one of the examples, and modify it for your
        !           680: purposes.  This chapter is just the reference manual for the macros
        !           681: etc. used by the generated code, and the other context expected by the
        !           682: generated code, and what you can do with the various generated files.
        !           683: 
        !           684: @section VM engine
        !           685: 
        !           686: The VM engine is the VM interpreter that executes the VM code.  It is
        !           687: essential for an interpretive system.
        !           688: 
        !           689: The main file generated for the VM interpreter is
        !           690: @file{@var{name}-vm.i}.  It uses the following macros and variables (and
        !           691: you have to define them):
        !           692: 
        !           693: @table @code
        !           694: 
        !           695: @item LABEL(@var{inst_name})
        !           696: This is used just before each VM instruction to provide a jump or
        !           697: @code{switch} label (the @samp{:} is provided by vmgen).  For switch
        !           698: dispatch this should expand to @samp{case @var{label}}; for
        !           699: threaded-code dispatch this should just expand to @samp{case
        !           700: @var{label}}.  In either case @var{label} is usually the @var{inst_name}
        !           701: with some prefix or suffix to avoid naming conflicts.
        !           702: 
        !           703: @item NAME(@var{inst_name_string})
        !           704: Called on entering a VM instruction with a string containing the name of
        !           705: the VM instruction as parameter.  In normal execution this should be a
        !           706: noop, but for tracing this usually prints the name, and possibly other
        !           707: information (several VM registers in our example).
        !           708: 
        !           709: @item DEF_CA
        !           710: Usually empty.  Called just inside a new scope at the start of a VM
        !           711: instruction.  Can be used to define variables that should be visible
        !           712: during every VM instruction.  If you define this macro as non-empty, you
        !           713: have to provide the finishing @samp{;} in the macro.
        !           714: 
        !           715: @item NEXT_P0 NEXT_P1 NEXT_P2
        !           716: The three parts of instruction dispatch.  They can be defined in
        !           717: different ways for best performance on various processors (see
        !           718: @file{engine.c} in the example or @file{engine/threaded.h} in Gforth).
        !           719: @samp{NEXT_P0} is invoked right at the start of the VM isntruction (but
        !           720: after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
        !           721: code, and @samp{NEXT_P2} at the end.  The actual jump has to be
        !           722: performed by @samp{NEXT_P2}.
        !           723: 
        !           724: The simplest variant is if @samp{NEXT_P2} does everything and the other
        !           725: macros do nothing.  Then also related macros like @samp{IP},
        !           726: @samp{SET_IP}, @samp{IP}, @samp{INC_IP} and @samp{IPTOS} are very
        !           727: straightforward to define.  For switch dispatch this code consists just
        !           728: of a jump to the dispatch code (@samp{goto next_inst;} in our example;
        !           729: for direct threaded code it consists of something like
        !           730: @samp{({cfa=*ip++; goto *cfa;})}.
        !           731: 
        !           732: Pulling code (usually the @samp{cfa=*ip;}) up into @samp{NEXT_P1}
        !           733: usually does not cause problems, but pulling things up into
        !           734: @samp{NEXT_P0} usually requires changing the other macros (and, at least
        !           735: for Gforth on Alpha, it does not buy much, because the compiler often
        !           736: manages to schedule the relevant stuff up by itself).  An even more
        !           737: extreme variant is to pull code up even further, into, e.g., NEXT_P1 of
        !           738: the previous VM instruction (prefetching, useful on PowerPCs).
        !           739: 
        !           740: @item INC_IP(@var{n})
        !           741: This increments IP by @var{n}.
        !           742: 
        !           743: @item vm_@var{A}2@var{B}(a,b)
        !           744: Type casting macro that assigns @samp{a} (of type @var{A}) to @samp{b}
        !           745: (of type @var{B}).  This is mainly used for getting stack items into
        !           746: variables and back.  So you need to define macros for every combination
        !           747: of stack basic type (@code{Cell} in our example) and type-prefix types
        !           748: used with that stack (in both directions).  For the type-prefix type,
        !           749: you use the type-prefix (not the C type string) as type name (e.g.,
        !           750: @samp{vm_Cell2i}, not @samp{vm_Cell2Cell}).  In addition, you have to
        !           751: define a vm_@var{X}2@var{X} macro for the stack basic type (used in
        !           752: superinstructions).
        !           753: 
        !           754: The stack basic type for the predefined @samp{inst-stream} is
        !           755: @samp{Cell}.  If you want a stack with the same item size, making its
        !           756: basic type @samp{Cell} usually reduces the number of macros you have to
        !           757: define.
        !           758: 
        !           759: Here our examples differ a lot: @file{vmgen-ex} uses casts in these
        !           760: macros, whereas @file{vmgen-ex2} uses union-field selection (or
        !           761: assignment to union fields).
        !           762: 
        !           763: @item vm_two@var{A}2@var{B}(a1,a2,b)
        !           764: @item vm_@var{B}2two@var{A}(b,a1,a2)
        !           765: Conversions between two stack items (@code{a1}, @code{a2}) and a
        !           766: variable @code{b} of a type that takes two stack items.  This does not
        !           767: occur in our small examples, but you can look at Gforth for examples.
        !           768: 
        !           769: @item @var{stackpointer}
        !           770: For each stack used, the stackpointer name given in the stack
        !           771: declaration is used.  For a regular stack this must be an l-expression;
        !           772: typically it is a variable declared as a pointer to the stack's basic
        !           773: type.  For @samp{inst-stream}, the name is @samp{IP}, and it can be a
        !           774: plain r-value; typically it is a macro that abstracts away the
        !           775: differences between the various implementations of NEXT_P*.
        !           776: 
        !           777: @item @var{stackpointer}TOS
        !           778: The top-of-stack for the stack pointed to by @var{stackpointer}.  If you
        !           779: are using top-of-stack caching for that stack, this should be defined as
        !           780: variable; if you are not using top-of-stack caching for that stack, this
        !           781: should be a macro expanding to @samp{@var{stackpointer}[0]}.  The stack
        !           782: pointer for the predefined @samp{inst-stream} is called @samp{IP}, so
        !           783: the top-of-stack is called @samp{IPTOS}.
        !           784: 
        !           785: @item IF_@var{stackpointer}TOS(@var{expr})
        !           786: Macro for executing @var{expr}, if top-of-stack caching is used for the
        !           787: @var{stackpointer} stack.  I.e., this should do @var{expr} if there is
        !           788: top-of-stack caching for @var{stackpointer}; otherwise it should do
        !           789: nothing.
        !           790: 
        !           791: @item VM_DEBUG
        !           792: If this is defined, the tracing code will be compiled in (slower
        !           793: interpretation, but better debugging).  Our example compiles two
        !           794: versions of the engine, a fast-running one that cannot trace, and one
        !           795: with potential tracing and profiling.
        !           796: 
        !           797: @item vm_debug
        !           798: Needed only if @samp{VM_DEBUG} is defined.  If this variable contains
        !           799: true, the VM instructions produce trace output.  It can be turned on or
        !           800: off at any time.
        !           801: 
        !           802: @item vm_out
        !           803: Needed only if @samp{VM_DEBUG} is defined.  Specifies the file on which
        !           804: to print the trace output (type @samp{FILE *}).
        !           805: 
        !           806: @item printarg_@var{type}(@var{value})
        !           807: Needed only if @samp{VM_DEBUG} is defined.  Macro or function for
        !           808: printing @var{value} in a way appropriate for the @var{type}.  This is
        !           809: used for printing the values of stack items during tracing.  @var{Type}
        !           810: is normally the type prefix specified in a @code{type-prefix} definition
        !           811: (e.g., @samp{printarg_i}); in superinstructions it is currently the
        !           812: basic type of the stack.
        !           813: 
        !           814: @end table
        !           815: 
        !           816: The file @file{@var{name}-labels.i} is used for enumerating or listing
        !           817: all virtual machine instructions and uses the following macro:
        !           818: 
        !           819: @table @samp
        !           820: 
        !           821: @item INST_ADDR(@var{inst_name})
        !           822: For switch dispatch, this is just the name of the switch label (the same
        !           823: name as used in @samp{LABEL(@var{inst_name})}).  For threaded-code
        !           824: dispatch, this is the address of the label defined in
        !           825: @samp{LABEL(@var{inst_name})}); the address is taken with @samp{&&}
        !           826: (@pxref{labels-as-values}).
        !           827: 
        !           828: @end table
        !           829: 
        !           830: 
1.3       anton     831: 
                    832: @section Stacks, types, and prefixes
1.2       anton     833: 
                    834: 
                    835: 
                    836: Invocation
                    837: 
                    838: Input Syntax
                    839: 
                    840: Concepts: Front end, VM, Stacks,  Types, input stream
                    841: 
                    842: Contact
1.4       anton     843: 
                    844: 
                    845: Required changes:
                    846: vm_...2... -> two arguments
                    847: "vm_two...2...(arg1,arg2,arg3);" -> "vm_two...2...(arg3,arg1,arg2)" (no ";").
                    848: define INST_ADDR and LABEL
                    849: define VM_IS_INST also for disassembler

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>