1: \input texinfo @c -*-texinfo-*-
2: @comment %**start of header
3: @setfilename vmgen.info
4: @include version.texi
5: @settitle Vmgen (Gforth @value{VERSION})
6: @c @syncodeindex pg cp
7: @comment %**end of header
8: @copying
9: This manual is for Vmgen
10: (version @value{VERSION}, @value{UPDATED}),
11: the virtual machine interpreter generator
12:
13: Copyright @copyright{} 2002 Free Software Foundation, Inc.
14:
15: @quotation
16: Permission is granted to copy, distribute and/or modify this document
17: under the terms of the GNU Free Documentation License, Version 1.1 or
18: any later version published by the Free Software Foundation; with no
19: Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
20: and with the Back-Cover Texts as in (a) below. A copy of the
21: license is included in the section entitled ``GNU Free Documentation
22: License.''
23:
24: (a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify
25: this GNU Manual, like GNU software. Copies published by the Free
26: Software Foundation raise funds for GNU development.''
27: @end quotation
28: @end copying
29:
30: @dircategory GNU programming tools
31: @direntry
32: * vmgen: (vmgen). Interpreter generator
33: @end direntry
34:
35: @titlepage
36: @title Vmgen
37: @subtitle for Gforth version @value{VERSION}, @value{UPDATED}
38: @author M. Anton Ertl (@email{anton@mips.complang.tuwien.ac.at})
39: @page
40: @vskip 0pt plus 1filll
41: @insertcopying
42: @end titlepage
43:
44: @contents
45:
46: @ifnottex
47: @node Top, Introduction, (dir), (dir)
48: @top Vmgen
49:
50: @insertcopying
51: @end ifnottex
52:
53: @menu
54: * Introduction:: What can Vmgen do for you?
55: * Why interpreters?:: Advantages and disadvantages
56: * Concepts:: VM interpreter background
57: * Invoking vmgen::
58: * Example::
59: * Input File Format::
60: * Using the generated code::
61: * Changes:: from earlier versions
62: * Contact:: Bug reporting etc.
63: * Copying This Manual:: Manual License
64: * Index::
65:
66: @detailmenu
67: --- The Detailed Node Listing ---
68:
69: Concepts
70:
71: * Front end and VM interpreter:: Modularizing an interpretive system
72: * Data handling:: Stacks, registers, immediate arguments
73: * Dispatch:: From one VM instruction to the next
74:
75: Example
76:
77: * Example overview::
78: * Using profiling to create superinstructions::
79:
80: Input File Format
81:
82: * Input File Grammar::
83: * Simple instructions::
84: * Superinstructions::
85:
86: Simple instructions
87:
88: * C Code Macros:: Macros recognized by Vmgen
89: * C Code restrictions:: Vmgen makes assumptions about C code
90:
91: Using the generated code
92:
93: * VM engine:: Executing VM code
94: * VM instruction table::
95: * VM code generation:: Creating VM code (in the front-end)
96: * Peephole optimization:: Creating VM superinstructions
97: * VM disassembler:: for debugging the front end
98: * VM profiler:: for finding worthwhile superinstructions
99:
100: Copying This Manual
101:
102: * GNU Free Documentation License:: License for copying this manual.
103:
104: @end detailmenu
105: @end menu
106:
107: @c @ifnottex
108: This file documents Vmgen (Gforth @value{VERSION}).
109:
110: @c ************************************************************
111: @node Introduction, Why interpreters?, Top, Top
112: @chapter Introduction
113:
114: Vmgen is a tool for writing efficient interpreters. It takes a simple
115: virtual machine description and generates efficient C code for dealing
116: with the virtual machine code in various ways (in particular, executing
117: it). The run-time efficiency of the resulting interpreters is usually
118: within a factor of 10 of machine code produced by an optimizing
119: compiler.
120:
121: The interpreter design strategy supported by vmgen is to divide the
122: interpreter into two parts:
123:
124: @itemize @bullet
125:
126: @item The @emph{front end} takes the source code of the language to be
127: implemented, and translates it into virtual machine code. This is
128: similar to an ordinary compiler front end; typically an interpreter
129: front-end performs no optimization, so it is relatively simple to
130: implement and runs fast.
131:
132: @item The @emph{virtual machine interpreter} executes the virtual
133: machine code.
134:
135: @end itemize
136:
137: Such a division is usually used in interpreters, for modularity as well
138: as for efficiency. The virtual machine code is typically passed between
139: front end and virtual machine interpreter in memory, like in a
140: load-and-go compiler; this avoids the complexity and time cost of
141: writing the code to a file and reading it again.
142:
143: A @emph{virtual machine} (VM) represents the program as a sequence of
144: @emph{VM instructions}, following each other in memory, similar to real
145: machine code. Control flow occurs through VM branch instructions, like
146: in a real machine.
147:
148: In this setup, vmgen can generate most of the code dealing with virtual
149: machine instructions from a simple description of the virtual machine
150: instructions (@pxref...), in particular:
151:
152: @table @emph
153:
154: @item VM instruction execution
155:
156: @item VM code generation
157: Useful in the front end.
158:
159: @item VM code decompiler
160: Useful for debugging the front end.
161:
162: @item VM code tracing
163: Useful for debugging the front end and the VM interpreter. You will
164: typically provide other means for debugging the user's programs at the
165: source level.
166:
167: @item VM code profiling
168: Useful for optimizing the VM insterpreter with superinstructions
169: (@pxref...).
170:
171: @end table
172:
173: VMgen supports efficient interpreters though various optimizations, in
174: particular
175:
176: @itemize
177:
178: @item Threaded code
179:
180: @item Caching the top-of-stack in a register
181:
182: @item Combining VM instructions into superinstructions
183:
184: @item
185: Replicating VM (super)instructions for better BTB prediction accuracy
186: (not yet in vmgen-ex, but already in Gforth).
187:
188: @end itemize
189:
190: As a result, vmgen-based interpreters are only about an order of
191: magintude slower than native code from an optimizing C compiler on small
192: benchmarks; on large benchmarks, which spend more time in the run-time
193: system, the slowdown is often less (e.g., the slowdown of a
194: Vmgen-generated JVM interpreter over the best JVM JIT compiler we
195: measured is only a factor of 2-3 for large benchmarks; some other JITs
196: and all other interpreters we looked at were slower than our
197: interpreter).
198:
199: VMs are usually designed as stack machines (passing data between VM
200: instructions on a stack), and vmgen supports such designs especially
201: well; however, you can also use vmgen for implementing a register VM and
202: still benefit from most of the advantages offered by vmgen.
203:
204: There are many potential uses of the instruction descriptions that are
205: not implemented at the moment, but we are open for feature requests, and
206: we will implement new features if someone asks for them; so the feature
207: list above is not exhaustive.
208:
209: @c *********************************************************************
210: @node Why interpreters?, Concepts, Introduction, Top
211: @chapter Why interpreters?
212:
213: Interpreters are a popular language implementation technique because
214: they combine all three of the following advantages:
215:
216: @itemize
217:
218: @item Ease of implementation
219:
220: @item Portability
221:
222: @item Fast edit-compile-run cycle
223:
224: @end itemize
225:
226: The main disadvantage of interpreters is their run-time speed. However,
227: there are huge differences between different interpreters in this area:
228: the slowdown over optimized C code on programs consisting of simple
229: operations is typically a factor of 10 for the more efficient
230: interpreters, and a factor of 1000 for the less efficient ones (the
231: slowdown for programs executing complex operations is less, because the
232: time spent in libraries for executing complex operations is the same in
233: all implementation strategies).
234:
235: Vmgen makes it even easier to implement interpreters. It also supports
236: techniques for building efficient interpreters.
237:
238: @c ********************************************************************
239: @node Concepts, Invoking vmgen, Why interpreters?, Top
240: @chapter Concepts
241:
242: @menu
243: * Front end and VM interpreter:: Modularizing an interpretive system
244: * Data handling:: Stacks, registers, immediate arguments
245: * Dispatch:: From one VM instruction to the next
246: @end menu
247:
248: @c --------------------------------------------------------------------
249: @node Front end and VM interpreter, Data handling, Concepts, Concepts
250: @section Front end and VM interpreter
251:
252: @cindex front-end
253: Interpretive systems are typically divided into a @emph{front end} that
254: parses the input language and produces an intermediate representation
255: for the program, and an interpreter that executes the intermediate
256: representation of the program.
257:
258: @cindex virtual machine
259: @cindex VM
260: @cindex instruction, VM
261: For efficient interpreters the intermediate representation of choice is
262: virtual machine code (rather than, e.g., an abstract syntax tree).
263: @emph{Virtual machine} (VM) code consists of VM instructions arranged
264: sequentially in memory; they are executed in sequence by the VM
265: interpreter, except for VM branch instructions, which implement control
266: structures. The conceptual similarity to real machine code results in
267: the name @emph{virtual machine}.
268:
269: In this framework, vmgen supports building the VM interpreter and any
270: other component dealing with VM instructions. It does not have any
271: support for the front end, apart from VM code generation support. The
272: front end can be implemented with classical compiler front-end
273: techniques, supported by tools like @command{flex} and @command{bison}.
274:
275: The intermediate representation is usually just internal to the
276: interpreter, but some systems also support saving it to a file, either
277: as an image file, or in a full-blown linkable file format (e.g., JVM).
278: Vmgen currently has no special support for such features, but the
279: information in the instruction descriptions can be helpful, and we are
280: open for feature requests and suggestions.
281:
282: @c --------------------------------------------------------------------
283: @node Data handling, Dispatch, Front end and VM interpreter, Concepts
284: @section Data handling
285:
286: @cindex stack machine
287: @cindex register machine
288: Most VMs use one or more stacks for passing temporary data between VM
289: instructions. Another option is to use a register machine architecture
290: for the virtual machine; however, this option is either slower or
291: significantly more complex to implement than a stack machine architecture.
292:
293: Vmgen has special support and optimizations for stack VMs, making their
294: implementation easy and efficient.
295:
296: You can also implement a register VM with vmgen (@pxref{Register
297: Machines}), and you will still profit from most vmgen features.
298:
299: @cindex stack item size
300: @cindex size, stack items
301: Stack items all have the same size, so they typically will be as wide as
302: an integer, pointer, or floating-point value. Vmgen supports treating
303: two consecutive stack items as a single value, but anything larger is
304: best kept in some other memory area (e.g., the heap), with pointers to
305: the data on the stack.
306:
307: @cindex instruction stream
308: @cindex immediate arguments
309: Another source of data is immediate arguments VM instructions (in the VM
310: instruction stream). The VM instruction stream is handled similar to a
311: stack in vmgen.
312:
313: @cindex garbage collection
314: @cindex reference counting
315: Vmgen has no built-in support for nor restrictions against @emph{garbage
316: collection}. If you need garbage collection, you need to provide it in
317: your run-time libraries. Using @emph{reference counting} is probably
318: harder, but might be possible (contact us if you are interested).
319: @c reference counting might be possible by including counting code in
320: @c the conversion macros.
321:
322: @c --------------------------------------------------------------------
323: @node Dispatch, , Data handling, Concepts
324: @section Dispatch
325:
326: Understanding this section is probably not necessary for using vmgen,
327: but it may help. You may want to skip it now, and read it if you find statements about dispatch methods confusing.
328:
329: After executing one VM instruction, the VM interpreter has to dispatch
330: the next VM instruction (vmgen calls the dispatch routine @samp{NEXT}).
331: Vmgen supports two methods of dispatch:
332:
333: @table
334:
335: @item switch dispatch
336: In this method the VM interpreter contains a giant @code{switch}
337: statement, with one @code{case} for each VM instruction. The VM
338: instructions are represented by integers (e.g., produced by an
339: @code{enum}) in the VM code, and dipatch occurs by loading the next
340: integer from the VM code, @code{switch}ing on it, and continuing at the
341: appropriate @code{case}; after executing the VM instruction, jump back
342: to the dispatch code.
343:
344: @item threaded code
345: This method represents a VM instruction in the VM code by the address of
346: the start of the machine code fragment for executing the VM instruction.
347: Dispatch consists of loading this address, jumping to it, and
348: incrementing the VM instruction pointer. Typically the threaded-code
349: dispatch code is appended directly to the code for executing the VM
350: instruction. Threaded code cannot be implemented in ANSI C, but it can
351: be implemented using GNU C's labels-as-values extension (@pxref{labels
352: as values}).
353:
354: @end table
355:
356: @c *************************************************************
357: @node Invoking vmgen, Example, Concepts, Top
358: @chapter Invoking vmgen
359:
360: The usual way to invoke vmgen is as follows:
361:
362: @example
363: vmgen @var{infile}
364: @end example
365:
366: Here @var{infile} is the VM instruction description file, which usually
367: ends in @file{.vmg}. The output filenames are made by taking the
368: basename of @file{infile} (i.e., the output files will be created in the
369: current working directory) and replacing @file{.vmg} with @file{-vm.i},
370: @file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i},
371: and @file{-peephole.i}. E.g., @command{bison hack/foo.vmg} will create
372: @file{foo-vm.i} etc.
373:
374: The command-line options supported by vmgen are
375:
376: @table @option
377:
378: @cindex -h, command-line option
379: @cindex --help, command-line option
380: @item --help
381: @itemx -h
382: Print a message about the command-line options
383:
384: @cindex -v, command-line option
385: @cindex --version, command-line option
386: @item --version
387: @itemx -v
388: Print version and exit
389: @end table
390:
391: @c env vars GFORTHDIR GFORTHDATADIR
392:
393: @c ****************************************************************
394: @node Example, Input File Format, Invoking vmgen, Top
395: @chapter Example
396:
397: @menu
398: * Example overview::
399: * Using profiling to create superinstructions::
400: @end menu
401:
402: @c --------------------------------------------------------------------
403: @node Example overview, Using profiling to create superinstructions, Example, Example
404: @section Example overview
405:
406: There are two versions of the same example for using vmgen:
407: @file{vmgen-ex} and @file{vmgen-ex2} (you can also see Gforth as
408: example, but it uses additional (undocumented) features, and also
409: differs in some other respects). The example implements @emph{mini}, a
410: tiny Modula-2-like language with a small JavaVM-like virtual machine.
411: The difference between the examples is that @file{vmgen-ex} uses many
412: casts, and @file{vmgen-ex2} tries to avoids most casts and uses unions
413: instead.
414:
415: The files provided with each example are:
416:
417: @example
418: Makefile
419: README
420: disasm.c wrapper file
421: engine.c wrapper file
422: peephole.c wrapper file
423: profile.c wrapper file
424: mini-inst.vmg simple VM instructions
425: mini-super.vmg superinstructions (empty at first)
426: mini.h common declarations
427: mini.l scanner
428: mini.y front end (parser, VM code generator)
429: support.c main() and other support functions
430: fib.mini example mini program
431: simple.mini example mini program
432: test.mini example mini program (tests everything)
433: test.out test.mini output
434: stat.awk script for aggregating profile information
435: peephole-blacklist list of instructions not allowed in superinstructions
436: seq2rule.awk script for creating superinstructions
437: @end example
438:
439: For your own interpreter, you would typically copy the following files
440: and change little, if anything:
441:
442: @example
443: disasm.c wrapper file
444: engine.c wrapper file
445: peephole.c wrapper file
446: profile.c wrapper file
447: stat.awk script for aggregating profile information
448: seq2rule.awk script for creating superinstructions
449: @end example
450:
451: You would typically change much in or replace the following files:
452:
453: @example
454: Makefile
455: mini-inst.vmg simple VM instructions
456: mini.h common declarations
457: mini.l scanner
458: mini.y front end (parser, VM code generator)
459: support.c main() and other support functions
460: peephole-blacklist list of instructions not allowed in superinstructions
461: @end example
462:
463: You can build the example by @code{cd}ing into the example's directory,
464: and then typing @samp{make}; you can check that it works with @samp{make
465: check}. You can run run mini programs like this:
466:
467: @example
468: ./mini fib.mini
469: @end example
470:
471: To learn about the options, type @samp{./mini -h}.
472:
473: @c --------------------------------------------------------------------
474: @node Using profiling to create superinstructions, , Example overview, Example
475: @section Using profiling to create superinstructions
476:
477: I have not added rules for this in the @file{Makefile} (there are many
478: options for selecting superinstructions, and I did not want to hardcode
479: one into the @file{Makefile}), but there are some supporting scripts, and
480: here's an example:
481:
482: Suppose you want to use @file{fib.mini} and @file{test.mini} as training
483: programs, you get the profiles like this:
484:
485: @example
486: make fib.prof test.prof #takes a few seconds
487: @end example
488:
489: You can aggregate these profiles with @file{stat.awk}:
490:
491: @example
492: awk -f stat.awk fib.prof test.prof
493: @end example
494:
495: The result contains lines like:
496:
497: @example
498: 2 16 36910041 loadlocal lit
499: @end example
500:
501: This means that the sequence @code{loadlocal lit} statically occurs a
502: total of 16 times in 2 profiles, with a dynamic execution count of
503: 36910041.
504:
505: The numbers can be used in various ways to select superinstructions.
506: E.g., if you just want to select all sequences with a dynamic
507: execution count exceeding 10000, you would use the following pipeline:
508:
509: @example
510: awk -f stat.awk fib.prof test.prof|
511: awk '$3>=10000'| #select sequences
512: fgrep -v -f peephole-blacklist| #eliminate wrong instructions
513: awk -f seq2rule.awk| #transform sequences into superinstruction rules
514: sort -k 3 >mini-super.vmg #sort sequences
515: @end example
516:
517: The file @file{peephole-blacklist} contains all instructions that
518: directly access a stack or stack pointer (for mini: @code{call},
519: @code{return}); the sort step is necessary to ensure that prefixes
520: preceed larger superinstructions.
521:
522: Now you can create a version of mini with superinstructions by just
523: saying @samp{make}
524:
525:
526: @c ***************************************************************
527: @node Input File Format, Using the generated code, Example, Top
528: @chapter Input File Format
529:
530: Vmgen takes as input a file containing specifications of virtual machine
531: instructions. This file usually has a name ending in @file{.vmg}.
532:
533: Most examples are taken from the example in @file{vmgen-ex}.
534:
535: @menu
536: * Input File Grammar::
537: * Simple instructions::
538: * Superinstructions::
539: @end menu
540:
541: @c --------------------------------------------------------------------
542: @node Input File Grammar, Simple instructions, Input File Format, Input File Format
543: @section Input File Grammar
544:
545: The grammar is in EBNF format, with @code{@var{a}|@var{b}} meaning
546: ``@var{a} or @var{b}'', @code{@{@var{c}@}} meaning 0 or more repetitions
547: of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}.
548:
549: Vmgen input is not free-format, so you have to take care where you put
550: spaces and especially newlines; it's not as bad as makefiles, though:
551: any sequence of spaces and tabs is equivalent to a single space.
552:
553: @example
554: description: {instruction|comment|eval-escape}
555:
556: instruction: simple-inst|superinst
557:
558: simple-inst: ident " (" stack-effect " )" newline c-code newline newline
559:
560: stack-effect: {ident} " --" {ident}
561:
562: super-inst: ident " =" ident {ident}
563:
564: comment: "\ " text newline
565:
566: eval-escape: "\e " text newline
567: @end example
568: @c \+ \- \g \f \c
569:
570: Note that the @code{\}s in this grammar are meant literally, not as
571: C-style encodings for non-printable characters.
572:
573: The C code in @code{simple-inst} must not contain empty lines (because
574: vmgen would mistake that as the end of the simple-inst. The text in
575: @code{comment} and @code{eval-escape} must not contain a newline.
576: @code{Ident} must conform to the usual conventions of C identifiers
577: (otherwise the C compiler would choke on the vmgen output).
578:
579: Vmgen understands a few extensions beyond the grammar given here, but
580: these extensions are only useful for building Gforth. You can find a
581: description of the format used for Gforth in @file{prim}.
582:
583: @subsection Eval escapes
584: @c woanders?
585: The text in @code{eval-escape} is Forth code that is evaluated when
586: vmgen reads the line. If you do not know (and do not want to learn)
587: Forth, you can build the text according to the following grammar; these
588: rules are normally all Forth you need for using vmgen:
589:
590: @example
591: text: stack-decl|type-prefix-decl|stack-prefix-decl
592:
593: stack-decl: "stack " ident ident ident
594: type-prefix-decl:
595: 's" ' string '" ' ("single"|"double") ident "type-prefix" ident
596: stack-prefix-decl: ident "stack-prefix" string
597: @end example
598:
599: Note that the syntax of this code is not checked thoroughly (there are
600: many other Forth program fragments that could be written there).
601:
602: If you know Forth, the stack effects of the non-standard words involved
603: are:
604:
605: @example
606: stack ( "name" "pointer" "type" -- )
607: ( name execution: -- stack )
608: type-prefix ( addr u xt1 xt2 n stack "prefix" -- )
609: single ( -- xt1 xt2 n )
610: double ( -- xt1 xt2 n )
611: stack-prefix ( stack "prefix" -- )
612: @end example
613:
614:
615: @c --------------------------------------------------------------------
616: @node Simple instructions, Superinstructions, Input File Grammar, Input File Format
617: @section Simple instructions
618:
619: We will use the following simple VM instruction description as example:
620:
621: @example
622: sub ( i1 i2 -- i )
623: i = i1-i2;
624: @end example
625:
626: The first line specifies the name of the VM instruction (@code{sub}) and
627: its stack effect (@code{i1 i2 -- i}). The rest of the description is
628: just plain C code.
629:
630: @cindex stack effect
631: The stack effect specifies that @code{sub} pulls two integers from the
632: data stack and puts them in the C variables @code{i1} and @code{i2} (with
633: the rightmost item (@code{i2}) taken from the top of stack) and later
634: pushes one integer (@code{i)) on the data stack (the rightmost item is
635: on the top afterwards).
636:
637: How do we know the type and stack of the stack items? Vmgen uses
638: prefixes, similar to Fortran; in contrast to Fortran, you have to
639: define the prefix first:
640:
641: @example
642: \E s" Cell" single data-stack type-prefix i
643: @end example
644:
645: This defines the prefix @code{i} to refer to the type @code{Cell}
646: (defined as @code{long} in @file{mini.h}) and, by default, to the
647: @code{data-stack}. It also specifies that this type takes one stack
648: item (@code{single}). The type prefix is part of the variable name.
649:
650: Before we can use @code{data-stack} in this way, we have to define it:
651:
652: @example
653: \E stack data-stack sp Cell
654: @end example
655: @c !! use something other than Cell
656:
657: This line defines the stack @code{data-stack}, which uses the stack
658: pointer @code{sp}, and each item has the basic type @code{Cell}; other
659: types have to fit into one or two @code{Cell}s (depending on whether the
660: type is @code{single} or @code{double} wide), and are converted from and
661: to Cells on accessing the @code{data-stack) with conversion macros
662: (@pxref{Conversion macros}). Stacks grow towards lower addresses in
663: vmgen-erated interpreters.
664:
665: We can override the default stack of a stack item by using a stack
666: prefix. E.g., consider the following instruction:
667:
668: @example
669: lit ( #i -- i )
670: @end example
671:
672: The VM instruction @code{lit} takes the item @code{i} from the
673: instruction stream (indicated by the prefix @code{#}), and pushes it on
674: the (default) data stack. The stack prefix is not part of the variable
675: name. Stack prefixes are defined like this:
676:
677: @example
678: \E inst-stream stack-prefix #
679: @end example
680:
681: This definition defines that the stack prefix @code{#} specifies the
682: ``stack'' @code{inst-stream}. Since the instruction stream behaves a
683: little differently than an ordinary stack, it is predefined, and you do
684: not need to define it.
685:
686: The instruction stream contains instructions and their immediate
687: arguments, so specifying that an argument comes from the instruction
688: stream indicates an immediate argument. Of course, instruction stream
689: arguments can only appear to the left of @code{--} in the stack effect.
690: If there are multiple instruction stream arguments, the leftmost is the
691: first one (just as the intuition suggests).
692:
693: @menu
694: * C Code Macros:: Macros recognized by Vmgen
695: * C Code restrictions:: Vmgen makes assumptions about C code
696: @end menu
697:
698: @c --------------------------------------------------------------------
699: @node C Code Macros, C Code restrictions, Simple instructions, Simple instructions
700: @subsection C Code Macros
701:
702: Vmgen recognizes the following strings in the C code part of simple
703: instructions:
704:
705: @table @samp
706:
707: @item SET_IP
708: As far as vmgen is concerned, a VM instruction containing this ends a VM
709: basic block (used in profiling to delimit profiled sequences). On the C
710: level, this also sets the instruction pointer.
711:
712: @item SUPER_END
713: This ends a basic block (for profiling), without a SET_IP.
714:
715: @item TAIL;
716: Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and
717: dispatching the next VM instruction. This happens automatically when
718: control reaches the end of the C code. If you want to have this in the
719: middle of the C code, you need to use @samp{TAIL;}. A typical example
720: is a conditional VM branch:
721:
722: @example
723: if (branch_condition) {
724: SET_IP(target); TAIL;
725: }
726: /* implicit tail follows here */
727: @end example
728:
729: In this example, @samp{TAIL;} is not strictly necessary, because there
730: is another one implicitly after the if-statement, but using it improves
731: branch prediction accuracy slightly and allows other optimizations.
732:
733: @item SUPER_CONTINUE
734: This indicates that the implicit tail at the end of the VM instruction
735: dispatches the sequentially next VM instruction even if there is a
736: @code{SET_IP} in the VM instruction. This enables an optimization that
737: is not yet implemented in the vmgen-ex code (but in Gforth). The
738: typical application is in conditional VM branches:
739:
740: @example
741: if (branch_condition) {
742: SET_IP(target); TAIL; /* now this TAIL is necessary */
743: }
744: SUPER_CONTINUE;
745: @end example
746:
747: @end table
748:
749: Note that vmgen is not smart about C-level tokenization, comments,
750: strings, or conditional compilation, so it will interpret even a
751: commented-out SUPER_END as ending a basic block (or, e.g.,
752: @samp{RETAIL;} as @samp{TAIL;}). Conversely, vmgen requires the literal
753: presence of these strings; vmgen will not see them if they are hiding in
754: a C preprocessor macro.
755:
756:
757: @c --------------------------------------------------------------------
758: @node C Code restrictions, , C Code Macros, Simple instructions
759: @subsection C Code restrictions
760:
761: Vmgen generates code and performs some optimizations under the
762: assumption that the user-supplied C code does not access the stack
763: pointers or stack items, and that accesses to the instruction pointer
764: only occur through special macros. In general you should heed these
765: restrictions. However, if you need to break these restrictions, read
766: the following.
767:
768: Accessing a stack or stack pointer directly can be a problem for several
769: reasons:
770:
771: @itemize
772:
773: @item
774: You may cache the top-of-stack item in a local variable (that is
775: allocated to a register). This is the most frequent source of trouble.
776: You can deal with it either by not using top-of-stack caching (slowdown
777: factor 1-1.4, depending on machine), or by inserting flushing code
778: (e.g., @samp{IF_spTOS(sp[...] = spTOS);}) at the start and reloading
779: code (e.g., @samp{IF_spTOS(spTOS = sp[0])}) at the end of problematic C
780: code. Vmgen inserts a stack pointer update before the start of the
781: user-supplied C code, so the flushing code has to use an index that
782: corrects for that. In the future, this flushing may be done
783: automatically by mentioning a special string in the C code.
784: @c sometimes flushing and/or reloading unnecessary
785:
786: @item
787: The vmgen-erated code loads the stack items from stack-pointer-indexed
788: memory into variables before the user-supplied C code, and stores them
789: from variables to stack-pointer-indexed memory afterwards. If you do
790: any writes to the stack through its stack pointer in your C code, it
791: will not affact the variables, and your write may be overwritten by the
792: stores after the C code. Similarly, a read from a stack using a stack
793: pointer will not reflect computations of stack items in the same VM
794: instruction.
795:
796: @item
797: Superinstructions keep stack items in variables across the whole
798: superinstruction. So you should not include VM instructions, that
799: access a stack or stack pointer, as components of superinstructions.
800:
801: @end itemize
802:
803: You should access the instruction pointer only through its special
804: macros (@samp{IP}, @samp{SET_IP}, @samp{IPTOS}); this ensure that these
805: macros can be implemented in several ways for best performance.
806: @samp{IP} points to the next instruction, and @samp{IPTOS} is its
807: contents.
808:
809:
810: @c --------------------------------------------------------------------
811: @node Superinstructions, , Simple instructions, Input File Format
812: @section Superinstructions
813:
814: Note: don't invest too much work in (static) superinstructions; a future
815: version of vmgen will support dynamic superinstructions (see Ian
816: Piumarta and Fabio Riccardi, @cite{Optimizing Direct Threaded Code by
817: Selective Inlining}, PLDI'98), and static superinstructions have much
818: less benefit in that context.
819:
820: Here is an example of a superinstruction definition:
821:
822: @example
823: lit_sub = lit sub
824: @end example
825:
826: @code{lit_sub} is the name of the superinstruction, and @code{lit} and
827: @code{sub} are its components. This superinstruction performs the same
828: action as the sequence @code{lit} and @code{sub}. It is generated
829: automatically by the VM code generation functions whenever that sequence
830: occurs, so you only need to add this definition if you want to use this
831: superinstruction (and even that can be partially automatized,
832: @pxref{...}).
833:
834: Vmgen requires that the component instructions are simple instructions
835: defined before superinstructions using the components. Currently, vmgen
836: also requires that all the subsequences at the start of a
837: superinstruction (prefixes) must be defined as superinstruction before
838: the superinstruction. I.e., if you want to define a superinstruction
839:
840: @example
841: sumof5 = add add add add
842: @end example
843:
844: you first have to define
845:
846: @example
847: add ( n1 n2 -- n )
848: n = n1+n2;
849:
850: sumof3 = add add
851: sumof4 = add add add
852: @end example
853:
854: Here, @code{sumof4} is the longest prefix of @code{sumof5}, and @code{sumof3}
855: is the longest prefix of @code{sumof4}.
856:
857: Note that vmgen assumes that only the code it generates accesses stack
858: pointers, the instruction pointer, and various stack items, and it
859: performs optimizations based on this assumption. Therefore, VM
860: instructions that change the instruction pointer should only be used as
861: last component; a VM instruction that accesses a stack pointer should
862: not be used as component at all. Vmgen does not check these
863: restrictions, they just result in bugs in your interpreter.
864:
865: @c ********************************************************************
866: @node Using the generated code, Changes, Input File Format, Top
867: @chapter Using the generated code
868:
869: The easiest way to create a working VM interpreter with vmgen is
870: probably to start with one of the examples, and modify it for your
871: purposes. This chapter is just the reference manual for the macros
872: etc. used by the generated code, the other context expected by the
873: generated code, and what you can do with the various generated files.
874:
875: @menu
876: * VM engine:: Executing VM code
877: * VM instruction table::
878: * VM code generation:: Creating VM code (in the front-end)
879: * Peephole optimization:: Creating VM superinstructions
880: * VM disassembler:: for debugging the front end
881: * VM profiler:: for finding worthwhile superinstructions
882: @end menu
883:
884: @c --------------------------------------------------------------------
885: @node VM engine, VM instruction table, Using the generated code, Using the generated code
886: @section VM engine
887:
888: The VM engine is the VM interpreter that executes the VM code. It is
889: essential for an interpretive system.
890:
891: Vmgen supports two methods of VM instruction dispatch: @emph{threaded
892: code} (fast, but gcc-specific), and @emph{switch dispatch} (slow, but
893: portable across C compilers); you can use conditional compilation
894: (@samp{defined(__GNUC__)}) to choose between these methods, and our
895: example does so.
896:
897: For both methods, the VM engine is contained in a C-level function.
898: Vmgen generates most of the contents of the function for you
899: (@file{@var{name}-vm.i}), but you have to define this function, and
900: macros and variables used in the engine, and initialize the variables.
901: In our example the engine function also includes
902: @file{@var{name}-labels.i} (@pxref{VM instruction table}).
903:
904: The following macros and variables are used in @file{@var{name}-vm.i}:
905:
906: @table @code
907:
908: @item LABEL(@var{inst_name})
909: This is used just before each VM instruction to provide a jump or
910: @code{switch} label (the @samp{:} is provided by vmgen). For switch
911: dispatch this should expand to @samp{case @var{label}}; for
912: threaded-code dispatch this should just expand to @samp{case
913: @var{label}}. In either case @var{label} is usually the @var{inst_name}
914: with some prefix or suffix to avoid naming conflicts.
915:
916: @item LABEL2(@var{inst_name})
917: This will be used for dynamic superinstructions; at the moment, this
918: should expand to nothing.
919:
920: @item NAME(@var{inst_name_string})
921: Called on entering a VM instruction with a string containing the name of
922: the VM instruction as parameter. In normal execution this should be a
923: noop, but for tracing this usually prints the name, and possibly other
924: information (several VM registers in our example).
925:
926: @item DEF_CA
927: Usually empty. Called just inside a new scope at the start of a VM
928: instruction. Can be used to define variables that should be visible
929: during every VM instruction. If you define this macro as non-empty, you
930: have to provide the finishing @samp{;} in the macro.
931:
932: @item NEXT_P0 NEXT_P1 NEXT_P2
933: The three parts of instruction dispatch. They can be defined in
934: different ways for best performance on various processors (see
935: @file{engine.c} in the example or @file{engine/threaded.h} in Gforth).
936: @samp{NEXT_P0} is invoked right at the start of the VM isntruction (but
937: after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C
938: code, and @samp{NEXT_P2} at the end. The actual jump has to be
939: performed by @samp{NEXT_P2}.
940:
941: The simplest variant is if @samp{NEXT_P2} does everything and the other
942: macros do nothing. Then also related macros like @samp{IP},
943: @samp{SET_IP}, @samp{IP}, @samp{INC_IP} and @samp{IPTOS} are very
944: straightforward to define. For switch dispatch this code consists just
945: of a jump to the dispatch code (@samp{goto next_inst;} in our example;
946: for direct threaded code it consists of something like
947: @samp{({cfa=*ip++; goto *cfa;})}.
948:
949: Pulling code (usually the @samp{cfa=*ip;}) up into @samp{NEXT_P1}
950: usually does not cause problems, but pulling things up into
951: @samp{NEXT_P0} usually requires changing the other macros (and, at least
952: for Gforth on Alpha, it does not buy much, because the compiler often
953: manages to schedule the relevant stuff up by itself). An even more
954: extreme variant is to pull code up even further, into, e.g., NEXT_P1 of
955: the previous VM instruction (prefetching, useful on PowerPCs).
956:
957: @item INC_IP(@var{n})
958: This increments @code{IP} by @var{n}.
959:
960: @item SET_IP(@var{target})
961: This sets @code{IP} to @var{target}.
962:
963: @item vm_@var{A}2@var{B}(a,b)
964: Type casting macro that assigns @samp{a} (of type @var{A}) to @samp{b}
965: (of type @var{B}). This is mainly used for getting stack items into
966: variables and back. So you need to define macros for every combination
967: of stack basic type (@code{Cell} in our example) and type-prefix types
968: used with that stack (in both directions). For the type-prefix type,
969: you use the type-prefix (not the C type string) as type name (e.g.,
970: @samp{vm_Cell2i}, not @samp{vm_Cell2Cell}). In addition, you have to
971: define a vm_@var{X}2@var{X} macro for the stack basic type (used in
972: superinstructions).
973:
974: The stack basic type for the predefined @samp{inst-stream} is
975: @samp{Cell}. If you want a stack with the same item size, making its
976: basic type @samp{Cell} usually reduces the number of macros you have to
977: define.
978:
979: Here our examples differ a lot: @file{vmgen-ex} uses casts in these
980: macros, whereas @file{vmgen-ex2} uses union-field selection (or
981: assignment to union fields).
982:
983: @item vm_two@var{A}2@var{B}(a1,a2,b)
984: @item vm_@var{B}2two@var{A}(b,a1,a2)
985: Conversions between two stack items (@code{a1}, @code{a2}) and a
986: variable @code{b} of a type that takes two stack items. This does not
987: occur in our small examples, but you can look at Gforth for examples.
988:
989: @item @var{stackpointer}
990: For each stack used, the stackpointer name given in the stack
991: declaration is used. For a regular stack this must be an l-expression;
992: typically it is a variable declared as a pointer to the stack's basic
993: type. For @samp{inst-stream}, the name is @samp{IP}, and it can be a
994: plain r-value; typically it is a macro that abstracts away the
995: differences between the various implementations of NEXT_P*.
996:
997: @item @var{stackpointer}TOS
998: The top-of-stack for the stack pointed to by @var{stackpointer}. If you
999: are using top-of-stack caching for that stack, this should be defined as
1000: variable; if you are not using top-of-stack caching for that stack, this
1001: should be a macro expanding to @samp{@var{stackpointer}[0]}. The stack
1002: pointer for the predefined @samp{inst-stream} is called @samp{IP}, so
1003: the top-of-stack is called @samp{IPTOS}.
1004:
1005: @item IF_@var{stackpointer}TOS(@var{expr})
1006: Macro for executing @var{expr}, if top-of-stack caching is used for the
1007: @var{stackpointer} stack. I.e., this should do @var{expr} if there is
1008: top-of-stack caching for @var{stackpointer}; otherwise it should do
1009: nothing.
1010:
1011: @item SUPER_END
1012: This is used by the VM profiler (@pxref{VM profiler}); it should not do
1013: anything in normal operation, and call @code{vm_count_block(IP)} for
1014: profiling.
1015:
1016: @item SUPER_CONTINUE
1017: This is just a hint to vmgen and does nothing at the C level.
1018:
1019: @item VM_DEBUG
1020: If this is defined, the tracing code will be compiled in (slower
1021: interpretation, but better debugging). Our example compiles two
1022: versions of the engine, a fast-running one that cannot trace, and one
1023: with potential tracing and profiling.
1024:
1025: @item vm_debug
1026: Needed only if @samp{VM_DEBUG} is defined. If this variable contains
1027: true, the VM instructions produce trace output. It can be turned on or
1028: off at any time.
1029:
1030: @item vm_out
1031: Needed only if @samp{VM_DEBUG} is defined. Specifies the file on which
1032: to print the trace output (type @samp{FILE *}).
1033:
1034: @item printarg_@var{type}(@var{value})
1035: Needed only if @samp{VM_DEBUG} is defined. Macro or function for
1036: printing @var{value} in a way appropriate for the @var{type}. This is
1037: used for printing the values of stack items during tracing. @var{Type}
1038: is normally the type prefix specified in a @code{type-prefix} definition
1039: (e.g., @samp{printarg_i}); in superinstructions it is currently the
1040: basic type of the stack.
1041:
1042: @end table
1043:
1044:
1045: @c --------------------------------------------------------------------
1046: @node VM instruction table, VM code generation, VM engine, Using the generated code
1047: @section VM instruction table
1048:
1049: For threaded code we also need to produce a table containing the labels
1050: of all VM instructions. This is needed for VM code generation
1051: (@pxref{VM code generation}), and it has to be done in the engine
1052: function, because the labels are not visible outside. It then has to be
1053: passed outside the function (and assigned to @samp{vm_prim}), to be used
1054: by the VM code generation functions.
1055:
1056: This means that the engine function has to be called first to produce
1057: the VM instruction table, and later, after generating VM code, it has to
1058: be called again to execute the generated VM code (yes, this is ugly).
1059: In our example program, these two modes of calling the engine function
1060: are differentiated by the value of the parameter ip0 (if it equals 0,
1061: then the table is passed out, otherwise the VM code is executed); in our
1062: example, we pass the table out by assigning it to @samp{vm_prim} and
1063: returning from @samp{engine}.
1064:
1065: In our example, we also build such a table for switch dispatch; this is
1066: mainly done for uniformity.
1067:
1068: For switch dispatch, we also need to define the VM instruction opcodes
1069: used as case labels in an @code{enum}.
1070:
1071: For both purposes (VM instruction table, and enum), the file
1072: @file{@var{name}-labels.i} is generated by vmgen. You have to define
1073: the following macro used in this file:
1074:
1075: @table @samp
1076:
1077: @item INST_ADDR(@var{inst_name})
1078: For switch dispatch, this is just the name of the switch label (the same
1079: name as used in @samp{LABEL(@var{inst_name})}), for both uses of
1080: @file{@var{name}-labels.i}. For threaded-code dispatch, this is the
1081: address of the label defined in @samp{LABEL(@var{inst_name})}); the
1082: address is taken with @samp{&&} (@pxref{labels-as-values}).
1083:
1084: @end table
1085:
1086:
1087: @c --------------------------------------------------------------------
1088: @node VM code generation, Peephole optimization, VM instruction table, Using the generated code
1089: @section VM code generation
1090:
1091: Vmgen generates VM code generation functions in @file{@var{name}-gen.i}
1092: that the front end can call to generate VM code. This is essential for
1093: an interpretive system.
1094:
1095: For a VM instruction @samp{x ( #a b #c -- d )}, vmgen generates a
1096: function with the prototype
1097:
1098: @example
1099: void gen_x(Inst **ctp, a_type a, c_type c)
1100: @end example
1101:
1102: The @code{ctp} argument points to a pointer to the next instruction.
1103: @code{*ctp} is increased by the generation functions; i.e., you should
1104: allocate memory for the code to be generated beforehand, and start with
1105: *ctp set at the start of this memory area. Before running out of
1106: memory, allocate a new area, and generate a VM-level jump to the new
1107: area (this is not implemented in our examples).
1108:
1109: The other arguments correspond to the immediate arguments of the VM
1110: instruction (with their appropriate types as defined in the
1111: @code{type_prefix} declaration.
1112:
1113: The following types, variables, and functions are used in
1114: @file{@var{name}-gen.i}:
1115:
1116: @table @samp
1117:
1118: @item Inst
1119: The type of the VM instruction; if you use threaded code, this is
1120: @code{void *}; for switch dispatch this is an integer type.
1121:
1122: @item vm_prim
1123: The VM instruction table (type: @code{Inst *}, @pxref{VM instruction table}).
1124:
1125: @item gen_inst(Inst **ctp, Inst i)
1126: This function compiles the instruction @code{i}. Take a look at it in
1127: @file{vmgen-ex/peephole.c}. It is trivial when you don't want to use
1128: superinstructions (just the last two lines of the example function), and
1129: slightly more complicated in the example due to its ability to use
1130: superinstructions (@pxref{Peephole optimization}).
1131:
1132: @item genarg_@var{type_prefix}(Inst **ctp, @var{type} @var{type_prefix})
1133: This compiles an immediate argument of @var{type} (as defined in a
1134: @code{type-prefix} definition). These functions are trivial to define
1135: (see @file{vmgen-ex/support.c}). You need one of these functions for
1136: every type that you use as immediate argument.
1137:
1138: @end table
1139:
1140: In addition to using these functions to generate code, you should call
1141: @code{BB_BOUNDARY} at every basic block entry point if you ever want to
1142: use superinstructions (or if you want to use the profiling supported by
1143: vmgen; however, this is mainly useful for selecting superinstructions).
1144: If you use @code{BB_BOUNDARY}, you should also define it (take a look at
1145: its definition in @file{vmgen-ex/mini.y}).
1146:
1147: You do not need to call @code{BB_BOUNDARY} after branches, because you
1148: will not define superinstructions that contain branches in the middle
1149: (and if you did, and it would work, there would be no reason to end the
1150: superinstruction at the branch), and because the branches announce
1151: themselves to the profiler.
1152:
1153:
1154: @c --------------------------------------------------------------------
1155: @node Peephole optimization, VM disassembler, VM code generation, Using the generated code
1156: @section Peephole optimization
1157:
1158: You need peephole optimization only if you want to use
1159: superinstructions. But having the code for it does not hurt much if you
1160: do not use superinstructions.
1161:
1162: A simple greedy peephole optimization algorithm is used for
1163: superinstruction selection: every time @code{gen_inst} compiles a VM
1164: instruction, it looks if it can combine it with the last VM instruction
1165: (which may also be a superinstruction resulting from a previous peephole
1166: optimization); if so, it changes the last instruction to the combined
1167: instruction instead of laying down @code{i} at the current @samp{*ctp}.
1168:
1169: The code for peephole optimization is in @file{vmgen-ex/peephole.c}.
1170: You can use this file almost verbatim. Vmgen generates
1171: @file{@var{file}-peephole.i} which contains data for the peephoile
1172: optimizer.
1173:
1174: You have to call @samp{init_peeptable()} after initializing
1175: @samp{vm_prim}, and before compiling any VM code to initialize data
1176: structures for peephole optimization. After that, compiling with the VM
1177: code generation functions will automatically combine VM instructions
1178: into superinstructions. Since you do not want to combine instructions
1179: across VM branch targets (otherwise there will not be a proper VM
1180: instruction to branch to), you have to call @code{BB_BOUNDARY}
1181: (@pxref{VM code generation}) at branch targets.
1182:
1183:
1184: @c --------------------------------------------------------------------
1185: @node VM disassembler, VM profiler, Peephole optimization, Using the generated code
1186: @section VM disassembler
1187:
1188: A VM code disassembler is optional for an interpretive system, but
1189: highly recommended during its development and maintenance, because it is
1190: very useful for detecting bugs in the front end (and for distinguishing
1191: them from VM interpreter bugs).
1192:
1193: Vmgen supports VM code disassembling by generating
1194: @file{@var{file}-disasm.i}. This code has to be wrapped into a
1195: function, as is done in @file{vmgen-ex/disasm.i}. You can use this file
1196: almost verbatim. In addition to @samp{vm_@var{A}2@var{B}(a,b)},
1197: @samp{vm_out}, @samp{printarg_@var{type}(@var{value})}, which are
1198: explained above, the following macros and variables are used in
1199: @file{@var{file}-disasm.i} (and you have to define them):
1200:
1201: @table @samp
1202:
1203: @item ip
1204: This variable points to the opcode of the current VM instruction.
1205:
1206: @item IP IPTOS
1207: @samp{IPTOS} is the first argument of the current VM instruction, and
1208: @samp{IP} points to it; this is just as in the engine, but here
1209: @samp{ip} points to the opcode of the VM instruction (in contrast to the
1210: engine, where @samp{ip} points to the next cell, or even one further).
1211:
1212: @item VM_IS_INST(Inst i, int n)
1213: Tests if the opcode @samp{i} is the same as the @samp{n}th entry in the
1214: VM instruction table.
1215:
1216: @end table
1217:
1218:
1219: @c --------------------------------------------------------------------
1220: @node VM profiler, , VM disassembler, Using the generated code
1221: @section VM profiler
1222:
1223: The VM profiler is designed for getting execution and occurence counts
1224: for VM instruction sequences, and these counts can then be used for
1225: selecting sequences as superinstructions. The VM profiler is probably
1226: not useful as profiling tool for the interpretive system. I.e., the VM
1227: profiler is useful for the developers, but not the users of the
1228: interpretive system.
1229:
1230: The output of the profiler is: for each basic block (executed at least
1231: once), it produces the dynamic execution count of that basic block and
1232: all its subsequences; e.g.,
1233:
1234: @example
1235: 9227465 lit storelocal
1236: 9227465 storelocal branch
1237: 9227465 lit storelocal branch
1238: @end example
1239:
1240: I.e., a basic block consisting of @samp{lit storelocal branch} is
1241: executed 9227465 times.
1242:
1243: This output can be combined in various ways. E.g.,
1244: @file{vmgen/stat.awk} adds up the occurences of a given sequence wrt
1245: dynamic execution, static occurence, and per-program occurence. E.g.,
1246:
1247: @example
1248: 2 16 36910041 loadlocal lit
1249: @end example
1250:
1251: indicates that the sequence @samp{loadlocal lit} occurs in 2 programs,
1252: in 16 places, and has been executed 36910041 times. Now you can select
1253: superinstructions in any way you like (note that compile time and space
1254: typically limit the number of superinstructions to 100--1000). After
1255: you have done that, @file{vmgen/seq2rule.awk} turns lines of the form
1256: above into rules for inclusion in a vmgen input file. Note that this
1257: script does not ensure that all prefixes are defined, so you have to do
1258: that in other ways. So, an overall script for turning profiles into
1259: superinstructions can look like this:
1260:
1261: @example
1262: awk -f stat.awk fib.prof test.prof|
1263: awk '$3>=10000'| #select sequences
1264: fgrep -v -f peephole-blacklist| #eliminate wrong instructions
1265: awk -f seq2rule.awk| #turn into superinstructions
1266: sort -k 3 >mini-super.vmg #sort sequences
1267: @end example
1268:
1269: Here the dynamic count is used for selecting sequences (preliminary
1270: results indicate that the static count gives better results, though);
1271: the third line eliminats sequences containing instructions that must not
1272: occur in a superinstruction, because they access a stack directly. The
1273: dynamic count selection ensures that all subsequences (including
1274: prefixes) of longer sequences occur (because subsequences have at least
1275: the same count as the longer sequences); the sort in the last line
1276: ensures that longer superinstructions occur after their prefixes.
1277:
1278: But before using it, you have to have the profiler. Vmgen supports its
1279: creation by generating @file{@var{file}-profile.i}; you also need the
1280: wrapper file @file{vmgen-ex/profile.c} that you can use almost verbatim.
1281:
1282: The profiler works by recording the targets of all VM control flow
1283: changes (through @code{SUPER_END} during execution, and through
1284: @code{BB_BOUNDARY} in the front end), and counting (through
1285: @code{SUPER_END}) how often they were targeted. After the program run,
1286: the numbers are corrected such that each VM basic block has the correct
1287: count (originally entering a block without executing a branch does not
1288: increase the count), then the subsequences of all basic blocks are
1289: printed. To get all this, you just have to define @code{SUPER_END} (and
1290: @code{BB_BOUNDARY}) appropriately, and call @code{vm_print_profile(FILE
1291: *file)} when you want to output the profile on @code{file}.
1292:
1293: The @file{@var{file}-profile.i} is simular to the disassembler file, and
1294: it uses variables and functions defined in @file{vmgen-ex/profile.c},
1295: plus @code{VM_IS_INST} already defined for the VM disassembler
1296: (@pxref{VM disassembler}).
1297:
1298:
1299: @c **********************************************************
1300: @node Changes, Contact, Using the generated code, Top
1301: @chapter Changes
1302:
1303: Users of the gforth-0.5.9-20010501 version of vmgen need to change
1304: several things in their source code to use the current version. I
1305: recommend keeping the gforth-0.5.9-20010501 version until you have
1306: completed the change (note that you can have several versions of Gforth
1307: installed at the same time). I hope to avoid such incompatible changes
1308: in the future.
1309:
1310: The required changes are:
1311:
1312: @table @code
1313:
1314: @item vm_@var{A}2@var{B}
1315: now takes two arguments.
1316:
1317: @item vm_two@var{A}2@var{B}(b,a1,a2);
1318: changed to vm_two@var{A}2@var{B}(a1,a2,b) (note the absence of the @samp{;}).
1319:
1320: @end table
1321:
1322: Also some new macros have to be defined, e.g., @code{INST_ADDR}, and
1323: @code{LABEL}; some macros have to be defined in new contexts, e.g.,
1324: @code{VM_IS_INST} is now also needed in the disassembler.
1325:
1326: @node Contact, Copying This Manual, Changes, Top
1327: @chapter Contact
1328:
1329: @node Copying This Manual, Index, Contact, Top
1330: @appendix Copying This Manual
1331:
1332: @menu
1333: * GNU Free Documentation License:: License for copying this manual.
1334: @end menu
1335:
1336: @include fdl.texi
1337:
1338:
1339: @node Index, , Copying This Manual, Top
1340: @unnumbered Index
1341:
1342: @printindex cp
1343:
1344: @bye
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>