gforth/doc/vmgen.texi - view

File: [gforth] / gforth / doc / vmgen.texi
Revision 1.12: download - view: text, annotated - select for diffs
Fri Aug 16 09:43:49 2002 UTC (21 years, 8 months ago) by anton
Branches: MAIN
CVS tags: HEAD

Documentation changes

1: \input texinfo @c -*-texinfo-*- 2: @comment %**start of header 3: @setfilename vmgen.info 4: @include version.texi 5: @settitle Vmgen (Gforth @value{VERSION}) 6: @c @syncodeindex pg cp 7: @comment %**end of header 8: @copying 9: This manual is for Vmgen 10: (version @value{VERSION}, @value{UPDATED}), 11: the virtual machine interpreter generator 12: 13: Copyright @copyright{} 2002 Free Software Foundation, Inc. 14: 15: @quotation 16: Permission is granted to copy, distribute and/or modify this document 17: under the terms of the GNU Free Documentation License, Version 1.1 or 18: any later version published by the Free Software Foundation; with no 19: Invariant Sections, with the Front-Cover texts being ``A GNU Manual,'' 20: and with the Back-Cover Texts as in (a) below. A copy of the 21: license is included in the section entitled ``GNU Free Documentation 22: License.'' 23: 24: (a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify 25: this GNU Manual, like GNU software. Copies published by the Free 26: Software Foundation raise funds for GNU development.'' 27: @end quotation 28: @end copying 29: 30: @dircategory GNU programming tools 31: @direntry 32: * Vmgen: (vmgen). Interpreter generator 33: @end direntry 34: 35: @titlepage 36: @title Vmgen 37: @subtitle for Gforth version @value{VERSION}, @value{UPDATED} 38: @author M. Anton Ertl (@email{anton@@mips.complang.tuwien.ac.at}) 39: @page 40: @vskip 0pt plus 1filll 41: @insertcopying 42: @end titlepage 43: 44: @contents 45: 46: @ifnottex 47: @node Top, Introduction, (dir), (dir) 48: @top Vmgen 49: 50: @insertcopying 51: @end ifnottex 52: 53: @menu 54: * Introduction:: What can Vmgen do for you? 55: * Why interpreters?:: Advantages and disadvantages 56: * Concepts:: VM interpreter background 57: * Invoking Vmgen:: 58: * Example:: 59: * Input File Format:: 60: * Using the generated code:: 61: * Changes:: from earlier versions 62: * Contact:: Bug reporting etc. 63: * Copying This Manual:: Manual License 64: * Index:: 65: 66: @detailmenu 67: --- The Detailed Node Listing --- 68: 69: Concepts 70: 71: * Front end and VM interpreter:: Modularizing an interpretive system 72: * Data handling:: Stacks, registers, immediate arguments 73: * Dispatch:: From one VM instruction to the next 74: 75: Example 76: 77: * Example overview:: 78: * Using profiling to create superinstructions:: 79: 80: Input File Format 81: 82: * Input File Grammar:: 83: * Simple instructions:: 84: * Superinstructions:: 85: * Register Machines:: How to define register VM instructions 86: 87: Simple instructions 88: 89: * C Code Macros:: Macros recognized by Vmgen 90: * C Code restrictions:: Vmgen makes assumptions about C code 91: 92: Using the generated code 93: 94: * VM engine:: Executing VM code 95: * VM instruction table:: 96: * VM code generation:: Creating VM code (in the front-end) 97: * Peephole optimization:: Creating VM superinstructions 98: * VM disassembler:: for debugging the front end 99: * VM profiler:: for finding worthwhile superinstructions 100: 101: Copying This Manual 102: 103: * GNU Free Documentation License:: License for copying this manual. 104: 105: @end detailmenu 106: @end menu 107: 108: @c @ifnottex 109: @c This file documents Vmgen (Gforth @value{VERSION}). 110: 111: @c ************************************************************ 112: @node Introduction, Why interpreters?, Top, Top 113: @chapter Introduction 114: 115: Vmgen is a tool for writing efficient interpreters. It takes a simple 116: virtual machine description and generates efficient C code for dealing 117: with the virtual machine code in various ways (in particular, executing 118: it). The run-time efficiency of the resulting interpreters is usually 119: within a factor of 10 of machine code produced by an optimizing 120: compiler. 121: 122: The interpreter design strategy supported by Vmgen is to divide the 123: interpreter into two parts: 124: 125: @itemize @bullet 126: 127: @item The @emph{front end} takes the source code of the language to be 128: implemented, and translates it into virtual machine code. This is 129: similar to an ordinary compiler front end; typically an interpreter 130: front-end performs no optimization, so it is relatively simple to 131: implement and runs fast. 132: 133: @item The @emph{virtual machine interpreter} executes the virtual 134: machine code. 135: 136: @end itemize 137: 138: Such a division is usually used in interpreters, for modularity as well 139: as for efficiency. The virtual machine code is typically passed between 140: front end and virtual machine interpreter in memory, like in a 141: load-and-go compiler; this avoids the complexity and time cost of 142: writing the code to a file and reading it again. 143: 144: A @emph{virtual machine} (VM) represents the program as a sequence of 145: @emph{VM instructions}, following each other in memory, similar to real 146: machine code. Control flow occurs through VM branch instructions, like 147: in a real machine. 148: 149: @cindex functionality features overview 150: In this setup, Vmgen can generate most of the code dealing with virtual 151: machine instructions from a simple description of the virtual machine 152: instructions (@pxref{Input File Format}), in particular: 153: 154: @table @asis 155: 156: @item VM instruction execution 157: 158: @item VM code generation 159: Useful in the front end. 160: 161: @item VM code decompiler 162: Useful for debugging the front end. 163: 164: @item VM code tracing 165: Useful for debugging the front end and the VM interpreter. You will 166: typically provide other means for debugging the user's programs at the 167: source level. 168: 169: @item VM code profiling 170: Useful for optimizing the VM interpreter with superinstructions 171: (@pxref{VM profiler}). 172: 173: @end table 174: 175: @cindex efficiency features overview 176: @noindent 177: Vmgen supports efficient interpreters though various optimizations, in 178: particular 179: 180: @itemize @bullet 181: 182: @item Threaded code 183: 184: @item Caching the top-of-stack in a register 185: 186: @item Combining VM instructions into superinstructions 187: 188: @item 189: Replicating VM (super)instructions for better BTB prediction accuracy 190: (not yet in vmgen-ex, but already in Gforth). 191: 192: @end itemize 193: 194: @cindex speed for JVM 195: As a result, Vmgen-based interpreters are only about an order of 196: magnitude slower than native code from an optimizing C compiler on small 197: benchmarks; on large benchmarks, which spend more time in the run-time 198: system, the slowdown is often less (e.g., the slowdown of a 199: Vmgen-generated JVM interpreter over the best JVM JIT compiler we 200: measured is only a factor of 2-3 for large benchmarks; some other JITs 201: and all other interpreters we looked at were slower than our 202: interpreter). 203: 204: VMs are usually designed as stack machines (passing data between VM 205: instructions on a stack), and Vmgen supports such designs especially 206: well; however, you can also use Vmgen for implementing a register VM 207: (@pxref{Register Machines}) and still benefit from most of the advantages 208: offered by Vmgen. 209: 210: There are many potential uses of the instruction descriptions that are 211: not implemented at the moment, but we are open for feature requests, and 212: we will implement new features if someone asks for them; so the feature 213: list above is not exhaustive. 214: 215: @c ********************************************************************* 216: @node Why interpreters?, Concepts, Introduction, Top 217: @chapter Why interpreters? 218: @cindex interpreters, advantages 219: @cindex advantages of interpreters 220: @cindex advantages of vmgen 221: 222: Interpreters are a popular language implementation technique because 223: they combine all three of the following advantages: 224: 225: @itemize @bullet 226: 227: @item Ease of implementation 228: 229: @item Portability 230: 231: @item Fast edit-compile-run cycle 232: 233: @end itemize 234: 235: Vmgen makes it even easier to implement interpreters. 236: 237: @cindex speed of interpreters 238: The main disadvantage of interpreters is their run-time speed. However, 239: there are huge differences between different interpreters in this area: 240: the slowdown over optimized C code on programs consisting of simple 241: operations is typically a factor of 10 for the more efficient 242: interpreters, and a factor of 1000 for the less efficient ones (the 243: slowdown for programs executing complex operations is less, because the 244: time spent in libraries for executing complex operations is the same in 245: all implementation strategies). 246: 247: Vmgen supports techniques for building efficient interpreters. 248: 249: @c ******************************************************************** 250: @node Concepts, Invoking Vmgen, Why interpreters?, Top 251: @chapter Concepts 252: 253: @menu 254: * Front end and VM interpreter:: Modularizing an interpretive system 255: * Data handling:: Stacks, registers, immediate arguments 256: * Dispatch:: From one VM instruction to the next 257: @end menu 258: 259: @c -------------------------------------------------------------------- 260: @node Front end and VM interpreter, Data handling, Concepts, Concepts 261: @section Front end and VM interpreter 262: @cindex modularization of interpreters 263: 264: @cindex front-end 265: Interpretive systems are typically divided into a @emph{front end} that 266: parses the input language and produces an intermediate representation 267: for the program, and an interpreter that executes the intermediate 268: representation of the program. 269: 270: @cindex virtual machine 271: @cindex VM 272: @cindex VM instruction 273: @cindex instruction, VM 274: @cindex VM branch instruction 275: @cindex branch instruction, VM 276: @cindex VM register 277: @cindex register, VM 278: @cindex opcode, VM instruction 279: @cindex immediate argument, VM instruction 280: For efficient interpreters the intermediate representation of choice is 281: virtual machine code (rather than, e.g., an abstract syntax tree). 282: @emph{Virtual machine} (VM) code consists of VM instructions arranged 283: sequentially in memory; they are executed in sequence by the VM 284: interpreter, but VM branch instructions can change the control flow and 285: are used for implementing control structures. The conceptual similarity 286: to real machine code results in the name @emph{virtual machine}. 287: Various terms similar to terms for real machines are used; e.g., there 288: are @emph{VM registers} (like the instruction pointer and stack 289: pointer(s)), and the VM instruction consists of an @emph{opcode} and 290: @emph{immediate arguments}. 291: 292: In this framework, Vmgen supports building the VM interpreter and any 293: other component dealing with VM instructions. It does not have any 294: support for the front end, apart from VM code generation support. The 295: front end can be implemented with classical compiler front-end 296: techniques, supported by tools like @command{flex} and @command{bison}. 297: 298: The intermediate representation is usually just internal to the 299: interpreter, but some systems also support saving it to a file, either 300: as an image file, or in a full-blown linkable file format (e.g., JVM). 301: Vmgen currently has no special support for such features, but the 302: information in the instruction descriptions can be helpful, and we are 303: open for feature requests and suggestions. 304: 305: @c -------------------------------------------------------------------- 306: @node Data handling, Dispatch, Front end and VM interpreter, Concepts 307: @section Data handling 308: 309: @cindex stack machine 310: @cindex register machine 311: Most VMs use one or more stacks for passing temporary data between VM 312: instructions. Another option is to use a register machine architecture 313: for the virtual machine; however, this option is either slower or 314: significantly more complex to implement than a stack machine architecture. 315: 316: Vmgen has special support and optimizations for stack VMs, making their 317: implementation easy and efficient. 318: 319: You can also implement a register VM with Vmgen (@pxref{Register 320: Machines}), and you will still profit from most Vmgen features. 321: 322: @cindex stack item size 323: @cindex size, stack items 324: Stack items all have the same size, so they typically will be as wide as 325: an integer, pointer, or floating-point value. Vmgen supports treating 326: two consecutive stack items as a single value, but anything larger is 327: best kept in some other memory area (e.g., the heap), with pointers to 328: the data on the stack. 329: 330: @cindex instruction stream 331: @cindex immediate arguments 332: Another source of data is immediate arguments VM instructions (in the VM 333: instruction stream). The VM instruction stream is handled similar to a 334: stack in Vmgen. 335: 336: @cindex garbage collection 337: @cindex reference counting 338: Vmgen has no built-in support for, nor restrictions against 339: @emph{garbage collection}. If you need garbage collection, you need to 340: provide it in your run-time libraries. Using @emph{reference counting} 341: is probably harder, but might be possible (contact us if you are 342: interested). 343: @c reference counting might be possible by including counting code in 344: @c the conversion macros. 345: 346: @c -------------------------------------------------------------------- 347: @node Dispatch, , Data handling, Concepts 348: @section Dispatch 349: @cindex Dispatch of VM instructions 350: @cindex main interpreter loop 351: 352: Understanding this section is probably not necessary for using Vmgen, 353: but it may help. You may want to skip it now, and read it if you find statements about dispatch methods confusing. 354: 355: After executing one VM instruction, the VM interpreter has to dispatch 356: the next VM instruction (Vmgen calls the dispatch routine @samp{NEXT}). 357: Vmgen supports two methods of dispatch: 358: 359: @table @asis 360: 361: @item switch dispatch 362: @cindex switch dispatch 363: In this method the VM interpreter contains a giant @code{switch} 364: statement, with one @code{case} for each VM instruction. The VM 365: instruction opcodes are represented by integers (e.g., produced by an 366: @code{enum}) in the VM code, and dispatch occurs by loading the next 367: opcode, @code{switch}ing on it, and continuing at the appropriate 368: @code{case}; after executing the VM instruction, the VM interpreter 369: jumps back to the dispatch code. 370: 371: @item threaded code 372: @cindex threaded code 373: This method represents a VM instruction opcode by the address of the 374: start of the machine code fragment for executing the VM instruction. 375: Dispatch consists of loading this address, jumping to it, and 376: incrementing the VM instruction pointer. Typically the threaded-code 377: dispatch code is appended directly to the code for executing the VM 378: instruction. Threaded code cannot be implemented in ANSI C, but it can 379: be implemented using GNU C's labels-as-values extension (@pxref{Labels 380: as Values, , Labels as Values, gcc.info, GNU C Manual}). 381: 382: @end table 383: 384: Threaded code can be twice as fast as switch dispatch, depending on the 385: interpreter, the benchmark, and the machine. 386: 387: @c ************************************************************* 388: @node Invoking Vmgen, Example, Concepts, Top 389: @chapter Invoking Vmgen 390: @cindex Invoking Vmgen 391: 392: The usual way to invoke Vmgen is as follows: 393: 394: @example 395: vmgen @var{infile} 396: @end example 397: 398: Here @var{infile} is the VM instruction description file, which usually 399: ends in @file{.vmg}. The output filenames are made by taking the 400: basename of @file{infile} (i.e., the output files will be created in the 401: current working directory) and replacing @file{.vmg} with @file{-vm.i}, 402: @file{-disasm.i}, @file{-gen.i}, @file{-labels.i}, @file{-profile.i}, 403: and @file{-peephole.i}. E.g., @command{vmgen hack/foo.vmg} will create 404: @file{foo-vm.i} etc. 405: 406: The command-line options supported by Vmgen are 407: 408: @table @option 409: 410: @cindex -h, command-line option 411: @cindex --help, command-line option 412: @item --help 413: @itemx -h 414: Print a message about the command-line options 415: 416: @cindex -v, command-line option 417: @cindex --version, command-line option 418: @item --version 419: @itemx -v 420: Print version and exit 421: @end table 422: 423: @c env vars GFORTHDIR GFORTHDATADIR 424: 425: @c **************************************************************** 426: @node Example, Input File Format, Invoking Vmgen, Top 427: @chapter Example 428: @cindex example of a Vmgen-based interpreter 429: 430: @menu 431: * Example overview:: 432: * Using profiling to create superinstructions:: 433: @end menu 434: 435: @c -------------------------------------------------------------------- 436: @node Example overview, Using profiling to create superinstructions, Example, Example 437: @section Example overview 438: @cindex example overview 439: @cindex @file{vmgen-ex} 440: @cindex @file{vmgen-ex2} 441: 442: There are two versions of the same example for using Vmgen: 443: @file{vmgen-ex} and @file{vmgen-ex2} (you can also see Gforth as 444: example, but it uses additional (undocumented) features, and also 445: differs in some other respects). The example implements @emph{mini}, a 446: tiny Modula-2-like language with a small JavaVM-like virtual machine. 447: 448: The difference between the examples is that @file{vmgen-ex} uses many 449: casts, and @file{vmgen-ex2} tries to avoids most casts and uses unions 450: instead. In the rest of this manual we usually mention just files in 451: @file{vmgen-ex}; if you want to use unions, use the equivalent file in 452: @file{vmgen-ex2}. 453: @cindex unions example 454: @cindex casts example 455: 456: The files provided with each example are: 457: @cindex example files 458: 459: @example 460: Makefile 461: README 462: disasm.c wrapper file 463: engine.c wrapper file 464: peephole.c wrapper file 465: profile.c wrapper file 466: mini-inst.vmg simple VM instructions 467: mini-super.vmg superinstructions (empty at first) 468: mini.h common declarations 469: mini.l scanner 470: mini.y front end (parser, VM code generator) 471: support.c main() and other support functions 472: fib.mini example mini program 473: simple.mini example mini program 474: test.mini example mini program (tests everything) 475: test.out test.mini output 476: stat.awk script for aggregating profile information 477: peephole-blacklist list of instructions not allowed in superinstructions 478: seq2rule.awk script for creating superinstructions 479: @end example 480: 481: For your own interpreter, you would typically copy the following files 482: and change little, if anything: 483: @cindex wrapper files 484: 485: @example 486: disasm.c wrapper file 487: engine.c wrapper file 488: peephole.c wrapper file 489: profile.c wrapper file 490: stat.awk script for aggregating profile information 491: seq2rule.awk script for creating superinstructions 492: @end example 493: 494: @noindent 495: You would typically change much in or replace the following files: 496: 497: @example 498: Makefile 499: mini-inst.vmg simple VM instructions 500: mini.h common declarations 501: mini.l scanner 502: mini.y front end (parser, VM code generator) 503: support.c main() and other support functions 504: peephole-blacklist list of instructions not allowed in superinstructions 505: @end example 506: 507: You can build the example by @code{cd}ing into the example's directory, 508: and then typing @code{make}; you can check that it works with @code{make 509: check}. You can run run mini programs like this: 510: 511: @example 512: ./mini fib.mini 513: @end example 514: 515: To learn about the options, type @code{./mini -h}. 516: 517: @c -------------------------------------------------------------------- 518: @node Using profiling to create superinstructions, , Example overview, Example 519: @section Using profiling to create superinstructions 520: @cindex profiling example 521: @cindex superinstructions example 522: 523: I have not added rules for this in the @file{Makefile} (there are many 524: options for selecting superinstructions, and I did not want to hardcode 525: one into the @file{Makefile}), but there are some supporting scripts, and 526: here's an example: 527: 528: Suppose you want to use @file{fib.mini} and @file{test.mini} as training 529: programs, you get the profiles like this: 530: 531: @example 532: make fib.prof test.prof #takes a few seconds 533: @end example 534: 535: You can aggregate these profiles with @file{stat.awk}: 536: 537: @example 538: awk -f stat.awk fib.prof test.prof 539: @end example 540: 541: The result contains lines like: 542: 543: @example 544: 2 16 36910041 loadlocal lit 545: @end example 546: 547: This means that the sequence @code{loadlocal lit} statically occurs a 548: total of 16 times in 2 profiles, with a dynamic execution count of 549: 36910041. 550: 551: The numbers can be used in various ways to select superinstructions. 552: E.g., if you just want to select all sequences with a dynamic 553: execution count exceeding 10000, you would use the following pipeline: 554: 555: @example 556: awk -f stat.awk fib.prof test.prof| 557: awk '$3>=10000'| #select sequences 558: fgrep -v -f peephole-blacklist| #eliminate wrong instructions 559: awk -f seq2rule.awk| #transform sequences into superinstruction rules 560: sort -k 3 >mini-super.vmg #sort sequences 561: @end example 562: 563: The file @file{peephole-blacklist} contains all instructions that 564: directly access a stack or stack pointer (for mini: @code{call}, 565: @code{return}); the sort step is necessary to ensure that prefixes 566: preceed larger superinstructions. 567: 568: Now you can create a version of mini with superinstructions by just 569: saying @samp{make} 570: 571: 572: @c *************************************************************** 573: @node Input File Format, Using the generated code, Example, Top 574: @chapter Input File Format 575: @cindex input file format 576: @cindex format, input file 577: 578: Vmgen takes as input a file containing specifications of virtual machine 579: instructions. This file usually has a name ending in @file{.vmg}. 580: 581: Most examples are taken from the example in @file{vmgen-ex}. 582: 583: @menu 584: * Input File Grammar:: 585: * Simple instructions:: 586: * Superinstructions:: 587: * Register Machines:: How to define register VM instructions 588: @end menu 589: 590: @c -------------------------------------------------------------------- 591: @node Input File Grammar, Simple instructions, Input File Format, Input File Format 592: @section Input File Grammar 593: @cindex grammar, input file 594: @cindex input file grammar 595: 596: The grammar is in EBNF format, with @code{@var{a}|@var{b}} meaning 597: ``@var{a} or @var{b}'', @code{@{@var{c}@}} meaning 0 or more repetitions 598: of @var{c} and @code{[@var{d}]} meaning 0 or 1 repetitions of @var{d}. 599: 600: @cindex free-format, not 601: Vmgen input is not free-format, so you have to take care where you put 602: spaces and especially newlines; it's not as bad as makefiles, though: 603: any sequence of spaces and tabs is equivalent to a single space. 604: 605: @example 606: description: @{instruction|comment|eval-escape@} 607: 608: instruction: simple-inst|superinst 609: 610: simple-inst: ident ' (' stack-effect ' )' newline c-code newline newline 611: 612: stack-effect: @{ident@} ' --' @{ident@} 613: 614: super-inst: ident ' =' ident @{ident@} 615: 616: comment: '\ ' text newline 617: 618: eval-escape: '\e ' text newline 619: @end example 620: @c \+ \- \g \f \c 621: 622: Note that the @code{\}s in this grammar are meant literally, not as 623: C-style encodings for non-printable characters. 624: 625: The C code in @code{simple-inst} must not contain empty lines (because 626: Vmgen would mistake that as the end of the simple-inst. The text in 627: @code{comment} and @code{eval-escape} must not contain a newline. 628: @code{Ident} must conform to the usual conventions of C identifiers 629: (otherwise the C compiler would choke on the Vmgen output). 630: 631: Vmgen understands a few extensions beyond the grammar given here, but 632: these extensions are only useful for building Gforth. You can find a 633: description of the format used for Gforth in @file{prim}. 634: 635: @subsection Eval escapes 636: @cindex escape to Forth 637: @cindex eval escape 638: 639: @c woanders? 640: The text in @code{eval-escape} is Forth code that is evaluated when 641: Vmgen reads the line. If you do not know (and do not want to learn) 642: Forth, you can build the text according to the following grammar; these 643: rules are normally all Forth you need for using Vmgen: 644: 645: @example 646: text: stack-decl|type-prefix-decl|stack-prefix-decl 647: 648: stack-decl: 'stack ' ident ident ident 649: type-prefix-decl: 650: 's" ' string '" ' ('single'|'double') ident 'type-prefix' ident 651: stack-prefix-decl: ident 'stack-prefix' string 652: @end example 653: 654: Note that the syntax of this code is not checked thoroughly (there are 655: many other Forth program fragments that could be written there). 656: 657: If you know Forth, the stack effects of the non-standard words involved 658: are: 659: @findex stack 660: @findex type-prefix 661: @findex single 662: @findex double 663: @findex stack-prefix 664: @example 665: stack ( "name" "pointer" "type" -- ) 666: ( name execution: -- stack ) 667: type-prefix ( addr u xt1 xt2 n stack "prefix" -- ) 668: single ( -- xt1 xt2 n ) 669: double ( -- xt1 xt2 n ) 670: stack-prefix ( stack "prefix" -- ) 671: @end example 672: 673: 674: @c -------------------------------------------------------------------- 675: @node Simple instructions, Superinstructions, Input File Grammar, Input File Format 676: @section Simple instructions 677: @cindex simple VM instruction 678: @cindex instruction, simple VM 679: 680: We will use the following simple VM instruction description as example: 681: 682: @example 683: sub ( i1 i2 -- i ) 684: i = i1-i2; 685: @end example 686: 687: The first line specifies the name of the VM instruction (@code{sub}) and 688: its stack effect (@code{i1 i2 -- i}). The rest of the description is 689: just plain C code. 690: 691: @cindex stack effect 692: @cindex effect, stack 693: The stack effect specifies that @code{sub} pulls two integers from the 694: data stack and puts them in the C variables @code{i1} and @code{i2} 695: (with the rightmost item (@code{i2}) taken from the top of stack; 696: intuition: if you push @code{i1}, then @code{i2} on the stack, the 697: resulting stack picture is @code{i1 i2}) and later pushes one integer 698: (@code{i}) on the data stack (the rightmost item is on the top 699: afterwards). 700: 701: @cindex prefix, type 702: @cindex type prefix 703: @cindex default stack of a type prefix 704: How do we know the type and stack of the stack items? Vmgen uses 705: prefixes, similar to Fortran; in contrast to Fortran, you have to 706: define the prefix first: 707: 708: @example 709: \E s" Cell" single data-stack type-prefix i 710: @end example 711: 712: This defines the prefix @code{i} to refer to the type @code{Cell} 713: (defined as @code{long} in @file{mini.h}) and, by default, to the 714: @code{data-stack}. It also specifies that this type takes one stack 715: item (@code{single}). The type prefix is part of the variable name. 716: 717: @cindex stack definition 718: @cindex defining a stack 719: Before we can use @code{data-stack} in this way, we have to define it: 720: 721: @example 722: \E stack data-stack sp Cell 723: @end example 724: @c !! use something other than Cell 725: 726: @cindex stack basic type 727: @cindex basic type of a stack 728: @cindex type of a stack, basic 729: @cindex stack growth direction 730: This line defines the stack @code{data-stack}, which uses the stack 731: pointer @code{sp}, and each item has the basic type @code{Cell}; other 732: types have to fit into one or two @code{Cell}s (depending on whether the 733: type is @code{single} or @code{double} wide), and are cast from and to 734: Cells on accessing the @code{data-stack} with type cast macros 735: (@pxref{VM engine}). Stacks grow towards lower addresses in 736: Vmgen-erated interpreters. 737: 738: @cindex stack prefix 739: @cindex prefix, stack 740: We can override the default stack of a stack item by using a stack 741: prefix. E.g., consider the following instruction: 742: 743: @example 744: lit ( #i -- i ) 745: @end example 746: 747: The VM instruction @code{lit} takes the item @code{i} from the 748: instruction stream (indicated by the prefix @code{#}), and pushes it on 749: the (default) data stack. The stack prefix is not part of the variable 750: name. Stack prefixes are defined like this: 751: 752: @example 753: \E inst-stream stack-prefix # 754: @end example 755: 756: This definition defines that the stack prefix @code{#} specifies the 757: ``stack'' @code{inst-stream}. Since the instruction stream behaves a 758: little differently than an ordinary stack, it is predefined, and you do 759: not need to define it. 760: 761: @cindex instruction stream 762: The instruction stream contains instructions and their immediate 763: arguments, so specifying that an argument comes from the instruction 764: stream indicates an immediate argument. Of course, instruction stream 765: arguments can only appear to the left of @code{--} in the stack effect. 766: If there are multiple instruction stream arguments, the leftmost is the 767: first one (just as the intuition suggests). 768: 769: @menu 770: * C Code Macros:: Macros recognized by Vmgen 771: * C Code restrictions:: Vmgen makes assumptions about C code 772: @end menu 773: 774: @c -------------------------------------------------------------------- 775: @node C Code Macros, C Code restrictions, Simple instructions, Simple instructions 776: @subsection C Code Macros 777: @cindex macros recognized by Vmgen 778: @cindex basic block, VM level 779: 780: Vmgen recognizes the following strings in the C code part of simple 781: instructions: 782: 783: @table @code 784: 785: @item SET_IP 786: @findex SET_IP 787: As far as Vmgen is concerned, a VM instruction containing this ends a VM 788: basic block (used in profiling to delimit profiled sequences). On the C 789: level, this also sets the instruction pointer. 790: 791: @item SUPER_END 792: @findex SUPER_END 793: This ends a basic block (for profiling), even if the instruction 794: contains no @code{SET_IP}. 795: 796: @item TAIL; 797: @findex TAIL; 798: Vmgen replaces @samp{TAIL;} with code for ending a VM instruction and 799: dispatching the next VM instruction. Even without a @samp{TAIL;} this 800: happens automatically when control reaches the end of the C code. If 801: you want to have this in the middle of the C code, you need to use 802: @samp{TAIL;}. A typical example is a conditional VM branch: 803: 804: @example 805: if (branch_condition) @{ 806: SET_IP(target); TAIL; 807: @} 808: /* implicit tail follows here */ 809: @end example 810: 811: In this example, @samp{TAIL;} is not strictly necessary, because there 812: is another one implicitly after the if-statement, but using it improves 813: branch prediction accuracy slightly and allows other optimizations. 814: 815: @item SUPER_CONTINUE 816: @findex SUPER_CONTINUE 817: This indicates that the implicit tail at the end of the VM instruction 818: dispatches the sequentially next VM instruction even if there is a 819: @code{SET_IP} in the VM instruction. This enables an optimization that 820: is not yet implemented in the vmgen-ex code (but in Gforth). The 821: typical application is in conditional VM branches: 822: 823: @example 824: if (branch_condition) @{ 825: SET_IP(target); TAIL; /* now this TAIL is necessary */ 826: @} 827: SUPER_CONTINUE; 828: @end example 829: 830: @end table 831: 832: Note that Vmgen is not smart about C-level tokenization, comments, 833: strings, or conditional compilation, so it will interpret even a 834: commented-out SUPER_END as ending a basic block (or, e.g., 835: @samp{RETAIL;} as @samp{TAIL;}). Conversely, Vmgen requires the literal 836: presence of these strings; Vmgen will not see them if they are hiding in 837: a C preprocessor macro. 838: 839: 840: @c -------------------------------------------------------------------- 841: @node C Code restrictions, , C Code Macros, Simple instructions 842: @subsection C Code restrictions 843: @cindex C code restrictions 844: @cindex restrictions on C code 845: @cindex assumptions about C code 846: 847: @cindex accessing stack (pointer) 848: @cindex stack pointer, access 849: @cindex instruction pointer, access 850: Vmgen generates code and performs some optimizations under the 851: assumption that the user-supplied C code does not access the stack 852: pointers or stack items, and that accesses to the instruction pointer 853: only occur through special macros. In general you should heed these 854: restrictions. However, if you need to break these restrictions, read 855: the following. 856: 857: Accessing a stack or stack pointer directly can be a problem for several 858: reasons: 859: @cindex stack caching, restriction on C code 860: @cindex superinstructions, restrictions on components 861: 862: @itemize @bullet 863: 864: @item 865: Vmgen optionally supports caching the top-of-stack item in a local 866: variable (that is allocated to a register). This is the most frequent 867: source of trouble. You can deal with it either by not using 868: top-of-stack caching (slowdown factor 1-1.4, depending on machine), or 869: by inserting flushing code (e.g., @samp{IF_spTOS(sp[...] = spTOS);}) at 870: the start and reloading code (e.g., @samp{IF_spTOS(spTOS = sp[0])}) at 871: the end of problematic C code. Vmgen inserts a stack pointer update 872: before the start of the user-supplied C code, so the flushing code has 873: to use an index that corrects for that. In the future, this flushing 874: may be done automatically by mentioning a special string in the C code. 875: @c sometimes flushing and/or reloading unnecessary 876: 877: @item 878: The Vmgen-erated code loads the stack items from stack-pointer-indexed 879: memory into variables before the user-supplied C code, and stores them 880: from variables to stack-pointer-indexed memory afterwards. If you do 881: any writes to the stack through its stack pointer in your C code, it 882: will not affact the variables, and your write may be overwritten by the 883: stores after the C code. Similarly, a read from a stack using a stack 884: pointer will not reflect computations of stack items in the same VM 885: instruction. 886: 887: @item 888: Superinstructions keep stack items in variables across the whole 889: superinstruction. So you should not include VM instructions, that 890: access a stack or stack pointer, as components of superinstructions 891: (@pxref{VM profiler}). 892: 893: @end itemize 894: 895: You should access the instruction pointer only through its special 896: macros (@samp{IP}, @samp{SET_IP}, @samp{IPTOS}); this ensure that these 897: macros can be implemented in several ways for best performance. 898: @samp{IP} points to the next instruction, and @samp{IPTOS} is its 899: contents. 900: 901: 902: @c -------------------------------------------------------------------- 903: @node Superinstructions, Register Machines, Simple instructions, Input File Format 904: @section Superinstructions 905: @cindex superinstructions, defining 906: @cindex defining superinstructions 907: 908: Note: don't invest too much work in (static) superinstructions; a future 909: version of Vmgen will support dynamic superinstructions (see Ian 910: Piumarta and Fabio Riccardi, @cite{Optimizing Direct Threaded Code by 911: Selective Inlining}, PLDI'98), and static superinstructions have much 912: less benefit in that context (preliminary results indicate only a factor 913: 1.1 speedup). 914: 915: Here is an example of a superinstruction definition: 916: 917: @example 918: lit_sub = lit sub 919: @end example 920: 921: @code{lit_sub} is the name of the superinstruction, and @code{lit} and 922: @code{sub} are its components. This superinstruction performs the same 923: action as the sequence @code{lit} and @code{sub}. It is generated 924: automatically by the VM code generation functions whenever that sequence 925: occurs, so if you want to use this superinstruction, you just need to 926: add this definition (and even that can be partially automatized, 927: @pxref{VM profiler}). 928: 929: @cindex prefixes of superinstructions 930: Vmgen requires that the component instructions are simple instructions 931: defined before superinstructions using the components. Currently, Vmgen 932: also requires that all the subsequences at the start of a 933: superinstruction (prefixes) must be defined as superinstruction before 934: the superinstruction. I.e., if you want to define a superinstruction 935: 936: @example 937: foo4 = load add sub mul 938: @end example 939: 940: you first have to define @code{load}, @code{add}, @code{sub} and 941: @code{mul}, plus 942: 943: @example 944: foo2 = load add 945: foo3 = load add sub 946: @end example 947: 948: Here, @code{sumof4} is the longest prefix of @code{sumof5}, and @code{sumof3} 949: is the longest prefix of @code{sumof4}. 950: 951: Note that Vmgen assumes that only the code it generates accesses stack 952: pointers, the instruction pointer, and various stack items, and it 953: performs optimizations based on this assumption. Therefore, VM 954: instructions where your C code changes the instruction pointer should 955: only be used as last component; a VM instruction where your C code 956: accesses a stack pointer should not be used as component at all. Vmgen 957: does not check these restrictions, they just result in bugs in your 958: interpreter. 959: 960: @c ------------------------------------------------------------------- 961: @node Register Machines, , Superinstructions, Input File Format 962: @section Register Machines 963: @cindex Register VM 964: @cindex Superinstructions for register VMs 965: @cindex tracing of register VMs 966: 967: If you want to implement a register VM rather than a stack VM with 968: Vmgen, there are two ways to do it: Directly and through 969: superinstructions. 970: 971: If you use the direct way, you define instructions that take the 972: register numbers as immediate arguments, like this: 973: 974: @example 975: add3 ( #src1 #src2 #dest -- ) 976: reg[dest] = reg[src1]+reg[src2]; 977: @end example 978: 979: A disadvantage of this method is that during tracing you only see the 980: register numbers, but not the register contents. Actually, with an 981: appropriate definition of @code{printarg_src} (@pxref{VM engine}), you 982: can print the values of the source registers on entry, but you cannot 983: print the value of the destination register on exit. 984: 985: If you use superinstructions to define a register VM, you define simple 986: instructions that use a stack, and then define superinstructions that 987: have no overall stack effect, like this: 988: 989: @example 990: loadreg ( #src -- n ) 991: n = reg[src]; 992: 993: storereg ( n #dest -- ) 994: reg[dest] = n; 995: 996: adds ( n1 n2 -- n ) 997: n = n1+n2; 998: 999: add3 = loadreg loadreg adds storereg 1000: @end example 1001: 1002: An advantage of this method is that you see the values and not just the 1003: register numbers in tracing. A disadvantage of this method is that 1004: currently you cannot generate superinstructions directly, but only 1005: through generating a sequence of simple instructions (we might change 1006: this in the future if there is demand). 1007: 1008: Could the register VM support be improved, apart from the issues 1009: mentioned above? It is hard to see how to do it in a general way, 1010: because there are a number of different designs that different people 1011: mean when they use the term @emph{register machine} in connection with 1012: VM interpreters. However, if you have ideas or requests in that 1013: direction, please let me know (@pxref{Contact}). 1014: 1015: @c ******************************************************************** 1016: @node Using the generated code, Changes, Input File Format, Top 1017: @chapter Using the generated code 1018: @cindex generated code, usage 1019: @cindex Using vmgen-erated code 1020: 1021: The easiest way to create a working VM interpreter with Vmgen is 1022: probably to start with @file{vmgen-ex}, and modify it for your purposes. 1023: This chapter is just the reference manual for the macros etc. used by 1024: the generated code, the other context expected by the generated code, 1025: and what you can do with the various generated files. 1026: 1027: @menu 1028: * VM engine:: Executing VM code 1029: * VM instruction table:: 1030: * VM code generation:: Creating VM code (in the front-end) 1031: * Peephole optimization:: Creating VM superinstructions 1032: * VM disassembler:: for debugging the front end 1033: * VM profiler:: for finding worthwhile superinstructions 1034: @end menu 1035: 1036: @c -------------------------------------------------------------------- 1037: @node VM engine, VM instruction table, Using the generated code, Using the generated code 1038: @section VM engine 1039: @cindex VM instruction execution 1040: @cindex engine 1041: @cindex executing VM code 1042: @cindex @file{engine.c} 1043: @cindex @file{-vm.i} output file 1044: 1045: The VM engine is the VM interpreter that executes the VM code. It is 1046: essential for an interpretive system. 1047: 1048: Vmgen supports two methods of VM instruction dispatch: @emph{threaded 1049: code} (fast, but gcc-specific), and @emph{switch dispatch} (slow, but 1050: portable across C compilers); you can use conditional compilation 1051: (@samp{defined(__GNUC__)}) to choose between these methods, and our 1052: example does so. 1053: 1054: For both methods, the VM engine is contained in a C-level function. 1055: Vmgen generates most of the contents of the function for you 1056: (@file{@var{name}-vm.i}), but you have to define this function, and 1057: macros and variables used in the engine, and initialize the variables. 1058: In our example the engine function also includes 1059: @file{@var{name}-labels.i} (@pxref{VM instruction table}). 1060: 1061: @cindex tracing VM code 1062: In addition to executing the code, the VM engine can optionally also 1063: print out a trace of the executed instructions, their arguments and 1064: results. For superinstructions it prints the trace as if only component 1065: instructions were executed; this allows to introduce new 1066: superinstructions while keeping the traces comparable to old ones 1067: (important for regression tests). 1068: 1069: It costs significant performance to check in each instruction whether to 1070: print tracing code, so we recommend producing two copies of the engine: 1071: one for fast execution, and one for tracing. See the rules for 1072: @file{engine.o} and @file{engine-debug.o} in @file{vmgen-ex/Makefile} 1073: for an example. 1074: 1075: The following macros and variables are used in @file{@var{name}-vm.i}: 1076: 1077: @table @code 1078: 1079: @findex LABEL 1080: @item LABEL(@var{inst_name}) 1081: This is used just before each VM instruction to provide a jump or 1082: @code{switch} label (the @samp{:} is provided by Vmgen). For switch 1083: dispatch this should expand to @samp{case @var{label}}; for 1084: threaded-code dispatch this should just expand to @samp{@var{label}}. 1085: In either case @var{label} is usually the @var{inst_name} with some 1086: prefix or suffix to avoid naming conflicts. 1087: 1088: @findex LABEL2 1089: @item LABEL2(@var{inst_name}) 1090: This will be used for dynamic superinstructions; at the moment, this 1091: should expand to nothing. 1092: 1093: @findex NAME 1094: @item NAME(@var{inst_name_string}) 1095: Called on entering a VM instruction with a string containing the name of 1096: the VM instruction as parameter. In normal execution this should be a 1097: noop, but for tracing this usually prints the name, and possibly other 1098: information (several VM registers in our example). 1099: 1100: @findex DEF_CA 1101: @item DEF_CA 1102: Usually empty. Called just inside a new scope at the start of a VM 1103: instruction. Can be used to define variables that should be visible 1104: during every VM instruction. If you define this macro as non-empty, you 1105: have to provide the finishing @samp{;} in the macro. 1106: 1107: @findex NEXT_P0 1108: @findex NEXT_P1 1109: @findex NEXT_P2 1110: @item NEXT_P0 NEXT_P1 NEXT_P2 1111: The three parts of instruction dispatch. They can be defined in 1112: different ways for best performance on various processors (see 1113: @file{engine.c} in the example or @file{engine/threaded.h} in Gforth). 1114: @samp{NEXT_P0} is invoked right at the start of the VM instruction (but 1115: after @samp{DEF_CA}), @samp{NEXT_P1} right after the user-supplied C 1116: code, and @samp{NEXT_P2} at the end. The actual jump has to be 1117: performed by @samp{NEXT_P2}. 1118: 1119: The simplest variant is if @samp{NEXT_P2} does everything and the other 1120: macros do nothing. Then also related macros like @samp{IP}, 1121: @samp{SET_IP}, @samp{IP}, @samp{INC_IP} and @samp{IPTOS} are very 1122: straightforward to define. For switch dispatch this code consists just 1123: of a jump to the dispatch code (@samp{goto next_inst;} in our example); 1124: for direct threaded code it consists of something like 1125: @samp{(@{cfa=*ip++; goto *cfa;@})}. 1126: 1127: Pulling code (usually the @samp{cfa=*ip++;}) up into @samp{NEXT_P1} 1128: usually does not cause problems, but pulling things up into 1129: @samp{NEXT_P0} usually requires changing the other macros (and, at least 1130: for Gforth on Alpha, it does not buy much, because the compiler often 1131: manages to schedule the relevant stuff up by itself). An even more 1132: extreme variant is to pull code up even further, into, e.g., NEXT_P1 of 1133: the previous VM instruction (prefetching, useful on PowerPCs). 1134: 1135: @findex INC_IP 1136: @item INC_IP(@var{n}) 1137: This increments @code{IP} by @var{n}. 1138: 1139: @findex SET_IP 1140: @item SET_IP(@var{target}) 1141: This sets @code{IP} to @var{target}. 1142: 1143: @cindex type cast macro 1144: @findex vm_@var{A}2@var{B} 1145: @item vm_@var{A}2@var{B}(a,b) 1146: Type casting macro that assigns @samp{a} (of type @var{A}) to @samp{b} 1147: (of type @var{B}). This is mainly used for getting stack items into 1148: variables and back. So you need to define macros for every combination 1149: of stack basic type (@code{Cell} in our example) and type-prefix types 1150: used with that stack (in both directions). For the type-prefix type, 1151: you use the type-prefix (not the C type string) as type name (e.g., 1152: @samp{vm_Cell2i}, not @samp{vm_Cell2Cell}). In addition, you have to 1153: define a vm_@var{X}2@var{X} macro for the stack's basic type @var{X} 1154: (used in superinstructions). 1155: 1156: @cindex instruction stream, basic type 1157: The stack basic type for the predefined @samp{inst-stream} is 1158: @samp{Cell}. If you want a stack with the same item size, making its 1159: basic type @samp{Cell} usually reduces the number of macros you have to 1160: define. 1161: 1162: @cindex unions in type cast macros 1163: @cindex casts in type cast macros 1164: @cindex type casting between floats and integers 1165: Here our examples differ a lot: @file{vmgen-ex} uses casts in these 1166: macros, whereas @file{vmgen-ex2} uses union-field selection (or 1167: assignment to union fields). Note that casting floats into integers and 1168: vice versa changes the bit pattern (and you do not want that). In this 1169: case your options are to use a (temporary) union, or to take the address 1170: of the value, cast the pointer, and dereference that (not always 1171: possible, and sometimes expensive). 1172: 1173: @findex vm_two@var{A}2@var{B} 1174: @findex vm_@var{B}2two@var{A} 1175: @item vm_two@var{A}2@var{B}(a1,a2,b) 1176: @item vm_@var{B}2two@var{A}(b,a1,a2) 1177: Type casting between two stack items (@code{a1}, @code{a2}) and a 1178: variable @code{b} of a type that takes two stack items. This does not 1179: occur in our small examples, but you can look at Gforth for examples 1180: (see @code{vm_twoCell2d} in @file{engine/forth.h}). 1181: 1182: @cindex stack pointer definition 1183: @cindex instruction pointer definition 1184: @item @var{stackpointer} 1185: For each stack used, the stackpointer name given in the stack 1186: declaration is used. For a regular stack this must be an l-expression; 1187: typically it is a variable declared as a pointer to the stack's basic 1188: type. For @samp{inst-stream}, the name is @samp{IP}, and it can be a 1189: plain r-value; typically it is a macro that abstracts away the 1190: differences between the various implementations of @code{NEXT_P*}. 1191: 1192: @cindex top of stack caching 1193: @cindex stack caching 1194: @cindex TOS 1195: @findex IPTOS 1196: @item @var{stackpointer}TOS 1197: The top-of-stack for the stack pointed to by @var{stackpointer}. If you 1198: are using top-of-stack caching for that stack, this should be defined as 1199: variable; if you are not using top-of-stack caching for that stack, this 1200: should be a macro expanding to @samp{@var{stackpointer}[0]}. The stack 1201: pointer for the predefined @samp{inst-stream} is called @samp{IP}, so 1202: the top-of-stack is called @samp{IPTOS}. 1203: 1204: @findex IF_@var{stackpointer}TOS 1205: @item IF_@var{stackpointer}TOS(@var{expr}) 1206: Macro for executing @var{expr}, if top-of-stack caching is used for the 1207: @var{stackpointer} stack. I.e., this should do @var{expr} if there is 1208: top-of-stack caching for @var{stackpointer}; otherwise it should do 1209: nothing. 1210: 1211: @findex SUPER_END 1212: @item SUPER_END 1213: This is used by the VM profiler (@pxref{VM profiler}); it should not do 1214: anything in normal operation, and call @code{vm_count_block(IP)} for 1215: profiling. 1216: 1217: @findex SUPER_CONTINUE 1218: @item SUPER_CONTINUE 1219: This is just a hint to Vmgen and does nothing at the C level. 1220: 1221: @findex VM_DEBUG 1222: @item VM_DEBUG 1223: If this is defined, the tracing code will be compiled in (slower 1224: interpretation, but better debugging). Our example compiles two 1225: versions of the engine, a fast-running one that cannot trace, and one 1226: with potential tracing and profiling. 1227: 1228: @findex vm_debug 1229: @item vm_debug 1230: Needed only if @samp{VM_DEBUG} is defined. If this variable contains 1231: true, the VM instructions produce trace output. It can be turned on or 1232: off at any time. 1233: 1234: @findex vm_out 1235: @item vm_out 1236: Needed only if @samp{VM_DEBUG} is defined. Specifies the file on which 1237: to print the trace output (type @samp{FILE *}). 1238: 1239: @findex printarg_@var{type} 1240: @item printarg_@var{type}(@var{value}) 1241: Needed only if @samp{VM_DEBUG} is defined. Macro or function for 1242: printing @var{value} in a way appropriate for the @var{type}. This is 1243: used for printing the values of stack items during tracing. @var{Type} 1244: is normally the type prefix specified in a @code{type-prefix} definition 1245: (e.g., @samp{printarg_i}); in superinstructions it is currently the 1246: basic type of the stack. 1247: 1248: @end table 1249: 1250: 1251: @c -------------------------------------------------------------------- 1252: @node VM instruction table, VM code generation, VM engine, Using the generated code 1253: @section VM instruction table 1254: @cindex instruction table 1255: @cindex opcode definition 1256: @cindex labels for threaded code 1257: @cindex @code{vm_prim}, definition 1258: @cindex @file{-labels.i} output file 1259: 1260: For threaded code we also need to produce a table containing the labels 1261: of all VM instructions. This is needed for VM code generation 1262: (@pxref{VM code generation}), and it has to be done in the engine 1263: function, because the labels are not visible outside. It then has to be 1264: passed outside the function (and assigned to @samp{vm_prim}), to be used 1265: by the VM code generation functions. 1266: 1267: This means that the engine function has to be called first to produce 1268: the VM instruction table, and later, after generating VM code, it has to 1269: be called again to execute the generated VM code (yes, this is ugly). 1270: In our example program, these two modes of calling the engine function 1271: are differentiated by the value of the parameter ip0 (if it equals 0, 1272: then the table is passed out, otherwise the VM code is executed); in our 1273: example, we pass the table out by assigning it to @samp{vm_prim} and 1274: returning from @samp{engine}. 1275: 1276: In our example (@file{vmgen-ex/engine.c}), we also build such a table for 1277: switch dispatch; this is mainly done for uniformity. 1278: 1279: For switch dispatch, we also need to define the VM instruction opcodes 1280: used as case labels in an @code{enum}. 1281: 1282: For both purposes (VM instruction table, and enum), the file 1283: @file{@var{name}-labels.i} is generated by Vmgen. You have to define 1284: the following macro used in this file: 1285: 1286: @table @code 1287: 1288: @findex INST_ADDR 1289: @item INST_ADDR(@var{inst_name}) 1290: For switch dispatch, this is just the name of the switch label (the same 1291: name as used in @samp{LABEL(@var{inst_name})}), for both uses of 1292: @file{@var{name}-labels.i}. For threaded-code dispatch, this is the 1293: address of the label defined in @samp{LABEL(@var{inst_name})}); the 1294: address is taken with @samp{&&} (@pxref{Labels as Values, , Labels as 1295: Values, gcc.info, GNU C Manual}). 1296: 1297: @end table 1298: 1299: 1300: @c -------------------------------------------------------------------- 1301: @node VM code generation, Peephole optimization, VM instruction table, Using the generated code 1302: @section VM code generation 1303: @cindex VM code generation 1304: @cindex code generation, VM 1305: @cindex @file{-gen.i} output file 1306: 1307: Vmgen generates VM code generation functions in @file{@var{name}-gen.i} 1308: that the front end can call to generate VM code. This is essential for 1309: an interpretive system. 1310: 1311: @findex gen_@var{inst} 1312: For a VM instruction @samp{x ( #a b #c -- d )}, Vmgen generates a 1313: function with the prototype 1314: 1315: @example 1316: void gen_x(Inst **ctp, a_type a, c_type c) 1317: @end example 1318: 1319: The @code{ctp} argument points to a pointer to the next instruction. 1320: @code{*ctp} is increased by the generation functions; i.e., you should 1321: allocate memory for the code to be generated beforehand, and start with 1322: *ctp set at the start of this memory area. Before running out of 1323: memory, allocate a new area, and generate a VM-level jump to the new 1324: area (this overflow handling is not implemented in our examples). 1325: 1326: @cindex immediate arguments, VM code generation 1327: The other arguments correspond to the immediate arguments of the VM 1328: instruction (with their appropriate types as defined in the 1329: @code{type_prefix} declaration. 1330: 1331: The following types, variables, and functions are used in 1332: @file{@var{name}-gen.i}: 1333: 1334: @table @code 1335: 1336: @findex Inst 1337: @item Inst 1338: The type of the VM instruction; if you use threaded code, this is 1339: @code{void *}; for switch dispatch this is an integer type. 1340: 1341: @cindex @code{vm_prim}, use 1342: @item vm_prim 1343: The VM instruction table (type: @code{Inst *}, @pxref{VM instruction table}). 1344: 1345: @findex gen_inst 1346: @item gen_inst(Inst **ctp, Inst i) 1347: This function compiles the instruction @code{i}. Take a look at it in 1348: @file{vmgen-ex/peephole.c}. It is trivial when you don't want to use 1349: superinstructions (just the last two lines of the example function), and 1350: slightly more complicated in the example due to its ability to use 1351: superinstructions (@pxref{Peephole optimization}). 1352: 1353: @findex genarg_@var{type_prefix} 1354: @item genarg_@var{type_prefix}(Inst **ctp, @var{type} @var{type_prefix}) 1355: This compiles an immediate argument of @var{type} (as defined in a 1356: @code{type-prefix} definition). These functions are trivial to define 1357: (see @file{vmgen-ex/support.c}). You need one of these functions for 1358: every type that you use as immediate argument. 1359: 1360: @end table 1361: 1362: @findex BB_BOUNDARY 1363: In addition to using these functions to generate code, you should call 1364: @code{BB_BOUNDARY} at every basic block entry point if you ever want to 1365: use superinstructions (or if you want to use the profiling supported by 1366: Vmgen; but this support is also useful mainly for selecting 1367: superinstructions). If you use @code{BB_BOUNDARY}, you should also 1368: define it (take a look at its definition in @file{vmgen-ex/mini.y}). 1369: 1370: You do not need to call @code{BB_BOUNDARY} after branches, because you 1371: will not define superinstructions that contain branches in the middle 1372: (and if you did, and it would work, there would be no reason to end the 1373: superinstruction at the branch), and because the branches announce 1374: themselves to the profiler. 1375: 1376: 1377: @c -------------------------------------------------------------------- 1378: @node Peephole optimization, VM disassembler, VM code generation, Using the generated code 1379: @section Peephole optimization 1380: @cindex peephole optimization 1381: @cindex superinstructions, generating 1382: @cindex @file{peephole.c} 1383: @cindex @file{-peephole.i} output file 1384: 1385: You need peephole optimization only if you want to use 1386: superinstructions. But having the code for it does not hurt much if you 1387: do not use superinstructions. 1388: 1389: A simple greedy peephole optimization algorithm is used for 1390: superinstruction selection: every time @code{gen_inst} compiles a VM 1391: instruction, it checks if it can combine it with the last VM instruction 1392: (which may also be a superinstruction resulting from a previous peephole 1393: optimization); if so, it changes the last instruction to the combined 1394: instruction instead of laying down @code{i} at the current @samp{*ctp}. 1395: 1396: The code for peephole optimization is in @file{vmgen-ex/peephole.c}. 1397: You can use this file almost verbatim. Vmgen generates 1398: @file{@var{file}-peephole.i} which contains data for the peephoile 1399: optimizer. 1400: 1401: @findex init_peeptable 1402: You have to call @samp{init_peeptable()} after initializing 1403: @samp{vm_prim}, and before compiling any VM code to initialize data 1404: structures for peephole optimization. After that, compiling with the VM 1405: code generation functions will automatically combine VM instructions 1406: into superinstructions. Since you do not want to combine instructions 1407: across VM branch targets (otherwise there will not be a proper VM 1408: instruction to branch to), you have to call @code{BB_BOUNDARY} 1409: (@pxref{VM code generation}) at branch targets. 1410: 1411: 1412: @c -------------------------------------------------------------------- 1413: @node VM disassembler, VM profiler, Peephole optimization, Using the generated code 1414: @section VM disassembler 1415: @cindex VM disassembler 1416: @cindex disassembler, VM code 1417: @cindex @file{disasm.c} 1418: @cindex @file{-disasm.i} output file 1419: 1420: A VM code disassembler is optional for an interpretive system, but 1421: highly recommended during its development and maintenance, because it is 1422: very useful for detecting bugs in the front end (and for distinguishing 1423: them from VM interpreter bugs). 1424: 1425: Vmgen supports VM code disassembling by generating 1426: @file{@var{file}-disasm.i}. This code has to be wrapped into a 1427: function, as is done in @file{vmgen-ex/disasm.c}. You can use this file 1428: almost verbatim. In addition to @samp{vm_@var{A}2@var{B}(a,b)}, 1429: @samp{vm_out}, @samp{printarg_@var{type}(@var{value})}, which are 1430: explained above, the following macros and variables are used in 1431: @file{@var{file}-disasm.i} (and you have to define them): 1432: 1433: @table @code 1434: 1435: @item ip 1436: This variable points to the opcode of the current VM instruction. 1437: 1438: @cindex @code{IP}, @code{IPTOS} in disassmbler 1439: @item IP IPTOS 1440: @samp{IPTOS} is the first argument of the current VM instruction, and 1441: @samp{IP} points to it; this is just as in the engine, but here 1442: @samp{ip} points to the opcode of the VM instruction (in contrast to the 1443: engine, where @samp{ip} points to the next cell, or even one further). 1444: 1445: @findex VM_IS_INST 1446: @item VM_IS_INST(Inst i, int n) 1447: Tests if the opcode @samp{i} is the same as the @samp{n}th entry in the 1448: VM instruction table. 1449: 1450: @end table 1451: 1452: 1453: @c -------------------------------------------------------------------- 1454: @node VM profiler, , VM disassembler, Using the generated code 1455: @section VM profiler 1456: @cindex VM profiler 1457: @cindex profiling for selecting superinstructions 1458: @cindex superinstructions and profiling 1459: @cindex @file{profile.c} 1460: @cindex @file{-profile.i} output file 1461: 1462: The VM profiler is designed for getting execution and occurence counts 1463: for VM instruction sequences, and these counts can then be used for 1464: selecting sequences as superinstructions. The VM profiler is probably 1465: not useful as profiling tool for the interpretive system. I.e., the VM 1466: profiler is useful for the developers, but not the users of the 1467: interpretive system. 1468: 1469: The output of the profiler is: for each basic block (executed at least 1470: once), it produces the dynamic execution count of that basic block and 1471: all its subsequences; e.g., 1472: 1473: @example 1474: 9227465 lit storelocal 1475: 9227465 storelocal branch 1476: 9227465 lit storelocal branch 1477: @end example 1478: 1479: I.e., a basic block consisting of @samp{lit storelocal branch} is 1480: executed 9227465 times. 1481: 1482: @cindex @file{stat.awk} 1483: @cindex @file{seq2rule.awk} 1484: This output can be combined in various ways. E.g., 1485: @file{vmgen-ex/stat.awk} adds up the occurences of a given sequence wrt 1486: dynamic execution, static occurence, and per-program occurence. E.g., 1487: 1488: @example 1489: 2 16 36910041 loadlocal lit 1490: @end example 1491: 1492: @noindent 1493: indicates that the sequence @samp{loadlocal lit} occurs in 2 programs, 1494: in 16 places, and has been executed 36910041 times. Now you can select 1495: superinstructions in any way you like (note that compile time and space 1496: typically limit the number of superinstructions to 100--1000). After 1497: you have done that, @file{vmgen/seq2rule.awk} turns lines of the form 1498: above into rules for inclusion in a Vmgen input file. Note that this 1499: script does not ensure that all prefixes are defined, so you have to do 1500: that in other ways. So, an overall script for turning profiles into 1501: superinstructions can look like this: 1502: 1503: @example 1504: awk -f stat.awk fib.prof test.prof| 1505: awk '$3>=10000'| #select sequences 1506: fgrep -v -f peephole-blacklist| #eliminate wrong instructions 1507: awk -f seq2rule.awk| #turn into superinstructions 1508: sort -k 3 >mini-super.vmg #sort sequences 1509: @end example 1510: 1511: Here the dynamic count is used for selecting sequences (preliminary 1512: results indicate that the static count gives better results, though); 1513: the third line eliminates sequences containing instructions that must not 1514: occur in a superinstruction, because they access a stack directly. The 1515: dynamic count selection ensures that all subsequences (including 1516: prefixes) of longer sequences occur (because subsequences have at least 1517: the same count as the longer sequences); the sort in the last line 1518: ensures that longer superinstructions occur after their prefixes. 1519: 1520: But before using this, you have to have the profiler. Vmgen supports its 1521: creation by generating @file{@var{file}-profile.i}; you also need the 1522: wrapper file @file{vmgen-ex/profile.c} that you can use almost verbatim. 1523: 1524: @cindex @code{SUPER_END} in profiling 1525: @cindex @code{BB_BOUNDARY} in profiling 1526: The profiler works by recording the targets of all VM control flow 1527: changes (through @code{SUPER_END} during execution, and through 1528: @code{BB_BOUNDARY} in the front end), and counting (through 1529: @code{SUPER_END}) how often they were targeted. After the program run, 1530: the numbers are corrected such that each VM basic block has the correct 1531: count (entering a block without executing a branch does not increase the 1532: count, and the correction fixes that), then the subsequences of all 1533: basic blocks are printed. To get all this, you just have to define 1534: @code{SUPER_END} (and @code{BB_BOUNDARY}) appropriately, and call 1535: @code{vm_print_profile(FILE *file)} when you want to output the profile 1536: on @code{file}. 1537: 1538: @cindex @code{VM_IS_INST} in profiling 1539: The @file{@var{file}-profile.i} is similar to the disassembler file, and 1540: it uses variables and functions defined in @file{vmgen-ex/profile.c}, 1541: plus @code{VM_IS_INST} already defined for the VM disassembler 1542: (@pxref{VM disassembler}). 1543: 1544: 1545: @c ********************************************************** 1546: @node Changes, Contact, Using the generated code, Top 1547: @chapter Changes 1548: @cindex Changes from old versions 1549: 1550: Users of the gforth-0.5.9-20010501 version of Vmgen need to change 1551: several things in their source code to use the current version. I 1552: recommend keeping the gforth-0.5.9-20010501 version until you have 1553: completed the change (note that you can have several versions of Gforth 1554: installed at the same time). I hope to avoid such incompatible changes 1555: in the future. 1556: 1557: The required changes are: 1558: 1559: @table @code 1560: 1561: @cindex @code{vm_@var{A}2@var{B}}, changes 1562: @item vm_@var{A}2@var{B} 1563: now takes two arguments. 1564: 1565: @cindex @code{vm_two@var{A}2@var{B}}, changes 1566: @item vm_two@var{A}2@var{B}(b,a1,a2); 1567: changed to vm_two@var{A}2@var{B}(a1,a2,b) (note the absence of the @samp{;}). 1568: 1569: @end table 1570: 1571: Also some new macros have to be defined, e.g., @code{INST_ADDR}, and 1572: @code{LABEL}; some macros have to be defined in new contexts, e.g., 1573: @code{VM_IS_INST} is now also needed in the disassembler. 1574: 1575: @c ********************************************************* 1576: @node Contact, Copying This Manual, Changes, Top 1577: @chapter Contact 1578: 1579: @c *********************************************************** 1580: @node Copying This Manual, Index, Contact, Top 1581: @appendix Copying This Manual 1582: 1583: @menu 1584: * GNU Free Documentation License:: License for copying this manual. 1585: @end menu 1586: 1587: @include fdl.texi 1588: 1589: 1590: @node Index, , Copying This Manual, Top 1591: @unnumbered Index 1592: 1593: @printindex cp 1594: 1595: @bye