--- gforth/doc/vmgen.texi 2002/08/07 22:19:44 1.6 +++ gforth/doc/vmgen.texi 2002/08/09 09:42:35 1.9 @@ -658,6 +658,12 @@ contents. @section Superinstructions +Note: don't invest too much work in (static) superinstructions; a future +version of vmgen will support dynamic superinstructions (see Ian +Piumarta and Fabio Riccardi, @cite{Optimizing Direct Threaded Code by +Selective Inlining}, PLDI'98), and static superinstructions have much +less benefit in that context. + Here is an example of a superinstruction definition: @example @@ -743,6 +749,10 @@ threaded-code dispatch this should just @var{label}}. In either case @var{label} is usually the @var{inst_name} with some prefix or suffix to avoid naming conflicts. +@item LABEL2(@var{inst_name}) +This will be used for dynamic superinstructions; at the moment, this +should expand to nothing. + @item NAME(@var{inst_name_string}) Called on entering a VM instruction with a string containing the name of the VM instruction as parameter. In normal execution this should be a @@ -781,7 +791,10 @@ extreme variant is to pull code up even the previous VM instruction (prefetching, useful on PowerPCs). @item INC_IP(@var{n}) -This increments IP by @var{n}. +This increments @code{IP} by @var{n}. + +@item SET_IP(@var{target}) +This sets @code{IP} to @var{target}. @item vm_@var{A}2@var{B}(a,b) Type casting macro that assigns @samp{a} (of type @var{A}) to @samp{b} @@ -831,6 +844,14 @@ Macro for executing @var{expr}, if top-o top-of-stack caching for @var{stackpointer}; otherwise it should do nothing. +@item SUPER_END +This is used by the VM profiler (@pxref{VM profiler}); it should not do +anything in normal operation, and call @code{vm_count_block(IP)} for +profiling. + +@item SUPER_CONTINUE +This is just a hint to vmgen and does nothing at the C level. + @item VM_DEBUG If this is defined, the tracing code will be compiled in (slower interpretation, but better debugging). Our example compiles two @@ -1023,22 +1044,107 @@ VM instruction table. @end table +@section VM profiler +The VM profiler is designed for getting execution and occurence counts +for VM instruction sequences, and these counts can then be used for +selecting sequences as superinstructions. The VM profiler is probably +not useful as profiling tool for the interpretive system. I.e., the VM +profiler is useful for the developers, but not the users of the +interpretive system. + +The output of the profiler is: for each basic block (executed at least +once), it produces the dynamic execution count of that basic block and +all its subsequences; e.g., +@example + 9227465 lit storelocal + 9227465 storelocal branch + 9227465 lit storelocal branch +@end example +I.e., a basic block consisting of @samp{lit storelocal branch} is +executed 9227465 times. +This output can be combined in various ways. E.g., +@file{vmgen/stat.awk} adds up the occurences of a given sequence wrt +dynamic execution, static occurence, and per-program occurence. E.g., -Invocation +@example + 2 16 36910041 loadlocal lit +@end example -Input Syntax +indicates that the sequence @samp{loadlocal lit} occurs in 2 programs, +in 16 places, and has been executed 36910041 times. Now you can select +superinstructions in any way you like (note that compile time and space +typically limit the number of superinstructions to 100--1000). After +you have done that, @file{vmgen/seq2rule.awk} turns lines of the form +above into rules for inclusion in a vmgen input file. Note that this +script does not ensure that all prefixes are defined, so you have to do +that in other ways. So, an overall script for turning profiles into +superinstructions can look like this: -Concepts: Front end, VM, Stacks, Types, input stream +@example +awk -f stat.awk fib.prof test.prof| +awk '$3>=10000'| #select sequences +fgrep -v -f peephole-blacklist| #eliminate wrong instructions +awk -f seq2rule.awk| #turn into superinstructions +sort -k 3 >mini-super.vmg #sort sequences +@end example + +Here the dynamic count is used for selecting sequences (preliminary +results indicate that the static count gives better results, though); +the third line eliminats sequences containing instructions that must not +occur in a superinstruction, because they access a stack directly. The +dynamic count selection ensures that all subsequences (including +prefixes) of longer sequences occur (because subsequences have at least +the same count as the longer sequences); the sort in the last line +ensures that longer superinstructions occur after their prefixes. + +But before using it, you have to have the profiler. Vmgen supports its +creation by generating @file{@var{file}-profile.i}; you also need the +wrapper file @file{vmgen-ex/profile.c} that you can use almost verbatim. + +The profiler works by recording the targets of all VM control flow +changes (through @code{SUPER_END} during execution, and through +@code{BB_BOUNDARY} in the front end), and counting (through +@code{SUPER_END}) how often they were targeted. After the program run, +the numbers are corrected such that each VM basic block has the correct +count (originally entering a block without executing a branch does not +increase the count), then the subsequences of all basic blocks are +printed. To get all this, you just have to define @code{SUPER_END} (and +@code{BB_BOUNDARY}) appropriately, and call @code{vm_print_profile(FILE +*file)} when you want to output the profile on @code{file}. + +The @file{@var{file}-profile.i} is simular to the disassembler file, and +it uses variables and functions defined in @file{vmgen-ex/profile.c}, +plus @code{VM_IS_INST} already defined for the VM disassembler +(@pxref{VM disassembler}). + +@chapter Changes + +Users of the gforth-0.5.9-20010501 version of vmgen need to change +several things in their source code to use the current version. I +recommend keeping the gforth-0.5.9-20010501 version until you have +completed the change (note that you can have several versions of Gforth +installed at the same time). I hope to avoid such incompatible changes +in the future. + +The required changes are: + +@table @code + +@item vm_@var{A}2@var{B} +now takes two arguments. + +@item vm_two@var{A}2@var{B}(b,a1,a2); +changed to vm_two@var{A}2@var{B}(a1,a2,b) (note the absence of the @samp{;}). + +@end table -Contact +Also some new macros have to be defined, e.g., @code{INST_ADDR}, and +@code{LABEL}; some macros have to be defined in new contexts, e.g., +@code{VM_IS_INST} is now also needed in the disassembler. +@chapter Contact -Required changes: -vm_...2... -> two arguments -"vm_two...2...(arg1,arg2,arg3);" -> "vm_two...2...(arg3,arg1,arg2)" (no ";"). -define INST_ADDR and LABEL -define VM_IS_INST also for disassembler