Annotation of gforth/prof-inline.fs, revision 1.2

1.1       anton       1: \ get some data on potential (partial) inlining
                      2: 
                      3: \ Copyright (C) 2004 Free Software Foundation, Inc.
                      4: 
                      5: \ This file is part of Gforth.
                      6: 
                      7: \ Gforth is free software; you can redistribute it and/or
                      8: \ modify it under the terms of the GNU General Public License
                      9: \ as published by the Free Software Foundation; either version 2
                     10: \ of the License, or (at your option) any later version.
                     11: 
                     12: \ This program is distributed in the hope that it will be useful,
                     13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
                     14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
                     15: \ GNU General Public License for more details.
                     16: 
                     17: \ You should have received a copy of the GNU General Public License
                     18: \ along with this program; if not, write to the Free Software
                     19: \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
                     20: 
                     21: 
                     22: \ relies on some Gforth internals
                     23: 
                     24: \ !! assumption: each file is included only once; otherwise you get
                     25: \ the counts for just one of the instances of the file.  This can be
                     26: \ fixed by making sure that every source position occurs only once as
                     27: \ a profile point.
                     28: 
                     29: true constant count-calls? \ do some profiling of colon definitions etc.
                     30: 
                     31: \ for true COUNT-CALLS?:
                     32: 
                     33: \ What data do I need for evaluating the effectiveness of (partial) inlining?
                     34: 
                     35: \ static and dynamic counts of everything:
                     36: 
                     37: \ original BB length (histogram and average)
                     38: \ BB length with partial inlining (histogram and average)
                     39: \   since we cannot partially inline library calls, we use a parameter
                     40: \   that represents the amount of partial inlining we can expect there.
                     41: \ number of tail calls (original and after partial inlining)
                     42: \ number of calls (original and after partial inlining)
                     43: \ reason for BB end: call, return, execute, branch
                     44: 
                     45: \ how many static calls are there to a word?  How many of the dynamic
                     46: \ calls call just a single word?
                     47: 
1.2     ! anton      48: \ how much does inlining called-once words help?
        !            49: \ how much does inlining words without control flow help?
        !            50: \ how much does partial inlining help?
        !            51: \ what's the overlap?
        !            52: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
        !            53: 
1.1       anton      54: struct
1.2     ! anton      55:     cell% list-next
        !            56: end-struct list%
        !            57: 
        !            58: list%
1.1       anton      59:     cell% 2* field profile-count
                     60:     cell% 2* field profile-sourcepos
                     61:     cell%    field profile-char \ character position in line
                     62:     count-calls? [if]
                     63:        cell% field profile-colondef? \ is this a colon definition start
1.2     ! anton      64:        cell% field profile-calls \ static calls to the colon def (calls%)
1.1       anton      65:        cell% field profile-straight-line \ may contain calls, but no other CF
                     66:        cell% field profile-calls-from \ static calls in the colon def
                     67:     [endif]
                     68: end-struct profile% \ profile point
                     69: 
1.2     ! anton      70: list%
        !            71:     cell% field calls%-call \ ptr to profile point of bb containing the call
        !            72: end-struct calls%
        !            73: 
1.1       anton      74: variable profile-points \ linked list of profile%
                     75: 0 profile-points !
                     76: variable next-profile-point-p \ the address where the next pp will be stored
                     77: profile-points next-profile-point-p !
                     78: count-calls? [if]
                     79:     variable last-colondef-profile \ pointer to the pp of last colon definition
                     80: [endif]
1.2     ! anton      81: 
        !            82: \ list stuff
        !            83: 
        !            84: 
        !            85: 
        !            86: \ profile-point stuff   
        !            87: 
1.1       anton      88: : new-profile-point ( -- addr )
                     89:     profile% %alloc >r
                     90:     0. r@ profile-count 2!
                     91:     current-sourcepos r@ profile-sourcepos 2!
                     92:     >in @ r@ profile-char !
                     93:     [ count-calls? ] [if]
                     94:        r@ profile-colondef? off
                     95:        0 r@ profile-calls !
                     96:        r@ profile-straight-line on
                     97:        0 r@ profile-calls-from !
                     98:     [endif]
1.2     ! anton      99:     0 r@ list-next !
1.1       anton     100:     r@ next-profile-point-p @ !
1.2     ! anton     101:     r@ list-next next-profile-point-p !
1.1       anton     102:     r> ;
                    103: 
                    104: : print-profile ( -- )
                    105:     profile-points @ begin
                    106:        dup while
                    107:            dup >r
                    108:            r@ profile-sourcepos 2@ .sourcepos ." :"
                    109:            r@ profile-char @ 0 .r ." : "
                    110:            r@ profile-count 2@ 0 d.r cr
1.2     ! anton     111:            r> list-next @
1.1       anton     112:     repeat
                    113:     drop ;
                    114: 
                    115: : print-profile-coldef ( -- )
                    116:     profile-points @ begin
                    117:        dup while
                    118:            dup >r
                    119:            r@ profile-colondef? @ if
                    120:                r@ profile-sourcepos 2@ .sourcepos ." :"
                    121:                r@ profile-char @ 3 .r ." : "
                    122:                r@ profile-count 2@ 10 d.r
                    123:                r@ profile-straight-line @ space 2 .r
                    124:                r@ profile-calls @ 4 .r
                    125:                cr
                    126:            endif
1.2     ! anton     127:            r> list-next @
1.1       anton     128:     repeat
                    129:     drop ;
                    130: 
                    131: : dinc ( profilep -- )
                    132:     \ increment double pointed to by d-addr
                    133:     profile-count dup 2@ 1. d+ rot 2! ;
                    134: 
                    135: : profile-this ( -- )
                    136:     new-profile-point POSTPONE literal POSTPONE dinc ;
                    137: 
                    138: \ Various words trigger PROFILE-THIS.  In order to avoid getting
                    139: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
                    140: \ just wait until the next word is parsed by the text interpreter (in
                    141: \ compile state) and call PROFILE-THIS only once then.  The whole
                    142: \ BEFORE-WORD hooking etc. is there for this.
                    143: 
                    144: \ The reason that we do this is because we use the source position for
                    145: \ the profiling information, and there's only one source position for
                    146: \ ?EXIT.  If we used the threaded code position instead, we would see
                    147: \ that ?EXIT compiles to several threaded-code words, and could use
                    148: \ different profile points for them.  However, usually dealing with
                    149: \ the source is more practical.
                    150: 
                    151: \ Another benefit is that we can ask for profiling anywhere in a
                    152: \ control-flow word (even before it compiles its own stuff).
                    153: 
                    154: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
                    155: \ a whole colon definition (and triggers our profiler), but during the
                    156: \ compilation of the colon definition there is no parsing.  Afterwards
                    157: \ you get interpret state at first (no profiling, either), but after
                    158: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
                    159: \ called (and compiles code that is never executed).  It would be
                    160: \ better if we had a way of knowing whether we are in a colon def or
                    161: \ not (and used that knowledge instead of STATE).
                    162: 
                    163: Defer before-word-profile ( -- )
                    164: ' noop IS before-word-profile
                    165: 
                    166: : before-word1 ( -- )
                    167:     before-word-profile defers before-word ;
                    168: 
                    169: ' before-word1 IS before-word
                    170: 
                    171: : profile-this-compiling ( -- )
                    172:     state @ if
                    173:        profile-this
                    174:        ['] noop IS before-word-profile
                    175:     endif ;
                    176: 
                    177: : cock-profiler ( -- )
                    178:     \ as in cock the gun - pull the trigger
                    179:     ['] profile-this-compiling IS before-word-profile
                    180:     [ count-calls? ] [if] \ we are at a non-colondef profile point
                    181:        last-colondef-profile @ profile-straight-line off
                    182:     [endif]
                    183: ;
                    184: 
                    185: : hook-profiling-into ( "name" -- )
                    186:     \ make (deferred word) "name" call cock-profiler, too
                    187:     ' >body >r :noname
                    188:     POSTPONE cock-profiler
                    189:     r@ @ compile, \ old hook behaviour
                    190:     POSTPONE ;
                    191:     r> ! ; \ change hook behaviour
                    192: 
                    193: : note-execute ( -- )
                    194:     \ end of BB due to execute
                    195: ;
                    196: 
                    197: : note-call ( addr -- )
                    198:     \ addr is the body address of a called colon def or does handler
                    199:     dup 3 cells + @ ['] dinc >body = if
                    200:        1 over  cell+ @ profile-calls +!
                    201:     endif
                    202:     drop ;
                    203:     
                    204: : prof-compile, ( xt -- )
                    205:     dup >does-code if
                    206:        dup >does-code note-call
                    207:     then
                    208:     dup >code-address CASE
                    209:        docol:   OF dup >body note-call ENDOF
                    210:        dodefer: OF note-execute ENDOF
                    211:        dofield: OF >body @ ['] lit+ peephole-compile, , EXIT ENDOF
                    212:        \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
                    213:        \ code words and ;code-defined words (code words could be optimized):
                    214:        dup in-dictionary? IF drop POSTPONE literal ['] execute peephole-compile, EXIT THEN
                    215:     ENDCASE
                    216:     DEFERS compile, ;
                    217: 
                    218: \ hook-profiling-into then-like
                    219: \ \ hook-profiling-into if-like    \ subsumed by other-control-flow
                    220: \ \ hook-profiling-into ahead-like \ subsumed by other-control-flow
                    221: \ hook-profiling-into other-control-flow
                    222: \ hook-profiling-into begin-like
                    223: \ hook-profiling-into again-like
                    224: \ hook-profiling-into until-like
                    225: 
                    226: : :-hook-profile ( -- )
                    227:     defers :-hook
                    228:     next-profile-point-p @
                    229:     profile-this
                    230:     @ dup last-colondef-profile !
                    231:     profile-colondef? on ;
                    232: 
                    233: ' :-hook-profile IS :-hook
                    234: ' prof-compile, IS compile,

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>