Annotation of gforth/prof-inline.fs, revision 1.5

1.1       anton       1: \ get some data on potential (partial) inlining
                      2: 
                      3: \ Copyright (C) 2004 Free Software Foundation, Inc.
                      4: 
                      5: \ This file is part of Gforth.
                      6: 
                      7: \ Gforth is free software; you can redistribute it and/or
                      8: \ modify it under the terms of the GNU General Public License
                      9: \ as published by the Free Software Foundation; either version 2
                     10: \ of the License, or (at your option) any later version.
                     11: 
                     12: \ This program is distributed in the hope that it will be useful,
                     13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
                     14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
                     15: \ GNU General Public License for more details.
                     16: 
                     17: \ You should have received a copy of the GNU General Public License
                     18: \ along with this program; if not, write to the Free Software
                     19: \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
                     20: 
                     21: 
                     22: \ relies on some Gforth internals
                     23: 
                     24: \ !! assumption: each file is included only once; otherwise you get
                     25: \ the counts for just one of the instances of the file.  This can be
                     26: \ fixed by making sure that every source position occurs only once as
                     27: \ a profile point.
                     28: 
                     29: true constant count-calls? \ do some profiling of colon definitions etc.
                     30: 
                     31: \ for true COUNT-CALLS?:
                     32: 
                     33: \ What data do I need for evaluating the effectiveness of (partial) inlining?
                     34: 
                     35: \ static and dynamic counts of everything:
                     36: 
                     37: \ original BB length (histogram and average)
                     38: \ BB length with partial inlining (histogram and average)
                     39: \   since we cannot partially inline library calls, we use a parameter
                     40: \   that represents the amount of partial inlining we can expect there.
                     41: \ number of tail calls (original and after partial inlining)
                     42: \ number of calls (original and after partial inlining)
                     43: \ reason for BB end: call, return, execute, branch
                     44: 
                     45: \ how many static calls are there to a word?  How many of the dynamic
                     46: \ calls call just a single word?
                     47: 
1.2       anton      48: \ how much does inlining called-once words help?
                     49: \ how much does inlining words without control flow help?
                     50: \ how much does partial inlining help?
                     51: \ what's the overlap?
                     52: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
                     53: 
1.1       anton      54: struct
1.3       anton      55:     cell% field list-next
1.2       anton      56: end-struct list%
                     57: 
                     58: list%
1.1       anton      59:     cell% 2* field profile-count
                     60:     cell% 2* field profile-sourcepos
                     61:     cell%    field profile-char \ character position in line
                     62:     count-calls? [if]
                     63:        cell% field profile-colondef? \ is this a colon definition start
1.2       anton      64:        cell% field profile-calls \ static calls to the colon def (calls%)
1.1       anton      65:        cell% field profile-straight-line \ may contain calls, but no other CF
                     66:        cell% field profile-calls-from \ static calls in the colon def
                     67:     [endif]
                     68: end-struct profile% \ profile point
                     69: 
1.2       anton      70: list%
1.3       anton      71:     cell% field calls-call \ ptr to profile point of bb containing the call
1.2       anton      72: end-struct calls%
                     73: 
1.1       anton      74: variable profile-points \ linked list of profile%
                     75: 0 profile-points !
                     76: variable next-profile-point-p \ the address where the next pp will be stored
                     77: profile-points next-profile-point-p !
1.3       anton      78: variable last-colondef-profile \ pointer to the pp of last colon definition
                     79: variable current-profile-point
1.5     ! anton      80: variable library-calls 0 library-calls ! \ list of calls to library colon defs
1.4       anton      81: variable in-compile,? in-compile,? off
1.2       anton      82: 
                     83: \ list stuff
                     84: 
1.3       anton      85: : map-list ( ... list xt -- ... )
                     86:     { xt } begin { list }
                     87:        list while
                     88:            list xt execute
                     89:            list list-next @
                     90:     repeat ;
                     91: 
                     92: : drop-1+ drop 1+ ;
                     93: 
                     94: : list-length ( list -- u )
                     95:     0 swap ['] drop-1+ map-list ;
                     96: 
                     97: : insert-list ( listp listpp -- )
                     98:     \ insert list node listp into list pointed to by listpp in front
                     99:     tuck @ over list-next !
                    100:     swap ! ;
                    101: 
                    102: : insert-list-end ( listp listppp -- )
                    103:     \ insert list node listp into list, with listppp indicating the
                    104:     \ position to insert at, and indicating the position behind the
                    105:     \ new element afterwards.
                    106:     2dup @ insert-list
                    107:     swap list-next swap ! ;
1.2       anton     108: 
1.3       anton     109: \ calls
                    110: 
                    111: : new-call ( profile-point -- call )
                    112:     calls% %alloc tuck calls-call ! ;
1.2       anton     113: 
                    114: \ profile-point stuff   
                    115: 
1.1       anton     116: : new-profile-point ( -- addr )
                    117:     profile% %alloc >r
                    118:     0. r@ profile-count 2!
                    119:     current-sourcepos r@ profile-sourcepos 2!
                    120:     >in @ r@ profile-char !
                    121:     [ count-calls? ] [if]
                    122:        r@ profile-colondef? off
                    123:        0 r@ profile-calls !
                    124:        r@ profile-straight-line on
                    125:        0 r@ profile-calls-from !
                    126:     [endif]
1.3       anton     127:     r@ next-profile-point-p insert-list-end
                    128:     r@ current-profile-point !
1.1       anton     129:     r> ;
                    130: 
                    131: : print-profile ( -- )
                    132:     profile-points @ begin
                    133:        dup while
                    134:            dup >r
                    135:            r@ profile-sourcepos 2@ .sourcepos ." :"
                    136:            r@ profile-char @ 0 .r ." : "
                    137:            r@ profile-count 2@ 0 d.r cr
1.2       anton     138:            r> list-next @
1.1       anton     139:     repeat
                    140:     drop ;
                    141: 
                    142: : print-profile-coldef ( -- )
                    143:     profile-points @ begin
                    144:        dup while
                    145:            dup >r
                    146:            r@ profile-colondef? @ if
                    147:                r@ profile-sourcepos 2@ .sourcepos ." :"
                    148:                r@ profile-char @ 3 .r ." : "
                    149:                r@ profile-count 2@ 10 d.r
                    150:                r@ profile-straight-line @ space 2 .r
1.3       anton     151:                r@ profile-calls @ list-length 4 .r
1.1       anton     152:                cr
                    153:            endif
1.2       anton     154:            r> list-next @
1.1       anton     155:     repeat
                    156:     drop ;
                    157: 
1.3       anton     158: : 1= ( u -- f )
                    159:     1 = ;
                    160: 
                    161: : 2= ( u -- f )
                    162:     2 = ;
                    163: 
                    164: : 3= ( u -- f )
                    165:     3 = ;
                    166: 
                    167: : 1u> ( u -- f )
                    168:     1 u> ;
                    169: 
                    170: : call-count+ ( ud1 callp -- ud2 )
                    171:     calls-call @ profile-count 2@ d+ ;
                    172: 
1.5     ! anton     173: : count-dyncalls ( calls -- ud )
        !           174:     0. rot ['] call-count+ map-list ;
        !           175: 
        !           176: : add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
        !           177:     \ add statistics for callee profpp up, if the number of static
        !           178:     \ calls to profpp satisfies xt-test ( u -- f ); see below for what
        !           179:     \ statistics are computed.
1.3       anton     180:     { xt-test p }
1.5     ! anton     181:     p profile-colondef? @ if
1.3       anton     182:        p profile-calls @ { calls }
                    183:        calls list-length { stat }
1.5     ! anton     184:        stat xt-test execute if
        !           185:            { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
        !           186:            ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
        !           187:            ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
        !           188:            u-stat stat +
        !           189:            u-exec-callees de dr d<> -
        !           190:            u-callees 1+
1.3       anton     191:        endif
                    192:     endif
                    193:     xt-test ;
                    194: 
                    195: : print-stat-line ( xt -- )
1.5     ! anton     196:     >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
1.3       anton     197:     ( ud-dyn-callee ud-dyn-caller u-stat )
1.5     ! anton     198:     6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;
        !           199: 
        !           200: : print-library-stats ( -- )
        !           201:     library-calls @ list-length 20 u.r \ static callers
        !           202:     library-calls @ count-dyncalls 12 ud.r \ dynamic callers
        !           203:     13 spaces ;
1.3       anton     204: 
                    205: : print-statistics ( -- )
1.5     ! anton     206:     ." callee exec'd static  dyn-caller  dyn-callee   condition" cr
1.3       anton     207:     ['] 0=  print-stat-line ." calls to coldefs with 0 callers" cr
                    208:     ['] 1=  print-stat-line ." calls to coldefs with 1 callers" cr
                    209:     ['] 2=  print-stat-line ." calls to coldefs with 2 callers" cr
                    210:     ['] 3=  print-stat-line ." calls to coldefs with 3 callers" cr
                    211:     ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
1.5     ! anton     212:     print-library-stats     ." library calls" cr
1.3       anton     213:     ;
                    214: 
1.1       anton     215: : dinc ( profilep -- )
                    216:     \ increment double pointed to by d-addr
                    217:     profile-count dup 2@ 1. d+ rot 2! ;
                    218: 
                    219: : profile-this ( -- )
1.4       anton     220:     in-compile,? @ in-compile,? on
                    221:     new-profile-point POSTPONE literal POSTPONE dinc
                    222:     in-compile,? ! ;
1.1       anton     223: 
                    224: \ Various words trigger PROFILE-THIS.  In order to avoid getting
                    225: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
                    226: \ just wait until the next word is parsed by the text interpreter (in
                    227: \ compile state) and call PROFILE-THIS only once then.  The whole
                    228: \ BEFORE-WORD hooking etc. is there for this.
                    229: 
                    230: \ The reason that we do this is because we use the source position for
                    231: \ the profiling information, and there's only one source position for
                    232: \ ?EXIT.  If we used the threaded code position instead, we would see
                    233: \ that ?EXIT compiles to several threaded-code words, and could use
                    234: \ different profile points for them.  However, usually dealing with
                    235: \ the source is more practical.
                    236: 
                    237: \ Another benefit is that we can ask for profiling anywhere in a
                    238: \ control-flow word (even before it compiles its own stuff).
                    239: 
                    240: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
                    241: \ a whole colon definition (and triggers our profiler), but during the
                    242: \ compilation of the colon definition there is no parsing.  Afterwards
                    243: \ you get interpret state at first (no profiling, either), but after
                    244: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
                    245: \ called (and compiles code that is never executed).  It would be
                    246: \ better if we had a way of knowing whether we are in a colon def or
                    247: \ not (and used that knowledge instead of STATE).
                    248: 
1.4       anton     249: \ Defer before-word-profile ( -- )
                    250: \ ' noop IS before-word-profile
1.1       anton     251: 
1.4       anton     252: \ : before-word1 ( -- )
                    253: \     before-word-profile defers before-word ;
1.1       anton     254: 
1.4       anton     255: \ ' before-word1 IS before-word
1.1       anton     256: 
1.4       anton     257: \ : profile-this-compiling ( -- )
                    258: \     state @ if
                    259: \      profile-this
                    260: \      ['] noop IS before-word-profile
                    261: \     endif ;
                    262: 
                    263: \ : cock-profiler ( -- )
                    264: \     \ as in cock the gun - pull the trigger
                    265: \     ['] profile-this-compiling IS before-word-profile
                    266: \     [ count-calls? ] [if] \ we are at a non-colondef profile point
                    267: \      last-colondef-profile @ profile-straight-line off
                    268: \     [endif]
                    269: \ ;
1.1       anton     270: 
                    271: : hook-profiling-into ( "name" -- )
                    272:     \ make (deferred word) "name" call cock-profiler, too
                    273:     ' >body >r :noname
1.4       anton     274:     POSTPONE profile-this
1.1       anton     275:     r@ @ compile, \ old hook behaviour
                    276:     POSTPONE ;
                    277:     r> ! ; \ change hook behaviour
                    278: 
                    279: : note-execute ( -- )
                    280:     \ end of BB due to execute
                    281: ;
                    282: 
                    283: : note-call ( addr -- )
                    284:     \ addr is the body address of a called colon def or does handler
1.5     ! anton     285:     dup ['] (does>2) >body = if \ adjust does handler address
        !           286:        4 cells here 1 cells - +!
1.1       anton     287:     endif
1.5     ! anton     288:     profile-this current-profile-point @ new-call
        !           289:     over 3 cells + @ ['] dinc >body = if ( addr call-prof-point )
        !           290:        \ non-library call
        !           291:         swap cell+ @ profile-calls insert-list
        !           292:     else ( addr call-prof-point )
        !           293:        library-calls insert-list drop
        !           294:     endif ;
1.4       anton     295: 
1.1       anton     296: : prof-compile, ( xt -- )
1.4       anton     297:     in-compile,? @ if
                    298:        DEFERS compile, EXIT
                    299:     endif
1.1       anton     300:     dup >does-code if
                    301:        dup >does-code note-call
                    302:     then
                    303:     dup >code-address CASE
                    304:        docol:   OF dup >body note-call ENDOF
                    305:        dodefer: OF note-execute ENDOF
                    306:        \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
                    307:        \ code words and ;code-defined words (code words could be optimized):
                    308:     ENDCASE
                    309:     DEFERS compile, ;
                    310: 
1.4       anton     311: : :-hook-profile ( -- )
                    312:     defers :-hook
                    313:     next-profile-point-p @
                    314:     profile-this
                    315:     @ dup last-colondef-profile !
                    316:     profile-colondef? on ;
                    317: 
1.1       anton     318: \ hook-profiling-into then-like
                    319: \ \ hook-profiling-into if-like    \ subsumed by other-control-flow
                    320: \ \ hook-profiling-into ahead-like \ subsumed by other-control-flow
                    321: \ hook-profiling-into other-control-flow
                    322: \ hook-profiling-into begin-like
                    323: \ hook-profiling-into again-like
                    324: \ hook-profiling-into until-like
                    325: ' :-hook-profile IS :-hook
1.4       anton     326: ' prof-compile, IS compile,

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>