File:  [gforth] / gforth / prof-inline.fs
Revision 1.9: download - view: text, annotated - select for diffs
Mon Dec 31 19:02:24 2007 UTC (16 years, 3 months ago) by anton
Branches: MAIN
CVS tags: v0-7-0, HEAD
updated copyright year after changing license notice

    1: \ get some data on potential (partial) inlining
    2: 
    3: \ Copyright (C) 2004,2007 Free Software Foundation, Inc.
    4: 
    5: \ This file is part of Gforth.
    6: 
    7: \ Gforth is free software; you can redistribute it and/or
    8: \ modify it under the terms of the GNU General Public License
    9: \ as published by the Free Software Foundation, either version 3
   10: \ of the License, or (at your option) any later version.
   11: 
   12: \ This program is distributed in the hope that it will be useful,
   13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
   14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   15: \ GNU General Public License for more details.
   16: 
   17: \ You should have received a copy of the GNU General Public License
   18: \ along with this program. If not, see http://www.gnu.org/licenses/.
   19: 
   20: 
   21: \ relies on some Gforth internals
   22: 
   23: \ !! assumption: each file is included only once; otherwise you get
   24: \ the counts for just one of the instances of the file.  This can be
   25: \ fixed by making sure that every source position occurs only once as
   26: \ a profile point.
   27: 
   28: true constant count-calls? \ do some profiling of colon definitions etc.
   29: 
   30: \ for true COUNT-CALLS?:
   31: 
   32: \ What data do I need for evaluating the effectiveness of (partial) inlining?
   33: 
   34: \ static and dynamic counts of everything:
   35: 
   36: \ original BB length (histogram and average)
   37: \ BB length with partial inlining (histogram and average)
   38: \   since we cannot partially inline library calls, we use a parameter
   39: \   that represents the amount of partial inlining we can expect there.
   40: \ number of tail calls (original and after partial inlining)
   41: \ number of calls (original and after partial inlining)
   42: \ reason for BB end: call, return, execute, branch
   43: 
   44: \ how many static calls are there to a word?  How many of the dynamic
   45: \ calls call just a single word?
   46: 
   47: \ how much does inlining called-once words help?
   48: \ how much does inlining words without control flow help?
   49: \ how much does partial inlining help?
   50: \ what's the overlap?
   51: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
   52: 
   53: struct
   54:     cell% field list-next
   55: end-struct list%
   56: 
   57: list%
   58:     cell% 2* field profile-count \ how often this profile point is performed
   59:     cell% 2* field profile-sourcepos
   60:     cell% field profile-char \ character position in line
   61:     cell% field profile-bblen \ number of primitives in BB
   62:     cell% field profile-bblenpi \ bblen after partial inlining
   63:     cell% field profile-callee-postlude \ 0 or (for calls) callee postlude len
   64:     cell% field profile-tailof \ 0 or (for tail bbs) pointer to coldef bb
   65:     cell% field profile-colondef? \ is this a colon definition start
   66:     cell% field profile-calls \ static calls to the colon def (calls%)
   67:     cell% field profile-straight-line \ may contain calls, but no other CF
   68:     cell% field profile-calls-from \ static calls in the colon def
   69:     cell% field profile-exits \ number of exits in this colon def
   70:     cell% 2* field profile-execs \ number of EXECUTEs etc. of this colon def
   71:     cell% field profile-prelude \ first BB-len of colon def (incl. callee)
   72:     cell% field profile-postlude \ last BB-len of colon def (incl. callee)
   73: end-struct profile% \ profile point 
   74: 
   75: list%
   76:     cell% field calls-call \ ptr to profile point of bb containing the call
   77: end-struct calls%
   78: 
   79: variable profile-points \ linked list of profile%
   80: 0 profile-points !
   81: variable next-profile-point-p \ the address where the next pp will be stored
   82: profile-points next-profile-point-p !
   83: variable last-colondef-profile \ pointer to the pp of last colon definition
   84: variable current-profile-point
   85: variable library-calls 0 library-calls ! \ list of calls to library colon defs
   86: variable in-compile,? in-compile,? off
   87: variable all-bbs 0 all-bbs ! \ list of all basic blocks
   88: 
   89: \ list stuff
   90: 
   91: : map-list ( ... list xt -- ... )
   92:     { xt } begin { list }
   93: 	list while
   94: 	    list xt execute
   95: 	    list list-next @
   96:     repeat ;
   97: 
   98: : drop-1+ drop 1+ ;
   99: 
  100: : list-length ( list -- u )
  101:     0 swap ['] drop-1+ map-list ;
  102: 
  103: : insert-list ( listp listpp -- )
  104:     \ insert list node listp into list pointed to by listpp in front
  105:     tuck @ over list-next !
  106:     swap ! ;
  107: 
  108: : insert-list-end ( listp listppp -- )
  109:     \ insert list node listp into list, with listppp indicating the
  110:     \ position to insert at, and indicating the position behind the
  111:     \ new element afterwards.
  112:     2dup @ insert-list
  113:     swap list-next swap ! ;
  114: 
  115: \ calls
  116: 
  117: : new-call ( profile-point -- call )
  118:     calls% %alloc tuck calls-call ! ;
  119: 
  120: \ profile-point stuff   
  121: 
  122: : new-profile-point ( -- addr )
  123:     profile% %alloc >r
  124:     0. r@ profile-count 2!
  125:     current-sourcepos r@ profile-sourcepos 2!
  126:     >in @ r@ profile-char !
  127:     0 r@ profile-callee-postlude !
  128:     0 r@ profile-tailof !
  129:     r@ profile-colondef? off
  130:     0 r@ profile-bblen !
  131:     -100000000 r@ profile-bblenpi !
  132:     current-profile-point @ profile-bblenpi @ -100000000 = if
  133: 	current-profile-point @ dup profile-bblen @ swap profile-bblenpi !
  134:     endif
  135:     0 r@ profile-calls !
  136:     r@ profile-straight-line on
  137:     0 r@ profile-calls-from !
  138:     0 r@ profile-exits !
  139:     0. r@ profile-execs 2!
  140:     0 r@ profile-prelude !
  141:     0 r@ profile-postlude !
  142:     r@ next-profile-point-p insert-list-end
  143:     r@ current-profile-point !
  144:     r@ new-call all-bbs insert-list
  145:     r> ;
  146: 
  147: : print-profile ( -- )
  148:     profile-points @ begin
  149: 	dup while
  150: 	    dup >r
  151: 	    r@ profile-sourcepos 2@ .sourcepos ." :"
  152: 	    r@ profile-char @ 0 .r ." : "
  153: 	    r@ profile-count 2@ 0 d.r cr
  154: 	    r> list-next @
  155:     repeat
  156:     drop ;
  157: 
  158: : print-profile-coldef ( -- )
  159:     profile-points @ begin
  160: 	dup while
  161: 	    dup >r
  162: 	    r@ profile-colondef? @ if
  163: 		r@ profile-sourcepos 2@ .sourcepos ." :"
  164: 		r@ profile-char @ 3 .r ." : "
  165: 		r@ profile-count 2@ 10 d.r
  166: 		r@ profile-straight-line @ space 2 .r
  167: 		r@ profile-calls @ list-length 4 .r
  168: 		cr
  169: 	    endif
  170: 	    r> list-next @
  171:     repeat
  172:     drop ;
  173: 
  174: : 1= ( u -- f )
  175:     1 = ;
  176: 
  177: : 2= ( u -- f )
  178:     2 = ;
  179: 
  180: : 3= ( u -- f )
  181:     3 = ;
  182: 
  183: : 1u> ( u -- f )
  184:     1 u> ;
  185: 
  186: : call-count+ ( ud1 callp -- ud2 )
  187:     calls-call @ profile-count 2@ d+ ;
  188: 
  189: : count-dyncalls ( calls -- ud )
  190:     0. rot ['] call-count+ map-list ;
  191: 
  192: : add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
  193:     \ add statistics for callee profpp up, if the number of static
  194:     \ calls to profpp satisfies xt-test ( u -- f ); see below for what
  195:     \ statistics are computed.
  196:     { xt-test p }
  197:     p profile-colondef? @ if
  198: 	p profile-calls @ { calls }
  199: 	calls list-length { stat }
  200: 	stat xt-test execute if
  201: 	    { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
  202: 	    ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
  203: 	    ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
  204: 	    u-stat stat +
  205: 	    u-exec-callees de dr d<> -
  206: 	    u-callees 1+
  207: 	endif
  208:     endif
  209:     xt-test ;
  210: 
  211: : print-stat-line ( xt -- )
  212:     >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
  213:     ( ud-dyn-callee ud-dyn-caller u-stat )
  214:     6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;
  215: 
  216: : print-library-stats ( -- )
  217:     library-calls @ list-length 20 u.r \ static callers
  218:     library-calls @ count-dyncalls 12 ud.r \ dynamic callers
  219:     13 spaces ;
  220: 
  221: : bblen+ ( u1 callp -- u2 )
  222:     calls-call @ profile-bblen @ + ;
  223: 
  224: : dyn-bblen+ ( ud1 callp -- ud2 )
  225:     calls-call @ dup profile-count 2@ rot profile-bblen @ 1 m*/ d+ ;
  226:     
  227: : print-bb-statistics ( -- )
  228:     ." static     dynamic" cr
  229:     all-bbs @ list-length 6 u.r all-bbs @ count-dyncalls 12 ud.r ."  basic blocks" cr
  230:     0 all-bbs @ ['] bblen+ map-list 6 u.r
  231:     0. all-bbs @ ['] dyn-bblen+ map-list 12 ud.r ."  primitives" cr
  232:     ;
  233: 
  234: : print-statistics ( -- )
  235:     ." callee exec'd static  dyn-caller  dyn-callee   condition" cr
  236:     ['] 0=  print-stat-line ." calls to coldefs with 0 callers" cr
  237:     ['] 1=  print-stat-line ." calls to coldefs with 1 callers" cr
  238:     ['] 2=  print-stat-line ." calls to coldefs with 2 callers" cr
  239:     ['] 3=  print-stat-line ." calls to coldefs with 3 callers" cr
  240:     ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
  241:     print-library-stats     ." library calls" cr
  242:     print-bb-statistics
  243:     ;
  244: 
  245: : dinc ( profilep -- )
  246:     \ increment double pointed to by d-addr
  247:     profile-count dup 2@ 1. d+ rot 2! ;
  248: 
  249: : profile-this ( -- )
  250:     in-compile,? @ in-compile,? on
  251:     new-profile-point POSTPONE literal POSTPONE dinc
  252:     in-compile,? ! ;
  253: 
  254: \ Various words trigger PROFILE-THIS.  In order to avoid getting
  255: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
  256: \ just wait until the next word is parsed by the text interpreter (in
  257: \ compile state) and call PROFILE-THIS only once then.  The whole
  258: \ BEFORE-WORD hooking etc. is there for this.
  259: 
  260: \ The reason that we do this is because we use the source position for
  261: \ the profiling information, and there's only one source position for
  262: \ ?EXIT.  If we used the threaded code position instead, we would see
  263: \ that ?EXIT compiles to several threaded-code words, and could use
  264: \ different profile points for them.  However, usually dealing with
  265: \ the source is more practical.
  266: 
  267: \ Another benefit is that we can ask for profiling anywhere in a
  268: \ control-flow word (even before it compiles its own stuff).
  269: 
  270: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
  271: \ a whole colon definition (and triggers our profiler), but during the
  272: \ compilation of the colon definition there is no parsing.  Afterwards
  273: \ you get interpret state at first (no profiling, either), but after
  274: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
  275: \ called (and compiles code that is never executed).  It would be
  276: \ better if we had a way of knowing whether we are in a colon def or
  277: \ not (and used that knowledge instead of STATE).
  278: 
  279: Defer before-word-profile ( -- )
  280: ' noop IS before-word-profile
  281: 
  282: : before-word1 ( -- )
  283:     before-word-profile defers before-word ;
  284: 
  285: ' before-word1 IS before-word
  286: 
  287: : profile-this-compiling ( -- )
  288:     state @ if
  289: 	profile-this
  290: 	['] noop IS before-word-profile
  291:     endif ;
  292: 
  293: : cock-profiler ( -- )
  294:     \ as in cock the gun - pull the trigger
  295:     ['] profile-this-compiling IS before-word-profile
  296:     [ count-calls? ] [if] \ we are at a non-colondef profile point
  297: 	last-colondef-profile @ profile-straight-line off
  298:     [endif]
  299: ;
  300: 
  301: : hook-profiling-into ( "name" -- )
  302:     \ make (deferred word) "name" call cock-profiler, too
  303:     ' >body >r :noname
  304:     POSTPONE cock-profiler
  305:     r@ @ compile, \ old hook behaviour
  306:     POSTPONE ;
  307:     r> ! ; \ change hook behaviour
  308: 
  309: : note-execute ( -- )
  310:     \ end of BB due to execute, dodefer, perform
  311:     profile-this \ should actually happen after the word, but the
  312:                  \ error is probably small
  313: ;
  314: 
  315: : note-call ( addr -- )
  316:     \ addr is the body address of a called colon def or does handler
  317:     dup ['] (does>2) >body = if \ adjust does handler address
  318: 	4 cells here 1 cells - +!
  319:     endif
  320:     { addr }
  321:     current-profile-point @ { lastbb }
  322:     profile-this
  323:     current-profile-point @ { thisbb }
  324:     thisbb new-call { call-node }
  325:     over 3 cells + @ ['] dinc >body = if
  326: 	\ non-library call
  327:     !! update profile-bblenpi of last and current pp
  328: 	addr cell+ @ { callee-pp }
  329: 	callee-pp profile-postlude @ thisbb profile-callee-postlude !
  330: 	call-node callee-pp profile-calls insert-list
  331:     else ( addr call-prof-point )
  332: 	call-node library-calls insert-list
  333:     endif ;
  334: 
  335: : prof-compile, ( xt -- )
  336:     in-compile,? @ if
  337: 	DEFERS compile, EXIT
  338:     endif
  339:     1 current-profile-point @ profile-bblen +!
  340:     dup CASE
  341: 	['] execute of note-execute endof
  342: 	['] perform of note-execute endof
  343: 	dup >does-code if
  344: 	    dup >does-code note-call
  345: 	then
  346: 	dup >code-address CASE
  347: 	    docol:   OF dup >body note-call ENDOF
  348: 	    dodefer: OF note-execute ENDOF
  349: 	    \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
  350: 	    \ code words and ;code-defined words (code words could be optimized):
  351: 	ENDCASE
  352:     ENDCASE
  353:     DEFERS compile, ;
  354: 
  355: : :-hook-profile ( -- )
  356:     defers :-hook
  357:     next-profile-point-p @
  358:     profile-this
  359:     @ dup last-colondef-profile ! ( current-profile-point )
  360:     1 over profile-bblenpi !
  361:     profile-colondef? on ;
  362: 
  363: : exit-hook-profile ( -- )
  364:     defers exit-hook
  365:     1 last-colondef-profile @ profile-exits +! ;
  366: 
  367: : ;-hook-profile ( -- )
  368:     \ ;-hook is called before the POSTPONE EXIT
  369:     defers ;-hook
  370:     last-colondef-profile @ { col }
  371:     current-profile-point @ { bb }
  372:     col profile-bblen @ col profile-prelude +!
  373:     col profile-exits @ 0= if
  374: 	col bb profile-tailof !
  375: 	bb profile-bblen @ bb profile-callee-postlude @ +
  376: 	col profile-postlude !
  377: 	1 bb profile-bblenpi !
  378: 	\ not counting the EXIT
  379:     endif ;
  380: 
  381: hook-profiling-into then-like
  382: \ hook-profiling-into if-like    \ subsumed by other-control-flow
  383: \ hook-profiling-into ahead-like \ subsumed by other-control-flow
  384: hook-profiling-into other-control-flow
  385: hook-profiling-into begin-like
  386: hook-profiling-into again-like
  387: hook-profiling-into until-like
  388: ' :-hook-profile IS :-hook
  389: ' prof-compile, IS compile,
  390: ' exit-hook-profile IS exit-hook
  391: ' ;-hook-profile IS ;-hook

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>