File:  [gforth] / gforth / prof-inline.fs
Revision 1.5: download - view: text, annotated - select for diffs
Tue Sep 7 09:59:01 2004 UTC (17 years ago) by anton
Branches: MAIN
CVS tags: HEAD
more prof-inline.fs work

    1: \ get some data on potential (partial) inlining
    2: 
    3: \ Copyright (C) 2004 Free Software Foundation, Inc.
    4: 
    5: \ This file is part of Gforth.
    6: 
    7: \ Gforth is free software; you can redistribute it and/or
    8: \ modify it under the terms of the GNU General Public License
    9: \ as published by the Free Software Foundation; either version 2
   10: \ of the License, or (at your option) any later version.
   11: 
   12: \ This program is distributed in the hope that it will be useful,
   13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
   14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   15: \ GNU General Public License for more details.
   16: 
   17: \ You should have received a copy of the GNU General Public License
   18: \ along with this program; if not, write to the Free Software
   19: \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
   20: 
   21: 
   22: \ relies on some Gforth internals
   23: 
   24: \ !! assumption: each file is included only once; otherwise you get
   25: \ the counts for just one of the instances of the file.  This can be
   26: \ fixed by making sure that every source position occurs only once as
   27: \ a profile point.
   28: 
   29: true constant count-calls? \ do some profiling of colon definitions etc.
   30: 
   31: \ for true COUNT-CALLS?:
   32: 
   33: \ What data do I need for evaluating the effectiveness of (partial) inlining?
   34: 
   35: \ static and dynamic counts of everything:
   36: 
   37: \ original BB length (histogram and average)
   38: \ BB length with partial inlining (histogram and average)
   39: \   since we cannot partially inline library calls, we use a parameter
   40: \   that represents the amount of partial inlining we can expect there.
   41: \ number of tail calls (original and after partial inlining)
   42: \ number of calls (original and after partial inlining)
   43: \ reason for BB end: call, return, execute, branch
   44: 
   45: \ how many static calls are there to a word?  How many of the dynamic
   46: \ calls call just a single word?
   47: 
   48: \ how much does inlining called-once words help?
   49: \ how much does inlining words without control flow help?
   50: \ how much does partial inlining help?
   51: \ what's the overlap?
   52: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
   53: 
   54: struct
   55:     cell% field list-next
   56: end-struct list%
   57: 
   58: list%
   59:     cell% 2* field profile-count
   60:     cell% 2* field profile-sourcepos
   61:     cell%    field profile-char \ character position in line
   62:     count-calls? [if]
   63: 	cell% field profile-colondef? \ is this a colon definition start
   64: 	cell% field profile-calls \ static calls to the colon def (calls%)
   65: 	cell% field profile-straight-line \ may contain calls, but no other CF
   66: 	cell% field profile-calls-from \ static calls in the colon def
   67:     [endif]
   68: end-struct profile% \ profile point
   69: 
   70: list%
   71:     cell% field calls-call \ ptr to profile point of bb containing the call
   72: end-struct calls%
   73: 
   74: variable profile-points \ linked list of profile%
   75: 0 profile-points !
   76: variable next-profile-point-p \ the address where the next pp will be stored
   77: profile-points next-profile-point-p !
   78: variable last-colondef-profile \ pointer to the pp of last colon definition
   79: variable current-profile-point
   80: variable library-calls 0 library-calls ! \ list of calls to library colon defs
   81: variable in-compile,? in-compile,? off
   82: 
   83: \ list stuff
   84: 
   85: : map-list ( ... list xt -- ... )
   86:     { xt } begin { list }
   87: 	list while
   88: 	    list xt execute
   89: 	    list list-next @
   90:     repeat ;
   91: 
   92: : drop-1+ drop 1+ ;
   93: 
   94: : list-length ( list -- u )
   95:     0 swap ['] drop-1+ map-list ;
   96: 
   97: : insert-list ( listp listpp -- )
   98:     \ insert list node listp into list pointed to by listpp in front
   99:     tuck @ over list-next !
  100:     swap ! ;
  101: 
  102: : insert-list-end ( listp listppp -- )
  103:     \ insert list node listp into list, with listppp indicating the
  104:     \ position to insert at, and indicating the position behind the
  105:     \ new element afterwards.
  106:     2dup @ insert-list
  107:     swap list-next swap ! ;
  108: 
  109: \ calls
  110: 
  111: : new-call ( profile-point -- call )
  112:     calls% %alloc tuck calls-call ! ;
  113: 
  114: \ profile-point stuff   
  115: 
  116: : new-profile-point ( -- addr )
  117:     profile% %alloc >r
  118:     0. r@ profile-count 2!
  119:     current-sourcepos r@ profile-sourcepos 2!
  120:     >in @ r@ profile-char !
  121:     [ count-calls? ] [if]
  122: 	r@ profile-colondef? off
  123: 	0 r@ profile-calls !
  124: 	r@ profile-straight-line on
  125: 	0 r@ profile-calls-from !
  126:     [endif]
  127:     r@ next-profile-point-p insert-list-end
  128:     r@ current-profile-point !
  129:     r> ;
  130: 
  131: : print-profile ( -- )
  132:     profile-points @ begin
  133: 	dup while
  134: 	    dup >r
  135: 	    r@ profile-sourcepos 2@ .sourcepos ." :"
  136: 	    r@ profile-char @ 0 .r ." : "
  137: 	    r@ profile-count 2@ 0 d.r cr
  138: 	    r> list-next @
  139:     repeat
  140:     drop ;
  141: 
  142: : print-profile-coldef ( -- )
  143:     profile-points @ begin
  144: 	dup while
  145: 	    dup >r
  146: 	    r@ profile-colondef? @ if
  147: 		r@ profile-sourcepos 2@ .sourcepos ." :"
  148: 		r@ profile-char @ 3 .r ." : "
  149: 		r@ profile-count 2@ 10 d.r
  150: 		r@ profile-straight-line @ space 2 .r
  151: 		r@ profile-calls @ list-length 4 .r
  152: 		cr
  153: 	    endif
  154: 	    r> list-next @
  155:     repeat
  156:     drop ;
  157: 
  158: : 1= ( u -- f )
  159:     1 = ;
  160: 
  161: : 2= ( u -- f )
  162:     2 = ;
  163: 
  164: : 3= ( u -- f )
  165:     3 = ;
  166: 
  167: : 1u> ( u -- f )
  168:     1 u> ;
  169: 
  170: : call-count+ ( ud1 callp -- ud2 )
  171:     calls-call @ profile-count 2@ d+ ;
  172: 
  173: : count-dyncalls ( calls -- ud )
  174:     0. rot ['] call-count+ map-list ;
  175: 
  176: : add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
  177:     \ add statistics for callee profpp up, if the number of static
  178:     \ calls to profpp satisfies xt-test ( u -- f ); see below for what
  179:     \ statistics are computed.
  180:     { xt-test p }
  181:     p profile-colondef? @ if
  182: 	p profile-calls @ { calls }
  183: 	calls list-length { stat }
  184: 	stat xt-test execute if
  185: 	    { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
  186: 	    ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
  187: 	    ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
  188: 	    u-stat stat +
  189: 	    u-exec-callees de dr d<> -
  190: 	    u-callees 1+
  191: 	endif
  192:     endif
  193:     xt-test ;
  194: 
  195: : print-stat-line ( xt -- )
  196:     >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
  197:     ( ud-dyn-callee ud-dyn-caller u-stat )
  198:     6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;
  199: 
  200: : print-library-stats ( -- )
  201:     library-calls @ list-length 20 u.r \ static callers
  202:     library-calls @ count-dyncalls 12 ud.r \ dynamic callers
  203:     13 spaces ;
  204: 
  205: : print-statistics ( -- )
  206:     ." callee exec'd static  dyn-caller  dyn-callee   condition" cr
  207:     ['] 0=  print-stat-line ." calls to coldefs with 0 callers" cr
  208:     ['] 1=  print-stat-line ." calls to coldefs with 1 callers" cr
  209:     ['] 2=  print-stat-line ." calls to coldefs with 2 callers" cr
  210:     ['] 3=  print-stat-line ." calls to coldefs with 3 callers" cr
  211:     ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
  212:     print-library-stats     ." library calls" cr
  213:     ;
  214: 
  215: : dinc ( profilep -- )
  216:     \ increment double pointed to by d-addr
  217:     profile-count dup 2@ 1. d+ rot 2! ;
  218: 
  219: : profile-this ( -- )
  220:     in-compile,? @ in-compile,? on
  221:     new-profile-point POSTPONE literal POSTPONE dinc
  222:     in-compile,? ! ;
  223: 
  224: \ Various words trigger PROFILE-THIS.  In order to avoid getting
  225: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
  226: \ just wait until the next word is parsed by the text interpreter (in
  227: \ compile state) and call PROFILE-THIS only once then.  The whole
  228: \ BEFORE-WORD hooking etc. is there for this.
  229: 
  230: \ The reason that we do this is because we use the source position for
  231: \ the profiling information, and there's only one source position for
  232: \ ?EXIT.  If we used the threaded code position instead, we would see
  233: \ that ?EXIT compiles to several threaded-code words, and could use
  234: \ different profile points for them.  However, usually dealing with
  235: \ the source is more practical.
  236: 
  237: \ Another benefit is that we can ask for profiling anywhere in a
  238: \ control-flow word (even before it compiles its own stuff).
  239: 
  240: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
  241: \ a whole colon definition (and triggers our profiler), but during the
  242: \ compilation of the colon definition there is no parsing.  Afterwards
  243: \ you get interpret state at first (no profiling, either), but after
  244: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
  245: \ called (and compiles code that is never executed).  It would be
  246: \ better if we had a way of knowing whether we are in a colon def or
  247: \ not (and used that knowledge instead of STATE).
  248: 
  249: \ Defer before-word-profile ( -- )
  250: \ ' noop IS before-word-profile
  251: 
  252: \ : before-word1 ( -- )
  253: \     before-word-profile defers before-word ;
  254: 
  255: \ ' before-word1 IS before-word
  256: 
  257: \ : profile-this-compiling ( -- )
  258: \     state @ if
  259: \ 	profile-this
  260: \ 	['] noop IS before-word-profile
  261: \     endif ;
  262: 
  263: \ : cock-profiler ( -- )
  264: \     \ as in cock the gun - pull the trigger
  265: \     ['] profile-this-compiling IS before-word-profile
  266: \     [ count-calls? ] [if] \ we are at a non-colondef profile point
  267: \ 	last-colondef-profile @ profile-straight-line off
  268: \     [endif]
  269: \ ;
  270: 
  271: : hook-profiling-into ( "name" -- )
  272:     \ make (deferred word) "name" call cock-profiler, too
  273:     ' >body >r :noname
  274:     POSTPONE profile-this
  275:     r@ @ compile, \ old hook behaviour
  276:     POSTPONE ;
  277:     r> ! ; \ change hook behaviour
  278: 
  279: : note-execute ( -- )
  280:     \ end of BB due to execute
  281: ;
  282: 
  283: : note-call ( addr -- )
  284:     \ addr is the body address of a called colon def or does handler
  285:     dup ['] (does>2) >body = if \ adjust does handler address
  286: 	4 cells here 1 cells - +!
  287:     endif
  288:     profile-this current-profile-point @ new-call
  289:     over 3 cells + @ ['] dinc >body = if ( addr call-prof-point )
  290: 	\ non-library call
  291: 	 swap cell+ @ profile-calls insert-list
  292:     else ( addr call-prof-point )
  293: 	library-calls insert-list drop
  294:     endif ;
  295: 
  296: : prof-compile, ( xt -- )
  297:     in-compile,? @ if
  298: 	DEFERS compile, EXIT
  299:     endif
  300:     dup >does-code if
  301: 	dup >does-code note-call
  302:     then
  303:     dup >code-address CASE
  304: 	docol:   OF dup >body note-call ENDOF
  305: 	dodefer: OF note-execute ENDOF
  306: 	\ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
  307: 	\ code words and ;code-defined words (code words could be optimized):
  308:     ENDCASE
  309:     DEFERS compile, ;
  310: 
  311: : :-hook-profile ( -- )
  312:     defers :-hook
  313:     next-profile-point-p @
  314:     profile-this
  315:     @ dup last-colondef-profile !
  316:     profile-colondef? on ;
  317: 
  318: \ hook-profiling-into then-like
  319: \ \ hook-profiling-into if-like    \ subsumed by other-control-flow
  320: \ \ hook-profiling-into ahead-like \ subsumed by other-control-flow
  321: \ hook-profiling-into other-control-flow
  322: \ hook-profiling-into begin-like
  323: \ hook-profiling-into again-like
  324: \ hook-profiling-into until-like
  325: ' :-hook-profile IS :-hook
  326: ' prof-compile, IS compile,

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>