File:  [gforth] / gforth / prof-inline.fs
Revision 1.9: download - view: text, annotated - select for diffs
Mon Dec 31 19:02:24 2007 UTC (16 years, 2 months ago) by anton
Branches: MAIN
CVS tags: v0-7-0, HEAD
updated copyright year after changing license notice

\ get some data on potential (partial) inlining

\ Copyright (C) 2004,2007 Free Software Foundation, Inc.

\ This file is part of Gforth.

\ Gforth is free software; you can redistribute it and/or
\ modify it under the terms of the GNU General Public License
\ as published by the Free Software Foundation, either version 3
\ of the License, or (at your option) any later version.

\ This program is distributed in the hope that it will be useful,
\ but WITHOUT ANY WARRANTY; without even the implied warranty of
\ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
\ GNU General Public License for more details.

\ You should have received a copy of the GNU General Public License
\ along with this program. If not, see http://www.gnu.org/licenses/.


\ relies on some Gforth internals

\ !! assumption: each file is included only once; otherwise you get
\ the counts for just one of the instances of the file.  This can be
\ fixed by making sure that every source position occurs only once as
\ a profile point.

true constant count-calls? \ do some profiling of colon definitions etc.

\ for true COUNT-CALLS?:

\ What data do I need for evaluating the effectiveness of (partial) inlining?

\ static and dynamic counts of everything:

\ original BB length (histogram and average)
\ BB length with partial inlining (histogram and average)
\   since we cannot partially inline library calls, we use a parameter
\   that represents the amount of partial inlining we can expect there.
\ number of tail calls (original and after partial inlining)
\ number of calls (original and after partial inlining)
\ reason for BB end: call, return, execute, branch

\ how many static calls are there to a word?  How many of the dynamic
\ calls call just a single word?

\ how much does inlining called-once words help?
\ how much does inlining words without control flow help?
\ how much does partial inlining help?
\ what's the overlap?
\ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls

struct
    cell% field list-next
end-struct list%

list%
    cell% 2* field profile-count \ how often this profile point is performed
    cell% 2* field profile-sourcepos
    cell% field profile-char \ character position in line
    cell% field profile-bblen \ number of primitives in BB
    cell% field profile-bblenpi \ bblen after partial inlining
    cell% field profile-callee-postlude \ 0 or (for calls) callee postlude len
    cell% field profile-tailof \ 0 or (for tail bbs) pointer to coldef bb
    cell% field profile-colondef? \ is this a colon definition start
    cell% field profile-calls \ static calls to the colon def (calls%)
    cell% field profile-straight-line \ may contain calls, but no other CF
    cell% field profile-calls-from \ static calls in the colon def
    cell% field profile-exits \ number of exits in this colon def
    cell% 2* field profile-execs \ number of EXECUTEs etc. of this colon def
    cell% field profile-prelude \ first BB-len of colon def (incl. callee)
    cell% field profile-postlude \ last BB-len of colon def (incl. callee)
end-struct profile% \ profile point 

list%
    cell% field calls-call \ ptr to profile point of bb containing the call
end-struct calls%

variable profile-points \ linked list of profile%
0 profile-points !
variable next-profile-point-p \ the address where the next pp will be stored
profile-points next-profile-point-p !
variable last-colondef-profile \ pointer to the pp of last colon definition
variable current-profile-point
variable library-calls 0 library-calls ! \ list of calls to library colon defs
variable in-compile,? in-compile,? off
variable all-bbs 0 all-bbs ! \ list of all basic blocks

\ list stuff

: map-list ( ... list xt -- ... )
    { xt } begin { list }
	list while
	    list xt execute
	    list list-next @
    repeat ;

: drop-1+ drop 1+ ;

: list-length ( list -- u )
    0 swap ['] drop-1+ map-list ;

: insert-list ( listp listpp -- )
    \ insert list node listp into list pointed to by listpp in front
    tuck @ over list-next !
    swap ! ;

: insert-list-end ( listp listppp -- )
    \ insert list node listp into list, with listppp indicating the
    \ position to insert at, and indicating the position behind the
    \ new element afterwards.
    2dup @ insert-list
    swap list-next swap ! ;

\ calls

: new-call ( profile-point -- call )
    calls% %alloc tuck calls-call ! ;

\ profile-point stuff   

: new-profile-point ( -- addr )
    profile% %alloc >r
    0. r@ profile-count 2!
    current-sourcepos r@ profile-sourcepos 2!
    >in @ r@ profile-char !
    0 r@ profile-callee-postlude !
    0 r@ profile-tailof !
    r@ profile-colondef? off
    0 r@ profile-bblen !
    -100000000 r@ profile-bblenpi !
    current-profile-point @ profile-bblenpi @ -100000000 = if
	current-profile-point @ dup profile-bblen @ swap profile-bblenpi !
    endif
    0 r@ profile-calls !
    r@ profile-straight-line on
    0 r@ profile-calls-from !
    0 r@ profile-exits !
    0. r@ profile-execs 2!
    0 r@ profile-prelude !
    0 r@ profile-postlude !
    r@ next-profile-point-p insert-list-end
    r@ current-profile-point !
    r@ new-call all-bbs insert-list
    r> ;

: print-profile ( -- )
    profile-points @ begin
	dup while
	    dup >r
	    r@ profile-sourcepos 2@ .sourcepos ." :"
	    r@ profile-char @ 0 .r ." : "
	    r@ profile-count 2@ 0 d.r cr
	    r> list-next @
    repeat
    drop ;

: print-profile-coldef ( -- )
    profile-points @ begin
	dup while
	    dup >r
	    r@ profile-colondef? @ if
		r@ profile-sourcepos 2@ .sourcepos ." :"
		r@ profile-char @ 3 .r ." : "
		r@ profile-count 2@ 10 d.r
		r@ profile-straight-line @ space 2 .r
		r@ profile-calls @ list-length 4 .r
		cr
	    endif
	    r> list-next @
    repeat
    drop ;

: 1= ( u -- f )
    1 = ;

: 2= ( u -- f )
    2 = ;

: 3= ( u -- f )
    3 = ;

: 1u> ( u -- f )
    1 u> ;

: call-count+ ( ud1 callp -- ud2 )
    calls-call @ profile-count 2@ d+ ;

: count-dyncalls ( calls -- ud )
    0. rot ['] call-count+ map-list ;

: add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
    \ add statistics for callee profpp up, if the number of static
    \ calls to profpp satisfies xt-test ( u -- f ); see below for what
    \ statistics are computed.
    { xt-test p }
    p profile-colondef? @ if
	p profile-calls @ { calls }
	calls list-length { stat }
	stat xt-test execute if
	    { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
	    ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
	    ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
	    u-stat stat +
	    u-exec-callees de dr d<> -
	    u-callees 1+
	endif
    endif
    xt-test ;

: print-stat-line ( xt -- )
    >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
    ( ud-dyn-callee ud-dyn-caller u-stat )
    6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;

: print-library-stats ( -- )
    library-calls @ list-length 20 u.r \ static callers
    library-calls @ count-dyncalls 12 ud.r \ dynamic callers
    13 spaces ;

: bblen+ ( u1 callp -- u2 )
    calls-call @ profile-bblen @ + ;

: dyn-bblen+ ( ud1 callp -- ud2 )
    calls-call @ dup profile-count 2@ rot profile-bblen @ 1 m*/ d+ ;
    
: print-bb-statistics ( -- )
    ." static     dynamic" cr
    all-bbs @ list-length 6 u.r all-bbs @ count-dyncalls 12 ud.r ."  basic blocks" cr
    0 all-bbs @ ['] bblen+ map-list 6 u.r
    0. all-bbs @ ['] dyn-bblen+ map-list 12 ud.r ."  primitives" cr
    ;

: print-statistics ( -- )
    ." callee exec'd static  dyn-caller  dyn-callee   condition" cr
    ['] 0=  print-stat-line ." calls to coldefs with 0 callers" cr
    ['] 1=  print-stat-line ." calls to coldefs with 1 callers" cr
    ['] 2=  print-stat-line ." calls to coldefs with 2 callers" cr
    ['] 3=  print-stat-line ." calls to coldefs with 3 callers" cr
    ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
    print-library-stats     ." library calls" cr
    print-bb-statistics
    ;

: dinc ( profilep -- )
    \ increment double pointed to by d-addr
    profile-count dup 2@ 1. d+ rot 2! ;

: profile-this ( -- )
    in-compile,? @ in-compile,? on
    new-profile-point POSTPONE literal POSTPONE dinc
    in-compile,? ! ;

\ Various words trigger PROFILE-THIS.  In order to avoid getting
\ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
\ just wait until the next word is parsed by the text interpreter (in
\ compile state) and call PROFILE-THIS only once then.  The whole
\ BEFORE-WORD hooking etc. is there for this.

\ The reason that we do this is because we use the source position for
\ the profiling information, and there's only one source position for
\ ?EXIT.  If we used the threaded code position instead, we would see
\ that ?EXIT compiles to several threaded-code words, and could use
\ different profile points for them.  However, usually dealing with
\ the source is more practical.

\ Another benefit is that we can ask for profiling anywhere in a
\ control-flow word (even before it compiles its own stuff).

\ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
\ a whole colon definition (and triggers our profiler), but during the
\ compilation of the colon definition there is no parsing.  Afterwards
\ you get interpret state at first (no profiling, either), but after
\ the "]" you get parsing in compile state, and PROFILE-THIS gets
\ called (and compiles code that is never executed).  It would be
\ better if we had a way of knowing whether we are in a colon def or
\ not (and used that knowledge instead of STATE).

Defer before-word-profile ( -- )
' noop IS before-word-profile

: before-word1 ( -- )
    before-word-profile defers before-word ;

' before-word1 IS before-word

: profile-this-compiling ( -- )
    state @ if
	profile-this
	['] noop IS before-word-profile
    endif ;

: cock-profiler ( -- )
    \ as in cock the gun - pull the trigger
    ['] profile-this-compiling IS before-word-profile
    [ count-calls? ] [if] \ we are at a non-colondef profile point
	last-colondef-profile @ profile-straight-line off
    [endif]
;

: hook-profiling-into ( "name" -- )
    \ make (deferred word) "name" call cock-profiler, too
    ' >body >r :noname
    POSTPONE cock-profiler
    r@ @ compile, \ old hook behaviour
    POSTPONE ;
    r> ! ; \ change hook behaviour

: note-execute ( -- )
    \ end of BB due to execute, dodefer, perform
    profile-this \ should actually happen after the word, but the
                 \ error is probably small
;

: note-call ( addr -- )
    \ addr is the body address of a called colon def or does handler
    dup ['] (does>2) >body = if \ adjust does handler address
	4 cells here 1 cells - +!
    endif
    { addr }
    current-profile-point @ { lastbb }
    profile-this
    current-profile-point @ { thisbb }
    thisbb new-call { call-node }
    over 3 cells + @ ['] dinc >body = if
	\ non-library call
    !! update profile-bblenpi of last and current pp
	addr cell+ @ { callee-pp }
	callee-pp profile-postlude @ thisbb profile-callee-postlude !
	call-node callee-pp profile-calls insert-list
    else ( addr call-prof-point )
	call-node library-calls insert-list
    endif ;

: prof-compile, ( xt -- )
    in-compile,? @ if
	DEFERS compile, EXIT
    endif
    1 current-profile-point @ profile-bblen +!
    dup CASE
	['] execute of note-execute endof
	['] perform of note-execute endof
	dup >does-code if
	    dup >does-code note-call
	then
	dup >code-address CASE
	    docol:   OF dup >body note-call ENDOF
	    dodefer: OF note-execute ENDOF
	    \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
	    \ code words and ;code-defined words (code words could be optimized):
	ENDCASE
    ENDCASE
    DEFERS compile, ;

: :-hook-profile ( -- )
    defers :-hook
    next-profile-point-p @
    profile-this
    @ dup last-colondef-profile ! ( current-profile-point )
    1 over profile-bblenpi !
    profile-colondef? on ;

: exit-hook-profile ( -- )
    defers exit-hook
    1 last-colondef-profile @ profile-exits +! ;

: ;-hook-profile ( -- )
    \ ;-hook is called before the POSTPONE EXIT
    defers ;-hook
    last-colondef-profile @ { col }
    current-profile-point @ { bb }
    col profile-bblen @ col profile-prelude +!
    col profile-exits @ 0= if
	col bb profile-tailof !
	bb profile-bblen @ bb profile-callee-postlude @ +
	col profile-postlude !
	1 bb profile-bblenpi !
	\ not counting the EXIT
    endif ;

hook-profiling-into then-like
\ hook-profiling-into if-like    \ subsumed by other-control-flow
\ hook-profiling-into ahead-like \ subsumed by other-control-flow
hook-profiling-into other-control-flow
hook-profiling-into begin-like
hook-profiling-into again-like
hook-profiling-into until-like
' :-hook-profile IS :-hook
' prof-compile, IS compile,
' exit-hook-profile IS exit-hook
' ;-hook-profile IS ;-hook

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>