Annotation of gforth/prof-inline.fs, revision 1.3
1.1 anton 1: \ get some data on potential (partial) inlining
2:
3: \ Copyright (C) 2004 Free Software Foundation, Inc.
4:
5: \ This file is part of Gforth.
6:
7: \ Gforth is free software; you can redistribute it and/or
8: \ modify it under the terms of the GNU General Public License
9: \ as published by the Free Software Foundation; either version 2
10: \ of the License, or (at your option) any later version.
11:
12: \ This program is distributed in the hope that it will be useful,
13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15: \ GNU General Public License for more details.
16:
17: \ You should have received a copy of the GNU General Public License
18: \ along with this program; if not, write to the Free Software
19: \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
20:
21:
22: \ relies on some Gforth internals
23:
24: \ !! assumption: each file is included only once; otherwise you get
25: \ the counts for just one of the instances of the file. This can be
26: \ fixed by making sure that every source position occurs only once as
27: \ a profile point.
28:
29: true constant count-calls? \ do some profiling of colon definitions etc.
30:
31: \ for true COUNT-CALLS?:
32:
33: \ What data do I need for evaluating the effectiveness of (partial) inlining?
34:
35: \ static and dynamic counts of everything:
36:
37: \ original BB length (histogram and average)
38: \ BB length with partial inlining (histogram and average)
39: \ since we cannot partially inline library calls, we use a parameter
40: \ that represents the amount of partial inlining we can expect there.
41: \ number of tail calls (original and after partial inlining)
42: \ number of calls (original and after partial inlining)
43: \ reason for BB end: call, return, execute, branch
44:
45: \ how many static calls are there to a word? How many of the dynamic
46: \ calls call just a single word?
47:
1.2 anton 48: \ how much does inlining called-once words help?
49: \ how much does inlining words without control flow help?
50: \ how much does partial inlining help?
51: \ what's the overlap?
52: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
53:
1.1 anton 54: struct
1.3 ! anton 55: cell% field list-next
1.2 anton 56: end-struct list%
57:
58: list%
1.1 anton 59: cell% 2* field profile-count
60: cell% 2* field profile-sourcepos
61: cell% field profile-char \ character position in line
62: count-calls? [if]
63: cell% field profile-colondef? \ is this a colon definition start
1.2 anton 64: cell% field profile-calls \ static calls to the colon def (calls%)
1.1 anton 65: cell% field profile-straight-line \ may contain calls, but no other CF
66: cell% field profile-calls-from \ static calls in the colon def
67: [endif]
68: end-struct profile% \ profile point
69:
1.2 anton 70: list%
1.3 ! anton 71: cell% field calls-call \ ptr to profile point of bb containing the call
1.2 anton 72: end-struct calls%
73:
1.1 anton 74: variable profile-points \ linked list of profile%
75: 0 profile-points !
76: variable next-profile-point-p \ the address where the next pp will be stored
77: profile-points next-profile-point-p !
1.3 ! anton 78: variable last-colondef-profile \ pointer to the pp of last colon definition
! 79: variable current-profile-point
! 80: variable library-calls \ list of calls to library colon defs
1.2 anton 81:
82: \ list stuff
83:
1.3 ! anton 84: : map-list ( ... list xt -- ... )
! 85: { xt } begin { list }
! 86: list while
! 87: list xt execute
! 88: list list-next @
! 89: repeat ;
! 90:
! 91: : drop-1+ drop 1+ ;
! 92:
! 93: : list-length ( list -- u )
! 94: 0 swap ['] drop-1+ map-list ;
! 95:
! 96: : insert-list ( listp listpp -- )
! 97: \ insert list node listp into list pointed to by listpp in front
! 98: tuck @ over list-next !
! 99: swap ! ;
! 100:
! 101: : insert-list-end ( listp listppp -- )
! 102: \ insert list node listp into list, with listppp indicating the
! 103: \ position to insert at, and indicating the position behind the
! 104: \ new element afterwards.
! 105: 2dup @ insert-list
! 106: swap list-next swap ! ;
1.2 anton 107:
1.3 ! anton 108: \ calls
! 109:
! 110: : new-call ( profile-point -- call )
! 111: calls% %alloc tuck calls-call ! ;
1.2 anton 112:
113: \ profile-point stuff
114:
1.1 anton 115: : new-profile-point ( -- addr )
116: profile% %alloc >r
117: 0. r@ profile-count 2!
118: current-sourcepos r@ profile-sourcepos 2!
119: >in @ r@ profile-char !
120: [ count-calls? ] [if]
121: r@ profile-colondef? off
122: 0 r@ profile-calls !
123: r@ profile-straight-line on
124: 0 r@ profile-calls-from !
125: [endif]
1.3 ! anton 126: r@ next-profile-point-p insert-list-end
! 127: r@ current-profile-point !
1.1 anton 128: r> ;
129:
130: : print-profile ( -- )
131: profile-points @ begin
132: dup while
133: dup >r
134: r@ profile-sourcepos 2@ .sourcepos ." :"
135: r@ profile-char @ 0 .r ." : "
136: r@ profile-count 2@ 0 d.r cr
1.2 anton 137: r> list-next @
1.1 anton 138: repeat
139: drop ;
140:
141: : print-profile-coldef ( -- )
142: profile-points @ begin
143: dup while
144: dup >r
145: r@ profile-colondef? @ if
146: r@ profile-sourcepos 2@ .sourcepos ." :"
147: r@ profile-char @ 3 .r ." : "
148: r@ profile-count 2@ 10 d.r
149: r@ profile-straight-line @ space 2 .r
1.3 ! anton 150: r@ profile-calls @ list-length 4 .r
1.1 anton 151: cr
152: endif
1.2 anton 153: r> list-next @
1.1 anton 154: repeat
155: drop ;
156:
1.3 ! anton 157: : 1= ( u -- f )
! 158: 1 = ;
! 159:
! 160: : 2= ( u -- f )
! 161: 2 = ;
! 162:
! 163: : 3= ( u -- f )
! 164: 3 = ;
! 165:
! 166: : 1u> ( u -- f )
! 167: 1 u> ;
! 168:
! 169: : call-count+ ( ud1 callp -- ud2 )
! 170: calls-call @ profile-count 2@ d+ ;
! 171:
! 172: : add-calls ( ud-dyn-callee1 ud-dyn-caller1 u-stat1 xt-test profpp --
! 173: ud-dyn-callee2 ud-dyn-caller2 u-stat2 xt-test )
! 174: \ add the static and dynamic call counts to profpp up, if the
! 175: \ number of static calls to profpp satisfies xt-test ( u -- f )
! 176: { xt-test p }
! 177: p profile-colondef? @ if ( u-dyn1 u-stat1 )
! 178: p profile-calls @ { calls }
! 179: calls list-length { stat }
! 180: stat xt-test execute if ( u-dyn u-stat )
! 181: stat + >r
! 182: 0. calls ['] call-count+ map-list d+ 2>r
! 183: p profile-count 2@ d+
! 184: 2r> r>
! 185: endif
! 186: endif
! 187: xt-test ;
! 188:
! 189: : print-stat-line ( xt -- )
! 190: >r 0. 0. 0 r> profile-points @ ['] add-calls map-list drop
! 191: ( ud-dyn-callee ud-dyn-caller u-stat )
! 192: 7 u.r 12 ud.r 12 ud.r space ;
! 193:
! 194: : print-statistics ( -- )
! 195: ." static dyn-caller dyn-callee condition" cr
! 196: ['] 0= print-stat-line ." calls to coldefs with 0 callers" cr
! 197: ['] 1= print-stat-line ." calls to coldefs with 1 callers" cr
! 198: ['] 2= print-stat-line ." calls to coldefs with 2 callers" cr
! 199: ['] 3= print-stat-line ." calls to coldefs with 3 callers" cr
! 200: ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
! 201: ;
! 202:
1.1 anton 203: : dinc ( profilep -- )
204: \ increment double pointed to by d-addr
205: profile-count dup 2@ 1. d+ rot 2! ;
206:
207: : profile-this ( -- )
208: new-profile-point POSTPONE literal POSTPONE dinc ;
209:
210: \ Various words trigger PROFILE-THIS. In order to avoid getting
211: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
212: \ just wait until the next word is parsed by the text interpreter (in
213: \ compile state) and call PROFILE-THIS only once then. The whole
214: \ BEFORE-WORD hooking etc. is there for this.
215:
216: \ The reason that we do this is because we use the source position for
217: \ the profiling information, and there's only one source position for
218: \ ?EXIT. If we used the threaded code position instead, we would see
219: \ that ?EXIT compiles to several threaded-code words, and could use
220: \ different profile points for them. However, usually dealing with
221: \ the source is more practical.
222:
223: \ Another benefit is that we can ask for profiling anywhere in a
224: \ control-flow word (even before it compiles its own stuff).
225:
226: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
227: \ a whole colon definition (and triggers our profiler), but during the
228: \ compilation of the colon definition there is no parsing. Afterwards
229: \ you get interpret state at first (no profiling, either), but after
230: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
231: \ called (and compiles code that is never executed). It would be
232: \ better if we had a way of knowing whether we are in a colon def or
233: \ not (and used that knowledge instead of STATE).
234:
235: Defer before-word-profile ( -- )
236: ' noop IS before-word-profile
237:
238: : before-word1 ( -- )
239: before-word-profile defers before-word ;
240:
241: ' before-word1 IS before-word
242:
243: : profile-this-compiling ( -- )
244: state @ if
245: profile-this
246: ['] noop IS before-word-profile
247: endif ;
248:
249: : cock-profiler ( -- )
250: \ as in cock the gun - pull the trigger
251: ['] profile-this-compiling IS before-word-profile
252: [ count-calls? ] [if] \ we are at a non-colondef profile point
253: last-colondef-profile @ profile-straight-line off
254: [endif]
255: ;
256:
257: : hook-profiling-into ( "name" -- )
258: \ make (deferred word) "name" call cock-profiler, too
259: ' >body >r :noname
260: POSTPONE cock-profiler
261: r@ @ compile, \ old hook behaviour
262: POSTPONE ;
263: r> ! ; \ change hook behaviour
264:
265: : note-execute ( -- )
266: \ end of BB due to execute
267: ;
268:
269: : note-call ( addr -- )
270: \ addr is the body address of a called colon def or does handler
1.3 ! anton 271: dup 3 cells + @ ['] dinc >body = if ( addr )
! 272: current-profile-point @ new-call over cell+ @ profile-calls insert-list
1.1 anton 273: endif
274: drop ;
275:
276: : prof-compile, ( xt -- )
277: dup >does-code if
278: dup >does-code note-call
279: then
280: dup >code-address CASE
281: docol: OF dup >body note-call ENDOF
282: dodefer: OF note-execute ENDOF
283: dofield: OF >body @ ['] lit+ peephole-compile, , EXIT ENDOF
284: \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
285: \ code words and ;code-defined words (code words could be optimized):
286: dup in-dictionary? IF drop POSTPONE literal ['] execute peephole-compile, EXIT THEN
287: ENDCASE
288: DEFERS compile, ;
289:
290: \ hook-profiling-into then-like
291: \ \ hook-profiling-into if-like \ subsumed by other-control-flow
292: \ \ hook-profiling-into ahead-like \ subsumed by other-control-flow
293: \ hook-profiling-into other-control-flow
294: \ hook-profiling-into begin-like
295: \ hook-profiling-into again-like
296: \ hook-profiling-into until-like
297:
298: : :-hook-profile ( -- )
299: defers :-hook
300: next-profile-point-p @
301: profile-this
302: @ dup last-colondef-profile !
303: profile-colondef? on ;
304:
305: ' :-hook-profile IS :-hook
306: ' prof-compile, IS compile,
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>