Annotation of gforth/prof-inline.fs, revision 1.6
1.1 anton 1: \ get some data on potential (partial) inlining
2:
3: \ Copyright (C) 2004 Free Software Foundation, Inc.
4:
5: \ This file is part of Gforth.
6:
7: \ Gforth is free software; you can redistribute it and/or
8: \ modify it under the terms of the GNU General Public License
9: \ as published by the Free Software Foundation; either version 2
10: \ of the License, or (at your option) any later version.
11:
12: \ This program is distributed in the hope that it will be useful,
13: \ but WITHOUT ANY WARRANTY; without even the implied warranty of
14: \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15: \ GNU General Public License for more details.
16:
17: \ You should have received a copy of the GNU General Public License
18: \ along with this program; if not, write to the Free Software
19: \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
20:
21:
22: \ relies on some Gforth internals
23:
24: \ !! assumption: each file is included only once; otherwise you get
25: \ the counts for just one of the instances of the file. This can be
26: \ fixed by making sure that every source position occurs only once as
27: \ a profile point.
28:
29: true constant count-calls? \ do some profiling of colon definitions etc.
30:
31: \ for true COUNT-CALLS?:
32:
33: \ What data do I need for evaluating the effectiveness of (partial) inlining?
34:
35: \ static and dynamic counts of everything:
36:
37: \ original BB length (histogram and average)
38: \ BB length with partial inlining (histogram and average)
39: \ since we cannot partially inline library calls, we use a parameter
40: \ that represents the amount of partial inlining we can expect there.
41: \ number of tail calls (original and after partial inlining)
42: \ number of calls (original and after partial inlining)
43: \ reason for BB end: call, return, execute, branch
44:
45: \ how many static calls are there to a word? How many of the dynamic
46: \ calls call just a single word?
47:
1.2 anton 48: \ how much does inlining called-once words help?
49: \ how much does inlining words without control flow help?
50: \ how much does partial inlining help?
51: \ what's the overlap?
52: \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
53:
1.1 anton 54: struct
1.3 anton 55: cell% field list-next
1.2 anton 56: end-struct list%
57:
58: list%
1.1 anton 59: cell% 2* field profile-count
60: cell% 2* field profile-sourcepos
1.6 ! anton 61: cell% field profile-char \ character position in line
! 62: cell% field profile-bblen \ number of primitives in BB
! 63: cell% field profile-colondef? \ is this a colon definition start
! 64: cell% field profile-calls \ static calls to the colon def (calls%)
! 65: cell% field profile-straight-line \ may contain calls, but no other CF
! 66: cell% field profile-calls-from \ static calls in the colon def
1.1 anton 67: end-struct profile% \ profile point
68:
1.2 anton 69: list%
1.3 anton 70: cell% field calls-call \ ptr to profile point of bb containing the call
1.2 anton 71: end-struct calls%
72:
1.1 anton 73: variable profile-points \ linked list of profile%
74: 0 profile-points !
75: variable next-profile-point-p \ the address where the next pp will be stored
76: profile-points next-profile-point-p !
1.3 anton 77: variable last-colondef-profile \ pointer to the pp of last colon definition
78: variable current-profile-point
1.5 anton 79: variable library-calls 0 library-calls ! \ list of calls to library colon defs
1.4 anton 80: variable in-compile,? in-compile,? off
1.6 ! anton 81: variable all-bbs 0 all-bbs ! \ list of all basic blocks
1.2 anton 82:
83: \ list stuff
84:
1.3 anton 85: : map-list ( ... list xt -- ... )
86: { xt } begin { list }
87: list while
88: list xt execute
89: list list-next @
90: repeat ;
91:
92: : drop-1+ drop 1+ ;
93:
94: : list-length ( list -- u )
95: 0 swap ['] drop-1+ map-list ;
96:
97: : insert-list ( listp listpp -- )
98: \ insert list node listp into list pointed to by listpp in front
99: tuck @ over list-next !
100: swap ! ;
101:
102: : insert-list-end ( listp listppp -- )
103: \ insert list node listp into list, with listppp indicating the
104: \ position to insert at, and indicating the position behind the
105: \ new element afterwards.
106: 2dup @ insert-list
107: swap list-next swap ! ;
1.2 anton 108:
1.3 anton 109: \ calls
110:
111: : new-call ( profile-point -- call )
112: calls% %alloc tuck calls-call ! ;
1.2 anton 113:
114: \ profile-point stuff
115:
1.1 anton 116: : new-profile-point ( -- addr )
117: profile% %alloc >r
118: 0. r@ profile-count 2!
119: current-sourcepos r@ profile-sourcepos 2!
120: >in @ r@ profile-char !
1.6 ! anton 121: r@ profile-colondef? off
! 122: 0 r@ profile-bblen !
! 123: 0 r@ profile-calls !
! 124: r@ profile-straight-line on
! 125: 0 r@ profile-calls-from !
1.3 anton 126: r@ next-profile-point-p insert-list-end
127: r@ current-profile-point !
1.6 ! anton 128: r@ new-call all-bbs insert-list
1.1 anton 129: r> ;
130:
131: : print-profile ( -- )
132: profile-points @ begin
133: dup while
134: dup >r
135: r@ profile-sourcepos 2@ .sourcepos ." :"
136: r@ profile-char @ 0 .r ." : "
137: r@ profile-count 2@ 0 d.r cr
1.2 anton 138: r> list-next @
1.1 anton 139: repeat
140: drop ;
141:
142: : print-profile-coldef ( -- )
143: profile-points @ begin
144: dup while
145: dup >r
146: r@ profile-colondef? @ if
147: r@ profile-sourcepos 2@ .sourcepos ." :"
148: r@ profile-char @ 3 .r ." : "
149: r@ profile-count 2@ 10 d.r
150: r@ profile-straight-line @ space 2 .r
1.3 anton 151: r@ profile-calls @ list-length 4 .r
1.1 anton 152: cr
153: endif
1.2 anton 154: r> list-next @
1.1 anton 155: repeat
156: drop ;
157:
1.3 anton 158: : 1= ( u -- f )
159: 1 = ;
160:
161: : 2= ( u -- f )
162: 2 = ;
163:
164: : 3= ( u -- f )
165: 3 = ;
166:
167: : 1u> ( u -- f )
168: 1 u> ;
169:
170: : call-count+ ( ud1 callp -- ud2 )
171: calls-call @ profile-count 2@ d+ ;
172:
1.5 anton 173: : count-dyncalls ( calls -- ud )
174: 0. rot ['] call-count+ map-list ;
175:
176: : add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
177: \ add statistics for callee profpp up, if the number of static
178: \ calls to profpp satisfies xt-test ( u -- f ); see below for what
179: \ statistics are computed.
1.3 anton 180: { xt-test p }
1.5 anton 181: p profile-colondef? @ if
1.3 anton 182: p profile-calls @ { calls }
183: calls list-length { stat }
1.5 anton 184: stat xt-test execute if
185: { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
186: ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
187: ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
188: u-stat stat +
189: u-exec-callees de dr d<> -
190: u-callees 1+
1.3 anton 191: endif
192: endif
193: xt-test ;
194:
195: : print-stat-line ( xt -- )
1.5 anton 196: >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
1.3 anton 197: ( ud-dyn-callee ud-dyn-caller u-stat )
1.5 anton 198: 6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;
199:
200: : print-library-stats ( -- )
201: library-calls @ list-length 20 u.r \ static callers
202: library-calls @ count-dyncalls 12 ud.r \ dynamic callers
203: 13 spaces ;
1.3 anton 204:
1.6 ! anton 205: : bblen+ ( u1 callp -- u2 )
! 206: calls-call @ profile-bblen @ + ;
! 207:
! 208: : dyn-bblen+ ( ud1 callp -- ud2 )
! 209: calls-call @ dup profile-count 2@ rot profile-bblen @ 1 m*/ d+ ;
! 210:
! 211: : print-bb-statistics ( -- )
! 212: ." static dynamic" cr
! 213: all-bbs @ list-length 6 u.r all-bbs @ count-dyncalls 12 ud.r ." basic blocks" cr
! 214: 0 all-bbs @ ['] bblen+ map-list 6 u.r
! 215: 0. all-bbs @ ['] dyn-bblen+ map-list 12 ud.r ." primitives" cr
! 216: ;
! 217:
1.3 anton 218: : print-statistics ( -- )
1.5 anton 219: ." callee exec'd static dyn-caller dyn-callee condition" cr
1.3 anton 220: ['] 0= print-stat-line ." calls to coldefs with 0 callers" cr
221: ['] 1= print-stat-line ." calls to coldefs with 1 callers" cr
222: ['] 2= print-stat-line ." calls to coldefs with 2 callers" cr
223: ['] 3= print-stat-line ." calls to coldefs with 3 callers" cr
224: ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
1.5 anton 225: print-library-stats ." library calls" cr
1.6 ! anton 226: print-bb-statistics
1.3 anton 227: ;
228:
1.1 anton 229: : dinc ( profilep -- )
230: \ increment double pointed to by d-addr
231: profile-count dup 2@ 1. d+ rot 2! ;
232:
233: : profile-this ( -- )
1.4 anton 234: in-compile,? @ in-compile,? on
235: new-profile-point POSTPONE literal POSTPONE dinc
236: in-compile,? ! ;
1.1 anton 237:
238: \ Various words trigger PROFILE-THIS. In order to avoid getting
239: \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
240: \ just wait until the next word is parsed by the text interpreter (in
241: \ compile state) and call PROFILE-THIS only once then. The whole
242: \ BEFORE-WORD hooking etc. is there for this.
243:
244: \ The reason that we do this is because we use the source position for
245: \ the profiling information, and there's only one source position for
246: \ ?EXIT. If we used the threaded code position instead, we would see
247: \ that ?EXIT compiles to several threaded-code words, and could use
248: \ different profile points for them. However, usually dealing with
249: \ the source is more practical.
250:
251: \ Another benefit is that we can ask for profiling anywhere in a
252: \ control-flow word (even before it compiles its own stuff).
253:
254: \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
255: \ a whole colon definition (and triggers our profiler), but during the
256: \ compilation of the colon definition there is no parsing. Afterwards
257: \ you get interpret state at first (no profiling, either), but after
258: \ the "]" you get parsing in compile state, and PROFILE-THIS gets
259: \ called (and compiles code that is never executed). It would be
260: \ better if we had a way of knowing whether we are in a colon def or
261: \ not (and used that knowledge instead of STATE).
262:
1.6 ! anton 263: Defer before-word-profile ( -- )
! 264: ' noop IS before-word-profile
1.1 anton 265:
1.6 ! anton 266: : before-word1 ( -- )
! 267: before-word-profile defers before-word ;
1.1 anton 268:
1.6 ! anton 269: ' before-word1 IS before-word
1.1 anton 270:
1.6 ! anton 271: : profile-this-compiling ( -- )
! 272: state @ if
! 273: profile-this
! 274: ['] noop IS before-word-profile
! 275: endif ;
! 276:
! 277: : cock-profiler ( -- )
! 278: \ as in cock the gun - pull the trigger
! 279: ['] profile-this-compiling IS before-word-profile
! 280: [ count-calls? ] [if] \ we are at a non-colondef profile point
! 281: last-colondef-profile @ profile-straight-line off
! 282: [endif]
! 283: ;
1.1 anton 284:
285: : hook-profiling-into ( "name" -- )
286: \ make (deferred word) "name" call cock-profiler, too
287: ' >body >r :noname
1.6 ! anton 288: POSTPONE cock-profiler
1.1 anton 289: r@ @ compile, \ old hook behaviour
290: POSTPONE ;
291: r> ! ; \ change hook behaviour
292:
293: : note-execute ( -- )
294: \ end of BB due to execute
295: ;
296:
297: : note-call ( addr -- )
298: \ addr is the body address of a called colon def or does handler
1.5 anton 299: dup ['] (does>2) >body = if \ adjust does handler address
300: 4 cells here 1 cells - +!
1.1 anton 301: endif
1.5 anton 302: profile-this current-profile-point @ new-call
303: over 3 cells + @ ['] dinc >body = if ( addr call-prof-point )
304: \ non-library call
305: swap cell+ @ profile-calls insert-list
306: else ( addr call-prof-point )
307: library-calls insert-list drop
308: endif ;
1.4 anton 309:
1.1 anton 310: : prof-compile, ( xt -- )
1.4 anton 311: in-compile,? @ if
312: DEFERS compile, EXIT
313: endif
1.6 ! anton 314: 1 current-profile-point @ profile-bblen +!
1.1 anton 315: dup >does-code if
316: dup >does-code note-call
317: then
318: dup >code-address CASE
319: docol: OF dup >body note-call ENDOF
320: dodefer: OF note-execute ENDOF
321: \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
322: \ code words and ;code-defined words (code words could be optimized):
323: ENDCASE
324: DEFERS compile, ;
325:
1.4 anton 326: : :-hook-profile ( -- )
327: defers :-hook
328: next-profile-point-p @
329: profile-this
330: @ dup last-colondef-profile !
331: profile-colondef? on ;
332:
1.6 ! anton 333: hook-profiling-into then-like
! 334: \ hook-profiling-into if-like \ subsumed by other-control-flow
! 335: \ hook-profiling-into ahead-like \ subsumed by other-control-flow
! 336: hook-profiling-into other-control-flow
! 337: hook-profiling-into begin-like
! 338: hook-profiling-into again-like
! 339: hook-profiling-into until-like
1.1 anton 340: ' :-hook-profile IS :-hook
1.4 anton 341: ' prof-compile, IS compile,
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>