[gforth] / gforth / prof-inline.fs  

gforth: gforth/prof-inline.fs


1 : anton 1.1 \ get some data on potential (partial) inlining
2 :    
3 : anton 1.9 \ Copyright (C) 2004,2007 Free Software Foundation, Inc.
4 : anton 1.1
5 :     \ This file is part of Gforth.
6 :    
7 :     \ Gforth is free software; you can redistribute it and/or
8 :     \ modify it under the terms of the GNU General Public License
9 : anton 1.8 \ as published by the Free Software Foundation, either version 3
10 : anton 1.1 \ of the License, or (at your option) any later version.
11 :    
12 :     \ This program is distributed in the hope that it will be useful,
13 :     \ but WITHOUT ANY WARRANTY; without even the implied warranty of
14 :     \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 :     \ GNU General Public License for more details.
16 :    
17 :     \ You should have received a copy of the GNU General Public License
18 : anton 1.8 \ along with this program. If not, see http://www.gnu.org/licenses/.
19 : anton 1.1
20 :    
21 :     \ relies on some Gforth internals
22 :    
23 :     \ !! assumption: each file is included only once; otherwise you get
24 :     \ the counts for just one of the instances of the file. This can be
25 :     \ fixed by making sure that every source position occurs only once as
26 :     \ a profile point.
27 :    
28 :     true constant count-calls? \ do some profiling of colon definitions etc.
29 :    
30 :     \ for true COUNT-CALLS?:
31 :    
32 :     \ What data do I need for evaluating the effectiveness of (partial) inlining?
33 :    
34 :     \ static and dynamic counts of everything:
35 :    
36 :     \ original BB length (histogram and average)
37 :     \ BB length with partial inlining (histogram and average)
38 :     \ since we cannot partially inline library calls, we use a parameter
39 :     \ that represents the amount of partial inlining we can expect there.
40 :     \ number of tail calls (original and after partial inlining)
41 :     \ number of calls (original and after partial inlining)
42 :     \ reason for BB end: call, return, execute, branch
43 :    
44 :     \ how many static calls are there to a word? How many of the dynamic
45 :     \ calls call just a single word?
46 :    
47 : anton 1.2 \ how much does inlining called-once words help?
48 :     \ how much does inlining words without control flow help?
49 :     \ how much does partial inlining help?
50 :     \ what's the overlap?
51 :     \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
52 :    
53 : anton 1.1 struct
54 : anton 1.3 cell% field list-next
55 : anton 1.2 end-struct list%
56 :    
57 :     list%
58 : anton 1.7 cell% 2* field profile-count \ how often this profile point is performed
59 : anton 1.1 cell% 2* field profile-sourcepos
60 : anton 1.6 cell% field profile-char \ character position in line
61 :     cell% field profile-bblen \ number of primitives in BB
62 : anton 1.7 cell% field profile-bblenpi \ bblen after partial inlining
63 :     cell% field profile-callee-postlude \ 0 or (for calls) callee postlude len
64 :     cell% field profile-tailof \ 0 or (for tail bbs) pointer to coldef bb
65 : anton 1.6 cell% field profile-colondef? \ is this a colon definition start
66 :     cell% field profile-calls \ static calls to the colon def (calls%)
67 :     cell% field profile-straight-line \ may contain calls, but no other CF
68 :     cell% field profile-calls-from \ static calls in the colon def
69 : anton 1.7 cell% field profile-exits \ number of exits in this colon def
70 :     cell% 2* field profile-execs \ number of EXECUTEs etc. of this colon def
71 :     cell% field profile-prelude \ first BB-len of colon def (incl. callee)
72 :     cell% field profile-postlude \ last BB-len of colon def (incl. callee)
73 :     end-struct profile% \ profile point
74 : anton 1.1
75 : anton 1.2 list%
76 : anton 1.3 cell% field calls-call \ ptr to profile point of bb containing the call
77 : anton 1.2 end-struct calls%
78 :    
79 : anton 1.1 variable profile-points \ linked list of profile%
80 :     0 profile-points !
81 :     variable next-profile-point-p \ the address where the next pp will be stored
82 :     profile-points next-profile-point-p !
83 : anton 1.3 variable last-colondef-profile \ pointer to the pp of last colon definition
84 :     variable current-profile-point
85 : anton 1.5 variable library-calls 0 library-calls ! \ list of calls to library colon defs
86 : anton 1.4 variable in-compile,? in-compile,? off
87 : anton 1.6 variable all-bbs 0 all-bbs ! \ list of all basic blocks
88 : anton 1.2
89 :     \ list stuff
90 :    
91 : anton 1.3 : map-list ( ... list xt -- ... )
92 :     { xt } begin { list }
93 :     list while
94 :     list xt execute
95 :     list list-next @
96 :     repeat ;
97 :    
98 :     : drop-1+ drop 1+ ;
99 :    
100 :     : list-length ( list -- u )
101 :     0 swap ['] drop-1+ map-list ;
102 :    
103 :     : insert-list ( listp listpp -- )
104 :     \ insert list node listp into list pointed to by listpp in front
105 :     tuck @ over list-next !
106 :     swap ! ;
107 :    
108 :     : insert-list-end ( listp listppp -- )
109 :     \ insert list node listp into list, with listppp indicating the
110 :     \ position to insert at, and indicating the position behind the
111 :     \ new element afterwards.
112 :     2dup @ insert-list
113 :     swap list-next swap ! ;
114 : anton 1.2
115 : anton 1.3 \ calls
116 :    
117 :     : new-call ( profile-point -- call )
118 :     calls% %alloc tuck calls-call ! ;
119 : anton 1.2
120 :     \ profile-point stuff
121 :    
122 : anton 1.1 : new-profile-point ( -- addr )
123 :     profile% %alloc >r
124 :     0. r@ profile-count 2!
125 :     current-sourcepos r@ profile-sourcepos 2!
126 :     >in @ r@ profile-char !
127 : anton 1.7 0 r@ profile-callee-postlude !
128 :     0 r@ profile-tailof !
129 : anton 1.6 r@ profile-colondef? off
130 :     0 r@ profile-bblen !
131 : anton 1.7 -100000000 r@ profile-bblenpi !
132 :     current-profile-point @ profile-bblenpi @ -100000000 = if
133 :     current-profile-point @ dup profile-bblen @ swap profile-bblenpi !
134 :     endif
135 : anton 1.6 0 r@ profile-calls !
136 :     r@ profile-straight-line on
137 :     0 r@ profile-calls-from !
138 : anton 1.7 0 r@ profile-exits !
139 :     0. r@ profile-execs 2!
140 :     0 r@ profile-prelude !
141 :     0 r@ profile-postlude !
142 : anton 1.3 r@ next-profile-point-p insert-list-end
143 :     r@ current-profile-point !
144 : anton 1.6 r@ new-call all-bbs insert-list
145 : anton 1.1 r> ;
146 :    
147 :     : print-profile ( -- )
148 :     profile-points @ begin
149 :     dup while
150 :     dup >r
151 :     r@ profile-sourcepos 2@ .sourcepos ." :"
152 :     r@ profile-char @ 0 .r ." : "
153 :     r@ profile-count 2@ 0 d.r cr
154 : anton 1.2 r> list-next @
155 : anton 1.1 repeat
156 :     drop ;
157 :    
158 :     : print-profile-coldef ( -- )
159 :     profile-points @ begin
160 :     dup while
161 :     dup >r
162 :     r@ profile-colondef? @ if
163 :     r@ profile-sourcepos 2@ .sourcepos ." :"
164 :     r@ profile-char @ 3 .r ." : "
165 :     r@ profile-count 2@ 10 d.r
166 :     r@ profile-straight-line @ space 2 .r
167 : anton 1.3 r@ profile-calls @ list-length 4 .r
168 : anton 1.1 cr
169 :     endif
170 : anton 1.2 r> list-next @
171 : anton 1.1 repeat
172 :     drop ;
173 :    
174 : anton 1.3 : 1= ( u -- f )
175 :     1 = ;
176 :    
177 :     : 2= ( u -- f )
178 :     2 = ;
179 :    
180 :     : 3= ( u -- f )
181 :     3 = ;
182 :    
183 :     : 1u> ( u -- f )
184 :     1 u> ;
185 :    
186 :     : call-count+ ( ud1 callp -- ud2 )
187 :     calls-call @ profile-count 2@ d+ ;
188 :    
189 : anton 1.5 : count-dyncalls ( calls -- ud )
190 :     0. rot ['] call-count+ map-list ;
191 :    
192 :     : add-calls ( statistics1 xt-test profpp -- statistics2 xt-test )
193 :     \ add statistics for callee profpp up, if the number of static
194 :     \ calls to profpp satisfies xt-test ( u -- f ); see below for what
195 :     \ statistics are computed.
196 : anton 1.3 { xt-test p }
197 : anton 1.5 p profile-colondef? @ if
198 : anton 1.3 p profile-calls @ { calls }
199 :     calls list-length { stat }
200 : anton 1.5 stat xt-test execute if
201 :     { d: ud-dyn-callee d: ud-dyn-caller u-stat u-exec-callees u-callees }
202 :     ud-dyn-callee p profile-count 2@ 2dup { d: de } d+
203 :     ud-dyn-caller calls count-dyncalls 2dup { d: dr } d+
204 :     u-stat stat +
205 :     u-exec-callees de dr d<> -
206 :     u-callees 1+
207 : anton 1.3 endif
208 :     endif
209 :     xt-test ;
210 :    
211 :     : print-stat-line ( xt -- )
212 : anton 1.5 >r 0. 0. 0 0 0 r> profile-points @ ['] add-calls map-list drop
213 : anton 1.3 ( ud-dyn-callee ud-dyn-caller u-stat )
214 : anton 1.5 6 u.r 7 u.r 7 u.r 12 ud.r 12 ud.r space ;
215 :    
216 :     : print-library-stats ( -- )
217 :     library-calls @ list-length 20 u.r \ static callers
218 :     library-calls @ count-dyncalls 12 ud.r \ dynamic callers
219 :     13 spaces ;
220 : anton 1.3
221 : anton 1.6 : bblen+ ( u1 callp -- u2 )
222 :     calls-call @ profile-bblen @ + ;
223 :    
224 :     : dyn-bblen+ ( ud1 callp -- ud2 )
225 :     calls-call @ dup profile-count 2@ rot profile-bblen @ 1 m*/ d+ ;
226 :    
227 :     : print-bb-statistics ( -- )
228 :     ." static dynamic" cr
229 :     all-bbs @ list-length 6 u.r all-bbs @ count-dyncalls 12 ud.r ." basic blocks" cr
230 :     0 all-bbs @ ['] bblen+ map-list 6 u.r
231 :     0. all-bbs @ ['] dyn-bblen+ map-list 12 ud.r ." primitives" cr
232 :     ;
233 :    
234 : anton 1.3 : print-statistics ( -- )
235 : anton 1.5 ." callee exec'd static dyn-caller dyn-callee condition" cr
236 : anton 1.3 ['] 0= print-stat-line ." calls to coldefs with 0 callers" cr
237 :     ['] 1= print-stat-line ." calls to coldefs with 1 callers" cr
238 :     ['] 2= print-stat-line ." calls to coldefs with 2 callers" cr
239 :     ['] 3= print-stat-line ." calls to coldefs with 3 callers" cr
240 :     ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
241 : anton 1.5 print-library-stats ." library calls" cr
242 : anton 1.6 print-bb-statistics
243 : anton 1.3 ;
244 :    
245 : anton 1.1 : dinc ( profilep -- )
246 :     \ increment double pointed to by d-addr
247 :     profile-count dup 2@ 1. d+ rot 2! ;
248 :    
249 :     : profile-this ( -- )
250 : anton 1.4 in-compile,? @ in-compile,? on
251 :     new-profile-point POSTPONE literal POSTPONE dinc
252 :     in-compile,? ! ;
253 : anton 1.1
254 :     \ Various words trigger PROFILE-THIS. In order to avoid getting
255 :     \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
256 :     \ just wait until the next word is parsed by the text interpreter (in
257 :     \ compile state) and call PROFILE-THIS only once then. The whole
258 :     \ BEFORE-WORD hooking etc. is there for this.
259 :    
260 :     \ The reason that we do this is because we use the source position for
261 :     \ the profiling information, and there's only one source position for
262 :     \ ?EXIT. If we used the threaded code position instead, we would see
263 :     \ that ?EXIT compiles to several threaded-code words, and could use
264 :     \ different profile points for them. However, usually dealing with
265 :     \ the source is more practical.
266 :    
267 :     \ Another benefit is that we can ask for profiling anywhere in a
268 :     \ control-flow word (even before it compiles its own stuff).
269 :    
270 :     \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
271 :     \ a whole colon definition (and triggers our profiler), but during the
272 :     \ compilation of the colon definition there is no parsing. Afterwards
273 :     \ you get interpret state at first (no profiling, either), but after
274 :     \ the "]" you get parsing in compile state, and PROFILE-THIS gets
275 :     \ called (and compiles code that is never executed). It would be
276 :     \ better if we had a way of knowing whether we are in a colon def or
277 :     \ not (and used that knowledge instead of STATE).
278 :    
279 : anton 1.6 Defer before-word-profile ( -- )
280 :     ' noop IS before-word-profile
281 : anton 1.1
282 : anton 1.6 : before-word1 ( -- )
283 :     before-word-profile defers before-word ;
284 : anton 1.1
285 : anton 1.6 ' before-word1 IS before-word
286 : anton 1.1
287 : anton 1.6 : profile-this-compiling ( -- )
288 :     state @ if
289 :     profile-this
290 :     ['] noop IS before-word-profile
291 :     endif ;
292 :    
293 :     : cock-profiler ( -- )
294 :     \ as in cock the gun - pull the trigger
295 :     ['] profile-this-compiling IS before-word-profile
296 :     [ count-calls? ] [if] \ we are at a non-colondef profile point
297 :     last-colondef-profile @ profile-straight-line off
298 :     [endif]
299 :     ;
300 : anton 1.1
301 :     : hook-profiling-into ( "name" -- )
302 :     \ make (deferred word) "name" call cock-profiler, too
303 :     ' >body >r :noname
304 : anton 1.6 POSTPONE cock-profiler
305 : anton 1.1 r@ @ compile, \ old hook behaviour
306 :     POSTPONE ;
307 :     r> ! ; \ change hook behaviour
308 :    
309 :     : note-execute ( -- )
310 : anton 1.7 \ end of BB due to execute, dodefer, perform
311 :     profile-this \ should actually happen after the word, but the
312 :     \ error is probably small
313 : anton 1.1 ;
314 :    
315 :     : note-call ( addr -- )
316 :     \ addr is the body address of a called colon def or does handler
317 : anton 1.5 dup ['] (does>2) >body = if \ adjust does handler address
318 :     4 cells here 1 cells - +!
319 : anton 1.1 endif
320 : anton 1.7 { addr }
321 :     current-profile-point @ { lastbb }
322 :     profile-this
323 :     current-profile-point @ { thisbb }
324 :     thisbb new-call { call-node }
325 :     over 3 cells + @ ['] dinc >body = if
326 : anton 1.5 \ non-library call
327 : anton 1.7 !! update profile-bblenpi of last and current pp
328 :     addr cell+ @ { callee-pp }
329 :     callee-pp profile-postlude @ thisbb profile-callee-postlude !
330 :     call-node callee-pp profile-calls insert-list
331 : anton 1.5 else ( addr call-prof-point )
332 : anton 1.7 call-node library-calls insert-list
333 : anton 1.5 endif ;
334 : anton 1.4
335 : anton 1.1 : prof-compile, ( xt -- )
336 : anton 1.4 in-compile,? @ if
337 :     DEFERS compile, EXIT
338 :     endif
339 : anton 1.6 1 current-profile-point @ profile-bblen +!
340 : anton 1.7 dup CASE
341 :     ['] execute of note-execute endof
342 :     ['] perform of note-execute endof
343 :     dup >does-code if
344 :     dup >does-code note-call
345 :     then
346 :     dup >code-address CASE
347 :     docol: OF dup >body note-call ENDOF
348 :     dodefer: OF note-execute ENDOF
349 :     \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
350 :     \ code words and ;code-defined words (code words could be optimized):
351 :     ENDCASE
352 : anton 1.1 ENDCASE
353 :     DEFERS compile, ;
354 :    
355 : anton 1.4 : :-hook-profile ( -- )
356 :     defers :-hook
357 :     next-profile-point-p @
358 :     profile-this
359 : anton 1.7 @ dup last-colondef-profile ! ( current-profile-point )
360 :     1 over profile-bblenpi !
361 : anton 1.4 profile-colondef? on ;
362 :    
363 : anton 1.7 : exit-hook-profile ( -- )
364 :     defers exit-hook
365 :     1 last-colondef-profile @ profile-exits +! ;
366 :    
367 :     : ;-hook-profile ( -- )
368 :     \ ;-hook is called before the POSTPONE EXIT
369 :     defers ;-hook
370 :     last-colondef-profile @ { col }
371 :     current-profile-point @ { bb }
372 :     col profile-bblen @ col profile-prelude +!
373 :     col profile-exits @ 0= if
374 :     col bb profile-tailof !
375 :     bb profile-bblen @ bb profile-callee-postlude @ +
376 :     col profile-postlude !
377 :     1 bb profile-bblenpi !
378 :     \ not counting the EXIT
379 :     endif ;
380 :    
381 : anton 1.6 hook-profiling-into then-like
382 :     \ hook-profiling-into if-like \ subsumed by other-control-flow
383 :     \ hook-profiling-into ahead-like \ subsumed by other-control-flow
384 :     hook-profiling-into other-control-flow
385 :     hook-profiling-into begin-like
386 :     hook-profiling-into again-like
387 :     hook-profiling-into until-like
388 : anton 1.1 ' :-hook-profile IS :-hook
389 : anton 1.4 ' prof-compile, IS compile,
390 : anton 1.7 ' exit-hook-profile IS exit-hook
391 :     ' ;-hook-profile IS ;-hook

CVS Admin

Powered by ViewCVS 1.0-dev
(Powered by ViewCVS)

ViewCVS and CVS Help