[gforth] / gforth / prof-inline.fs  

gforth: gforth/prof-inline.fs


1 : anton 1.1 \ get some data on potential (partial) inlining
2 :    
3 :     \ Copyright (C) 2004 Free Software Foundation, Inc.
4 :    
5 :     \ This file is part of Gforth.
6 :    
7 :     \ Gforth is free software; you can redistribute it and/or
8 :     \ modify it under the terms of the GNU General Public License
9 :     \ as published by the Free Software Foundation; either version 2
10 :     \ of the License, or (at your option) any later version.
11 :    
12 :     \ This program is distributed in the hope that it will be useful,
13 :     \ but WITHOUT ANY WARRANTY; without even the implied warranty of
14 :     \ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
15 :     \ GNU General Public License for more details.
16 :    
17 :     \ You should have received a copy of the GNU General Public License
18 :     \ along with this program; if not, write to the Free Software
19 :     \ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
20 :    
21 :    
22 :     \ relies on some Gforth internals
23 :    
24 :     \ !! assumption: each file is included only once; otherwise you get
25 :     \ the counts for just one of the instances of the file. This can be
26 :     \ fixed by making sure that every source position occurs only once as
27 :     \ a profile point.
28 :    
29 :     true constant count-calls? \ do some profiling of colon definitions etc.
30 :    
31 :     \ for true COUNT-CALLS?:
32 :    
33 :     \ What data do I need for evaluating the effectiveness of (partial) inlining?
34 :    
35 :     \ static and dynamic counts of everything:
36 :    
37 :     \ original BB length (histogram and average)
38 :     \ BB length with partial inlining (histogram and average)
39 :     \ since we cannot partially inline library calls, we use a parameter
40 :     \ that represents the amount of partial inlining we can expect there.
41 :     \ number of tail calls (original and after partial inlining)
42 :     \ number of calls (original and after partial inlining)
43 :     \ reason for BB end: call, return, execute, branch
44 :    
45 :     \ how many static calls are there to a word? How many of the dynamic
46 :     \ calls call just a single word?
47 :    
48 : anton 1.2 \ how much does inlining called-once words help?
49 :     \ how much does inlining words without control flow help?
50 :     \ how much does partial inlining help?
51 :     \ what's the overlap?
52 :     \ optimizing return-to-returns (tail calls), return-to-calls, call-to-calls
53 :    
54 : anton 1.1 struct
55 : anton 1.3 cell% field list-next
56 : anton 1.2 end-struct list%
57 :    
58 :     list%
59 : anton 1.1 cell% 2* field profile-count
60 :     cell% 2* field profile-sourcepos
61 :     cell% field profile-char \ character position in line
62 :     count-calls? [if]
63 :     cell% field profile-colondef? \ is this a colon definition start
64 : anton 1.2 cell% field profile-calls \ static calls to the colon def (calls%)
65 : anton 1.1 cell% field profile-straight-line \ may contain calls, but no other CF
66 :     cell% field profile-calls-from \ static calls in the colon def
67 :     [endif]
68 :     end-struct profile% \ profile point
69 :    
70 : anton 1.2 list%
71 : anton 1.3 cell% field calls-call \ ptr to profile point of bb containing the call
72 : anton 1.2 end-struct calls%
73 :    
74 : anton 1.1 variable profile-points \ linked list of profile%
75 :     0 profile-points !
76 :     variable next-profile-point-p \ the address where the next pp will be stored
77 :     profile-points next-profile-point-p !
78 : anton 1.3 variable last-colondef-profile \ pointer to the pp of last colon definition
79 :     variable current-profile-point
80 :     variable library-calls \ list of calls to library colon defs
81 : anton 1.2
82 :     \ list stuff
83 :    
84 : anton 1.3 : map-list ( ... list xt -- ... )
85 :     { xt } begin { list }
86 :     list while
87 :     list xt execute
88 :     list list-next @
89 :     repeat ;
90 :    
91 :     : drop-1+ drop 1+ ;
92 :    
93 :     : list-length ( list -- u )
94 :     0 swap ['] drop-1+ map-list ;
95 :    
96 :     : insert-list ( listp listpp -- )
97 :     \ insert list node listp into list pointed to by listpp in front
98 :     tuck @ over list-next !
99 :     swap ! ;
100 :    
101 :     : insert-list-end ( listp listppp -- )
102 :     \ insert list node listp into list, with listppp indicating the
103 :     \ position to insert at, and indicating the position behind the
104 :     \ new element afterwards.
105 :     2dup @ insert-list
106 :     swap list-next swap ! ;
107 : anton 1.2
108 : anton 1.3 \ calls
109 :    
110 :     : new-call ( profile-point -- call )
111 :     calls% %alloc tuck calls-call ! ;
112 : anton 1.2
113 :     \ profile-point stuff
114 :    
115 : anton 1.1 : new-profile-point ( -- addr )
116 :     profile% %alloc >r
117 :     0. r@ profile-count 2!
118 :     current-sourcepos r@ profile-sourcepos 2!
119 :     >in @ r@ profile-char !
120 :     [ count-calls? ] [if]
121 :     r@ profile-colondef? off
122 :     0 r@ profile-calls !
123 :     r@ profile-straight-line on
124 :     0 r@ profile-calls-from !
125 :     [endif]
126 : anton 1.3 r@ next-profile-point-p insert-list-end
127 :     r@ current-profile-point !
128 : anton 1.1 r> ;
129 :    
130 :     : print-profile ( -- )
131 :     profile-points @ begin
132 :     dup while
133 :     dup >r
134 :     r@ profile-sourcepos 2@ .sourcepos ." :"
135 :     r@ profile-char @ 0 .r ." : "
136 :     r@ profile-count 2@ 0 d.r cr
137 : anton 1.2 r> list-next @
138 : anton 1.1 repeat
139 :     drop ;
140 :    
141 :     : print-profile-coldef ( -- )
142 :     profile-points @ begin
143 :     dup while
144 :     dup >r
145 :     r@ profile-colondef? @ if
146 :     r@ profile-sourcepos 2@ .sourcepos ." :"
147 :     r@ profile-char @ 3 .r ." : "
148 :     r@ profile-count 2@ 10 d.r
149 :     r@ profile-straight-line @ space 2 .r
150 : anton 1.3 r@ profile-calls @ list-length 4 .r
151 : anton 1.1 cr
152 :     endif
153 : anton 1.2 r> list-next @
154 : anton 1.1 repeat
155 :     drop ;
156 :    
157 : anton 1.3 : 1= ( u -- f )
158 :     1 = ;
159 :    
160 :     : 2= ( u -- f )
161 :     2 = ;
162 :    
163 :     : 3= ( u -- f )
164 :     3 = ;
165 :    
166 :     : 1u> ( u -- f )
167 :     1 u> ;
168 :    
169 :     : call-count+ ( ud1 callp -- ud2 )
170 :     calls-call @ profile-count 2@ d+ ;
171 :    
172 :     : add-calls ( ud-dyn-callee1 ud-dyn-caller1 u-stat1 xt-test profpp --
173 :     ud-dyn-callee2 ud-dyn-caller2 u-stat2 xt-test )
174 :     \ add the static and dynamic call counts to profpp up, if the
175 :     \ number of static calls to profpp satisfies xt-test ( u -- f )
176 :     { xt-test p }
177 :     p profile-colondef? @ if ( u-dyn1 u-stat1 )
178 :     p profile-calls @ { calls }
179 :     calls list-length { stat }
180 :     stat xt-test execute if ( u-dyn u-stat )
181 :     stat + >r
182 :     0. calls ['] call-count+ map-list d+ 2>r
183 :     p profile-count 2@ d+
184 :     2r> r>
185 :     endif
186 :     endif
187 :     xt-test ;
188 :    
189 :     : print-stat-line ( xt -- )
190 :     >r 0. 0. 0 r> profile-points @ ['] add-calls map-list drop
191 :     ( ud-dyn-callee ud-dyn-caller u-stat )
192 :     7 u.r 12 ud.r 12 ud.r space ;
193 :    
194 :     : print-statistics ( -- )
195 :     ." static dyn-caller dyn-callee condition" cr
196 :     ['] 0= print-stat-line ." calls to coldefs with 0 callers" cr
197 :     ['] 1= print-stat-line ." calls to coldefs with 1 callers" cr
198 :     ['] 2= print-stat-line ." calls to coldefs with 2 callers" cr
199 :     ['] 3= print-stat-line ." calls to coldefs with 3 callers" cr
200 :     ['] 1u> print-stat-line ." calls to coldefs with >1 callers" cr
201 :     ;
202 :    
203 : anton 1.1 : dinc ( profilep -- )
204 :     \ increment double pointed to by d-addr
205 :     profile-count dup 2@ 1. d+ rot 2! ;
206 :    
207 :     : profile-this ( -- )
208 :     new-profile-point POSTPONE literal POSTPONE dinc ;
209 :    
210 :     \ Various words trigger PROFILE-THIS. In order to avoid getting
211 :     \ several calls to PROFILE-THIS from a compiling word (like ?EXIT), we
212 :     \ just wait until the next word is parsed by the text interpreter (in
213 :     \ compile state) and call PROFILE-THIS only once then. The whole
214 :     \ BEFORE-WORD hooking etc. is there for this.
215 :    
216 :     \ The reason that we do this is because we use the source position for
217 :     \ the profiling information, and there's only one source position for
218 :     \ ?EXIT. If we used the threaded code position instead, we would see
219 :     \ that ?EXIT compiles to several threaded-code words, and could use
220 :     \ different profile points for them. However, usually dealing with
221 :     \ the source is more practical.
222 :    
223 :     \ Another benefit is that we can ask for profiling anywhere in a
224 :     \ control-flow word (even before it compiles its own stuff).
225 :    
226 :     \ Potential problem: Consider "COMPILING ] [" where COMPILING compiles
227 :     \ a whole colon definition (and triggers our profiler), but during the
228 :     \ compilation of the colon definition there is no parsing. Afterwards
229 :     \ you get interpret state at first (no profiling, either), but after
230 :     \ the "]" you get parsing in compile state, and PROFILE-THIS gets
231 :     \ called (and compiles code that is never executed). It would be
232 :     \ better if we had a way of knowing whether we are in a colon def or
233 :     \ not (and used that knowledge instead of STATE).
234 :    
235 :     Defer before-word-profile ( -- )
236 :     ' noop IS before-word-profile
237 :    
238 :     : before-word1 ( -- )
239 :     before-word-profile defers before-word ;
240 :    
241 :     ' before-word1 IS before-word
242 :    
243 :     : profile-this-compiling ( -- )
244 :     state @ if
245 :     profile-this
246 :     ['] noop IS before-word-profile
247 :     endif ;
248 :    
249 :     : cock-profiler ( -- )
250 :     \ as in cock the gun - pull the trigger
251 :     ['] profile-this-compiling IS before-word-profile
252 :     [ count-calls? ] [if] \ we are at a non-colondef profile point
253 :     last-colondef-profile @ profile-straight-line off
254 :     [endif]
255 :     ;
256 :    
257 :     : hook-profiling-into ( "name" -- )
258 :     \ make (deferred word) "name" call cock-profiler, too
259 :     ' >body >r :noname
260 :     POSTPONE cock-profiler
261 :     r@ @ compile, \ old hook behaviour
262 :     POSTPONE ;
263 :     r> ! ; \ change hook behaviour
264 :    
265 :     : note-execute ( -- )
266 :     \ end of BB due to execute
267 :     ;
268 :    
269 :     : note-call ( addr -- )
270 :     \ addr is the body address of a called colon def or does handler
271 : anton 1.3 dup 3 cells + @ ['] dinc >body = if ( addr )
272 :     current-profile-point @ new-call over cell+ @ profile-calls insert-list
273 : anton 1.1 endif
274 :     drop ;
275 :    
276 :     : prof-compile, ( xt -- )
277 :     dup >does-code if
278 :     dup >does-code note-call
279 :     then
280 :     dup >code-address CASE
281 :     docol: OF dup >body note-call ENDOF
282 :     dodefer: OF note-execute ENDOF
283 :     dofield: OF >body @ ['] lit+ peephole-compile, , EXIT ENDOF
284 :     \ dofield: OF >body @ POSTPONE literal ['] + peephole-compile, EXIT ENDOF
285 :     \ code words and ;code-defined words (code words could be optimized):
286 :     dup in-dictionary? IF drop POSTPONE literal ['] execute peephole-compile, EXIT THEN
287 :     ENDCASE
288 :     DEFERS compile, ;
289 :    
290 :     \ hook-profiling-into then-like
291 :     \ \ hook-profiling-into if-like \ subsumed by other-control-flow
292 :     \ \ hook-profiling-into ahead-like \ subsumed by other-control-flow
293 :     \ hook-profiling-into other-control-flow
294 :     \ hook-profiling-into begin-like
295 :     \ hook-profiling-into again-like
296 :     \ hook-profiling-into until-like
297 :    
298 :     : :-hook-profile ( -- )
299 :     defers :-hook
300 :     next-profile-point-p @
301 :     profile-this
302 :     @ dup last-colondef-profile !
303 :     profile-colondef? on ;
304 :    
305 :     ' :-hook-profile IS :-hook
306 :     ' prof-compile, IS compile,

CVS Admin

Powered by ViewCVS 1.0-dev
(Powered by ViewCVS)

ViewCVS and CVS Help