Name | Last modified | Size | Description | |
---|---|---|---|---|
Parent Directory | - | |||
combine.awk | 26-Nov-1995 11:13 | 244 | ||
cross.peep | 26-Nov-1995 11:13 | 354K | ||
gs.peep | 26-Nov-1995 11:13 | 405K | ||
peep.awk | 26-Nov-1995 11:13 | 222 | ||
percent | 06-Mar-2019 09:39 | 1.4M | ||
percent.awk | 06-Mar-2019 09:39 | 178 | ||
predict-dtc | 16-Dec-1995 18:52 | 2.7K | ||
predict-dtc.awk | 10-Dec-1995 11:29 | 339 | ||
predict-itc | 16-Dec-1995 18:53 | 2.9K | ||
predict-itc.awk | 10-Dec-1995 11:26 | 328 | ||
prims2x.peep | 26-Nov-1995 11:13 | 346K | ||
sorted | 26-Nov-1995 11:13 | 883K | ||
These data are derived from dynamic traces of real applications running on Gforth, not from small benchmarks that might provide unrealistic results:
(Here, I use the term "primitive" for anything that performs a NEXT (or NEXT1); this includes routines like docol and dovar).
The files included here are:
You can get all of them in tarred and gzipped form from http://mips.complang.tuwien.ac.at/forth/peep.tar.gz.
Not all of the dynamic sequences shown in the data can be combined into one primitive by a simple compiler. E.g., if they contain a primitive that changes the control flow (e.g., docol, ;s, branch, dodefer) on any but the last position, the sequence cannot be combined (easily). Moreover, when two sequences overlap, only one of them can be combined.
The following table shows how many distinct sequences of a given length appear in the applications and what portion of the dynamic sequences is covered by the top 20 (50,100,200) most frequently occuring sequences. Note that the number of dynamic sequences of length n is #dynprimitives-n+1.
length distinct %top20 %top50 %top100 %top200 1 160 75.53 94.219 99.832 100 2 1278 31.1074 48.8859 65.3516 83.112 3 3321 16.7137 31.2978 44.3291 62.3914 4 5489 13.1306 25.8037 36.8787 53.2986 5 7423 11.8311 23.5362 34.1386 49.9976 6 9138 11.0131 22.015 32.3519 47.7475
Eliminating sequences, that cannot be combined because they contain a control flow change in the wrong place reduces the portion of the length-2 top-100 from 65% to 45%. I do not have an accurate result for the effect of overlapping sequences; however, for length-2 sequences the overlap can at most halve the number of combinable sequences.
So, the results are encouraging: Even if we just combine length-2 sequences, we can reduce the dynamic primitive count by at least 22% (probably significantly more) by adding just 100 new primitives that represent the most frequent combinations. If we also consider longer sequences, the effect should be even more pronounced. It also appears that most of the frequent sequences are in general use; they usually appear frequently in more than one trace.
Wil Baden implemented 'Pinhole optimization' in ThisForth. Rob Chapman does similar stuff in Timbre.
Peephole optimization itself goes back into the dark ages of computer science, but schuetz92 is the first reference I have that applies it to Forth intermediate code.
@InProceedings{schuetz92, author = "Udo Sch{\"u}tz", title = "{Optimierung von Fadencode}", booktitle = "{FORTH-Tagung 1992}", year = "1992", organization = "Forth Gesellscheft e.V." }
Proebsting95 peephole optimizes already on the tree intermediate representation, enabling more combinations. The combinations are selected automatically and a custom interpreter is generated automatically for the application.
@InProceedings{proebsting95, author = "Todd A. Proebsting", title = "Optimizing an ANSI C Interpreter with Superoperators", crossref = "popl95", pages = "322--332" } @Proceedings{popl95, booktitle = "Principles of Programming Languages (POPL '95)", title = "Principles of Programming Languages (POPL '95)", year = "1995", key = "POPL '95" }Anton Ertl