[gforth] / gforth / Attic / gforth.ds  

gforth: gforth/Attic/gforth.ds


1 : anton 1.1 \input texinfo @c -*-texinfo-*-
2 :     @comment The source is gforth.ds, from which gforth.texi is generated
3 :     @comment %**start of header (This is for running Texinfo on a region.)
4 : anton 1.4 @setfilename gforth.info
5 : anton 1.17 @settitle Gforth Manual
6 : anton 1.4 @comment @setchapternewpage odd
7 : anton 1.1 @comment %**end of header (This is for running Texinfo on a region.)
8 :    
9 :     @ifinfo
10 : anton 1.30 This file documents Gforth 0.2
11 : anton 1.1
12 : anton 1.32 Copyright @copyright{} 1995,1996 Free Software Foundation, Inc.
13 : anton 1.1
14 :     Permission is granted to make and distribute verbatim copies of
15 :     this manual provided the copyright notice and this permission notice
16 :     are preserved on all copies.
17 :    
18 : anton 1.4 @ignore
19 : anton 1.1 Permission is granted to process this file through TeX and print the
20 :     results, provided the printed document carries a copying permission
21 :     notice identical to this one except for the removal of this paragraph
22 :     (this paragraph not being relevant to the printed manual).
23 :    
24 : anton 1.4 @end ignore
25 : anton 1.1 Permission is granted to copy and distribute modified versions of this
26 :     manual under the conditions for verbatim copying, provided also that the
27 :     sections entitled "Distribution" and "General Public License" are
28 :     included exactly as in the original, and provided that the entire
29 :     resulting derived work is distributed under the terms of a permission
30 :     notice identical to this one.
31 :    
32 :     Permission is granted to copy and distribute translations of this manual
33 :     into another language, under the above conditions for modified versions,
34 :     except that the sections entitled "Distribution" and "General Public
35 :     License" may be included in a translation approved by the author instead
36 :     of in the original English.
37 :     @end ifinfo
38 :    
39 : anton 1.24 @finalout
40 : anton 1.1 @titlepage
41 :     @sp 10
42 : anton 1.17 @center @titlefont{Gforth Manual}
43 : anton 1.1 @sp 2
44 : anton 1.30 @center for version 0.2
45 : anton 1.1 @sp 2
46 :     @center Anton Ertl
47 : anton 1.25 @center Bernd Paysan
48 : anton 1.17 @sp 3
49 :     @center This manual is under construction
50 : anton 1.1
51 :     @comment The following two commands start the copyright page.
52 :     @page
53 :     @vskip 0pt plus 1filll
54 : anton 1.32 Copyright @copyright{} 1995,1996 Free Software Foundation, Inc.
55 : anton 1.1
56 :     @comment !! Published by ... or You can get a copy of this manual ...
57 :    
58 :     Permission is granted to make and distribute verbatim copies of
59 :     this manual provided the copyright notice and this permission notice
60 :     are preserved on all copies.
61 :    
62 :     Permission is granted to copy and distribute modified versions of this
63 :     manual under the conditions for verbatim copying, provided also that the
64 :     sections entitled "Distribution" and "General Public License" are
65 :     included exactly as in the original, and provided that the entire
66 :     resulting derived work is distributed under the terms of a permission
67 :     notice identical to this one.
68 :    
69 :     Permission is granted to copy and distribute translations of this manual
70 :     into another language, under the above conditions for modified versions,
71 :     except that the sections entitled "Distribution" and "General Public
72 :     License" may be included in a translation approved by the author instead
73 :     of in the original English.
74 :     @end titlepage
75 :    
76 :    
77 :     @node Top, License, (dir), (dir)
78 :     @ifinfo
79 : anton 1.17 Gforth is a free implementation of ANS Forth available on many
80 : anton 1.30 personal machines. This manual corresponds to version 0.2.
81 : anton 1.1 @end ifinfo
82 :    
83 :     @menu
84 : anton 1.4 * License::
85 : anton 1.17 * Goals:: About the Gforth Project
86 : anton 1.4 * Other Books:: Things you might want to read
87 : anton 1.17 * Invocation:: Starting Gforth
88 :     * Words:: Forth words available in Gforth
89 : anton 1.40 * Tools:: Programming tools
90 : anton 1.4 * ANS conformance:: Implementation-defined options etc.
91 : anton 1.17 * Model:: The abstract machine of Gforth
92 : anton 1.34 * Integrating Gforth:: Forth as scripting language for applications.
93 : anton 1.17 * Emacs and Gforth:: The Gforth Mode
94 : anton 1.4 * Internals:: Implementation details
95 :     * Bugs:: How to report them
96 : anton 1.29 * Origin:: Authors and ancestors of Gforth
97 : anton 1.4 * Word Index:: An item for each Forth word
98 :     * Node Index:: An item for each node
99 : anton 1.1 @end menu
100 :    
101 :     @node License, Goals, Top, Top
102 : pazsan 1.20 @unnumbered GNU GENERAL PUBLIC LICENSE
103 :     @center Version 2, June 1991
104 :    
105 :     @display
106 :     Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc.
107 :     675 Mass Ave, Cambridge, MA 02139, USA
108 :    
109 :     Everyone is permitted to copy and distribute verbatim copies
110 :     of this license document, but changing it is not allowed.
111 :     @end display
112 :    
113 :     @unnumberedsec Preamble
114 :    
115 :     The licenses for most software are designed to take away your
116 :     freedom to share and change it. By contrast, the GNU General Public
117 :     License is intended to guarantee your freedom to share and change free
118 :     software---to make sure the software is free for all its users. This
119 :     General Public License applies to most of the Free Software
120 :     Foundation's software and to any other program whose authors commit to
121 :     using it. (Some other Free Software Foundation software is covered by
122 :     the GNU Library General Public License instead.) You can apply it to
123 :     your programs, too.
124 :    
125 :     When we speak of free software, we are referring to freedom, not
126 :     price. Our General Public Licenses are designed to make sure that you
127 :     have the freedom to distribute copies of free software (and charge for
128 :     this service if you wish), that you receive source code or can get it
129 :     if you want it, that you can change the software or use pieces of it
130 :     in new free programs; and that you know you can do these things.
131 :    
132 :     To protect your rights, we need to make restrictions that forbid
133 :     anyone to deny you these rights or to ask you to surrender the rights.
134 :     These restrictions translate to certain responsibilities for you if you
135 :     distribute copies of the software, or if you modify it.
136 :    
137 :     For example, if you distribute copies of such a program, whether
138 :     gratis or for a fee, you must give the recipients all the rights that
139 :     you have. You must make sure that they, too, receive or can get the
140 :     source code. And you must show them these terms so they know their
141 :     rights.
142 :    
143 :     We protect your rights with two steps: (1) copyright the software, and
144 :     (2) offer you this license which gives you legal permission to copy,
145 :     distribute and/or modify the software.
146 :    
147 :     Also, for each author's protection and ours, we want to make certain
148 :     that everyone understands that there is no warranty for this free
149 :     software. If the software is modified by someone else and passed on, we
150 :     want its recipients to know that what they have is not the original, so
151 :     that any problems introduced by others will not reflect on the original
152 :     authors' reputations.
153 :    
154 :     Finally, any free program is threatened constantly by software
155 :     patents. We wish to avoid the danger that redistributors of a free
156 :     program will individually obtain patent licenses, in effect making the
157 :     program proprietary. To prevent this, we have made it clear that any
158 :     patent must be licensed for everyone's free use or not licensed at all.
159 :    
160 :     The precise terms and conditions for copying, distribution and
161 :     modification follow.
162 :    
163 :     @iftex
164 :     @unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
165 :     @end iftex
166 :     @ifinfo
167 :     @center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
168 :     @end ifinfo
169 :    
170 :     @enumerate 0
171 :     @item
172 :     This License applies to any program or other work which contains
173 :     a notice placed by the copyright holder saying it may be distributed
174 :     under the terms of this General Public License. The ``Program'', below,
175 :     refers to any such program or work, and a ``work based on the Program''
176 :     means either the Program or any derivative work under copyright law:
177 :     that is to say, a work containing the Program or a portion of it,
178 :     either verbatim or with modifications and/or translated into another
179 :     language. (Hereinafter, translation is included without limitation in
180 :     the term ``modification''.) Each licensee is addressed as ``you''.
181 :    
182 :     Activities other than copying, distribution and modification are not
183 :     covered by this License; they are outside its scope. The act of
184 :     running the Program is not restricted, and the output from the Program
185 :     is covered only if its contents constitute a work based on the
186 :     Program (independent of having been made by running the Program).
187 :     Whether that is true depends on what the Program does.
188 :    
189 :     @item
190 :     You may copy and distribute verbatim copies of the Program's
191 :     source code as you receive it, in any medium, provided that you
192 :     conspicuously and appropriately publish on each copy an appropriate
193 :     copyright notice and disclaimer of warranty; keep intact all the
194 :     notices that refer to this License and to the absence of any warranty;
195 :     and give any other recipients of the Program a copy of this License
196 :     along with the Program.
197 :    
198 :     You may charge a fee for the physical act of transferring a copy, and
199 :     you may at your option offer warranty protection in exchange for a fee.
200 :    
201 :     @item
202 :     You may modify your copy or copies of the Program or any portion
203 :     of it, thus forming a work based on the Program, and copy and
204 :     distribute such modifications or work under the terms of Section 1
205 :     above, provided that you also meet all of these conditions:
206 :    
207 :     @enumerate a
208 :     @item
209 :     You must cause the modified files to carry prominent notices
210 :     stating that you changed the files and the date of any change.
211 :    
212 :     @item
213 :     You must cause any work that you distribute or publish, that in
214 :     whole or in part contains or is derived from the Program or any
215 :     part thereof, to be licensed as a whole at no charge to all third
216 :     parties under the terms of this License.
217 :    
218 :     @item
219 :     If the modified program normally reads commands interactively
220 :     when run, you must cause it, when started running for such
221 :     interactive use in the most ordinary way, to print or display an
222 :     announcement including an appropriate copyright notice and a
223 :     notice that there is no warranty (or else, saying that you provide
224 :     a warranty) and that users may redistribute the program under
225 :     these conditions, and telling the user how to view a copy of this
226 :     License. (Exception: if the Program itself is interactive but
227 :     does not normally print such an announcement, your work based on
228 :     the Program is not required to print an announcement.)
229 :     @end enumerate
230 :    
231 :     These requirements apply to the modified work as a whole. If
232 :     identifiable sections of that work are not derived from the Program,
233 :     and can be reasonably considered independent and separate works in
234 :     themselves, then this License, and its terms, do not apply to those
235 :     sections when you distribute them as separate works. But when you
236 :     distribute the same sections as part of a whole which is a work based
237 :     on the Program, the distribution of the whole must be on the terms of
238 :     this License, whose permissions for other licensees extend to the
239 :     entire whole, and thus to each and every part regardless of who wrote it.
240 :    
241 :     Thus, it is not the intent of this section to claim rights or contest
242 :     your rights to work written entirely by you; rather, the intent is to
243 :     exercise the right to control the distribution of derivative or
244 :     collective works based on the Program.
245 :    
246 :     In addition, mere aggregation of another work not based on the Program
247 :     with the Program (or with a work based on the Program) on a volume of
248 :     a storage or distribution medium does not bring the other work under
249 :     the scope of this License.
250 :    
251 :     @item
252 :     You may copy and distribute the Program (or a work based on it,
253 :     under Section 2) in object code or executable form under the terms of
254 :     Sections 1 and 2 above provided that you also do one of the following:
255 :    
256 :     @enumerate a
257 :     @item
258 :     Accompany it with the complete corresponding machine-readable
259 :     source code, which must be distributed under the terms of Sections
260 :     1 and 2 above on a medium customarily used for software interchange; or,
261 :    
262 :     @item
263 :     Accompany it with a written offer, valid for at least three
264 :     years, to give any third party, for a charge no more than your
265 :     cost of physically performing source distribution, a complete
266 :     machine-readable copy of the corresponding source code, to be
267 :     distributed under the terms of Sections 1 and 2 above on a medium
268 :     customarily used for software interchange; or,
269 :    
270 :     @item
271 :     Accompany it with the information you received as to the offer
272 :     to distribute corresponding source code. (This alternative is
273 :     allowed only for noncommercial distribution and only if you
274 :     received the program in object code or executable form with such
275 :     an offer, in accord with Subsection b above.)
276 :     @end enumerate
277 :    
278 :     The source code for a work means the preferred form of the work for
279 :     making modifications to it. For an executable work, complete source
280 :     code means all the source code for all modules it contains, plus any
281 :     associated interface definition files, plus the scripts used to
282 :     control compilation and installation of the executable. However, as a
283 :     special exception, the source code distributed need not include
284 :     anything that is normally distributed (in either source or binary
285 :     form) with the major components (compiler, kernel, and so on) of the
286 :     operating system on which the executable runs, unless that component
287 :     itself accompanies the executable.
288 :    
289 :     If distribution of executable or object code is made by offering
290 :     access to copy from a designated place, then offering equivalent
291 :     access to copy the source code from the same place counts as
292 :     distribution of the source code, even though third parties are not
293 :     compelled to copy the source along with the object code.
294 :    
295 :     @item
296 :     You may not copy, modify, sublicense, or distribute the Program
297 :     except as expressly provided under this License. Any attempt
298 :     otherwise to copy, modify, sublicense or distribute the Program is
299 :     void, and will automatically terminate your rights under this License.
300 :     However, parties who have received copies, or rights, from you under
301 :     this License will not have their licenses terminated so long as such
302 :     parties remain in full compliance.
303 :    
304 :     @item
305 :     You are not required to accept this License, since you have not
306 :     signed it. However, nothing else grants you permission to modify or
307 :     distribute the Program or its derivative works. These actions are
308 :     prohibited by law if you do not accept this License. Therefore, by
309 :     modifying or distributing the Program (or any work based on the
310 :     Program), you indicate your acceptance of this License to do so, and
311 :     all its terms and conditions for copying, distributing or modifying
312 :     the Program or works based on it.
313 :    
314 :     @item
315 :     Each time you redistribute the Program (or any work based on the
316 :     Program), the recipient automatically receives a license from the
317 :     original licensor to copy, distribute or modify the Program subject to
318 :     these terms and conditions. You may not impose any further
319 :     restrictions on the recipients' exercise of the rights granted herein.
320 :     You are not responsible for enforcing compliance by third parties to
321 :     this License.
322 :    
323 :     @item
324 :     If, as a consequence of a court judgment or allegation of patent
325 :     infringement or for any other reason (not limited to patent issues),
326 :     conditions are imposed on you (whether by court order, agreement or
327 :     otherwise) that contradict the conditions of this License, they do not
328 :     excuse you from the conditions of this License. If you cannot
329 :     distribute so as to satisfy simultaneously your obligations under this
330 :     License and any other pertinent obligations, then as a consequence you
331 :     may not distribute the Program at all. For example, if a patent
332 :     license would not permit royalty-free redistribution of the Program by
333 :     all those who receive copies directly or indirectly through you, then
334 :     the only way you could satisfy both it and this License would be to
335 :     refrain entirely from distribution of the Program.
336 :    
337 :     If any portion of this section is held invalid or unenforceable under
338 :     any particular circumstance, the balance of the section is intended to
339 :     apply and the section as a whole is intended to apply in other
340 :     circumstances.
341 :    
342 :     It is not the purpose of this section to induce you to infringe any
343 :     patents or other property right claims or to contest validity of any
344 :     such claims; this section has the sole purpose of protecting the
345 :     integrity of the free software distribution system, which is
346 :     implemented by public license practices. Many people have made
347 :     generous contributions to the wide range of software distributed
348 :     through that system in reliance on consistent application of that
349 :     system; it is up to the author/donor to decide if he or she is willing
350 :     to distribute software through any other system and a licensee cannot
351 :     impose that choice.
352 :    
353 :     This section is intended to make thoroughly clear what is believed to
354 :     be a consequence of the rest of this License.
355 :    
356 :     @item
357 :     If the distribution and/or use of the Program is restricted in
358 :     certain countries either by patents or by copyrighted interfaces, the
359 :     original copyright holder who places the Program under this License
360 :     may add an explicit geographical distribution limitation excluding
361 :     those countries, so that distribution is permitted only in or among
362 :     countries not thus excluded. In such case, this License incorporates
363 :     the limitation as if written in the body of this License.
364 :    
365 :     @item
366 :     The Free Software Foundation may publish revised and/or new versions
367 :     of the General Public License from time to time. Such new versions will
368 :     be similar in spirit to the present version, but may differ in detail to
369 :     address new problems or concerns.
370 :    
371 :     Each version is given a distinguishing version number. If the Program
372 :     specifies a version number of this License which applies to it and ``any
373 :     later version'', you have the option of following the terms and conditions
374 :     either of that version or of any later version published by the Free
375 :     Software Foundation. If the Program does not specify a version number of
376 :     this License, you may choose any version ever published by the Free Software
377 :     Foundation.
378 :    
379 :     @item
380 :     If you wish to incorporate parts of the Program into other free
381 :     programs whose distribution conditions are different, write to the author
382 :     to ask for permission. For software which is copyrighted by the Free
383 :     Software Foundation, write to the Free Software Foundation; we sometimes
384 :     make exceptions for this. Our decision will be guided by the two goals
385 :     of preserving the free status of all derivatives of our free software and
386 :     of promoting the sharing and reuse of software generally.
387 :    
388 :     @iftex
389 :     @heading NO WARRANTY
390 :     @end iftex
391 :     @ifinfo
392 :     @center NO WARRANTY
393 :     @end ifinfo
394 :    
395 :     @item
396 :     BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
397 :     FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
398 :     OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
399 :     PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
400 :     OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
401 :     MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
402 :     TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
403 :     PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
404 :     REPAIR OR CORRECTION.
405 :    
406 :     @item
407 :     IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
408 :     WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
409 :     REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
410 :     INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
411 :     OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
412 :     TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
413 :     YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
414 :     PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
415 :     POSSIBILITY OF SUCH DAMAGES.
416 :     @end enumerate
417 :    
418 :     @iftex
419 :     @heading END OF TERMS AND CONDITIONS
420 :     @end iftex
421 :     @ifinfo
422 :     @center END OF TERMS AND CONDITIONS
423 :     @end ifinfo
424 :    
425 :     @page
426 :     @unnumberedsec How to Apply These Terms to Your New Programs
427 :    
428 :     If you develop a new program, and you want it to be of the greatest
429 :     possible use to the public, the best way to achieve this is to make it
430 :     free software which everyone can redistribute and change under these terms.
431 :    
432 :     To do so, attach the following notices to the program. It is safest
433 :     to attach them to the start of each source file to most effectively
434 :     convey the exclusion of warranty; and each file should have at least
435 :     the ``copyright'' line and a pointer to where the full notice is found.
436 :    
437 :     @smallexample
438 :     @var{one line to give the program's name and a brief idea of what it does.}
439 :     Copyright (C) 19@var{yy} @var{name of author}
440 :    
441 :     This program is free software; you can redistribute it and/or modify
442 :     it under the terms of the GNU General Public License as published by
443 :     the Free Software Foundation; either version 2 of the License, or
444 :     (at your option) any later version.
445 :    
446 :     This program is distributed in the hope that it will be useful,
447 :     but WITHOUT ANY WARRANTY; without even the implied warranty of
448 :     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
449 :     GNU General Public License for more details.
450 :    
451 :     You should have received a copy of the GNU General Public License
452 :     along with this program; if not, write to the Free Software
453 :     Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
454 :     @end smallexample
455 :    
456 :     Also add information on how to contact you by electronic and paper mail.
457 :    
458 :     If the program is interactive, make it output a short notice like this
459 :     when it starts in an interactive mode:
460 :    
461 :     @smallexample
462 :     Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
463 :     Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
464 :     type `show w'.
465 :     This is free software, and you are welcome to redistribute it
466 :     under certain conditions; type `show c' for details.
467 :     @end smallexample
468 :    
469 :     The hypothetical commands @samp{show w} and @samp{show c} should show
470 :     the appropriate parts of the General Public License. Of course, the
471 :     commands you use may be called something other than @samp{show w} and
472 :     @samp{show c}; they could even be mouse-clicks or menu items---whatever
473 :     suits your program.
474 :    
475 :     You should also get your employer (if you work as a programmer) or your
476 :     school, if any, to sign a ``copyright disclaimer'' for the program, if
477 :     necessary. Here is a sample; alter the names:
478 :    
479 :     @smallexample
480 :     Yoyodyne, Inc., hereby disclaims all copyright interest in the program
481 :     `Gnomovision' (which makes passes at compilers) written by James Hacker.
482 :    
483 :     @var{signature of Ty Coon}, 1 April 1989
484 :     Ty Coon, President of Vice
485 :     @end smallexample
486 :    
487 :     This General Public License does not permit incorporating your program into
488 :     proprietary programs. If your program is a subroutine library, you may
489 :     consider it more useful to permit linking proprietary applications with the
490 :     library. If this is what you want to do, use the GNU Library General
491 :     Public License instead of this License.
492 : anton 1.1
493 :     @iftex
494 : pazsan 1.23 @node Preface
495 :     @comment node-name, next, previous, up
496 : anton 1.1 @unnumbered Preface
497 : pazsan 1.23 @cindex Preface
498 : anton 1.17 This manual documents Gforth. The reader is expected to know
499 : anton 1.1 Forth. This manual is primarily a reference manual. @xref{Other Books}
500 :     for introductory material.
501 :     @end iftex
502 :    
503 :     @node Goals, Other Books, License, Top
504 :     @comment node-name, next, previous, up
505 : anton 1.17 @chapter Goals of Gforth
506 : anton 1.1 @cindex Goals
507 : anton 1.17 The goal of the Gforth Project is to develop a standard model for
508 : anton 1.1 ANSI Forth. This can be split into several subgoals:
509 :    
510 :     @itemize @bullet
511 :     @item
512 : anton 1.17 Gforth should conform to the ANSI Forth standard.
513 : anton 1.1 @item
514 :     It should be a model, i.e. it should define all the
515 :     implementation-dependent things.
516 :     @item
517 :     It should become standard, i.e. widely accepted and used. This goal
518 :     is the most difficult one.
519 :     @end itemize
520 :    
521 : anton 1.17 To achieve these goals Gforth should be
522 : anton 1.1 @itemize @bullet
523 :     @item
524 :     Similar to previous models (fig-Forth, F83)
525 :     @item
526 :     Powerful. It should provide for all the things that are considered
527 :     necessary today and even some that are not yet considered necessary.
528 :     @item
529 :     Efficient. It should not get the reputation of being exceptionally
530 :     slow.
531 :     @item
532 :     Free.
533 :     @item
534 :     Available on many machines/easy to port.
535 :     @end itemize
536 :    
537 : anton 1.17 Have we achieved these goals? Gforth conforms to the ANS Forth
538 :     standard. It may be considered a model, but we have not yet documented
539 : anton 1.1 which parts of the model are stable and which parts we are likely to
540 : anton 1.17 change. It certainly has not yet become a de facto standard. It has some
541 :     similarities and some differences to previous models. It has some
542 :     powerful features, but not yet everything that we envisioned. We
543 :     certainly have achieved our execution speed goals (@pxref{Performance}).
544 :     It is free and available on many machines.
545 : anton 1.1
546 :     @node Other Books, Invocation, Goals, Top
547 :     @chapter Other books on ANS Forth
548 :    
549 :     As the standard is relatively new, there are not many books out yet. It
550 : anton 1.17 is not recommended to learn Forth by using Gforth and a book that is
551 : anton 1.1 not written for ANS Forth, as you will not know your mistakes from the
552 :     deviations of the book.
553 :    
554 :     There is, of course, the standard, the definite reference if you want to
555 : anton 1.19 write ANS Forth programs. It is available in printed form from the
556 :     National Standards Institute Sales Department (Tel.: USA (212) 642-4900;
557 :     Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about $200. You
558 :     can also get it from Global Engineering Documents (Tel.: USA (800)
559 :     854-7179; Fax.: (303) 843-9880) for about $300.
560 :    
561 :     @cite{dpANS6}, the last draft of the standard, which was then submitted to ANSI
562 :     for publication is available electronically and for free in some MS Word
563 :     format, and it has been converted to HTML. Some pointers to these
564 :     versions can be found through
565 : anton 1.24 @*@file{http://www.complang.tuwien.ac.at/projects/forth.html}.
566 : anton 1.1
567 : anton 1.21 @cite{Forth: The new model} by Jack Woehr (Prentice-Hall, 1993) is an
568 : anton 1.1 introductory book based on a draft version of the standard. It does not
569 :     cover the whole standard. It also contains interesting background
570 :     information (Jack Woehr was in the ANS Forth Technical Committe). It is
571 :     not appropriate for complete newbies, but programmers experienced in
572 :     other languages should find it ok.
573 :    
574 :     @node Invocation, Words, Other Books, Top
575 :     @chapter Invocation
576 :    
577 :     You will usually just say @code{gforth}. In many other cases the default
578 : anton 1.17 Gforth image will be invoked like this:
579 : anton 1.1
580 :     @example
581 :     gforth [files] [-e forth-code]
582 :     @end example
583 :    
584 :     executing the contents of the files and the Forth code in the order they
585 :     are given.
586 :    
587 :     In general, the command line looks like this:
588 :    
589 :     @example
590 :     gforth [initialization options] [image-specific options]
591 :     @end example
592 :    
593 :     The initialization options must come before the rest of the command
594 :     line. They are:
595 :    
596 :     @table @code
597 :     @item --image-file @var{file}
598 : pazsan 1.20 @item -i @var{file}
599 : anton 1.1 Loads the Forth image @var{file} instead of the default
600 :     @file{gforth.fi}.
601 :    
602 :     @item --path @var{path}
603 : pazsan 1.20 @item -p @var{path}
604 : anton 1.39 Uses @var{path} for searching the image file and Forth source code files
605 :     instead of the default in the environment variable @code{GFORTHPATH} or
606 :     the path specified at installation time (e.g.,
607 :     @file{/usr/local/share/gforth/0.2.0:.}). A path is given as a list of
608 :     directories, separated by @samp{:} (on Unix) or @samp{;} (on other OSs).
609 : anton 1.1
610 :     @item --dictionary-size @var{size}
611 :     @item -m @var{size}
612 :     Allocate @var{size} space for the Forth dictionary space instead of
613 :     using the default specified in the image (typically 256K). The
614 :     @var{size} specification consists of an integer and a unit (e.g.,
615 :     @code{4M}). The unit can be one of @code{b} (bytes), @code{e} (element
616 :     size, in this case Cells), @code{k} (kilobytes), and @code{M}
617 :     (Megabytes). If no unit is specified, @code{e} is used.
618 :    
619 :     @item --data-stack-size @var{size}
620 :     @item -d @var{size}
621 :     Allocate @var{size} space for the data stack instead of using the
622 :     default specified in the image (typically 16K).
623 :    
624 :     @item --return-stack-size @var{size}
625 :     @item -r @var{size}
626 :     Allocate @var{size} space for the return stack instead of using the
627 :     default specified in the image (typically 16K).
628 :    
629 :     @item --fp-stack-size @var{size}
630 :     @item -f @var{size}
631 :     Allocate @var{size} space for the floating point stack instead of
632 :     using the default specified in the image (typically 16K). In this case
633 :     the unit specifier @code{e} refers to floating point numbers.
634 :    
635 :     @item --locals-stack-size @var{size}
636 :     @item -l @var{size}
637 :     Allocate @var{size} space for the locals stack instead of using the
638 :     default specified in the image (typically 16K).
639 :    
640 :     @end table
641 :    
642 :     As explained above, the image-specific command-line arguments for the
643 :     default image @file{gforth.fi} consist of a sequence of filenames and
644 :     @code{-e @var{forth-code}} options that are interpreted in the seqence
645 :     in which they are given. The @code{-e @var{forth-code}} or
646 :     @code{--evaluate @var{forth-code}} option evaluates the forth
647 :     code. This option takes only one argument; if you want to evaluate more
648 :     Forth words, you have to quote them or use several @code{-e}s. To exit
649 :     after processing the command line (instead of entering interactive mode)
650 :     append @code{-e bye} to the command line.
651 :    
652 : anton 1.22 If you have several versions of Gforth installed, @code{gforth} will
653 :     invoke the version that was installed last. @code{gforth-@var{version}}
654 :     invokes a specific version. You may want to use the option
655 :     @code{--path}, if your environment contains the variable
656 :     @code{GFORTHPATH}.
657 :    
658 : anton 1.1 Not yet implemented:
659 :     On startup the system first executes the system initialization file
660 :     (unless the option @code{--no-init-file} is given; note that the system
661 :     resulting from using this option may not be ANS Forth conformant). Then
662 :     the user initialization file @file{.gforth.fs} is executed, unless the
663 :     option @code{--no-rc} is given; this file is first searched in @file{.},
664 :     then in @file{~}, then in the normal path (see above).
665 :    
666 : anton 1.40 @node Words, Tools, Invocation, Top
667 : anton 1.1 @chapter Forth Words
668 :    
669 :     @menu
670 : anton 1.4 * Notation::
671 :     * Arithmetic::
672 :     * Stack Manipulation::
673 :     * Memory access::
674 :     * Control Structures::
675 :     * Locals::
676 :     * Defining Words::
677 : anton 1.37 * Tokens for Words::
678 : anton 1.4 * Wordlists::
679 :     * Files::
680 :     * Blocks::
681 :     * Other I/O::
682 :     * Programming Tools::
683 : anton 1.18 * Assembler and Code words::
684 : anton 1.4 * Threading Words::
685 : anton 1.1 @end menu
686 :    
687 :     @node Notation, Arithmetic, Words, Words
688 :     @section Notation
689 :    
690 :     The Forth words are described in this section in the glossary notation
691 :     that has become a de-facto standard for Forth texts, i.e.
692 :    
693 : anton 1.4 @format
694 : anton 1.1 @var{word} @var{Stack effect} @var{wordset} @var{pronunciation}
695 : anton 1.4 @end format
696 : anton 1.1 @var{Description}
697 :    
698 :     @table @var
699 :     @item word
700 : anton 1.17 The name of the word. BTW, Gforth is case insensitive, so you can
701 : anton 1.14 type the words in in lower case (However, @pxref{core-idef}).
702 : anton 1.1
703 :     @item Stack effect
704 :     The stack effect is written in the notation @code{@var{before} --
705 :     @var{after}}, where @var{before} and @var{after} describe the top of
706 :     stack entries before and after the execution of the word. The rest of
707 :     the stack is not touched by the word. The top of stack is rightmost,
708 : anton 1.17 i.e., a stack sequence is written as it is typed in. Note that Gforth
709 : anton 1.1 uses a separate floating point stack, but a unified stack
710 :     notation. Also, return stack effects are not shown in @var{stack
711 :     effect}, but in @var{Description}. The name of a stack item describes
712 :     the type and/or the function of the item. See below for a discussion of
713 :     the types.
714 :    
715 : anton 1.19 All words have two stack effects: A compile-time stack effect and a
716 :     run-time stack effect. The compile-time stack-effect of most words is
717 :     @var{ -- }. If the compile-time stack-effect of a word deviates from
718 :     this standard behaviour, or the word does other unusual things at
719 :     compile time, both stack effects are shown; otherwise only the run-time
720 :     stack effect is shown.
721 :    
722 : anton 1.1 @item pronunciation
723 :     How the word is pronounced
724 :    
725 :     @item wordset
726 :     The ANS Forth standard is divided into several wordsets. A standard
727 :     system need not support all of them. So, the fewer wordsets your program
728 :     uses the more portable it will be in theory. However, we suspect that
729 :     most ANS Forth systems on personal machines will feature all
730 :     wordsets. Words that are not defined in the ANS standard have
731 : anton 1.19 @code{gforth} or @code{gforth-internal} as wordset. @code{gforth}
732 :     describes words that will work in future releases of Gforth;
733 :     @code{gforth-internal} words are more volatile. Environmental query
734 :     strings are also displayed like words; you can recognize them by the
735 :     @code{environment} in the wordset field.
736 : anton 1.1
737 :     @item Description
738 :     A description of the behaviour of the word.
739 :     @end table
740 :    
741 : anton 1.4 The type of a stack item is specified by the character(s) the name
742 :     starts with:
743 : anton 1.1
744 :     @table @code
745 :     @item f
746 :     Bool, i.e. @code{false} or @code{true}.
747 :     @item c
748 :     Char
749 :     @item w
750 :     Cell, can contain an integer or an address
751 :     @item n
752 :     signed integer
753 :     @item u
754 :     unsigned integer
755 :     @item d
756 :     double sized signed integer
757 :     @item ud
758 :     double sized unsigned integer
759 :     @item r
760 : anton 1.36 Float (on the FP stack)
761 : anton 1.1 @item a_
762 :     Cell-aligned address
763 :     @item c_
764 : anton 1.36 Char-aligned address (note that a Char may have two bytes in Windows NT)
765 : anton 1.1 @item f_
766 :     Float-aligned address
767 :     @item df_
768 :     Address aligned for IEEE double precision float
769 :     @item sf_
770 :     Address aligned for IEEE single precision float
771 :     @item xt
772 :     Execution token, same size as Cell
773 :     @item wid
774 :     Wordlist ID, same size as Cell
775 :     @item f83name
776 :     Pointer to a name structure
777 : anton 1.36 @item "
778 :     string in the input stream (not the stack). The terminating character is
779 :     a blank by default. If it is not a blank, it is shown in @code{<>}
780 :     quotes.
781 :    
782 : anton 1.1 @end table
783 :    
784 : anton 1.4 @node Arithmetic, Stack Manipulation, Notation, Words
785 : anton 1.1 @section Arithmetic
786 :     Forth arithmetic is not checked, i.e., you will not hear about integer
787 :     overflow on addition or multiplication, you may hear about division by
788 :     zero if you are lucky. The operator is written after the operands, but
789 :     the operands are still in the original order. I.e., the infix @code{2-1}
790 :     corresponds to @code{2 1 -}. Forth offers a variety of division
791 :     operators. If you perform division with potentially negative operands,
792 :     you do not want to use @code{/} or @code{/mod} with its undefined
793 :     behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the
794 : anton 1.4 former, @pxref{Mixed precision}).
795 :    
796 :     @menu
797 :     * Single precision::
798 :     * Bitwise operations::
799 :     * Mixed precision:: operations with single and double-cell integers
800 :     * Double precision:: Double-cell integer arithmetic
801 :     * Floating Point::
802 :     @end menu
803 : anton 1.1
804 : anton 1.4 @node Single precision, Bitwise operations, Arithmetic, Arithmetic
805 : anton 1.1 @subsection Single precision
806 :     doc-+
807 :     doc--
808 :     doc-*
809 :     doc-/
810 :     doc-mod
811 :     doc-/mod
812 :     doc-negate
813 :     doc-abs
814 :     doc-min
815 :     doc-max
816 :    
817 : anton 1.4 @node Bitwise operations, Mixed precision, Single precision, Arithmetic
818 : anton 1.1 @subsection Bitwise operations
819 :     doc-and
820 :     doc-or
821 :     doc-xor
822 :     doc-invert
823 :     doc-2*
824 :     doc-2/
825 :    
826 : anton 1.4 @node Mixed precision, Double precision, Bitwise operations, Arithmetic
827 : anton 1.1 @subsection Mixed precision
828 :     doc-m+
829 :     doc-*/
830 :     doc-*/mod
831 :     doc-m*
832 :     doc-um*
833 :     doc-m*/
834 :     doc-um/mod
835 :     doc-fm/mod
836 :     doc-sm/rem
837 :    
838 : anton 1.4 @node Double precision, Floating Point, Mixed precision, Arithmetic
839 : anton 1.1 @subsection Double precision
840 : anton 1.16
841 :     The outer (aka text) interpreter converts numbers containing a dot into
842 :     a double precision number. Note that only numbers with the dot as last
843 :     character are standard-conforming.
844 :    
845 : anton 1.1 doc-d+
846 :     doc-d-
847 :     doc-dnegate
848 :     doc-dabs
849 :     doc-dmin
850 :     doc-dmax
851 :    
852 : anton 1.4 @node Floating Point, , Double precision, Arithmetic
853 :     @subsection Floating Point
854 : anton 1.16
855 :     The format of floating point numbers recognized by the outer (aka text)
856 :     interpreter is: a signed decimal number, possibly containing a decimal
857 :     point (@code{.}), followed by @code{E} or @code{e}, optionally followed
858 :     by a signed integer (the exponent). E.g., @code{1e} ist the same as
859 : anton 1.35 @code{+1.0e+0}. Note that a number without @code{e}
860 : anton 1.16 is not interpreted as floating-point number, but as double (if the
861 :     number contains a @code{.}) or single precision integer. Also,
862 :     conversions between string and floating point numbers always use base
863 :     10, irrespective of the value of @code{BASE}. If @code{BASE} contains a
864 :     value greater then 14, the @code{E} may be interpreted as digit and the
865 :     number will be interpreted as integer, unless it has a signed exponent
866 :     (both @code{+} and @code{-} are allowed as signs).
867 : anton 1.4
868 :     Angles in floating point operations are given in radians (a full circle
869 : anton 1.17 has 2 pi radians). Note, that Gforth has a separate floating point
870 : anton 1.4 stack, but we use the unified notation.
871 :    
872 :     Floating point numbers have a number of unpleasant surprises for the
873 :     unwary (e.g., floating point addition is not associative) and even a few
874 :     for the wary. You should not use them unless you know what you are doing
875 :     or you don't care that the results you get are totally bogus. If you
876 :     want to learn about the problems of floating point numbers (and how to
877 : anton 1.11 avoid them), you might start with @cite{David Goldberg, What Every
878 : anton 1.6 Computer Scientist Should Know About Floating-Point Arithmetic, ACM
879 :     Computing Surveys 23(1):5@minus{}48, March 1991}.
880 : anton 1.4
881 :     doc-f+
882 :     doc-f-
883 :     doc-f*
884 :     doc-f/
885 :     doc-fnegate
886 :     doc-fabs
887 :     doc-fmax
888 :     doc-fmin
889 :     doc-floor
890 :     doc-fround
891 :     doc-f**
892 :     doc-fsqrt
893 :     doc-fexp
894 :     doc-fexpm1
895 :     doc-fln
896 :     doc-flnp1
897 :     doc-flog
898 : anton 1.6 doc-falog
899 : anton 1.4 doc-fsin
900 :     doc-fcos
901 :     doc-fsincos
902 :     doc-ftan
903 :     doc-fasin
904 :     doc-facos
905 :     doc-fatan
906 :     doc-fatan2
907 :     doc-fsinh
908 :     doc-fcosh
909 :     doc-ftanh
910 :     doc-fasinh
911 :     doc-facosh
912 :     doc-fatanh
913 :    
914 :     @node Stack Manipulation, Memory access, Arithmetic, Words
915 : anton 1.1 @section Stack Manipulation
916 :    
917 : anton 1.17 Gforth has a data stack (aka parameter stack) for characters, cells,
918 : anton 1.1 addresses, and double cells, a floating point stack for floating point
919 :     numbers, a return stack for storing the return addresses of colon
920 :     definitions and other data, and a locals stack for storing local
921 :     variables. Note that while every sane Forth has a separate floating
922 :     point stack, this is not strictly required; an ANS Forth system could
923 :     theoretically keep floating point numbers on the data stack. As an
924 :     additional difficulty, you don't know how many cells a floating point
925 :     number takes. It is reportedly possible to write words in a way that
926 :     they work also for a unified stack model, but we do not recommend trying
927 : anton 1.4 it. Instead, just say that your program has an environmental dependency
928 :     on a separate FP stack.
929 :    
930 :     Also, a Forth system is allowed to keep the local variables on the
931 : anton 1.1 return stack. This is reasonable, as local variables usually eliminate
932 :     the need to use the return stack explicitly. So, if you want to produce
933 :     a standard complying program and if you are using local variables in a
934 :     word, forget about return stack manipulations in that word (see the
935 :     standard document for the exact rules).
936 :    
937 : anton 1.4 @menu
938 :     * Data stack::
939 :     * Floating point stack::
940 :     * Return stack::
941 :     * Locals stack::
942 :     * Stack pointer manipulation::
943 :     @end menu
944 :    
945 :     @node Data stack, Floating point stack, Stack Manipulation, Stack Manipulation
946 : anton 1.1 @subsection Data stack
947 :     doc-drop
948 :     doc-nip
949 :     doc-dup
950 :     doc-over
951 :     doc-tuck
952 :     doc-swap
953 :     doc-rot
954 :     doc--rot
955 :     doc-?dup
956 :     doc-pick
957 :     doc-roll
958 :     doc-2drop
959 :     doc-2nip
960 :     doc-2dup
961 :     doc-2over
962 :     doc-2tuck
963 :     doc-2swap
964 :     doc-2rot
965 :    
966 : anton 1.4 @node Floating point stack, Return stack, Data stack, Stack Manipulation
967 : anton 1.1 @subsection Floating point stack
968 :     doc-fdrop
969 :     doc-fnip
970 :     doc-fdup
971 :     doc-fover
972 :     doc-ftuck
973 :     doc-fswap
974 :     doc-frot
975 :    
976 : anton 1.4 @node Return stack, Locals stack, Floating point stack, Stack Manipulation
977 : anton 1.1 @subsection Return stack
978 :     doc->r
979 :     doc-r>
980 :     doc-r@
981 :     doc-rdrop
982 :     doc-2>r
983 :     doc-2r>
984 :     doc-2r@
985 :     doc-2rdrop
986 :    
987 : anton 1.4 @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation
988 : anton 1.1 @subsection Locals stack
989 :    
990 : anton 1.4 @node Stack pointer manipulation, , Locals stack, Stack Manipulation
991 : anton 1.1 @subsection Stack pointer manipulation
992 :     doc-sp@
993 :     doc-sp!
994 :     doc-fp@
995 :     doc-fp!
996 :     doc-rp@
997 :     doc-rp!
998 :     doc-lp@
999 :     doc-lp!
1000 :    
1001 : anton 1.4 @node Memory access, Control Structures, Stack Manipulation, Words
1002 : anton 1.1 @section Memory access
1003 :    
1004 : anton 1.4 @menu
1005 :     * Stack-Memory transfers::
1006 :     * Address arithmetic::
1007 :     * Memory block access::
1008 :     @end menu
1009 :    
1010 :     @node Stack-Memory transfers, Address arithmetic, Memory access, Memory access
1011 : anton 1.1 @subsection Stack-Memory transfers
1012 :    
1013 :     doc-@
1014 :     doc-!
1015 :     doc-+!
1016 :     doc-c@
1017 :     doc-c!
1018 :     doc-2@
1019 :     doc-2!
1020 :     doc-f@
1021 :     doc-f!
1022 :     doc-sf@
1023 :     doc-sf!
1024 :     doc-df@
1025 :     doc-df!
1026 :    
1027 : anton 1.4 @node Address arithmetic, Memory block access, Stack-Memory transfers, Memory access
1028 : anton 1.1 @subsection Address arithmetic
1029 :    
1030 :     ANS Forth does not specify the sizes of the data types. Instead, it
1031 :     offers a number of words for computing sizes and doing address
1032 :     arithmetic. Basically, address arithmetic is performed in terms of
1033 :     address units (aus); on most systems the address unit is one byte. Note
1034 :     that a character may have more than one au, so @code{chars} is no noop
1035 :     (on systems where it is a noop, it compiles to nothing).
1036 :    
1037 :     ANS Forth also defines words for aligning addresses for specific
1038 :     addresses. Many computers require that accesses to specific data types
1039 :     must only occur at specific addresses; e.g., that cells may only be
1040 :     accessed at addresses divisible by 4. Even if a machine allows unaligned
1041 :     accesses, it can usually perform aligned accesses faster.
1042 :    
1043 : anton 1.17 For the performance-conscious: alignment operations are usually only
1044 : anton 1.1 necessary during the definition of a data structure, not during the
1045 :     (more frequent) accesses to it.
1046 :    
1047 :     ANS Forth defines no words for character-aligning addresses. This is not
1048 :     an oversight, but reflects the fact that addresses that are not
1049 :     char-aligned have no use in the standard and therefore will not be
1050 :     created.
1051 :    
1052 :     The standard guarantees that addresses returned by @code{CREATE}d words
1053 : anton 1.17 are cell-aligned; in addition, Gforth guarantees that these addresses
1054 : anton 1.1 are aligned for all purposes.
1055 :    
1056 : anton 1.9 Note that the standard defines a word @code{char}, which has nothing to
1057 :     do with address arithmetic.
1058 :    
1059 : anton 1.1 doc-chars
1060 :     doc-char+
1061 :     doc-cells
1062 :     doc-cell+
1063 :     doc-align
1064 :     doc-aligned
1065 :     doc-floats
1066 :     doc-float+
1067 :     doc-falign
1068 :     doc-faligned
1069 :     doc-sfloats
1070 :     doc-sfloat+
1071 :     doc-sfalign
1072 :     doc-sfaligned
1073 :     doc-dfloats
1074 :     doc-dfloat+
1075 :     doc-dfalign
1076 :     doc-dfaligned
1077 : anton 1.10 doc-maxalign
1078 :     doc-maxaligned
1079 :     doc-cfalign
1080 :     doc-cfaligned
1081 : anton 1.1 doc-address-unit-bits
1082 :    
1083 : anton 1.4 @node Memory block access, , Address arithmetic, Memory access
1084 : anton 1.1 @subsection Memory block access
1085 :    
1086 :     doc-move
1087 :     doc-erase
1088 :    
1089 :     While the previous words work on address units, the rest works on
1090 :     characters.
1091 :    
1092 :     doc-cmove
1093 :     doc-cmove>
1094 :     doc-fill
1095 :     doc-blank
1096 :    
1097 : anton 1.4 @node Control Structures, Locals, Memory access, Words
1098 : anton 1.1 @section Control Structures
1099 :    
1100 :     Control structures in Forth cannot be used in interpret state, only in
1101 :     compile state, i.e., in a colon definition. We do not like this
1102 :     limitation, but have not seen a satisfying way around it yet, although
1103 :     many schemes have been proposed.
1104 :    
1105 : anton 1.4 @menu
1106 :     * Selection::
1107 :     * Simple Loops::
1108 :     * Counted Loops::
1109 :     * Arbitrary control structures::
1110 :     * Calls and returns::
1111 :     * Exception Handling::
1112 :     @end menu
1113 :    
1114 :     @node Selection, Simple Loops, Control Structures, Control Structures
1115 : anton 1.1 @subsection Selection
1116 :    
1117 :     @example
1118 :     @var{flag}
1119 :     IF
1120 :     @var{code}
1121 :     ENDIF
1122 :     @end example
1123 :     or
1124 :     @example
1125 :     @var{flag}
1126 :     IF
1127 :     @var{code1}
1128 :     ELSE
1129 :     @var{code2}
1130 :     ENDIF
1131 :     @end example
1132 :    
1133 : anton 1.4 You can use @code{THEN} instead of @code{ENDIF}. Indeed, @code{THEN} is
1134 : anton 1.1 standard, and @code{ENDIF} is not, although it is quite popular. We
1135 :     recommend using @code{ENDIF}, because it is less confusing for people
1136 :     who also know other languages (and is not prone to reinforcing negative
1137 :     prejudices against Forth in these people). Adding @code{ENDIF} to a
1138 :     system that only supplies @code{THEN} is simple:
1139 :     @example
1140 :     : endif POSTPONE then ; immediate
1141 :     @end example
1142 :    
1143 :     [According to @cite{Webster's New Encyclopedic Dictionary}, @dfn{then
1144 :     (adv.)} has the following meanings:
1145 :     @quotation
1146 :     ... 2b: following next after in order ... 3d: as a necessary consequence
1147 :     (if you were there, then you saw them).
1148 :     @end quotation
1149 :     Forth's @code{THEN} has the meaning 2b, whereas @code{THEN} in Pascal
1150 :     and many other programming languages has the meaning 3d.]
1151 :    
1152 : anton 1.31 Gforth also provides the words @code{?dup-if} and @code{?dup-0=-if}, so
1153 :     you can avoid using @code{?dup}. Using these alternatives is also more
1154 :     efficient than using @code{?dup}. Definitions in plain standard Forth
1155 :     for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in
1156 :     @file{compat/control.fs}.
1157 : anton 1.1
1158 :     @example
1159 :     @var{n}
1160 :     CASE
1161 :     @var{n1} OF @var{code1} ENDOF
1162 :     @var{n2} OF @var{code2} ENDOF
1163 : anton 1.4 @dots{}
1164 : anton 1.1 ENDCASE
1165 :     @end example
1166 :    
1167 :     Executes the first @var{codei}, where the @var{ni} is equal to
1168 :     @var{n}. A default case can be added by simply writing the code after
1169 :     the last @code{ENDOF}. It may use @var{n}, which is on top of the stack,
1170 :     but must not consume it.
1171 :    
1172 : anton 1.4 @node Simple Loops, Counted Loops, Selection, Control Structures
1173 : anton 1.1 @subsection Simple Loops
1174 :    
1175 :     @example
1176 :     BEGIN
1177 :     @var{code1}
1178 :     @var{flag}
1179 :     WHILE
1180 :     @var{code2}
1181 :     REPEAT
1182 :     @end example
1183 :    
1184 :     @var{code1} is executed and @var{flag} is computed. If it is true,
1185 :     @var{code2} is executed and the loop is restarted; If @var{flag} is false, execution continues after the @code{REPEAT}.
1186 :    
1187 :     @example
1188 :     BEGIN
1189 :     @var{code}
1190 :     @var{flag}
1191 :     UNTIL
1192 :     @end example
1193 :    
1194 :     @var{code} is executed. The loop is restarted if @code{flag} is false.
1195 :    
1196 :     @example
1197 :     BEGIN
1198 :     @var{code}
1199 :     AGAIN
1200 :     @end example
1201 :    
1202 :     This is an endless loop.
1203 :    
1204 : anton 1.4 @node Counted Loops, Arbitrary control structures, Simple Loops, Control Structures
1205 : anton 1.1 @subsection Counted Loops
1206 :    
1207 :     The basic counted loop is:
1208 :     @example
1209 :     @var{limit} @var{start}
1210 :     ?DO
1211 :     @var{body}
1212 :     LOOP
1213 :     @end example
1214 :    
1215 :     This performs one iteration for every integer, starting from @var{start}
1216 :     and up to, but excluding @var{limit}. The counter, aka index, can be
1217 :     accessed with @code{i}. E.g., the loop
1218 :     @example
1219 :     10 0 ?DO
1220 :     i .
1221 :     LOOP
1222 :     @end example
1223 :     prints
1224 :     @example
1225 :     0 1 2 3 4 5 6 7 8 9
1226 :     @end example
1227 :     The index of the innermost loop can be accessed with @code{i}, the index
1228 :     of the next loop with @code{j}, and the index of the third loop with
1229 :     @code{k}.
1230 :    
1231 :     The loop control data are kept on the return stack, so there are some
1232 :     restrictions on mixing return stack accesses and counted loop
1233 :     words. E.g., if you put values on the return stack outside the loop, you
1234 :     cannot read them inside the loop. If you put values on the return stack
1235 :     within a loop, you have to remove them before the end of the loop and
1236 :     before accessing the index of the loop.
1237 :    
1238 :     There are several variations on the counted loop:
1239 :    
1240 :     @code{LEAVE} leaves the innermost counted loop immediately.
1241 :    
1242 : anton 1.18 If @var{start} is greater than @var{limit}, a @code{?DO} loop is entered
1243 :     (and @code{LOOP} iterates until they become equal by wrap-around
1244 :     arithmetic). This behaviour is usually not what you want. Therefore,
1245 :     Gforth offers @code{+DO} and @code{U+DO} (as replacements for
1246 :     @code{?DO}), which do not enter the loop if @var{start} is greater than
1247 :     @var{limit}; @code{+DO} is for signed loop parameters, @code{U+DO} for
1248 : anton 1.30 unsigned loop parameters.
1249 : anton 1.18
1250 : anton 1.1 @code{LOOP} can be replaced with @code{@var{n} +LOOP}; this updates the
1251 :     index by @var{n} instead of by 1. The loop is terminated when the border
1252 :     between @var{limit-1} and @var{limit} is crossed. E.g.:
1253 :    
1254 : anton 1.18 @code{4 0 +DO i . 2 +LOOP} prints @code{0 2}
1255 : anton 1.1
1256 : anton 1.18 @code{4 1 +DO i . 2 +LOOP} prints @code{1 3}
1257 : anton 1.1
1258 :     The behaviour of @code{@var{n} +LOOP} is peculiar when @var{n} is negative:
1259 :    
1260 : anton 1.2 @code{-1 0 ?DO i . -1 +LOOP} prints @code{0 -1}
1261 : anton 1.1
1262 : anton 1.2 @code{ 0 0 ?DO i . -1 +LOOP} prints nothing
1263 : anton 1.1
1264 : anton 1.18 Therefore we recommend avoiding @code{@var{n} +LOOP} with negative
1265 :     @var{n}. One alternative is @code{@var{u} -LOOP}, which reduces the
1266 :     index by @var{u} each iteration. The loop is terminated when the border
1267 :     between @var{limit+1} and @var{limit} is crossed. Gforth also provides
1268 :     @code{-DO} and @code{U-DO} for down-counting loops. E.g.:
1269 : anton 1.1
1270 : anton 1.18 @code{-2 0 -DO i . 1 -LOOP} prints @code{0 -1}
1271 : anton 1.1
1272 : anton 1.18 @code{-1 0 -DO i . 1 -LOOP} prints @code{0}
1273 : anton 1.1
1274 : anton 1.18 @code{ 0 0 -DO i . 1 -LOOP} prints nothing
1275 : anton 1.1
1276 : anton 1.30 Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and
1277 :     @code{-LOOP} are not in the ANS Forth standard. However, an
1278 :     implementation for these words that uses only standard words is provided
1279 :     in @file{compat/loops.fs}.
1280 : anton 1.18
1281 :     @code{?DO} can also be replaced by @code{DO}. @code{DO} always enters
1282 :     the loop, independent of the loop parameters. Do not use @code{DO}, even
1283 :     if you know that the loop is entered in any case. Such knowledge tends
1284 :     to become invalid during maintenance of a program, and then the
1285 :     @code{DO} will make trouble.
1286 : anton 1.1
1287 :     @code{UNLOOP} is used to prepare for an abnormal loop exit, e.g., via
1288 :     @code{EXIT}. @code{UNLOOP} removes the loop control parameters from the
1289 :     return stack so @code{EXIT} can get to its return address.
1290 :    
1291 :     Another counted loop is
1292 :     @example
1293 :     @var{n}
1294 :     FOR
1295 :     @var{body}
1296 :     NEXT
1297 :     @end example
1298 :     This is the preferred loop of native code compiler writers who are too
1299 : anton 1.17 lazy to optimize @code{?DO} loops properly. In Gforth, this loop
1300 : anton 1.1 iterates @var{n+1} times; @code{i} produces values starting with @var{n}
1301 :     and ending with 0. Other Forth systems may behave differently, even if
1302 : anton 1.30 they support @code{FOR} loops. To avoid problems, don't use @code{FOR}
1303 :     loops.
1304 : anton 1.1
1305 : anton 1.4 @node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures
1306 : anton 1.2 @subsection Arbitrary control structures
1307 :    
1308 :     ANS Forth permits and supports using control structures in a non-nested
1309 :     way. Information about incomplete control structures is stored on the
1310 :     control-flow stack. This stack may be implemented on the Forth data
1311 : anton 1.17 stack, and this is what we have done in Gforth.
1312 : anton 1.2
1313 :     An @i{orig} entry represents an unresolved forward branch, a @i{dest}
1314 :     entry represents a backward branch target. A few words are the basis for
1315 :     building any control structure possible (except control structures that
1316 :     need storage, like calls, coroutines, and backtracking).
1317 :    
1318 : anton 1.3 doc-if
1319 :     doc-ahead
1320 :     doc-then
1321 :     doc-begin
1322 :     doc-until
1323 :     doc-again
1324 :     doc-cs-pick
1325 :     doc-cs-roll
1326 : anton 1.2
1327 : anton 1.17 On many systems control-flow stack items take one word, in Gforth they
1328 : anton 1.2 currently take three (this may change in the future). Therefore it is a
1329 :     really good idea to manipulate the control flow stack with
1330 :     @code{cs-pick} and @code{cs-roll}, not with data stack manipulation
1331 :     words.
1332 :    
1333 :     Some standard control structure words are built from these words:
1334 :    
1335 : anton 1.3 doc-else
1336 :     doc-while
1337 :     doc-repeat
1338 : anton 1.2
1339 : anton 1.31 Gforth adds some more control-structure words:
1340 :    
1341 :     doc-endif
1342 :     doc-?dup-if
1343 :     doc-?dup-0=-if
1344 :    
1345 : anton 1.2 Counted loop words constitute a separate group of words:
1346 :    
1347 : anton 1.3 doc-?do
1348 : anton 1.18 doc-+do
1349 :     doc-u+do
1350 :     doc--do
1351 :     doc-u-do
1352 : anton 1.3 doc-do
1353 :     doc-for
1354 :     doc-loop
1355 :     doc-+loop
1356 : anton 1.18 doc--loop
1357 : anton 1.3 doc-next
1358 :     doc-leave
1359 :     doc-?leave
1360 :     doc-unloop
1361 : anton 1.10 doc-done
1362 : anton 1.2
1363 :     The standard does not allow using @code{cs-pick} and @code{cs-roll} on
1364 :     @i{do-sys}. Our system allows it, but it's your job to ensure that for
1365 :     every @code{?DO} etc. there is exactly one @code{UNLOOP} on any path
1366 : anton 1.3 through the definition (@code{LOOP} etc. compile an @code{UNLOOP} on the
1367 :     fall-through path). Also, you have to ensure that all @code{LEAVE}s are
1368 : pazsan 1.7 resolved (by using one of the loop-ending words or @code{DONE}).
1369 : anton 1.2
1370 :     Another group of control structure words are
1371 :    
1372 : anton 1.3 doc-case
1373 :     doc-endcase
1374 :     doc-of
1375 :     doc-endof
1376 : anton 1.2
1377 :     @i{case-sys} and @i{of-sys} cannot be processed using @code{cs-pick} and
1378 :     @code{cs-roll}.
1379 :    
1380 : anton 1.3 @subsubsection Programming Style
1381 :    
1382 :     In order to ensure readability we recommend that you do not create
1383 :     arbitrary control structures directly, but define new control structure
1384 :     words for the control structure you want and use these words in your
1385 :     program.
1386 :    
1387 :     E.g., instead of writing
1388 :    
1389 :     @example
1390 :     begin
1391 :     ...
1392 :     if [ 1 cs-roll ]
1393 :     ...
1394 :     again then
1395 :     @end example
1396 :    
1397 :     we recommend defining control structure words, e.g.,
1398 :    
1399 :     @example
1400 :     : while ( dest -- orig dest )
1401 :     POSTPONE if
1402 :     1 cs-roll ; immediate
1403 :    
1404 :     : repeat ( orig dest -- )
1405 :     POSTPONE again
1406 :     POSTPONE then ; immediate
1407 :     @end example
1408 :    
1409 :     and then using these to create the control structure:
1410 :    
1411 :     @example
1412 :     begin
1413 :     ...
1414 :     while
1415 :     ...
1416 :     repeat
1417 :     @end example
1418 :    
1419 : anton 1.30 That's much easier to read, isn't it? Of course, @code{REPEAT} and
1420 : anton 1.3 @code{WHILE} are predefined, so in this example it would not be
1421 :     necessary to define them.
1422 :    
1423 : anton 1.4 @node Calls and returns, Exception Handling, Arbitrary control structures, Control Structures
1424 : anton 1.3 @subsection Calls and returns
1425 :    
1426 :     A definition can be called simply be writing the name of the
1427 : anton 1.17 definition. When the end of the definition is reached, it returns. An
1428 :     earlier return can be forced using
1429 : anton 1.3
1430 :     doc-exit
1431 :    
1432 :     Don't forget to clean up the return stack and @code{UNLOOP} any
1433 :     outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing. The
1434 :     primitive compiled by @code{EXIT} is
1435 :    
1436 :     doc-;s
1437 :    
1438 : anton 1.4 @node Exception Handling, , Calls and returns, Control Structures
1439 : anton 1.3 @subsection Exception Handling
1440 :    
1441 :     doc-catch
1442 :     doc-throw
1443 :    
1444 : anton 1.4 @node Locals, Defining Words, Control Structures, Words
1445 : anton 1.1 @section Locals
1446 :    
1447 : anton 1.2 Local variables can make Forth programming more enjoyable and Forth
1448 :     programs easier to read. Unfortunately, the locals of ANS Forth are
1449 :     laden with restrictions. Therefore, we provide not only the ANS Forth
1450 :     locals wordset, but also our own, more powerful locals wordset (we
1451 :     implemented the ANS Forth locals wordset through our locals wordset).
1452 :    
1453 : anton 1.24 The ideas in this section have also been published in the paper
1454 :     @cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented
1455 :     at EuroForth '94; it is available at
1456 :     @*@file{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}.
1457 :    
1458 : anton 1.2 @menu
1459 : anton 1.17 * Gforth locals::
1460 : anton 1.4 * ANS Forth locals::
1461 : anton 1.2 @end menu
1462 :    
1463 : anton 1.17 @node Gforth locals, ANS Forth locals, Locals, Locals
1464 :     @subsection Gforth locals
1465 : anton 1.2
1466 :     Locals can be defined with
1467 :    
1468 :     @example
1469 :     @{ local1 local2 ... -- comment @}
1470 :     @end example
1471 :     or
1472 :     @example
1473 :     @{ local1 local2 ... @}
1474 :     @end example
1475 :    
1476 :     E.g.,
1477 :     @example
1478 :     : max @{ n1 n2 -- n3 @}
1479 :     n1 n2 > if
1480 :     n1
1481 :     else
1482 :     n2
1483 :     endif ;
1484 :     @end example
1485 :    
1486 :     The similarity of locals definitions with stack comments is intended. A
1487 :     locals definition often replaces the stack comment of a word. The order
1488 :     of the locals corresponds to the order in a stack comment and everything
1489 :     after the @code{--} is really a comment.
1490 :    
1491 :     This similarity has one disadvantage: It is too easy to confuse locals
1492 :     declarations with stack comments, causing bugs and making them hard to
1493 :     find. However, this problem can be avoided by appropriate coding
1494 :     conventions: Do not use both notations in the same program. If you do,
1495 :     they should be distinguished using additional means, e.g. by position.
1496 :    
1497 :     The name of the local may be preceded by a type specifier, e.g.,
1498 :     @code{F:} for a floating point value:
1499 :    
1500 :     @example
1501 :     : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
1502 :     \ complex multiplication
1503 :     Ar Br f* Ai Bi f* f-
1504 :     Ar Bi f* Ai Br f* f+ ;
1505 :     @end example
1506 :    
1507 : anton 1.17 Gforth currently supports cells (@code{W:}, @code{W^}), doubles
1508 : anton 1.2 (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
1509 :     (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
1510 :     with @code{W:}, @code{D:} etc.) produces its value and can be changed
1511 :     with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
1512 :     produces its address (which becomes invalid when the variable's scope is
1513 :     left). E.g., the standard word @code{emit} can be defined in therms of
1514 :     @code{type} like this:
1515 :    
1516 :     @example
1517 :     : emit @{ C^ char* -- @}
1518 :     char* 1 type ;
1519 :     @end example
1520 :    
1521 :     A local without type specifier is a @code{W:} local. Both flavours of
1522 :     locals are initialized with values from the data or FP stack.
1523 :    
1524 :     Currently there is no way to define locals with user-defined data
1525 :     structures, but we are working on it.
1526 :    
1527 : anton 1.17 Gforth allows defining locals everywhere in a colon definition. This
1528 : pazsan 1.7 poses the following questions:
1529 : anton 1.2
1530 : anton 1.4 @menu
1531 :     * Where are locals visible by name?::
1532 : anton 1.14 * How long do locals live?::
1533 : anton 1.4 * Programming Style::
1534 :     * Implementation::
1535 :     @end menu
1536 :    
1537 : anton 1.17 @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
1538 : anton 1.2 @subsubsection Where are locals visible by name?
1539 :    
1540 :     Basically, the answer is that locals are visible where you would expect
1541 :     it in block-structured languages, and sometimes a little longer. If you
1542 :     want to restrict the scope of a local, enclose its definition in
1543 :     @code{SCOPE}...@code{ENDSCOPE}.
1544 :    
1545 :     doc-scope
1546 :     doc-endscope
1547 :    
1548 :     These words behave like control structure words, so you can use them
1549 :     with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
1550 :     arbitrary ways.
1551 :    
1552 :     If you want a more exact answer to the visibility question, here's the
1553 :     basic principle: A local is visible in all places that can only be
1554 :     reached through the definition of the local@footnote{In compiler
1555 :     construction terminology, all places dominated by the definition of the
1556 :     local.}. In other words, it is not visible in places that can be reached
1557 :     without going through the definition of the local. E.g., locals defined
1558 :     in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
1559 :     defined in @code{BEGIN}...@code{UNTIL} are visible after the
1560 :     @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
1561 :    
1562 :     The reasoning behind this solution is: We want to have the locals
1563 :     visible as long as it is meaningful. The user can always make the
1564 :     visibility shorter by using explicit scoping. In a place that can
1565 :     only be reached through the definition of a local, the meaning of a
1566 :     local name is clear. In other places it is not: How is the local
1567 :     initialized at the control flow path that does not contain the
1568 :     definition? Which local is meant, if the same name is defined twice in
1569 :     two independent control flow paths?
1570 :    
1571 :     This should be enough detail for nearly all users, so you can skip the
1572 :     rest of this section. If you relly must know all the gory details and
1573 :     options, read on.
1574 :    
1575 :     In order to implement this rule, the compiler has to know which places
1576 :     are unreachable. It knows this automatically after @code{AHEAD},
1577 :     @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
1578 :     most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
1579 :     compiler that the control flow never reaches that place. If
1580 :     @code{UNREACHABLE} is not used where it could, the only consequence is
1581 :     that the visibility of some locals is more limited than the rule above
1582 :     says. If @code{UNREACHABLE} is used where it should not (i.e., if you
1583 :     lie to the compiler), buggy code will be produced.
1584 :    
1585 :     Another problem with this rule is that at @code{BEGIN}, the compiler
1586 : anton 1.3 does not know which locals will be visible on the incoming
1587 :     back-edge. All problems discussed in the following are due to this
1588 :     ignorance of the compiler (we discuss the problems using @code{BEGIN}
1589 :     loops as examples; the discussion also applies to @code{?DO} and other
1590 : anton 1.2 loops). Perhaps the most insidious example is:
1591 :     @example
1592 :     AHEAD
1593 :     BEGIN
1594 :     x
1595 :     [ 1 CS-ROLL ] THEN
1596 : anton 1.4 @{ x @}
1597 : anton 1.2 ...
1598 :     UNTIL
1599 :     @end example
1600 :    
1601 :     This should be legal according to the visibility rule. The use of
1602 :     @code{x} can only be reached through the definition; but that appears
1603 :     textually below the use.
1604 :    
1605 :     From this example it is clear that the visibility rules cannot be fully
1606 :     implemented without major headaches. Our implementation treats common
1607 :     cases as advertised and the exceptions are treated in a safe way: The
1608 :     compiler makes a reasonable guess about the locals visible after a
1609 :     @code{BEGIN}; if it is too pessimistic, the
1610 :     user will get a spurious error about the local not being defined; if the
1611 :     compiler is too optimistic, it will notice this later and issue a
1612 :     warning. In the case above the compiler would complain about @code{x}
1613 :     being undefined at its use. You can see from the obscure examples in
1614 :     this section that it takes quite unusual control structures to get the
1615 :     compiler into trouble, and even then it will often do fine.
1616 :    
1617 :     If the @code{BEGIN} is reachable from above, the most optimistic guess
1618 :     is that all locals visible before the @code{BEGIN} will also be
1619 :     visible after the @code{BEGIN}. This guess is valid for all loops that
1620 :     are entered only through the @code{BEGIN}, in particular, for normal
1621 :     @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
1622 :     @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
1623 :     compiler. When the branch to the @code{BEGIN} is finally generated by
1624 :     @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
1625 :     warns the user if it was too optimisitic:
1626 :     @example
1627 :     IF
1628 : anton 1.4 @{ x @}
1629 : anton 1.2 BEGIN
1630 :     \ x ?
1631 :     [ 1 cs-roll ] THEN
1632 :     ...
1633 :     UNTIL
1634 :     @end example
1635 :    
1636 :     Here, @code{x} lives only until the @code{BEGIN}, but the compiler
1637 :     optimistically assumes that it lives until the @code{THEN}. It notices
1638 :     this difference when it compiles the @code{UNTIL} and issues a
1639 :     warning. The user can avoid the warning, and make sure that @code{x}
1640 :     is not used in the wrong area by using explicit scoping:
1641 :     @example
1642 :     IF
1643 :     SCOPE
1644 : anton 1.4 @{ x @}
1645 : anton 1.2 ENDSCOPE
1646 :     BEGIN
1647 :     [ 1 cs-roll ] THEN
1648 :     ...
1649 :     UNTIL
1650 :     @end example
1651 :    
1652 :     Since the guess is optimistic, there will be no spurious error messages
1653 :     about undefined locals.
1654 :    
1655 :     If the @code{BEGIN} is not reachable from above (e.g., after
1656 :     @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
1657 :     optimistic guess, as the locals visible after the @code{BEGIN} may be
1658 :     defined later. Therefore, the compiler assumes that no locals are
1659 : anton 1.17 visible after the @code{BEGIN}. However, the user can use
1660 : anton 1.2 @code{ASSUME-LIVE} to make the compiler assume that the same locals are
1661 : anton 1.17 visible at the BEGIN as at the point where the top control-flow stack
1662 :     item was created.
1663 : anton 1.2
1664 :     doc-assume-live
1665 :    
1666 :     E.g.,
1667 :     @example
1668 : anton 1.4 @{ x @}
1669 : anton 1.2 AHEAD
1670 :     ASSUME-LIVE
1671 :     BEGIN
1672 :     x
1673 :     [ 1 CS-ROLL ] THEN
1674 :     ...
1675 :     UNTIL
1676 :     @end example
1677 :    
1678 :     Other cases where the locals are defined before the @code{BEGIN} can be
1679 :     handled by inserting an appropriate @code{CS-ROLL} before the
1680 :     @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
1681 :     behind the @code{ASSUME-LIVE}).
1682 :    
1683 :     Cases where locals are defined after the @code{BEGIN} (but should be
1684 :     visible immediately after the @code{BEGIN}) can only be handled by
1685 :     rearranging the loop. E.g., the ``most insidious'' example above can be
1686 :     arranged into:
1687 :     @example
1688 :     BEGIN
1689 : anton 1.4 @{ x @}
1690 : anton 1.2 ... 0=
1691 :     WHILE
1692 :     x
1693 :     REPEAT
1694 :     @end example
1695 :    
1696 : anton 1.17 @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals
1697 : anton 1.2 @subsubsection How long do locals live?
1698 :    
1699 :     The right answer for the lifetime question would be: A local lives at
1700 :     least as long as it can be accessed. For a value-flavoured local this
1701 :     means: until the end of its visibility. However, a variable-flavoured
1702 :     local could be accessed through its address far beyond its visibility
1703 :     scope. Ultimately, this would mean that such locals would have to be
1704 :     garbage collected. Since this entails un-Forth-like implementation
1705 :     complexities, I adopted the same cowardly solution as some other
1706 :     languages (e.g., C): The local lives only as long as it is visible;
1707 :     afterwards its address is invalid (and programs that access it
1708 :     afterwards are erroneous).
1709 :    
1710 : anton 1.17 @node Programming Style, Implementation, How long do locals live?, Gforth locals
1711 : anton 1.2 @subsubsection Programming Style
1712 :    
1713 :     The freedom to define locals anywhere has the potential to change
1714 :     programming styles dramatically. In particular, the need to use the
1715 :     return stack for intermediate storage vanishes. Moreover, all stack
1716 :     manipulations (except @code{PICK}s and @code{ROLL}s with run-time
1717 :     determined arguments) can be eliminated: If the stack items are in the
1718 :     wrong order, just write a locals definition for all of them; then
1719 :     write the items in the order you want.
1720 :    
1721 :     This seems a little far-fetched and eliminating stack manipulations is
1722 : anton 1.4 unlikely to become a conscious programming objective. Still, the number
1723 :     of stack manipulations will be reduced dramatically if local variables
1724 : anton 1.17 are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with
1725 : anton 1.4 a traditional implementation of @code{max}).
1726 : anton 1.2
1727 :     This shows one potential benefit of locals: making Forth programs more
1728 :     readable. Of course, this benefit will only be realized if the
1729 :     programmers continue to honour the principle of factoring instead of
1730 :     using the added latitude to make the words longer.
1731 :    
1732 :     Using @code{TO} can and should be avoided. Without @code{TO},
1733 :     every value-flavoured local has only a single assignment and many
1734 :     advantages of functional languages apply to Forth. I.e., programs are
1735 :     easier to analyse, to optimize and to read: It is clear from the
1736 :     definition what the local stands for, it does not turn into something
1737 :     different later.
1738 :    
1739 :     E.g., a definition using @code{TO} might look like this:
1740 :     @example
1741 :     : strcmp @{ addr1 u1 addr2 u2 -- n @}
1742 :     u1 u2 min 0
1743 :     ?do
1744 : anton 1.36 addr1 c@@ addr2 c@@ -
1745 : anton 1.31 ?dup-if
1746 : anton 1.2 unloop exit
1747 :     then
1748 :     addr1 char+ TO addr1
1749 :     addr2 char+ TO addr2
1750 :     loop
1751 :     u1 u2 - ;
1752 :     @end example
1753 :     Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
1754 :     every loop iteration. @code{strcmp} is a typical example of the
1755 :     readability problems of using @code{TO}. When you start reading
1756 :     @code{strcmp}, you think that @code{addr1} refers to the start of the
1757 :     string. Only near the end of the loop you realize that it is something
1758 :     else.
1759 :    
1760 :     This can be avoided by defining two locals at the start of the loop that
1761 :     are initialized with the right value for the current iteration.
1762 :     @example
1763 :     : strcmp @{ addr1 u1 addr2 u2 -- n @}
1764 :     addr1 addr2
1765 :     u1 u2 min 0
1766 :     ?do @{ s1 s2 @}
1767 : anton 1.36 s1 c@@ s2 c@@ -
1768 : anton 1.31 ?dup-if
1769 : anton 1.2 unloop exit
1770 :     then
1771 :     s1 char+ s2 char+
1772 :     loop
1773 :     2drop
1774 :     u1 u2 - ;
1775 :     @end example
1776 :     Here it is clear from the start that @code{s1} has a different value
1777 :     in every loop iteration.
1778 :    
1779 : anton 1.17 @node Implementation, , Programming Style, Gforth locals
1780 : anton 1.2 @subsubsection Implementation
1781 :    
1782 : anton 1.17 Gforth uses an extra locals stack. The most compelling reason for
1783 : anton 1.2 this is that the return stack is not float-aligned; using an extra stack
1784 :     also eliminates the problems and restrictions of using the return stack
1785 :     as locals stack. Like the other stacks, the locals stack grows toward
1786 :     lower addresses. A few primitives allow an efficient implementation:
1787 :    
1788 :     doc-@local#
1789 :     doc-f@local#
1790 :     doc-laddr#
1791 :     doc-lp+!#
1792 :     doc-lp!
1793 :     doc->l
1794 :     doc-f>l
1795 :    
1796 :     In addition to these primitives, some specializations of these
1797 :     primitives for commonly occurring inline arguments are provided for
1798 :     efficiency reasons, e.g., @code{@@local0} as specialization of
1799 :     @code{@@local#} for the inline argument 0. The following compiling words
1800 :     compile the right specialized version, or the general version, as
1801 :     appropriate:
1802 :    
1803 : anton 1.12 doc-compile-@local
1804 :     doc-compile-f@local
1805 : anton 1.2 doc-compile-lp+!
1806 :    
1807 :     Combinations of conditional branches and @code{lp+!#} like
1808 :     @code{?branch-lp+!#} (the locals pointer is only changed if the branch
1809 :     is taken) are provided for efficiency and correctness in loops.
1810 :    
1811 :     A special area in the dictionary space is reserved for keeping the
1812 :     local variable names. @code{@{} switches the dictionary pointer to this
1813 :     area and @code{@}} switches it back and generates the locals
1814 :     initializing code. @code{W:} etc.@ are normal defining words. This
1815 :     special area is cleared at the start of every colon definition.
1816 :    
1817 : anton 1.17 A special feature of Gforth's dictionary is used to implement the
1818 : anton 1.2 definition of locals without type specifiers: every wordlist (aka
1819 :     vocabulary) has its own methods for searching
1820 : anton 1.4 etc. (@pxref{Wordlists}). For the present purpose we defined a wordlist
1821 : anton 1.2 with a special search method: When it is searched for a word, it
1822 :     actually creates that word using @code{W:}. @code{@{} changes the search
1823 :     order to first search the wordlist containing @code{@}}, @code{W:} etc.,
1824 :     and then the wordlist for defining locals without type specifiers.
1825 :    
1826 :     The lifetime rules support a stack discipline within a colon
1827 :     definition: The lifetime of a local is either nested with other locals
1828 :     lifetimes or it does not overlap them.
1829 :    
1830 :     At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
1831 :     pointer manipulation is generated. Between control structure words
1832 :     locals definitions can push locals onto the locals stack. @code{AGAIN}
1833 :     is the simplest of the other three control flow words. It has to
1834 :     restore the locals stack depth of the corresponding @code{BEGIN}
1835 :     before branching. The code looks like this:
1836 :     @format
1837 :     @code{lp+!#} current-locals-size @minus{} dest-locals-size
1838 :     @code{branch} <begin>
1839 :     @end format
1840 :    
1841 :     @code{UNTIL} is a little more complicated: If it branches back, it
1842 :     must adjust the stack just like @code{AGAIN}. But if it falls through,
1843 :     the locals stack must not be changed. The compiler generates the
1844 :     following code:
1845 :     @format
1846 :     @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
1847 :     @end format
1848 :     The locals stack pointer is only adjusted if the branch is taken.
1849 :    
1850 :     @code{THEN} can produce somewhat inefficient code:
1851 :     @format
1852 :     @code{lp+!#} current-locals-size @minus{} orig-locals-size
1853 :     <orig target>:
1854 :     @code{lp+!#} orig-locals-size @minus{} new-locals-size
1855 :     @end format
1856 :     The second @code{lp+!#} adjusts the locals stack pointer from the
1857 : anton 1.4 level at the @var{orig} point to the level after the @code{THEN}. The
1858 : anton 1.2 first @code{lp+!#} adjusts the locals stack pointer from the current
1859 :     level to the level at the orig point, so the complete effect is an
1860 :     adjustment from the current level to the right level after the
1861 :     @code{THEN}.
1862 :    
1863 :     In a conventional Forth implementation a dest control-flow stack entry
1864 :     is just the target address and an orig entry is just the address to be
1865 :     patched. Our locals implementation adds a wordlist to every orig or dest
1866 :     item. It is the list of locals visible (or assumed visible) at the point
1867 :     described by the entry. Our implementation also adds a tag to identify
1868 :     the kind of entry, in particular to differentiate between live and dead
1869 :     (reachable and unreachable) orig entries.
1870 :    
1871 :     A few unusual operations have to be performed on locals wordlists:
1872 :    
1873 :     doc-common-list
1874 :     doc-sub-list?
1875 :     doc-list-size
1876 :    
1877 :     Several features of our locals wordlist implementation make these
1878 :     operations easy to implement: The locals wordlists are organised as
1879 :     linked lists; the tails of these lists are shared, if the lists
1880 :     contain some of the same locals; and the address of a name is greater
1881 :     than the address of the names behind it in the list.
1882 :    
1883 :     Another important implementation detail is the variable
1884 :     @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
1885 :     determine if they can be reached directly or only through the branch
1886 :     that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
1887 :     @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
1888 :     definition, by @code{BEGIN} and usually by @code{THEN}.
1889 :    
1890 :     Counted loops are similar to other loops in most respects, but
1891 :     @code{LEAVE} requires special attention: It performs basically the same
1892 :     service as @code{AHEAD}, but it does not create a control-flow stack
1893 :     entry. Therefore the information has to be stored elsewhere;
1894 :     traditionally, the information was stored in the target fields of the
1895 :     branches created by the @code{LEAVE}s, by organizing these fields into a
1896 :     linked list. Unfortunately, this clever trick does not provide enough
1897 :     space for storing our extended control flow information. Therefore, we
1898 :     introduce another stack, the leave stack. It contains the control-flow
1899 :     stack entries for all unresolved @code{LEAVE}s.
1900 :    
1901 :     Local names are kept until the end of the colon definition, even if
1902 :     they are no longer visible in any control-flow path. In a few cases
1903 :     this may lead to increased space needs for the locals name area, but
1904 :     usually less than reclaiming this space would cost in code size.
1905 :    
1906 :    
1907 : anton 1.17 @node ANS Forth locals, , Gforth locals, Locals
1908 : anton 1.2 @subsection ANS Forth locals
1909 :    
1910 :     The ANS Forth locals wordset does not define a syntax for locals, but
1911 :     words that make it possible to define various syntaxes. One of the
1912 : anton 1.17 possible syntaxes is a subset of the syntax we used in the Gforth locals
1913 : anton 1.2 wordset, i.e.:
1914 :    
1915 :     @example
1916 :     @{ local1 local2 ... -- comment @}
1917 :     @end example
1918 :     or
1919 :     @example
1920 :     @{ local1 local2 ... @}
1921 :     @end example
1922 :    
1923 :     The order of the locals corresponds to the order in a stack comment. The
1924 :     restrictions are:
1925 : anton 1.1
1926 : anton 1.2 @itemize @bullet
1927 :     @item
1928 : anton 1.17 Locals can only be cell-sized values (no type specifiers are allowed).
1929 : anton 1.2 @item
1930 :     Locals can be defined only outside control structures.
1931 :     @item
1932 :     Locals can interfere with explicit usage of the return stack. For the
1933 :     exact (and long) rules, see the standard. If you don't use return stack
1934 : anton 1.17 accessing words in a definition using locals, you will be all right. The
1935 : anton 1.2 purpose of this rule is to make locals implementation on the return
1936 :     stack easier.
1937 :     @item
1938 :     The whole definition must be in one line.
1939 :     @end itemize
1940 :    
1941 : anton 1.35 Locals defined in this way behave like @code{VALUE}s (@xref{Simple
1942 :     Defining Words}). I.e., they are initialized from the stack. Using their
1943 : anton 1.2 name produces their value. Their value can be changed using @code{TO}.
1944 :    
1945 : anton 1.17 Since this syntax is supported by Gforth directly, you need not do
1946 : anton 1.2 anything to use it. If you want to port a program using this syntax to
1947 : anton 1.30 another ANS Forth system, use @file{compat/anslocal.fs} to implement the
1948 :     syntax on the other system.
1949 : anton 1.2
1950 :     Note that a syntax shown in the standard, section A.13 looks
1951 :     similar, but is quite different in having the order of locals
1952 :     reversed. Beware!
1953 :    
1954 :     The ANS Forth locals wordset itself consists of the following word
1955 :    
1956 :     doc-(local)
1957 :    
1958 :     The ANS Forth locals extension wordset defines a syntax, but it is so
1959 :     awful that we strongly recommend not to use it. We have implemented this
1960 : anton 1.17 syntax to make porting to Gforth easy, but do not document it here. The
1961 : anton 1.2 problem with this syntax is that the locals are defined in an order
1962 :     reversed with respect to the standard stack comment notation, making
1963 :     programs harder to read, and easier to misread and miswrite. The only
1964 :     merit of this syntax is that it is easy to implement using the ANS Forth
1965 :     locals wordset.
1966 : anton 1.3
1967 : anton 1.37 @node Defining Words, Tokens for Words, Locals, Words
1968 : anton 1.4 @section Defining Words
1969 :    
1970 : anton 1.14 @menu
1971 : anton 1.35 * Simple Defining Words::
1972 :     * Colon Definitions::
1973 :     * User-defined Defining Words::
1974 :     * Supplying names::
1975 :     * Interpretation and Compilation Semantics::
1976 : anton 1.14 @end menu
1977 :    
1978 : anton 1.35 @node Simple Defining Words, Colon Definitions, Defining Words, Defining Words
1979 :     @subsection Simple Defining Words
1980 :    
1981 :     doc-constant
1982 :     doc-2constant
1983 :     doc-fconstant
1984 :     doc-variable
1985 :     doc-2variable
1986 :     doc-fvariable
1987 :     doc-create
1988 :     doc-user
1989 :     doc-value
1990 :     doc-to
1991 :     doc-defer
1992 :     doc-is
1993 :    
1994 :     @node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words
1995 :     @subsection Colon Definitions
1996 :    
1997 :     @example
1998 :     : name ( ... -- ... )
1999 :     word1 word2 word3 ;
2000 :     @end example
2001 :    
2002 :     creates a word called @code{name}, that, upon execution, executes
2003 :     @code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}.
2004 :    
2005 :     The explanation above is somewhat superficial. @xref{Interpretation and
2006 :     Compilation Semantics} for an in-depth discussion of some of the issues
2007 :     involved.
2008 :    
2009 :     doc-:
2010 :     doc-;
2011 :    
2012 :     @node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words
2013 :     @subsection User-defined Defining Words
2014 :    
2015 :     You can create new defining words simply by wrapping defining-time code
2016 :     around existing defining words and putting the sequence in a colon
2017 :     definition.
2018 :    
2019 : anton 1.36 If you want the words defined with your defining words to behave
2020 :     differently from words defined with standard defining words, you can
2021 : anton 1.35 write your defining word like this:
2022 :    
2023 :     @example
2024 :     : def-word ( "name" -- )
2025 :     Create @var{code1}
2026 :     DOES> ( ... -- ... )
2027 :     @var{code2} ;
2028 :    
2029 :     def-word name
2030 :     @end example
2031 :    
2032 :     Technically, this fragment defines a defining word @code{def-word}, and
2033 :     a word @code{name}; when you execute @code{name}, the address of the
2034 :     body of @code{name} is put on the data stack and @var{code2} is executed
2035 :     (the address of the body of @code{name} is the address @code{HERE}
2036 : anton 1.36 returns immediately after the @code{CREATE}).
2037 :    
2038 :     In other words, if you make the following definitions:
2039 :    
2040 :     @example
2041 :     : def-word1 ( "name" -- )
2042 :     Create @var{code1} ;
2043 :    
2044 :     : action1 ( ... -- ... )
2045 :     @var{code2} ;
2046 :    
2047 :     def-word name1
2048 :     @end example
2049 :    
2050 :     Using @code{name1 action1} is equivalent to using @code{name}.
2051 :    
2052 :     E.g., you can implement @code{Constant} in this way:
2053 : anton 1.35
2054 :     @example
2055 :     : constant ( w "name" -- )
2056 :     create ,
2057 :     DOES> ( -- w )
2058 : anton 1.36 @@ ;
2059 : anton 1.35 @end example
2060 :    
2061 :     When you create a constant with @code{5 constant five}, first a new word
2062 :     @code{five} is created, then the value 5 is laid down in the body of
2063 :     @code{five} with @code{,}. When @code{five} is invoked, the address of
2064 :     the body is put on the stack, and @code{@@} retrieves the value 5.
2065 :    
2066 :     In the example above the stack comment after the @code{DOES>} specifies
2067 :     the stack effect of the defined words, not the stack effect of the
2068 :     following code (the following code expects the address of the body on
2069 :     the top of stack, which is not reflected in the stack comment). This is
2070 :     the convention that I use and recommend (it clashes a bit with using
2071 :     locals declarations for stack effect specification, though).
2072 :    
2073 :     @subsubsection Applications of @code{CREATE..DOES>}
2074 :    
2075 : anton 1.36 You may wonder how to use this feature. Here are some usage patterns:
2076 : anton 1.35
2077 :     When you see a sequence of code occurring several times, and you can
2078 :     identify a meaning, you will factor it out as a colon definition. When
2079 :     you see similar colon definitions, you can factor them using
2080 :     @code{CREATE..DOES>}. E.g., an assembler usually defines several words
2081 :     that look very similar:
2082 :     @example
2083 :     : ori, ( reg-taget reg-source n -- )
2084 :     0 asm-reg-reg-imm ;
2085 :     : andi, ( reg-taget reg-source n -- )
2086 :     1 asm-reg-reg-imm ;
2087 :     @end example
2088 :    
2089 :     This could be factored with:
2090 :     @example
2091 :     : reg-reg-imm ( op-code -- )
2092 :     create ,
2093 :     DOES> ( reg-taget reg-source n -- )
2094 : anton 1.36 @@ asm-reg-reg-imm ;
2095 : anton 1.35
2096 :     0 reg-reg-imm ori,
2097 :     1 reg-reg-imm andi,
2098 :     @end example
2099 :    
2100 :     Another view of @code{CREATE..DOES>} is to consider it as a crude way to
2101 :     supply a part of the parameters for a word (known as @dfn{currying} in
2102 :     the functional language community). E.g., @code{+} needs two
2103 :     parameters. Creating versions of @code{+} with one parameter fixed can
2104 :     be done like this:
2105 :     @example
2106 :     : curry+ ( n1 -- )
2107 :     create ,
2108 :     DOES> ( n2 -- n1+n2 )
2109 : anton 1.36 @@ + ;
2110 : anton 1.35
2111 :     3 curry+ 3+
2112 :     -2 curry+ 2-
2113 :     @end example
2114 :    
2115 :     @subsubsection The gory details of @code{CREATE..DOES>}
2116 :    
2117 :     doc-does>
2118 :    
2119 :     This means that you need not use @code{CREATE} and @code{DOES>} in the
2120 :     same definition; E.g., you can put the @code{DOES>}-part in a separate
2121 :     definition. This allows us to, e.g., select among different DOES>-parts:
2122 :     @example
2123 :     : does1
2124 :     DOES> ( ... -- ... )
2125 :     ... ;
2126 :    
2127 :     : does2
2128 :     DOES> ( ... -- ... )
2129 :     ... ;
2130 :    
2131 :     : def-word ( ... -- ... )
2132 :     create ...
2133 :     IF
2134 :     does1
2135 :     ELSE
2136 :     does2
2137 :     ENDIF ;
2138 :     @end example
2139 :    
2140 :     In a standard program you can apply a @code{DOES>}-part only if the last
2141 :     word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part
2142 :     will override the behaviour of the last word defined in any case. In a
2143 :     standard program, you can use @code{DOES>} only in a colon
2144 :     definition. In Gforth, you can also use it in interpretation state, in a
2145 :     kind of one-shot mode:
2146 :     @example
2147 :     CREATE name ( ... -- ... )
2148 :     @var{initialization}
2149 :     DOES>
2150 :     @var{code} ;
2151 :     @end example
2152 :     This is equivalwent to the standard
2153 :     @example
2154 :     :noname
2155 :     DOES>
2156 :     @var{code} ;
2157 :     CREATE name EXECUTE ( ... -- ... )
2158 :     @var{initialization}
2159 :     @end example
2160 :    
2161 :     You can get the address of the body of a word with
2162 :    
2163 :     doc->body
2164 :    
2165 :     @node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words
2166 :     @subsection Supplying names for the defined words
2167 :    
2168 :     By default, defining words take the names for the defined words from the
2169 :     input stream. Sometimes you want to supply the name from a string. You
2170 :     can do this with
2171 :    
2172 :     doc-nextname
2173 :    
2174 :     E.g.,
2175 :    
2176 :     @example
2177 :     s" foo" nextname create
2178 :     @end example
2179 :     is equivalent to
2180 :     @example
2181 :     create foo
2182 :     @end example
2183 :    
2184 :     Sometimes you want to define a word without a name. You can do this with
2185 :    
2186 :     doc-noname
2187 :    
2188 :     To make any use of the newly defined word, you need its execution
2189 :     token. You can get it with
2190 :    
2191 :     doc-lastxt
2192 :    
2193 :     E.g., you can initialize a deferred word with an anonymous colon
2194 :     definition:
2195 :     @example
2196 :     Defer deferred
2197 :     noname : ( ... -- ... )
2198 :     ... ;
2199 :     lastxt IS deferred
2200 :     @end example
2201 :    
2202 :     @code{lastxt} also works when the last word was not defined as
2203 :     @code{noname}.
2204 :    
2205 :     The standard has also recognized the need for anonymous words and
2206 :     provides
2207 :    
2208 :     doc-:noname
2209 :    
2210 :     This leaves the execution token for the word on the stack after the
2211 :     closing @code{;}. You can rewrite the last example with @code{:noname}:
2212 :     @example
2213 :     Defer deferred
2214 :     :noname ( ... -- ... )
2215 :     ... ;
2216 :     IS deferred
2217 :     @end example
2218 :    
2219 :     @node Interpretation and Compilation Semantics, , Supplying names, Defining Words
2220 :     @subsection Interpretation and Compilation Semantics
2221 :    
2222 : anton 1.36 The @dfn{interpretation semantics} of a word are what the text
2223 :     interpreter does when it encounters the word in interpret state. It also
2224 :     appears in some other contexts, e.g., the execution token returned by
2225 :     @code{' @var{word}} identifies the interpretation semantics of
2226 :     @var{word} (in other words, @code{' @var{word} execute} is equivalent to
2227 :     interpret-state text interpretation of @code{@var{word}}).
2228 :    
2229 :     The @dfn{compilation semantics} of a word are what the text interpreter
2230 :     does when it encounters the word in compile state. It also appears in
2231 :     other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In
2232 :     standard terminology, ``appends to the current definition''.} the
2233 :     compilation semantics of @var{word}.
2234 :    
2235 :     The standard also talks about @dfn{execution semantics}. They are used
2236 :     only for defining the interpretation and compilation semantics of many
2237 :     words. By default, the interpretation semantics of a word are to
2238 :     @code{execute} its execution semantics, and the compilation semantics of
2239 :     a word are to @code{compile,} its execution semantics.@footnote{In
2240 :     standard terminology: The default interpretation semantics are its
2241 :     execution semantics; the default compilation semantics are to append its
2242 :     execution semantics to the execution semantics of the current
2243 :     definition.}
2244 :    
2245 :     You can change the compilation semantics into @code{execute}ing the
2246 :     execution semantics with
2247 :    
2248 : anton 1.35 doc-immediate
2249 : anton 1.36
2250 :     You can remove the interpretation semantics of a word with
2251 :    
2252 :     doc-compile-only
2253 :     doc-restrict
2254 :    
2255 :     Note that ticking (@code{'}) compile-only words gives an error
2256 :     (``Interpreting a compile-only word'').
2257 :    
2258 :     Gforth also allows you to define words with arbitrary combinations of
2259 :     interpretation and compilation semantics.
2260 :    
2261 : anton 1.35 doc-interpret/compile:
2262 :    
2263 : anton 1.36 This feature was introduced for implementing @code{TO} and @code{S"}. I
2264 :     recommend that you do not define such words, as cute as they may be:
2265 :     they make it hard to get at both parts of the word in some contexts.
2266 :     E.g., assume you want to get an execution token for the compilation
2267 :     part. Instead, define two words, one that embodies the interpretation
2268 :     part, and one that embodies the compilation part.
2269 :    
2270 :     There is, however, a potentially useful application of this feature:
2271 :     Providing differing implementations for the default semantics. While
2272 :     this introduces redundancy and is therefore usually a bad idea, a
2273 :     performance improvement may be worth the trouble. E.g., consider the
2274 :     word @code{foobar}:
2275 :    
2276 :     @example
2277 :     : foobar
2278 :     foo bar ;
2279 :     @end example
2280 :    
2281 :     Let us assume that @code{foobar} is called so frequently that the
2282 :     calling overhead would take a significant amount of the run-time. We can
2283 :     optimize it with @code{interpret/compile:}:
2284 : anton 1.35
2285 : anton 1.36 @example
2286 :     :noname
2287 :     foo bar ;
2288 :     :noname
2289 :     POSTPONE foo POSTPONE bar ;
2290 :     interpret/compile: foobar
2291 :     @end example
2292 :    
2293 :     This definition has the same interpretation semantics and essentially
2294 :     the same compilation semantics as the simple definition of
2295 :     @code{foobar}, but the implementation of the compilation semantics is
2296 :     more efficient with respect to run-time.
2297 :    
2298 :     Some people try to use state-smart words to emulate the feature provided
2299 :     by @code{interpret/compile:} (words are state-smart if they check
2300 :     @code{STATE} during execution). E.g., they would try to code
2301 :     @code{foobar} like this:
2302 :    
2303 :     @example
2304 :     : foobar
2305 :     STATE @@
2306 :     IF ( compilation state )
2307 :     POSTPONE foo POSTPONE bar
2308 :     ELSE
2309 :     foo bar
2310 :     ENDIF ; immediate
2311 :     @end example
2312 :    
2313 :     While this works if @code{foobar} is processed only by the text
2314 :     interpreter, it does not work in other contexts (like @code{'} or
2315 :     @code{POSTPONE}). E.g., @code{' foobar} will produce an execution token
2316 :     for a state-smart word, not for the interpretation semantics of the
2317 :     original @code{foobar}; when you execute this execution token (directly
2318 :     with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile
2319 :     state, the result will not be what you expected (i.e., it will not
2320 :     perform @code{foo bar}). State-smart words are a bad idea. Simply don't
2321 :     write them!
2322 :    
2323 :     It is also possible to write defining words that define words with
2324 :     arbitrary combinations of interpretation and compilation semantics (or,
2325 :     preferably, arbitrary combinations of implementations of the default
2326 :     semantics). In general, this looks like:
2327 :    
2328 :     @example
2329 :     : def-word
2330 :     create-interpret/compile
2331 :     @var{code1}
2332 :     interpretation>
2333 :     @var{code2}
2334 :     <interpretation
2335 :     compilation>
2336 :     @var{code3}
2337 :     <compilation ;
2338 :     @end example
2339 :    
2340 :     For a @var{word} defined with @code{def-word}, the interpretation
2341 :     semantics are to push the address of the body of @var{word} and perform
2342 :     @var{code2}, and the compilation semantics are to push the address of
2343 :     the body of @var{word} and perform @var{code3}. E.g., @code{constant}
2344 :     can also be defined like this:
2345 :    
2346 :     @example
2347 :     : constant ( n "name" -- )
2348 :     create-interpret/compile
2349 :     ,
2350 :     interpretation> ( -- n )
2351 :     @@
2352 :     <interpretation
2353 :     compilation> ( compilation. -- ; run-time. -- n )
2354 :     @@ postpone literal
2355 :     <compilation ;
2356 :     @end example
2357 :    
2358 :     doc-create-interpret/compile
2359 :     doc-interpretation>
2360 :     doc-<interpretation
2361 :     doc-compilation>
2362 :     doc-<compilation
2363 :    
2364 :     Note that words defined with @code{interpret/compile:} and
2365 :     @code{create-interpret/compile} have an extended header structure that
2366 :     differs from other words; however, unless you try to access them with
2367 :     plain address arithmetic, you should not notice this. Words for
2368 :     accessing the header structure usually know how to deal with this; e.g.,
2369 :     @code{' word >body} also gives you the body of a word created with
2370 :     @code{create-interpret/compile}.
2371 : anton 1.4
2372 : anton 1.37 @node Tokens for Words, Wordlists, Defining Words, Words
2373 :     @section Tokens for Words
2374 :    
2375 :     This chapter describes the creation and use of tokens that represent
2376 :     words on the stack (and in data space).
2377 :    
2378 :     Named words have interpretation and compilation semantics. Unnamed words
2379 :     just have execution semantics.
2380 :    
2381 :     An @dfn{execution token} represents the execution semantics of an
2382 :     unnamed word. An execution token occupies one cell. As explained in
2383 :     section @ref{Supplying names}, the execution token of the last words
2384 :     defined can be produced with
2385 :    
2386 :     short-lastxt
2387 :    
2388 :     You can perform the semantics represented by an execution token with
2389 :     doc-execute
2390 :     You can compile the word with
2391 :     doc-compile,
2392 :    
2393 :     In Gforth, the abstract data type @emph{execution token} is implemented
2394 :     as CFA (code field address).
2395 :    
2396 :     The interpretation semantics of a named word are also represented by an
2397 :     execution token. You can get it with
2398 :    
2399 :     doc-[']
2400 :     doc-'
2401 :    
2402 :     For literals, you use @code{'} in interpreted code and @code{[']} in
2403 :     compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusual
2404 :     by complaining about compile-only words. To get an execution token for a
2405 :     compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP']
2406 :     @var{X} drop}.
2407 :    
2408 :     The compilation semantics are represented by a @dfn{compilation token}
2409 :     consisting of two cells: @var{w xt}. The top cell @var{xt} is an
2410 :     execution token. The compilation semantics represented by the
2411 :     compilation token can be performed with @code{execute}, which consumes
2412 :     the whole compilation token, with an additional stack effect determined
2413 :     by the represented compilation semantics.
2414 :    
2415 :     doc-[comp']
2416 :     doc-comp'
2417 :    
2418 : anton 1.38 You can compile the compilation semantics with @code{postpone,}. I.e.,
2419 :     @code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE
2420 :     @var{word}}.
2421 :    
2422 :     doc-postpone,
2423 :    
2424 : anton 1.37 At present, the @var{w} part of a compilation token is an execution
2425 :     token, and the @var{xt} part represents either @code{execute} or
2426 :     @code{compile,}. However, don't rely on that kowledge, unless necessary;
2427 :     we may introduce unusual compilation tokens in the future (e.g.,
2428 :     compilation tokens representing the compilation semantics of literals).
2429 :    
2430 :     Named words are also represented by the @dfn{name token}. The abstract
2431 :     data type @emph{name token} is implemented as NFA (name field address).
2432 :    
2433 :     doc-find-name
2434 :     doc-name>int
2435 :     doc-name?int
2436 :     doc-name>comp
2437 :     doc-name>string
2438 :    
2439 :     @node Wordlists, Files, Tokens for Words, Words
2440 : anton 1.4 @section Wordlists
2441 :    
2442 :     @node Files, Blocks, Wordlists, Words
2443 :     @section Files
2444 :    
2445 :     @node Blocks, Other I/O, Files, Words
2446 :     @section Blocks
2447 :    
2448 :     @node Other I/O, Programming Tools, Blocks, Words
2449 :     @section Other I/O
2450 :    
2451 : anton 1.18 @node Programming Tools, Assembler and Code words, Other I/O, Words
2452 : anton 1.4 @section Programming Tools
2453 :    
2454 : anton 1.5 @menu
2455 :     * Debugging:: Simple and quick.
2456 :     * Assertions:: Making your programs self-checking.
2457 :     @end menu
2458 :    
2459 :     @node Debugging, Assertions, Programming Tools, Programming Tools
2460 : anton 1.4 @subsection Debugging
2461 :    
2462 :     The simple debugging aids provided in @file{debugging.fs}
2463 :     are meant to support a different style of debugging than the
2464 :     tracing/stepping debuggers used in languages with long turn-around
2465 :     times.
2466 :    
2467 :     A much better (faster) way in fast-compilig languages is to add
2468 :     printing code at well-selected places, let the program run, look at
2469 :     the output, see where things went wrong, add more printing code, etc.,
2470 :     until the bug is found.
2471 :    
2472 :     The word @code{~~} is easy to insert. It just prints debugging
2473 :     information (by default the source location and the stack contents). It
2474 :     is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
2475 :     query-replace them with nothing). The deferred words
2476 :     @code{printdebugdata} and @code{printdebugline} control the output of
2477 :     @code{~~}. The default source location output format works well with
2478 :     Emacs' compilation mode, so you can step through the program at the
2479 : anton 1.5 source level using @kbd{C-x `} (the advantage over a stepping debugger
2480 :     is that you can step in any direction and you know where the crash has
2481 :     happened or where the strange data has occurred).
2482 : anton 1.4
2483 :     Note that the default actions clobber the contents of the pictured
2484 :     numeric output string, so you should not use @code{~~}, e.g., between
2485 :     @code{<#} and @code{#>}.
2486 :    
2487 :     doc-~~
2488 :     doc-printdebugdata
2489 :     doc-printdebugline
2490 :    
2491 : anton 1.5 @node Assertions, , Debugging, Programming Tools
2492 : anton 1.4 @subsection Assertions
2493 :    
2494 : anton 1.5 It is a good idea to make your programs self-checking, in particular, if
2495 :     you use an assumption (e.g., that a certain field of a data structure is
2496 : anton 1.17 never zero) that may become wrong during maintenance. Gforth supports
2497 : anton 1.5 assertions for this purpose. They are used like this:
2498 :    
2499 :     @example
2500 :     assert( @var{flag} )
2501 :     @end example
2502 :    
2503 :     The code between @code{assert(} and @code{)} should compute a flag, that
2504 :     should be true if everything is alright and false otherwise. It should
2505 :     not change anything else on the stack. The overall stack effect of the
2506 :     assertion is @code{( -- )}. E.g.
2507 :    
2508 :     @example
2509 :     assert( 1 1 + 2 = ) \ what we learn in school
2510 :     assert( dup 0<> ) \ assert that the top of stack is not zero
2511 :     assert( false ) \ this code should not be reached
2512 :     @end example
2513 :    
2514 :     The need for assertions is different at different times. During
2515 :     debugging, we want more checking, in production we sometimes care more
2516 :     for speed. Therefore, assertions can be turned off, i.e., the assertion
2517 :     becomes a comment. Depending on the importance of an assertion and the
2518 :     time it takes to check it, you may want to turn off some assertions and
2519 : anton 1.17 keep others turned on. Gforth provides several levels of assertions for
2520 : anton 1.5 this purpose:
2521 :    
2522 :     doc-assert0(
2523 :     doc-assert1(
2524 :     doc-assert2(
2525 :     doc-assert3(
2526 :     doc-assert(
2527 :     doc-)
2528 :    
2529 :     @code{Assert(} is the same as @code{assert1(}. The variable
2530 :     @code{assert-level} specifies the highest assertions that are turned
2531 :     on. I.e., at the default @code{assert-level} of one, @code{assert0(} and
2532 :     @code{assert1(} assertions perform checking, while @code{assert2(} and
2533 :     @code{assert3(} assertions are treated as comments.
2534 :    
2535 :     Note that the @code{assert-level} is evaluated at compile-time, not at
2536 :     run-time. I.e., you cannot turn assertions on or off at run-time, you
2537 :     have to set the @code{assert-level} appropriately before compiling a
2538 :     piece of code. You can compile several pieces of code at several
2539 :     @code{assert-level}s (e.g., a trusted library at level 1 and newly
2540 :     written code at level 3).
2541 :    
2542 :     doc-assert-level
2543 :    
2544 :     If an assertion fails, a message compatible with Emacs' compilation mode
2545 :     is produced and the execution is aborted (currently with @code{ABORT"}.
2546 :     If there is interest, we will introduce a special throw code. But if you
2547 :     intend to @code{catch} a specific condition, using @code{throw} is
2548 :     probably more appropriate than an assertion).
2549 :    
2550 : anton 1.18 @node Assembler and Code words, Threading Words, Programming Tools, Words
2551 :     @section Assembler and Code words
2552 :    
2553 :     Gforth provides some words for defining primitives (words written in
2554 :     machine code), and for defining the the machine-code equivalent of
2555 :     @code{DOES>}-based defining words. However, the machine-independent
2556 : anton 1.40 nature of Gforth poses a few problems: First of all, Gforth runs on
2557 : anton 1.18 several architectures, so it can provide no standard assembler. What's
2558 :     worse is that the register allocation not only depends on the processor,
2559 : anton 1.25 but also on the @code{gcc} version and options used.
2560 : anton 1.18
2561 : anton 1.25 The words that Gforth offers encapsulate some system dependences (e.g., the
2562 : anton 1.18 header structure), so a system-independent assembler may be used in
2563 :     Gforth. If you do not have an assembler, you can compile machine code
2564 :     directly with @code{,} and @code{c,}.
2565 :    
2566 :     doc-assembler
2567 :     doc-code
2568 :     doc-end-code
2569 :     doc-;code
2570 :     doc-flush-icache
2571 :    
2572 :     If @code{flush-icache} does not work correctly, @code{code} words
2573 :     etc. will not work (reliably), either.
2574 :    
2575 :     These words are rarely used. Therefore they reside in @code{code.fs},
2576 :     which is usually not loaded (except @code{flush-icache}, which is always
2577 : anton 1.19 present). You can load them with @code{require code.fs}.
2578 : anton 1.18
2579 : anton 1.25 In the assembly code you will want to refer to the inner interpreter's
2580 :     registers (e.g., the data stack pointer) and you may want to use other
2581 :     registers for temporary storage. Unfortunately, the register allocation
2582 :     is installation-dependent.
2583 :    
2584 :     The easiest solution is to use explicit register declarations
2585 :     (@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info,
2586 :     GNU C Manual}) for all of the inner interpreter's registers: You have to
2587 :     compile Gforth with @code{-DFORCE_REG} (configure option
2588 :     @code{--enable-force-reg}) and the appropriate declarations must be
2589 :     present in the @code{machine.h} file (see @code{mips.h} for an example;
2590 :     you can find a full list of all declarable register symbols with
2591 :     @code{grep register engine.c}). If you give explicit registers to all
2592 :     variables that are declared at the beginning of @code{engine()}, you
2593 :     should be able to use the other caller-saved registers for temporary
2594 :     storage. Alternatively, you can use the @code{gcc} option
2595 :     @code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code
2596 :     Generation Conventions, gcc.info, GNU C Manual}) to reserve a register
2597 :     (however, this restriction on register allocation may slow Gforth
2598 :     significantly).
2599 :    
2600 :     If this solution is not viable (e.g., because @code{gcc} does not allow
2601 :     you to explicitly declare all the registers you need), you have to find
2602 :     out by looking at the code where the inner interpreter's registers
2603 :     reside and which registers can be used for temporary storage. You can
2604 :     get an assembly listing of the engine's code with @code{make engine.s}.
2605 :    
2606 :     In any case, it is good practice to abstract your assembly code from the
2607 :     actual register allocation. E.g., if the data stack pointer resides in
2608 :     register @code{$17}, create an alias for this register called @code{sp},
2609 :     and use that in your assembly code.
2610 :    
2611 : anton 1.18 Another option for implementing normal and defining words efficiently
2612 :     is: adding the wanted functionality to the source of Gforth. For normal
2613 : anton 1.35 words you just have to edit @file{primitives} (@pxref{Automatic
2614 :     Generation}), defining words (equivalent to @code{;CODE} words, for fast
2615 :     defined words) may require changes in @file{engine.c}, @file{kernal.fs},
2616 :     @file{prims2x.fs}, and possibly @file{cross.fs}.
2617 : anton 1.18
2618 :    
2619 :     @node Threading Words, , Assembler and Code words, Words
2620 : anton 1.4 @section Threading Words
2621 :    
2622 :     These words provide access to code addresses and other threading stuff
2623 : anton 1.17 in Gforth (and, possibly, other interpretive Forths). It more or less
2624 : anton 1.4 abstracts away the differences between direct and indirect threading
2625 :     (and, for direct threading, the machine dependences). However, at
2626 :     present this wordset is still inclomplete. It is also pretty low-level;
2627 :     some day it will hopefully be made unnecessary by an internals words set
2628 :     that abstracts implementation details away completely.
2629 :    
2630 :     doc->code-address
2631 :     doc->does-code
2632 :     doc-code-address!
2633 :     doc-does-code!
2634 :     doc-does-handler!
2635 :     doc-/does-handler
2636 :    
2637 : anton 1.18 The code addresses produced by various defining words are produced by
2638 :     the following words:
2639 : anton 1.14
2640 : anton 1.18 doc-docol:
2641 :     doc-docon:
2642 :     doc-dovar:
2643 :     doc-douser:
2644 :     doc-dodefer:
2645 :     doc-dofield:
2646 :    
2647 : anton 1.35 You can recognize words defined by a @code{CREATE}...@code{DOES>} word
2648 :     with @code{>DOES-CODE}. If the word was defined in that way, the value
2649 :     returned is different from 0 and identifies the @code{DOES>} used by the
2650 :     defining word.
2651 : anton 1.14
2652 : anton 1.40 @node Tools, ANS conformance, Words, Top
2653 :     @chapter Tools
2654 :    
2655 :     @menu
2656 :     * ANS Report:: Report the words used, sorted by wordset
2657 :     @end menu
2658 :    
2659 :     See also @ref{Emacs and Gforth}.
2660 :    
2661 :     @node ANS Report, , Tools, Tools
2662 :     @section @file{ans-report.fs}: Report the words used, sorted by wordset
2663 :    
2664 :     If you want to label a Forth program as ANS Forth Program, you must
2665 :     document which wordsets the program uses; for extension wordsets, it is
2666 :     helpful to list the words the program requires from these wordsets
2667 :     (because Forth systems are allowed to provide only some words of them).
2668 :    
2669 :     The @file{ans-report.fs} tool makes it easy for you to determine which
2670 :     words from which wordset and which non-ANS words your application
2671 :     uses. You simply have to include @file{ans-report.fs} before loading the
2672 :     program you want to check. After loading your program, you can get the
2673 :     report with @code{print-ans-report}. A typical use is to run this as
2674 :     batch job like this:
2675 :     @example
2676 :     gforth ans-report.fs myprog.fs -e "print-ans-report bye"
2677 :     @end example
2678 :    
2679 :     The output looks like this (for @file{compat/control.fs}):
2680 :     @example
2681 :     The program uses the following words
2682 :     from CORE :
2683 :     : POSTPONE THEN ; immediate ?dup IF 0=
2684 :     from BLOCK-EXT :
2685 :     \
2686 :     from FILE :
2687 :     (
2688 :     @end example
2689 :    
2690 :     @subsection Caveats
2691 :    
2692 :     Note that @file{ans-report.fs} just checks which words are used, not whether
2693 :     they are used in an ANS Forth conforming way!
2694 :    
2695 :     Some words are defined in several wordsets in the
2696 :     standard. @file{ans-report.fs} reports them for only one of the
2697 :     wordsets, and not necessarily the one you expect. It depends on usage
2698 :     which wordset is the right one to specify. E.g., if you only use the
2699 :     compilation semantics of @code{S"}, it is a Core word; if you also use
2700 :     its interpretation semantics, it is a File word.
2701 :    
2702 :    
2703 :     @node ANS conformance, Model, Tools, Top
2704 : anton 1.4 @chapter ANS conformance
2705 :    
2706 : anton 1.17 To the best of our knowledge, Gforth is an
2707 : anton 1.14
2708 : anton 1.15 ANS Forth System
2709 : anton 1.34 @itemize @bullet
2710 : anton 1.15 @item providing the Core Extensions word set
2711 :     @item providing the Block word set
2712 :     @item providing the Block Extensions word set
2713 :     @item providing the Double-Number word set
2714 :     @item providing the Double-Number Extensions word set
2715 :     @item providing the Exception word set
2716 :     @item providing the Exception Extensions word set
2717 :     @item providing the Facility word set
2718 :     @item providing @code{MS} and @code{TIME&DATE} from the Facility Extensions word set
2719 :     @item providing the File Access word set
2720 :     @item providing the File Access Extensions word set
2721 :     @item providing the Floating-Point word set
2722 :     @item providing the Floating-Point Extensions word set
2723 :     @item providing the Locals word set
2724 :     @item providing the Locals Extensions word set
2725 :     @item providing the Memory-Allocation word set
2726 :     @item providing the Memory-Allocation Extensions word set (that one's easy)
2727 :     @item providing the Programming-Tools word set
2728 : anton 1.34 @item providing @code{;CODE}, @code{AHEAD}, @code{ASSEMBLER}, @code{BYE}, @code{CODE}, @code{CS-PICK}, @code{CS-ROLL}, @code{STATE}, @code{[ELSE]}, @code{[IF]}, @code{[THEN]} from the Programming-Tools Extensions word set
2729 : anton 1.15 @item providing the Search-Order word set
2730 :     @item providing the Search-Order Extensions word set
2731 :     @item providing the String word set
2732 :     @item providing the String Extensions word set (another easy one)
2733 :     @end itemize
2734 :    
2735 :     In addition, ANS Forth systems are required to document certain
2736 :     implementation choices. This chapter tries to meet these
2737 :     requirements. In many cases it gives a way to ask the system for the
2738 :     information instead of providing the information directly, in
2739 :     particular, if the information depends on the processor, the operating
2740 :     system or the installation options chosen, or if they are likely to
2741 : anton 1.17 change during the maintenance of Gforth.
2742 : anton 1.15
2743 : anton 1.14 @comment The framework for the rest has been taken from pfe.
2744 :    
2745 :     @menu
2746 :     * The Core Words::
2747 :     * The optional Block word set::
2748 :     * The optional Double Number word set::
2749 :     * The optional Exception word set::
2750 :     * The optional Facility word set::
2751 :     * The optional File-Access word set::
2752 :     * The optional Floating-Point word set::
2753 :     * The optional Locals word set::
2754 :     * The optional Memory-Allocation word set::
2755 :     * The optional Programming-Tools word set::
2756 :     * The optional Search-Order word set::
2757 :     @end menu
2758 :    
2759 :    
2760 :     @c =====================================================================
2761 :     @node The Core Words, The optional Block word set, ANS conformance, ANS conformance
2762 :     @comment node-name, next, previous, up
2763 :     @section The Core Words
2764 :     @c =====================================================================
2765 :    
2766 :     @menu
2767 : anton 1.15 * core-idef:: Implementation Defined Options
2768 :     * core-ambcond:: Ambiguous Conditions
2769 :     * core-other:: Other System Documentation
2770 : anton 1.14 @end menu
2771 :    
2772 :     @c ---------------------------------------------------------------------
2773 :     @node core-idef, core-ambcond, The Core Words, The Core Words
2774 :     @subsection Implementation Defined Options
2775 :     @c ---------------------------------------------------------------------
2776 :    
2777 :     @table @i
2778 :    
2779 :     @item (Cell) aligned addresses:
2780 : anton 1.17 processor-dependent. Gforth's alignment words perform natural alignment
2781 : anton 1.14 (e.g., an address aligned for a datum of size 8 is divisible by
2782 :     8). Unaligned accesses usually result in a @code{-23 THROW}.
2783 :    
2784 :     @item @code{EMIT} and non-graphic characters:
2785 :     The character is output using the C library function (actually, macro)
2786 : anton 1.36 @code{putc}.
2787 : anton 1.14
2788 :     @item character editing of @code{ACCEPT} and @code{EXPECT}:
2789 :     This is modeled on the GNU readline library (@pxref{Readline
2790 :     Interaction, , Command Line Editing, readline, The GNU Readline
2791 :     Library}) with Emacs-like key bindings. @kbd{Tab} deviates a little by
2792 :     producing a full word completion every time you type it (instead of
2793 :     producing the common prefix of all completions).
2794 :    
2795 :     @item character set:
2796 :     The character set of your computer and display device. Gforth is
2797 :     8-bit-clean (but some other component in your system may make trouble).
2798 :    
2799 :     @item Character-aligned address requirements:
2800 :     installation-dependent. Currently a character is represented by a C
2801 :     @code{unsigned char}; in the future we might switch to @code{wchar_t}
2802 :     (Comments on that requested).
2803 :    
2804 :     @item character-set extensions and matching of names:
2805 : anton 1.17 Any character except the ASCII NUL charcter can be used in a
2806 : anton 1.36 name. Matching is case-insensitive (except in @code{TABLE}s. The
2807 :     matching is performed using the C function @code{strncasecmp}, whose
2808 :     function is probably influenced by the locale. E.g., the @code{C} locale
2809 :     does not know about accents and umlauts, so they are matched
2810 :     case-sensitively in that locale. For portability reasons it is best to
2811 :     write programs such that they work in the @code{C} locale. Then one can
2812 :     use libraries written by a Polish programmer (who might use words
2813 :     containing ISO Latin-2 encoded characters) and by a French programmer
2814 :     (ISO Latin-1) in the same program (of course, @code{WORDS} will produce
2815 :     funny results for some of the words (which ones, depends on the font you
2816 :     are using)). Also, the locale you prefer may not be available in other
2817 :     operating systems. Hopefully, Unicode will solve these problems one day.
2818 : anton 1.14
2819 :     @item conditions under which control characters match a space delimiter:
2820 :     If @code{WORD} is called with the space character as a delimiter, all
2821 :     white-space characters (as identified by the C macro @code{isspace()})
2822 :     are delimiters. @code{PARSE}, on the other hand, treats space like other
2823 :     delimiters. @code{PARSE-WORD} treats space like @code{WORD}, but behaves
2824 :     like @code{PARSE} otherwise. @code{(NAME)}, which is used by the outer
2825 :     interpreter (aka text interpreter) by default, treats all white-space
2826 :     characters as delimiters.
2827 :    
2828 :     @item format of the control flow stack:
2829 :     The data stack is used as control flow stack. The size of a control flow
2830 :     stack item in cells is given by the constant @code{cs-item-size}. At the
2831 :     time of this writing, an item consists of a (pointer to a) locals list
2832 :     (third), an address in the code (second), and a tag for identifying the
2833 :     item (TOS). The following tags are used: @code{defstart},
2834 :     @code{live-orig}, @code{dead-orig}, @code{dest}, @code{do-dest},
2835 :     @code{scopestart}.
2836 :    
2837 :     @item conversion of digits > 35
2838 :     The characters @code{[\]^_'} are the digits with the decimal value
2839 :     36@minus{}41. There is no way to input many of the larger digits.
2840 :    
2841 :     @item display after input terminates in @code{ACCEPT} and @code{EXPECT}:
2842 :     The cursor is moved to the end of the entered string. If the input is
2843 :     terminated using the @kbd{Return} key, a space is typed.
2844 :    
2845 :     @item exception abort sequence of @code{ABORT"}:
2846 :     The error string is stored into the variable @code{"error} and a
2847 :     @code{-2 throw} is performed.
2848 :    
2849 :     @item input line terminator:
2850 : anton 1.36 For interactive input, @kbd{C-m} (CR) and @kbd{C-j} (LF) terminate
2851 :     lines. One of these characters is typically produced when you type the
2852 :     @kbd{Enter} or @kbd{Return} key.
2853 : anton 1.14
2854 :     @item maximum size of a counted string:
2855 :     @code{s" /counted-string" environment? drop .}. Currently 255 characters
2856 :     on all ports, but this may change.
2857 :    
2858 :     @item maximum size of a parsed string:
2859 :     Given by the constant @code{/line}. Currently 255 characters.
2860 :    
2861 :     @item maximum size of a definition name, in characters:
2862 :     31
2863 :    
2864 :     @item maximum string length for @code{ENVIRONMENT?}, in characters:
2865 :     31
2866 :    
2867 :     @item method of selecting the user input device:
2868 : anton 1.17 The user input device is the standard input. There is currently no way to
2869 :     change it from within Gforth. However, the input can typically be
2870 :     redirected in the command line that starts Gforth.
2871 : anton 1.14
2872 :     @item method of selecting the user output device:
2873 : anton 1.36 @code{EMIT} and @code{TYPE} output to the file-id stored in the value
2874 :     @code{outfile-id} (@code{stdout} by default). Gforth uses buffered
2875 :     output, so output on a terminal does not become visible before the next
2876 :     newline or buffer overflow. Output on non-terminals is invisible until
2877 :     the buffer overflows.
2878 : anton 1.14
2879 :     @item methods of dictionary compilation:
2880 : anton 1.17 What are we expected to document here?
2881 : anton 1.14
2882 :     @item number of bits in one address unit:
2883 :     @code{s" address-units-bits" environment? drop .}. 8 in all current
2884 :     ports.
2885 :    
2886 :     @item number representation and arithmetic:
2887 :     Processor-dependent. Binary two's complement on all current ports.
2888 :    
2889 :     @item ranges for integer types:
2890 :     Installation-dependent. Make environmental queries for @code{MAX-N},
2891 :     @code{MAX-U}, @code{MAX-D} and @code{MAX-UD}. The lower bounds for
2892 :     unsigned (and positive) types is 0. The lower bound for signed types on
2893 :     two's complement and one's complement machines machines can be computed
2894 :     by adding 1 to the upper bound.
2895 :    
2896 :     @item read-only data space regions:
2897 :     The whole Forth data space is writable.
2898 :    
2899 :     @item size of buffer at @code{WORD}:
2900 :     @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is
2901 :     shared with the pictured numeric output string. If overwriting
2902 :     @code{PAD} is acceptable, it is as large as the remaining dictionary
2903 :     space, although only as much can be sensibly used as fits in a counted
2904 :     string.
2905 :    
2906 :     @item size of one cell in address units:
2907 :     @code{1 cells .}.
2908 :    
2909 :     @item size of one character in address units:
2910 :     @code{1 chars .}. 1 on all current ports.
2911 :    
2912 :     @item size of the keyboard terminal buffer:
2913 : anton 1.36 Varies. You can determine the size at a specific time using @code{lp@@
2914 : anton 1.14 tib - .}. It is shared with the locals stack and TIBs of files that
2915 :     include the current file. You can change the amount of space for TIBs
2916 : anton 1.17 and locals stack at Gforth startup with the command line option
2917 : anton 1.14 @code{-l}.
2918 :    
2919 :     @item size of the pictured numeric output buffer:
2920 :     @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is
2921 :     shared with @code{WORD}.
2922 :    
2923 :     @item size of the scratch area returned by @code{PAD}:
2924 :     The remainder of dictionary space. You can even use the unused part of
2925 : anton 1.36 the data stack space. The current size can be computed with @code{sp@@
2926 : anton 1.14 pad - .}.
2927 :    
2928 :     @item system case-sensitivity characteristics:
2929 : anton 1.36 Dictionary searches are case insensitive (except in
2930 :     @code{TABLE}s). However, as explained above under @i{character-set
2931 :     extensions}, the matching for non-ASCII characters is determined by the
2932 :     locale you are using. In the default @code{C} locale all non-ASCII
2933 :     characters are matched case-sensitively.
2934 : anton 1.14
2935 :     @item system prompt:
2936 :     @code{ ok} in interpret state, @code{ compiled} in compile state.
2937 :    
2938 :     @item division rounding:
2939 :     installation dependent. @code{s" floored" environment? drop .}. We leave
2940 : anton 1.25 the choice to @code{gcc} (what to use for @code{/}) and to you (whether to use
2941 : anton 1.14 @code{fm/mod}, @code{sm/rem} or simply @code{/}).
2942 :    
2943 :     @item values of @code{STATE} when true:
2944 :     -1.
2945 :    
2946 :     @item values returned after arithmetic overflow:
2947 :     On two's complement machines, arithmetic is performed modulo
2948 :     2**bits-per-cell for single arithmetic and 4**bits-per-cell for double
2949 :     arithmetic (with appropriate mapping for signed types). Division by zero
2950 : anton 1.36 typically results in a @code{-55 throw} (Floating-point unidentified
2951 : anton 1.14 fault), although a @code{-10 throw} (divide by zero) would be more
2952 :     appropriate.
2953 :    
2954 :     @item whether the current definition can be found after @t{DOES>}:
2955 :     No.
2956 :    
2957 :     @end table
2958 :    
2959 :     @c ---------------------------------------------------------------------
2960 :     @node core-ambcond, core-other, core-idef, The Core Words
2961 :     @subsection Ambiguous conditions
2962 :     @c ---------------------------------------------------------------------
2963 :    
2964 :     @table @i
2965 :    
2966 :     @item a name is neither a word nor a number:
2967 : anton 1.36 @code{-13 throw} (Undefined word). Actually, @code{-13 bounce}, which
2968 :     preserves the data and FP stack, so you don't lose more work than
2969 :     necessary.
2970 : anton 1.14
2971 :     @item a definition name exceeds the maximum length allowed:
2972 :     @code{-19 throw} (Word name too long)
2973 :    
2974 :     @item addressing a region not inside the various data spaces of the forth system:
2975 :     The stacks, code space and name space are accessible. Machine code space is
2976 :     typically readable. Accessing other addresses gives results dependent on
2977 :     the operating system. On decent systems: @code{-9 throw} (Invalid memory
2978 :     address).
2979 :    
2980 :     @item argument type incompatible with parameter:
2981 :     This is usually not caught. Some words perform checks, e.g., the control
2982 :     flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type
2983 :     mismatch).
2984 :    
2985 :     @item attempting to obtain the execution token of a word with undefined execution semantics:
2986 : anton 1.36 @code{-14 throw} (Interpreting a compile-only word). In some cases, you
2987 :     get an execution token for @code{compile-only-error} (which performs a
2988 :     @code{-14 throw} when executed).
2989 : anton 1.14
2990 :     @item dividing by zero:
2991 :     typically results in a @code{-55 throw} (floating point unidentified
2992 :     fault), although a @code{-10 throw} (divide by zero) would be more
2993 :     appropriate.
2994 :    
2995 :     @item insufficient data stack or return stack space:
2996 :     Not checked. This typically results in mysterious illegal memory
2997 :     accesses, producing @code{-9 throw} (Invalid memory address) or
2998 :     @code{-23 throw} (Address alignment exception).
2999 :    
3000 :     @item insufficient space for loop control parameters:
3001 :     like other return stack overflows.
3002 :    
3003 :     @item insufficient space in the dictionary:
3004 :     Not checked. Similar results as stack overflows. However, typically the
3005 :     error appears at a different place when one inserts or removes code.
3006 :    
3007 :     @item interpreting a word with undefined interpretation semantics:
3008 :     For some words, we defined interpretation semantics. For the others:
3009 : anton 1.36 @code{-14 throw} (Interpreting a compile-only word).
3010 : anton 1.14
3011 :     @item modifying the contents of the input buffer or a string literal:
3012 :     These are located in writable memory and can be modified.
3013 :    
3014 :     @item overflow of the pictured numeric output string:
3015 :     Not checked.
3016 :    
3017 :     @item parsed string overflow:
3018 :     @code{PARSE} cannot overflow. @code{WORD} does not check for overflow.
3019 :    
3020 :     @item producing a result out of range:
3021 :     On two's complement machines, arithmetic is performed modulo
3022 :     2**bits-per-cell for single arithmetic and 4**bits-per-cell for double
3023 :     arithmetic (with appropriate mapping for signed types). Division by zero
3024 :     typically results in a @code{-55 throw} (floatingpoint unidentified
3025 :     fault), although a @code{-10 throw} (divide by zero) would be more
3026 :     appropriate. @code{convert} and @code{>number} currently overflow
3027 :     silently.
3028 :    
3029 :     @item reading from an empty data or return stack:
3030 :     The data stack is checked by the outer (aka text) interpreter after
3031 :     every word executed. If it has underflowed, a @code{-4 throw} (Stack
3032 :     underflow) is performed. Apart from that, the stacks are not checked and
3033 :     underflows can result in similar behaviour as overflows (of adjacent
3034 :     stacks).
3035 :    
3036 : anton 1.36 @item unexpected end of the input buffer, resulting in an attempt to use a zero-length string as a name:
3037 : anton 1.14 @code{Create} and its descendants perform a @code{-16 throw} (Attempt to
3038 :     use zero-length string as a name). Words like @code{'} probably will not
3039 :     find what they search. Note that it is possible to create zero-length
3040 :     names with @code{nextname} (should it not?).
3041 :    
3042 :     @item @code{>IN} greater than input buffer:
3043 :     The next invocation of a parsing word returns a string wih length 0.
3044 :    
3045 :     @item @code{RECURSE} appears after @code{DOES>}:
3046 : anton 1.36 Compiles a recursive call to the defining word, not to the defined word.
3047 : anton 1.14
3048 :     @item argument input source different than current input source for @code{RESTORE-INPUT}:
3049 : anton 1.27 @code{-12 THROW}. Note that, once an input file is closed (e.g., because
3050 :     the end of the file was reached), its source-id may be
3051 :     reused. Therefore, restoring an input source specification referencing a
3052 :     closed file may lead to unpredictable results instead of a @code{-12
3053 :     THROW}.
3054 :    
3055 : anton 1.36 In the future, Gforth may be able to restore input source specifications
3056 : anton 1.27 from other than the current input soruce.
3057 : anton 1.14
3058 :     @item data space containing definitions gets de-allocated:
3059 :     Deallocation with @code{allot} is not checked. This typically resuls in
3060 :     memory access faults or execution of illegal instructions.
3061 :    
3062 :     @item data space read/write with incorrect alignment:
3063 :     Processor-dependent. Typically results in a @code{-23 throw} (Address
3064 :     alignment exception). Under Linux on a 486 or later processor with
3065 :     alignment turned on, incorrect alignment results in a @code{-9 throw}
3066 :     (Invalid memory address). There are reportedly some processors with
3067 :     alignment restrictions that do not report them.
3068 :    
3069 :     @item data space pointer not properly aligned, @code{,}, @code{C,}:
3070 :     Like other alignment errors.
3071 :    
3072 :     @item less than u+2 stack items (@code{PICK} and @code{ROLL}):
3073 :     Not checked. May cause an illegal memory access.
3074 :    
3075 :     @item loop control parameters not available:
3076 :     Not checked. The counted loop words simply assume that the top of return
3077 :     stack items are loop control parameters and behave accordingly.
3078 :    
3079 :     @item most recent definition does not have a name (@code{IMMEDIATE}):
3080 :     @code{abort" last word was headerless"}.
3081 :    
3082 :     @item name not defined by @code{VALUE} used by @code{TO}:
3083 : anton 1.36 @code{-32 throw} (Invalid name argument) (unless name was defined by
3084 :     @code{CONSTANT}; then it just changes the constant).
3085 : anton 1.14
3086 : anton 1.15 @item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}):
3087 : anton 1.14 @code{-13 throw} (Undefined word)
3088 :    
3089 :     @item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}):
3090 :     Gforth behaves as if they were of the same type. I.e., you can predict
3091 :     the behaviour by interpreting all parameters as, e.g., signed.
3092 :    
3093 :     @item @code{POSTPONE} or @code{[COMPILE]} applied to @code{TO}:
3094 : anton 1.36 Assume @code{: X POSTPONE TO ; IMMEDIATE}. @code{X} performs the
3095 :     compilation semantics of @code{TO}.
3096 : anton 1.14
3097 :     @item String longer than a counted string returned by @code{WORD}:
3098 :     Not checked. The string will be ok, but the count will, of course,
3099 :     contain only the least significant bits of the length.
3100 :    
3101 : anton 1.15 @item u greater than or equal to the number of bits in a cell (@code{LSHIFT}, @code{RSHIFT}):
3102 : anton 1.14 Processor-dependent. Typical behaviours are returning 0 and using only
3103 :     the low bits of the shift count.
3104 :    
3105 :     @item word not defined via @code{CREATE}:
3106 :     @code{>BODY} produces the PFA of the word no matter how it was defined.
3107 :    
3108 :     @code{DOES>} changes the execution semantics of the last defined word no
3109 :     matter how it was defined. E.g., @code{CONSTANT DOES>} is equivalent to
3110 :     @code{CREATE , DOES>}.
3111 :    
3112 :     @item words improperly used outside @code{<#} and @code{#>}:
3113 :     Not checked. As usual, you can expect memory faults.
3114 :    
3115 :     @end table
3116 :    
3117 :    
3118 :     @c ---------------------------------------------------------------------
3119 :     @node core-other, , core-ambcond, The Core Words
3120 :     @subsection Other system documentation
3121 :     @c ---------------------------------------------------------------------
3122 :    
3123 :     @table @i
3124 :    
3125 :     @item nonstandard words using @code{PAD}:
3126 :     None.
3127 :    
3128 :     @item operator's terminal facilities available:
3129 : anton 1.26 After processing the command line, Gforth goes into interactive mode,
3130 :     and you can give commands to Gforth interactively. The actual facilities
3131 :     available depend on how you invoke Gforth.
3132 : anton 1.14
3133 :     @item program data space available:
3134 : anton 1.36 @code{sp@@ here - .} gives the space remaining for dictionary and data
3135 : anton 1.14 stack together.
3136 :    
3137 :     @item return stack space available:
3138 : anton 1.26 By default 16 KBytes. The default can be overridden with the @code{-r}
3139 :     switch (@pxref{Invocation}) when Gforth starts up.
3140 : anton 1.14
3141 :     @item stack space available:
3142 : anton 1.36 @code{sp@@ here - .} gives the space remaining for dictionary and data
3143 : anton 1.14 stack together.
3144 :    
3145 :     @item system dictionary space required, in address units:
3146 :     Type @code{here forthstart - .} after startup. At the time of this
3147 :     writing, this gives 70108 (bytes) on a 32-bit system.
3148 :     @end table
3149 :    
3150 :    
3151 :     @c =====================================================================
3152 :     @node The optional Block word set, The optional Double Number word set, The Core Words, ANS conformance
3153 :     @section The optional Block word set
3154 :     @c =====================================================================
3155 :    
3156 :     @menu
3157 : anton 1.15 * block-idef:: Implementation Defined Options
3158 :     * block-ambcond:: Ambiguous Conditions
3159 :     * block-other:: Other System Documentation
3160 : anton 1.14 @end menu
3161 :    
3162 :    
3163 :     @c ---------------------------------------------------------------------
3164 :     @node block-idef, block-ambcond, The optional Block word set, The optional Block word set
3165 :     @subsection Implementation Defined Options
3166 :     @c ---------------------------------------------------------------------
3167 :    
3168 :     @table @i
3169 :    
3170 :     @item the format for display by @code{LIST}:
3171 :     First the screen number is displayed, then 16 lines of 64 characters,
3172 :     each line preceded by the line number.
3173 :    
3174 :     @item the length of a line affected by @code{\}:
3175 :     64 characters.
3176 :     @end table
3177 :    
3178 :    
3179 :     @c ---------------------------------------------------------------------
3180 :     @node block-ambcond, block-other, block-idef, The optional Block word set
3181 :     @subsection Ambiguous conditions
3182 :     @c ---------------------------------------------------------------------
3183 :    
3184 :     @table @i
3185 :    
3186 :     @item correct block read was not possible:
3187 :     Typically results in a @code{throw} of some OS-derived value (between
3188 :     -512 and -2048). If the blocks file was just not long enough, blanks are
3189 :     supplied for the missing portion.
3190 :    
3191 :     @item I/O exception in block transfer:
3192 :     Typically results in a @code{throw} of some OS-derived value (between
3193 :     -512 and -2048).
3194 :    
3195 :     @item invalid block number:
3196 :     @code{-35 throw} (Invalid block number)
3197 :    
3198 :     @item a program directly alters the contents of @code{BLK}:
3199 :     The input stream is switched to that other block, at the same
3200 :     position. If the storing to @code{BLK} happens when interpreting
3201 :     non-block input, the system will get quite confused when the block ends.
3202 :    
3203 :     @item no current block buffer for @code{UPDATE}:
3204 :     @code{UPDATE} has no effect.
3205 :    
3206 :     @end table
3207 :    
3208 :    
3209 :     @c ---------------------------------------------------------------------
3210 :     @node block-other, , block-ambcond, The optional Block word set
3211 :     @subsection Other system documentation
3212 :     @c ---------------------------------------------------------------------
3213 :    
3214 :     @table @i
3215 :    
3216 :     @item any restrictions a multiprogramming system places on the use of buffer addresses:
3217 :     No restrictions (yet).
3218 :    
3219 :     @item the number of blocks available for source and data:
3220 :     depends on your disk space.
3221 :    
3222 :     @end table
3223 :    
3224 :    
3225 :     @c =====================================================================
3226 :     @node The optional Double Number word set, The optional Exception word set, The optional Block word set, ANS conformance
3227 :     @section The optional Double Number word set
3228 :     @c =====================================================================
3229 :    
3230 :     @menu
3231 : anton 1.15 * double-ambcond:: Ambiguous Conditions
3232 : anton 1.14 @end menu
3233 :    
3234 :    
3235 :     @c ---------------------------------------------------------------------
3236 : anton 1.15 @node double-ambcond, , The optional Double Number word set, The optional Double Number word set
3237 : anton 1.14 @subsection Ambiguous conditions
3238 :     @c ---------------------------------------------------------------------
3239 :    
3240 :     @table @i
3241 :    
3242 : anton 1.15 @item @var{d} outside of range of @var{n} in @code{D>S}:
3243 : anton 1.14 The least significant cell of @var{d} is produced.
3244 :    
3245 :     @end table
3246 :    
3247 :    
3248 :     @c =====================================================================
3249 :     @node The optional Exception word set, The optional Facility word set, The optional Double Number word set, ANS conformance
3250 :     @section The optional Exception word set
3251 :     @c =====================================================================
3252 :    
3253 :     @menu
3254 : anton 1.15 * exception-idef:: Implementation Defined Options
3255 : anton 1.14 @end menu
3256 :    
3257 :    
3258 :     @c ---------------------------------------------------------------------
3259 : anton 1.15 @node exception-idef, , The optional Exception word set, The optional Exception word set
3260 : anton 1.14 @subsection Implementation Defined Options
3261 :     @c ---------------------------------------------------------------------
3262 :    
3263 :     @table @i
3264 :     @item @code{THROW}-codes used in the system:
3265 :     The codes -256@minus{}-511 are used for reporting signals (see
3266 :     @file{errore.fs}). The codes -512@minus{}-2047 are used for OS errors
3267 :     (for file and memory allocation operations). The mapping from OS error
3268 : anton 1.37 numbers to throw code is -512@minus{}@code{errno}. One side effect of
3269 : anton 1.14 this mapping is that undefined OS errors produce a message with a
3270 :     strange number; e.g., @code{-1000 THROW} results in @code{Unknown error
3271 :     488} on my system.
3272 :     @end table
3273 :    
3274 :     @c =====================================================================
3275 :     @node The optional Facility word set, The optional File-Access word set, The optional Exception word set, ANS conformance
3276 :     @section The optional Facility word set
3277 :     @c =====================================================================
3278 :    
3279 :     @menu
3280 : anton 1.15 * facility-idef:: Implementation Defined Options
3281 :     * facility-ambcond:: Ambiguous Conditions
3282 : anton 1.14 @end menu
3283 :    
3284 :    
3285 :     @c ---------------------------------------------------------------------
3286 :     @node facility-idef, facility-ambcond, The optional Facility word set, The optional Facility word set
3287 :     @subsection Implementation Defined Options
3288 :     @c ---------------------------------------------------------------------
3289 :    
3290 :     @table @i
3291 :    
3292 :     @item encoding of keyboard events (@code{EKEY}):
3293 :     Not yet implemeted.
3294 :    
3295 :     @item duration of a system clock tick
3296 :     System dependent. With respect to @code{MS}, the time is specified in
3297 :     microseconds. How well the OS and the hardware implement this, is
3298 :     another question.
3299 :    
3300 :     @item repeatability to be expected from the execution of @code{MS}:
3301 :     System dependent. On Unix, a lot depends on load. If the system is
3302 : anton 1.17 lightly loaded, and the delay is short enough that Gforth does not get
3303 : anton 1.14 swapped out, the performance should be acceptable. Under MS-DOS and
3304 :     other single-tasking systems, it should be good.
3305 :    
3306 :     @end table
3307 :    
3308 :    
3309 :     @c ---------------------------------------------------------------------
3310 : anton 1.15 @node facility-ambcond, , facility-idef, The optional Facility word set
3311 : anton 1.14 @subsection Ambiguous conditions
3312 :     @c ---------------------------------------------------------------------
3313 :    
3314 :     @table @i
3315 :    
3316 :     @item @code{AT-XY} can't be performed on user output device:
3317 :     Largely terminal dependant. No range checks are done on the arguments.
3318 :     No errors are reported. You may see some garbage appearing, you may see
3319 :     simply nothing happen.
3320 :    
3321 :     @end table
3322 :    
3323 :    
3324 :     @c =====================================================================
3325 :     @node The optional File-Access word set, The optional Floating-Point word set, The optional Facility word set, ANS conformance
3326 :     @section The optional File-Access word set
3327 :     @c =====================================================================
3328 :    
3329 :     @menu
3330 : anton 1.15 * file-idef:: Implementation Defined Options
3331 :     * file-ambcond:: Ambiguous Conditions
3332 : anton 1.14 @end menu
3333 :    
3334 :    
3335 :     @c ---------------------------------------------------------------------
3336 :     @node file-idef, file-ambcond, The optional File-Access word set, The optional File-Access word set
3337 :     @subsection Implementation Defined Options
3338 :     @c ---------------------------------------------------------------------
3339 :    
3340 :     @table @i
3341 :    
3342 :     @item File access methods used:
3343 :     @code{R/O}, @code{R/W} and @code{BIN} work as you would
3344 :     expect. @code{W/O} translates into the C file opening mode @code{w} (or
3345 :     @code{wb}): The file is cleared, if it exists, and created, if it does
3346 : anton 1.15 not (both with @code{open-file} and @code{create-file}). Under Unix
3347 : anton 1.14 @code{create-file} creates a file with 666 permissions modified by your
3348 :     umask.
3349 :    
3350 :     @item file exceptions:
3351 :     The file words do not raise exceptions (except, perhaps, memory access
3352 :     faults when you pass illegal addresses or file-ids).
3353 :    
3354 :     @item file line terminator:
3355 :     System-dependent. Gforth uses C's newline character as line
3356 :     terminator. What the actual character code(s) of this are is
3357 :     system-dependent.
3358 :    
3359 :     @item file name format
3360 :     System dependent. Gforth just uses the file name format of your OS.
3361 :    
3362 :     @item information returned by @code{FILE-STATUS}:
3363 :     @code{FILE-STATUS} returns the most powerful file access mode allowed
3364 :     for the file: Either @code{R/O}, @code{W/O} or @code{R/W}. If the file
3365 :     cannot be accessed, @code{R/O BIN} is returned. @code{BIN} is applicable
3366 :     along with the retured mode.
3367 :    
3368 :     @item input file state after an exception when including source:
3369 :     All files that are left via the exception are closed.
3370 :    
3371 :     @item @var{ior} values and meaning:
3372 : anton 1.15 The @var{ior}s returned by the file and memory allocation words are
3373 :     intended as throw codes. They typically are in the range
3374 :     -512@minus{}-2047 of OS errors. The mapping from OS error numbers to
3375 :     @var{ior}s is -512@minus{}@var{errno}.
3376 : anton 1.14
3377 :     @item maximum depth of file input nesting:
3378 :     limited by the amount of return stack, locals/TIB stack, and the number
3379 :     of open files available. This should not give you troubles.
3380 :    
3381 :     @item maximum size of input line:
3382 :     @code{/line}. Currently 255.
3383 :    
3384 :     @item methods of mapping block ranges to files:
3385 : anton 1.37 By default, blocks are accessed in the file @file{blocks.fb} in the
3386 :     current working directory. The file can be switched with @code{USE}.
3387 : anton 1.14
3388 :     @item number of string buffers provided by @code{S"}:
3389 :     1
3390 :    
3391 :     @item size of string buffer used by @code{S"}:
3392 :     @code{/line}. currently 255.
3393 :    
3394 :     @end table
3395 :    
3396 :     @c ---------------------------------------------------------------------
3397 : anton 1.15 @node file-ambcond, , file-idef, The optional File-Access word set
3398 : anton 1.14 @subsection Ambiguous conditions
3399 :     @c ---------------------------------------------------------------------
3400 :    
3401 :     @table @i
3402 :    
3403 :     @item attempting to position a file outside it's boundaries:
3404 :     @code{REPOSITION-FILE} is performed as usual: Afterwards,
3405 :     @code{FILE-POSITION} returns the value given to @code{REPOSITION-FILE}.
3406 :    
3407 :     @item attempting to read from file positions not yet written:
3408 :     End-of-file, i.e., zero characters are read and no error is reported.
3409 :    
3410 :     @item @var{file-id} is invalid (@code{INCLUDE-FILE}):
3411 :     An appropriate exception may be thrown, but a memory fault or other
3412 :     problem is more probable.
3413 :    
3414 :     @item I/O exception reading or closing @var{file-id} (@code{include-file}, @code{included}):
3415 :     The @var{ior} produced by the operation, that discovered the problem, is
3416 :     thrown.
3417 :    
3418 :     @item named file cannot be opened (@code{included}):
3419 :     The @var{ior} produced by @code{open-file} is thrown.
3420 :    
3421 :     @item requesting an unmapped block number:
3422 :     There are no unmapped legal block numbers. On some operating systems,
3423 :     writing a block with a large number may overflow the file system and
3424 :     have an error message as consequence.
3425 :    
3426 :     @item using @code{source-id} when @code{blk} is non-zero:
3427 :     @code{source-id} performs its function. Typically it will give the id of
3428 :     the source which loaded the block. (Better ideas?)
3429 :    
3430 :     @end table
3431 :    
3432 :    
3433 :     @c =====================================================================
3434 :     @node The optional Floating-Point word set, The optional Locals word set, The optional File-Access word set, ANS conformance
3435 : anton 1.15 @section The optional Floating-Point word set
3436 : anton 1.14 @c =====================================================================
3437 :    
3438 :     @menu
3439 : anton 1.15 * floating-idef:: Implementation Defined Options
3440 :     * floating-ambcond:: Ambiguous Conditions
3441 : anton 1.14 @end menu
3442 :    
3443 :    
3444 :     @c ---------------------------------------------------------------------
3445 :     @node floating-idef, floating-ambcond, The optional Floating-Point word set, The optional Floating-Point word set
3446 :     @subsection Implementation Defined Options
3447 :     @c ---------------------------------------------------------------------
3448 :    
3449 :     @table @i
3450 :    
3451 : anton 1.15 @item format and range of floating point numbers:
3452 :     System-dependent; the @code{double} type of C.
3453 : anton 1.14
3454 : anton 1.15 @item results of @code{REPRESENT} when @var{float} is out of range:
3455 :     System dependent; @code{REPRESENT} is implemented using the C library
3456 :     function @code{ecvt()} and inherits its behaviour in this respect.
3457 : anton 1.14
3458 : anton 1.15 @item rounding or truncation of floating-point numbers:
3459 : anton 1.26 System dependent; the rounding behaviour is inherited from the hosting C
3460 :     compiler. IEEE-FP-based (i.e., most) systems by default round to
3461 :     nearest, and break ties by rounding to even (i.e., such that the last
3462 :     bit of the mantissa is 0).
3463 : anton 1.14
3464 : anton 1.15 @item size of floating-point stack:
3465 :     @code{s" FLOATING-STACK" environment? drop .}. Can be changed at startup
3466 :     with the command-line option @code{-f}.
3467 : anton 1.14
3468 : anton 1.15 @item width of floating-point stack:
3469 :     @code{1 floats}.
3470 : anton 1.14
3471 :     @end table
3472 :    
3473 :    
3474 :     @c ---------------------------------------------------------------------
3475 : anton 1.15 @node floating-ambcond, , floating-idef, The optional Floating-Point word set
3476 :     @subsection Ambiguous conditions
3477 : anton 1.14 @c ---------------------------------------------------------------------
3478 :    
3479 :     @table @i
3480 :    
3481 : anton 1.15 @item @code{df@@} or @code{df!} used with an address that is not double-float aligned:
3482 : anton 1.37 System-dependent. Typically results in a @code{-23 THROW} like other
3483 : anton 1.15 alignment violations.
3484 : anton 1.14
3485 : anton 1.15 @item @code{f@@} or @code{f!} used with an address that is not float aligned:
3486 : anton 1.37 System-dependent. Typically results in a @code{-23 THROW} like other
3487 : anton 1.15 alignment violations.
3488 : anton 1.14
3489 : anton 1.15 @item Floating-point result out of range:
3490 :     System-dependent. Can result in a @code{-55 THROW} (Floating-point
3491 :     unidentified fault), or can produce a special value representing, e.g.,
3492 :     Infinity.
3493 : anton 1.14
3494 : anton 1.15 @item @code{sf@@} or @code{sf!} used with an address that is not single-float aligned:
3495 :     System-dependent. Typically results in an alignment fault like other
3496 :     alignment violations.
3497 : anton 1.14
3498 : anton 1.15 @item BASE is not decimal (@code{REPRESENT}, @code{F.}, @code{FE.}, @code{FS.}):
3499 :     The floating-point number is converted into decimal nonetheless.
3500 : anton 1.14
3501 : anton 1.15 @item Both arguments are equal to zero (@code{FATAN2}):
3502 :     System-dependent. @code{FATAN2} is implemented using the C library
3503 :     function @code{atan2()}.
3504 : anton 1.14
3505 : anton 1.15 @item Using ftan on an argument @var{r1} where cos(@var{r1}) is zero:
3506 :     System-dependent. Anyway, typically the cos of @var{r1} will not be zero
3507 :     because of small errors and the tan will be a very large (or very small)
3508 :     but finite number.
3509 : anton 1.14
3510 : anton 1.15 @item @var{d} cannot be presented precisely as a float in @code{D>F}:
3511 :     The result is rounded to the nearest float.
3512 : anton 1.14
3513 : anton 1.15 @item dividing by zero:
3514 :     @code{-55 throw} (Floating-point unidentified fault)
3515 : anton 1.14
3516 : anton 1.15 @item exponent too big for conversion (@code{DF!}, @code{DF@@}, @code{SF!}, @code{SF@@}):
3517 :     System dependent. On IEEE-FP based systems the number is converted into
3518 :     an infinity.
3519 : anton 1.14
3520 : anton 1.15 @item @var{float}<1 (@code{facosh}):
3521 :     @code{-55 throw} (Floating-point unidentified fault)
3522 : anton 1.14
3523 : anton 1.15 @item @var{float}=<-1 (@code{flnp1}):
3524 :     @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems
3525 :     negative infinity is typically produced for @var{float}=-1.
3526 : anton 1.14
3527 : anton 1.15 @item @var{float}=<0 (@code{fln}, @code{flog}):
3528 :     @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems
3529 :     negative infinity is typically produced for @var{float}=0.
3530 : anton 1.14
3531 : anton 1.15 @item @var{float}<0 (@code{fasinh}, @code{fsqrt}):
3532 :     @code{-55 throw} (Floating-point unidentified fault). @code{fasinh}
3533 :     produces values for these inputs on my Linux box (Bug in the C library?)
3534 : anton 1.14
3535 : anton 1.15 @item |@var{float}|>1 (@code{facos}, @code{fasin}, @code{fatanh}):
3536 :     @code{-55 throw} (Floating-point unidentified fault).
3537 : anton 1.14
3538 : anton 1.15 @item integer part of float cannot be represented by @var{d} in @code{f>d}:
3539 :     @code{-55 throw} (Floating-point unidentified fault).
3540 : anton 1.14
3541 : anton 1.15 @item string larger than pictured numeric output area (@code{f.}, @code{fe.}, @code{fs.}):
3542 :     This does not happen.
3543 :     @end table
3544 : anton 1.14
3545 :    
3546 :    
3547 :     @c =====================================================================
3548 : anton 1.15 @node The optional Locals word set, The optional Memory-Allocation word set, The optional Floating-Point word set, ANS conformance
3549 :     @section The optional Locals word set
3550 : anton 1.14 @c =====================================================================
3551 :    
3552 :     @menu
3553 : anton 1.15 * locals-idef:: Implementation Defined Options
3554 :     * locals-ambcond:: Ambiguous Conditions
3555 : anton 1.14 @end menu
3556 :    
3557 :    
3558 :     @c ---------------------------------------------------------------------
3559 : anton 1.15 @node locals-idef, locals-ambcond, The optional Locals word set, The optional Locals word set
3560 : anton 1.14 @subsection Implementation Defined Options
3561 :     @c ---------------------------------------------------------------------
3562 :    
3563 :     @table @i
3564 :    
3565 : anton 1.15 @item maximum number of locals in a definition:
3566 :     @code{s" #locals" environment? drop .}. Currently 15. This is a lower
3567 :     bound, e.g., on a 32-bit machine there can be 41 locals of up to 8
3568 :     characters. The number of locals in a definition is bounded by the size
3569 :     of locals-buffer, which contains the names of the locals.
3570 : anton 1.14
3571 :     @end table
3572 :    
3573 :    
3574 :     @c ---------------------------------------------------------------------
3575 : anton 1.15 @node locals-ambcond, , locals-idef, The optional Locals word set
3576 : anton 1.14 @subsection Ambiguous conditions
3577 :     @c ---------------------------------------------------------------------
3578 :    
3579 :     @table @i
3580 :    
3581 : anton 1.15 @item executing a named local in interpretation state:
3582 :     @code{-14 throw} (Interpreting a compile-only word).
3583 : anton 1.14
3584 : anton 1.15 @item @var{name} not defined by @code{VALUE} or @code{(LOCAL)} (@code{TO}):
3585 :     @code{-32 throw} (Invalid name argument)
3586 : anton 1.14
3587 :     @end table
3588 :    
3589 :    
3590 :     @c =====================================================================
3591 : anton 1.15 @node The optional Memory-Allocation word set, The optional Programming-Tools word set, The optional Locals word set, ANS conformance
3592 :     @section The optional Memory-Allocation word set
3593 : anton 1.14 @c =====================================================================
3594 :    
3595 :     @menu
3596 : anton 1.15 * memory-idef:: Implementation Defined Options
3597 : anton 1.14 @end menu
3598 :    
3599 :    
3600 :     @c ---------------------------------------------------------------------
3601 : anton 1.15 @node memory-idef, , The optional Memory-Allocation word set, The optional Memory-Allocation word set
3602 : anton 1.14 @subsection Implementation Defined Options
3603 :     @c ---------------------------------------------------------------------
3604 :    
3605 :     @table @i
3606 :    
3607 : anton 1.15 @item values and meaning of @var{ior}:
3608 :     The @var{ior}s returned by the file and memory allocation words are
3609 :     intended as throw codes. They typically are in the range
3610 :     -512@minus{}-2047 of OS errors. The mapping from OS error numbers to
3611 :     @var{ior}s is -512@minus{}@var{errno}.
3612 : anton 1.14
3613 :     @end table
3614 :    
3615 :     @c =====================================================================
3616 : anton 1.15 @node The optional Programming-Tools word set, The optional Search-Order word set, The optional Memory-Allocation word set, ANS conformance
3617 :     @section The optional Programming-Tools word set
3618 : anton 1.14 @c =====================================================================
3619 :    
3620 :     @menu
3621 : anton 1.15 * programming-idef:: Implementation Defined Options
3622 :     * programming-ambcond:: Ambiguous Conditions
3623 : anton 1.14 @end menu
3624 :    
3625 :    
3626 :     @c ---------------------------------------------------------------------
3627 : anton 1.15 @node programming-idef, programming-ambcond, The optional Programming-Tools word set, The optional Programming-Tools word set
3628 : anton 1.14 @subsection Implementation Defined Options
3629 :     @c ---------------------------------------------------------------------
3630 :    
3631 :     @table @i
3632 :    
3633 : anton 1.15 @item ending sequence for input following @code{;code} and @code{code}:
3634 : anton 1.37 @code{end-code}
3635 : anton 1.14
3636 : anton 1.15 @item manner of processing input following @code{;code} and @code{code}:
3637 : anton 1.37 The @code{assembler} vocabulary is pushed on the search order stack, and
3638 :     the input is processed by the text interpreter, (starting) in interpret
3639 :     state.
3640 : anton 1.15
3641 :     @item search order capability for @code{EDITOR} and @code{ASSEMBLER}:
3642 : anton 1.37 The ANS Forth search order word set.
3643 : anton 1.15
3644 :     @item source and format of display by @code{SEE}:
3645 :     The source for @code{see} is the intermediate code used by the inner
3646 :     interpreter. The current @code{see} tries to output Forth source code
3647 :     as well as possible.
3648 :    
3649 : anton 1.14 @end table
3650 :    
3651 :     @c ---------------------------------------------------------------------
3652 : anton 1.15 @node programming-ambcond, , programming-idef, The optional Programming-Tools word set
3653 : anton 1.14 @subsection Ambiguous conditions
3654 :     @c ---------------------------------------------------------------------
3655 :    
3656 :     @table @i
3657 :    
3658 : anton 1.15 @item deleting the compilation wordlist (@code{FORGET}):
3659 :     Not implemented (yet).
3660 : anton 1.14
3661 : anton 1.15 @item fewer than @var{u}+1 items on the control flow stack (@code{CS-PICK}, @code{CS-ROLL}):
3662 :     This typically results in an @code{abort"} with a descriptive error
3663 :     message (may change into a @code{-22 throw} (Control structure mismatch)
3664 :     in the future). You may also get a memory access error. If you are
3665 :     unlucky, this ambiguous condition is not caught.
3666 :    
3667 :     @item @var{name} can't be found (@code{forget}):
3668 :     Not implemented (yet).
3669 : anton 1.14
3670 : anton 1.15 @item @var{name} not defined via @code{CREATE}:
3671 : anton 1.37 @code{;code} behaves like @code{DOES>} in this respect, i.e., it changes
3672 :     the execution semantics of the last defined word no matter how it was
3673 :     defined.
3674 : anton 1.14
3675 : anton 1.15 @item @code{POSTPONE} applied to @code{[IF]}:
3676 :     After defining @code{: X POSTPONE [IF] ; IMMEDIATE}. @code{X} is
3677 :     equivalent to @code{[IF]}.
3678 : anton 1.14
3679 : anton 1.15 @item reaching the end of the input source before matching @code{[ELSE]} or @code{[THEN]}:
3680 :     Continue in the same state of conditional compilation in the next outer
3681 :     input source. Currently there is no warning to the user about this.
3682 : anton 1.14
3683 : anton 1.15 @item removing a needed definition (@code{FORGET}):
3684 :     Not implemented (yet).
3685 : anton 1.14
3686 :     @end table
3687 :    
3688 :    
3689 :     @c =====================================================================
3690 : anton 1.15 @node The optional Search-Order word set, , The optional Programming-Tools word set, ANS conformance
3691 :     @section The optional Search-Order word set
3692 : anton 1.14 @c =====================================================================
3693 :    
3694 :     @menu
3695 : anton 1.15 * search-idef:: Implementation Defined Options
3696 :     * search-ambcond:: Ambiguous Conditions
3697 : anton 1.14 @end menu
3698 :    
3699 :    
3700 :     @c ---------------------------------------------------------------------
3701 : anton 1.15 @node search-idef, search-ambcond, The optional Search-Order word set, The optional Search-Order word set
3702 : anton 1.14 @subsection Implementation Defined Options
3703 :     @c ---------------------------------------------------------------------
3704 :    
3705 :     @table @i
3706 :    
3707 : anton 1.15 @item maximum number of word lists in search order:
3708 :     @code{s" wordlists" environment? drop .}. Currently 16.
3709 :    
3710 :     @item minimum search order:
3711 :     @code{root root}.
3712 : anton 1.14
3713 :     @end table
3714 :    
3715 :     @c ---------------------------------------------------------------------
3716 : anton 1.15 @node search-ambcond, , search-idef, The optional Search-Order word set
3717 : anton 1.14 @subsection Ambiguous conditions
3718 :     @c ---------------------------------------------------------------------
3719 :    
3720 :     @table @i
3721 :    
3722 : anton 1.15 @item changing the compilation wordlist (during compilation):
3723 : anton 1.33 The word is entered into the wordlist that was the compilation wordlist
3724 :     at the start of the definition. Any changes to the name field (e.g.,
3725 :     @code{immediate}) or the code field (e.g., when executing @code{DOES>})
3726 :     are applied to the latest defined word (as reported by @code{last} or
3727 :     @code{lastxt}), if possible, irrespective of the compilation wordlist.
3728 : anton 1.14
3729 : anton 1.15 @item search order empty (@code{previous}):
3730 :     @code{abort" Vocstack empty"}.
3731 : anton 1.14
3732 : anton 1.15 @item too many word lists in search order (@code{also}):
3733 :     @code{abort" Vocstack full"}.
3734 : anton 1.14
3735 :     @end table
3736 : anton 1.13
3737 : anton 1.34 @node Model, Integrating Gforth, ANS conformance, Top
3738 :     @chapter Model
3739 :    
3740 :     This chapter has yet to be written. It will contain information, on
3741 :     which internal structures you can rely.
3742 :    
3743 :     @node Integrating Gforth, Emacs and Gforth, Model, Top
3744 :     @chapter Integrating Gforth into C programs
3745 :    
3746 :     This is not yet implemented.
3747 :    
3748 :     Several people like to use Forth as scripting language for applications
3749 :     that are otherwise written in C, C++, or some other language.
3750 :    
3751 :     The Forth system ATLAST provides facilities for embedding it into
3752 :     applications; unfortunately it has several disadvantages: most
3753 : anton 1.36 importantly, it is not based on ANS Forth, and it is apparently dead
3754 : anton 1.34 (i.e., not developed further and not supported). The facilities
3755 :     provided by Gforth in this area are inspired by ATLASTs facilities, so
3756 :     making the switch should not be hard.
3757 :    
3758 :     We also tried to design the interface such that it can easily be
3759 :     implemented by other Forth systems, so that we may one day arrive at a
3760 :     standardized interface. Such a standard interface would allow you to
3761 :     replace the Forth system without having to rewrite C code.
3762 :    
3763 :     You embed the Gforth interpreter by linking with the library
3764 :     @code{libgforth.a} (give the compiler the option @code{-lgforth}). All
3765 :     global symbols in this library that belong to the interface, have the
3766 :     prefix @code{forth_}. (Global symbols that are used internally have the
3767 :     prefix @code{gforth_}).
3768 :    
3769 :     You can include the declarations of Forth types and the functions and
3770 : anton 1.36 variables of the interface with @code{#include <forth.h>}.
3771 : anton 1.34
3772 :     Types.
3773 : anton 1.13
3774 : anton 1.34 Variables.
3775 :    
3776 :     Data and FP Stack pointer. Area sizes.
3777 :    
3778 :     functions.
3779 :    
3780 :     forth_init(imagefile)
3781 :     forth_evaluate(string) exceptions?
3782 :     forth_goto(address) (or forth_execute(xt)?)
3783 :     forth_continue() (a corountining mechanism)
3784 :    
3785 :     Adding primitives.
3786 :    
3787 :     No checking.
3788 :    
3789 :     Signals?
3790 :    
3791 :     Accessing the Stacks
3792 : anton 1.4
3793 : anton 1.34 @node Emacs and Gforth, Internals, Integrating Gforth, Top
3794 : anton 1.17 @chapter Emacs and Gforth
3795 : anton 1.4
3796 : anton 1.17 Gforth comes with @file{gforth.el}, an improved version of
3797 : anton 1.33 @file{forth.el} by Goran Rydqvist (included in the TILE package). The
3798 : anton 1.4 improvements are a better (but still not perfect) handling of
3799 :     indentation. I have also added comment paragraph filling (@kbd{M-q}),
3800 : anton 1.8 commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) regions and
3801 :     removing debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). I left the
3802 :     stuff I do not use alone, even though some of it only makes sense for
3803 :     TILE. To get a description of these features, enter Forth mode and type
3804 :     @kbd{C-h m}.
3805 : anton 1.4
3806 : anton 1.17 In addition, Gforth supports Emacs quite well: The source code locations
3807 : anton 1.4 given in error messages, debugging output (from @code{~~}) and failed
3808 :     assertion messages are in the right format for Emacs' compilation mode
3809 :     (@pxref{Compilation, , Running Compilations under Emacs, emacs, Emacs
3810 :     Manual}) so the source location corresponding to an error or other
3811 :     message is only a few keystrokes away (@kbd{C-x `} for the next error,
3812 :     @kbd{C-c C-c} for the error under the cursor).
3813 :    
3814 :     Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file
3815 :     (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) will be produced that
3816 :     contains the definitions of all words defined afterwards. You can then
3817 :     find the source for a word using @kbd{M-.}. Note that emacs can use
3818 : anton 1.17 several tags files at the same time (e.g., one for the Gforth sources
3819 : anton 1.28 and one for your program, @pxref{Select Tags Table,,Selecting a Tags
3820 :     Table,emacs, Emacs Manual}). The TAGS file for the preloaded words is
3821 :     @file{$(datadir)/gforth/$(VERSION)/TAGS} (e.g.,
3822 : anton 1.33 @file{/usr/local/share/gforth/0.2.0/TAGS}).
3823 : anton 1.4
3824 :     To get all these benefits, add the following lines to your @file{.emacs}
3825 :     file:
3826 :    
3827 :     @example
3828 :     (autoload 'forth-mode "gforth.el")
3829 :     (setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist))
3830 :     @end example
3831 :    
3832 : anton 1.17 @node Internals, Bugs, Emacs and Gforth, Top
3833 : anton 1.3 @chapter Internals
3834 :    
3835 : anton 1.17 Reading this section is not necessary for programming with Gforth. It
3836 :     should be helpful for finding your way in the Gforth sources.
3837 : anton 1.3
3838 : anton 1.24 The ideas in this section have also been published in the papers
3839 :     @cite{ANS fig/GNU/??? Forth} (in German) by Bernd Paysan, presented at
3840 :     the Forth-Tagung '93 and @cite{A Portable Forth Engine} by M. Anton
3841 :     Ertl, presented at EuroForth '93; the latter is available at
3842 :     @*@file{http://www.complang.tuwien.ac.at/papers/ertl93.ps.Z}.
3843 :    
3844 : anton 1.4 @menu
3845 :     * Portability::
3846 :     * Threading::
3847 :     * Primitives::
3848 :     * System Architecture::
3849 : anton 1.17 * Performance::
3850 : anton 1.4 @end menu
3851 :    
3852 :     @node Portability, Threading, Internals, Internals
3853 : anton 1.3 @section Portability
3854 :    
3855 :     One of the main goals of the effort is availability across a wide range
3856 :     of personal machines. fig-Forth, and, to a lesser extent, F83, achieved
3857 :     this goal by manually coding the engine in assembly language for several
3858 :     then-popular processors. This approach is very labor-intensive and the
3859 :     results are short-lived due to progress in computer architecture.
3860 :    
3861 :     Others have avoided this problem by coding in C, e.g., Mitch Bradley
3862 :     (cforth), Mikael Patel (TILE) and Dirk Zoller (pfe). This approach is
3863 :     particularly popular for UNIX-based Forths due to the large variety of
3864 :     architectures of UNIX machines. Unfortunately an implementation in C
3865 :     does not mix well with the goals of efficiency and with using
3866 :     traditional techniques: Indirect or direct threading cannot be expressed
3867 :     in C, and switch threading, the fastest technique available in C, is
3868 :     significantly slower. Another problem with C is that it's very
3869 :     cumbersome to express double integer arithmetic.
3870 :    
3871 :     Fortunately, there is a portable language that does not have these
3872 :     limitations: GNU C, the version of C processed by the GNU C compiler
3873 :     (@pxref{C Extensions, , Extensions to the C Language Family, gcc.info,
3874 :     GNU C Manual}). Its labels as values feature (@pxref{Labels as Values, ,
3875 :     Labels as Values, gcc.info, GNU C Manual}) makes direct and indirect
3876 :     threading possible, its @code{long long} type (@pxref{Long Long, ,
3877 : anton 1.33 Double-Word Integers, gcc.info, GNU C Manual}) corresponds to Forth's
3878 : anton 1.32 double numbers@footnote{Unfortunately, long longs are not implemented
3879 :     properly on all machines (e.g., on alpha-osf1, long longs are only 64
3880 :     bits, the same size as longs (and pointers), but they should be twice as
3881 :     long according to @ref{Long Long, , Double-Word Integers, gcc.info, GNU
3882 :     C Manual}). So, we had to implement doubles in C after all. Still, on
3883 :     most machines we can use long longs and achieve better performance than
3884 :     with the emulation package.}. GNU C is available for free on all
3885 :     important (and many unimportant) UNIX machines, VMS, 80386s running
3886 :     MS-DOS, the Amiga, and the Atari ST, so a Forth written in GNU C can run
3887 :     on all these machines.
3888 : anton 1.3
3889 :     Writing in a portable language has the reputation of producing code that
3890 :     is slower than assembly. For our Forth engine we repeatedly looked at
3891 :     the code produced by the compiler and eliminated most compiler-induced
3892 :     inefficiencies by appropriate changes in the source-code.
3893 :    
3894 :     However, register allocation cannot be portably influenced by the
3895 :     programmer, leading to some inefficiencies on register-starved
3896 :     machines. We use explicit register declarations (@pxref{Explicit Reg
3897 :     Vars, , Variables in Specified Registers, gcc.info, GNU C Manual}) to
3898 :     improve the speed on some machines. They are turned on by using the
3899 :     @code{gcc} switch @code{-DFORCE_REG}. Unfortunately, this feature not
3900 :     only depends on the machine, but also on the compiler version: On some
3901 :     machines some compiler versions produce incorrect code when certain
3902 :     explicit register declarations are used. So by default
3903 :     @code{-DFORCE_REG} is not used.
3904 :    
3905 : anton 1.4 @node Threading, Primitives, Portability, Internals
3906 : anton 1.3 @section Threading
3907 :    
3908 :     GNU C's labels as values extension (available since @code{gcc-2.0},
3909 :     @pxref{Labels as Values, , Labels as Values, gcc.info, GNU C Manual})
3910 :     makes it possible to take the address of @var{label} by writing
3911 :     @code{&&@var{label}}. This address can then be used in a statement like
3912 :     @code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as
3913 :     @code{goto x}.
3914 :    
3915 :     With this feature an indirect threaded NEXT looks like:
3916 :     @example
3917 :     cfa = *ip++;
3918 :     ca = *cfa;
3919 :     goto *ca;
3920 :     @end example
3921 :     For those unfamiliar with the names: @code{ip} is the Forth instruction
3922 :     pointer; the @code{cfa} (code-field address) corresponds to ANS Forths
3923 :     execution token and points to the code field of the next word to be
3924 :     executed; The @code{ca} (code address) fetched from there points to some
3925 :     executable code, e.g., a primitive or the colon definition handler
3926 :     @code{docol}.
3927 :    
3928 :     Direct threading is even simpler:
3929 :     @example
3930 :     ca = *ip++;
3931 :     goto *ca;
3932 :     @end example
3933 :    
3934 :     Of course we have packaged the whole thing neatly in macros called
3935 :     @code{NEXT} and @code{NEXT1} (the part of NEXT after fetching the cfa).
3936 :    
3937 : anton 1.4 @menu
3938 :     * Scheduling::
3939 :     * Direct or Indirect Threaded?::
3940 :     * DOES>::
3941 :     @end menu
3942 :    
3943 :     @node Scheduling, Direct or Indirect Threaded?, Threading, Threading
3944 : anton 1.3 @subsection Scheduling
3945 :    
3946 :     There is a little complication: Pipelined and superscalar processors,
3947 :     i.e., RISC and some modern CISC machines can process independent
3948 :     instructions while waiting for the results of an instruction. The
3949 :     compiler usually reorders (schedules) the instructions in a way that
3950 :     achieves good usage of these delay slots. However, on our first tries
3951 :     the compiler did not do well on scheduling primitives. E.g., for
3952 :     @code{+} implemented as
3953 :     @example
3954 :     n=sp[0]+sp[1];
3955 :     sp++;
3956 :     sp[0]=n;
3957 :     NEXT;
3958 :     @end example
3959 :     the NEXT comes strictly after the other code, i.e., there is nearly no
3960 :     scheduling. After a little thought the problem becomes clear: The
3961 :     compiler cannot know that sp and ip point to different addresses (and
3962 : anton 1.4 the version of @code{gcc} we used would not know it even if it was
3963 :     possible), so it could not move the load of the cfa above the store to
3964 :     the TOS. Indeed the pointers could be the same, if code on or very near
3965 :     the top of stack were executed. In the interest of speed we chose to
3966 :     forbid this probably unused ``feature'' and helped the compiler in
3967 :     scheduling: NEXT is divided into the loading part (@code{NEXT_P1}) and
3968 :     the goto part (@code{NEXT_P2}). @code{+} now looks like:
3969 : anton 1.3 @example
3970 :     n=sp[0]+sp[1];
3971 :     sp++;
3972 :     NEXT_P1;
3973 :     sp[0]=n;
3974 :     NEXT_P2;
3975 :     @end example
3976 : anton 1.4 This can be scheduled optimally by the compiler.
3977 : anton 1.3
3978 :     This division can be turned off with the switch @code{-DCISC_NEXT}. This
3979 :     switch is on by default on machines that do not profit from scheduling
3980 :     (e.g., the 80386), in order to preserve registers.
3981 :    
3982 : anton 1.4 @node Direct or Indirect Threaded?, DOES>, Scheduling, Threading
3983 : anton 1.3 @subsection Direct or Indirect Threaded?
3984 :    
3985 :     Both! After packaging the nasty details in macro definitions we
3986 :     realized that we could switch between direct and indirect threading by
3987 :     simply setting a compilation flag (@code{-DDIRECT_THREADED}) and
3988 :     defining a few machine-specific macros for the direct-threading case.
3989 :     On the Forth level we also offer access words that hide the
3990 :     differences between the threading methods (@pxref{Threading Words}).
3991 :    
3992 :     Indirect threading is implemented completely
3993 :     machine-independently. Direct threading needs routines for creating
3994 :     jumps to the executable code (e.g. to docol or dodoes). These routines
3995 :     are inherently machine-dependent, but they do not amount to many source
3996 :     lines. I.e., even porting direct threading to a new machine is a small
3997 :     effort.
3998 :    
3999 : anton 1.4 @node DOES>, , Direct or Indirect Threaded?, Threading
4000 : anton 1.3 @subsection DOES>
4001 :     One of the most complex parts of a Forth engine is @code{dodoes}, i.e.,
4002 :     the chunk of code executed by every word defined by a
4003 :     @code{CREATE}...@code{DOES>} pair. The main problem here is: How to find
4004 :     the Forth code to be executed, i.e. the code after the @code{DOES>} (the
4005 :     DOES-code)? There are two solutions:
4006 :    
4007 :     In fig-Forth the code field points directly to the dodoes and the
4008 :     DOES-code address is stored in the cell after the code address
4009 :     (i.e. at cfa cell+). It may seem that this solution is illegal in the
4010 :     Forth-79 and all later standards, because in fig-Forth this address
4011 :     lies in the body (which is illegal in these standards). However, by
4012 :     making the code field larger for all words this solution becomes legal
4013 :     again. We use this approach for the indirect threaded version. Leaving
4014 :     a cell unused in most words is a bit wasteful, but on the machines we
4015 :     are targetting this is hardly a problem. The other reason for having a
4016 :     code field size of two cells is to avoid having different image files
4017 : anton 1.4 for direct and indirect threaded systems (@pxref{System Architecture}).
4018 : anton 1.3
4019 :     The other approach is that the code field points or jumps to the cell
4020 :     after @code{DOES}. In this variant there is a jump to @code{dodoes} at
4021 :     this address. @code{dodoes} can then get the DOES-code address by
4022 :     computing the code address, i.e., the address of the jump to dodoes,
4023 :     and add the length of that jump field. A variant of this is to have a
4024 :     call to @code{dodoes} after the @code{DOES>}; then the return address
4025 :     (which can be found in the return register on RISCs) is the DOES-code
4026 :     address. Since the two cells available in the code field are usually
4027 :     used up by the jump to the code address in direct threading, we use
4028 :     this approach for direct threading. We did not want to add another
4029 :     cell to the code field.
4030 :    
4031 : anton 1.4 @node Primitives, System Architecture, Threading, Internals
4032 : anton 1.3 @section Primitives
4033 :    
4034 : anton 1.4 @menu
4035 :     * Automatic Generation::
4036 :     * TOS Optimization::
4037 :     * Produced code::
4038 :     @end menu
4039 :    
4040 :     @node Automatic Generation, TOS Optimization, Primitives, Primitives
4041 : anton 1.3 @subsection Automatic Generation
4042 :    
4043 :     Since the primitives are implemented in a portable language, there is no
4044 :     longer any need to minimize the number of primitives. On the contrary,
4045 :     having many primitives is an advantage: speed. In order to reduce the
4046 :     number of errors in primitives and to make programming them easier, we
4047 :     provide a tool, the primitive generator (@file{prims2x.fs}), that
4048 :     automatically generates most (and sometimes all) of the C code for a
4049 :     primitive from the stack effect notation. The source for a primitive
4050 :     has the following form:
4051 :    
4052 :     @format
4053 :     @var{Forth-name} @var{stack-effect} @var{category} [@var{pronounc.}]
4054 :     [@code{""}@var{glossary entry}@code{""}]
4055 :     @var{C code}
4056 :     [@code{:}
4057 :     @var{Forth code}]
4058 :     @end format
4059 :    
4060 :     The items in brackets are optional. The category and glossary fields
4061 :     are there for generating the documentation, the Forth code is there
4062 :     for manual implementations on machines without GNU C. E.g., the source
4063 :     for the primitive @code{+} is:
4064 :     @example
4065 :     + n1 n2 -- n core plus
4066 :     n = n1+n2;
4067 :     @end example
4068 :    
4069 :     This looks like a specification, but in fact @code{n = n1+n2} is C
4070 :     code. Our primitive generation tool extracts a lot of information from
4071 :     the stack effect notations@footnote{We use a one-stack notation, even
4072 :     though we have separate data and floating-point stacks; The separate
4073 :     notation can be generated easily from the unified notation.}: The number
4074 :     of items popped from and pushed on the stack, their type, and by what
4075 :     name they are referred to in the C code. It then generates a C code
4076 :     prelude and postlude for each primitive. The final C code for @code{+}
4077 :     looks like this:
4078 :    
4079 :     @example
4080 :     I_plus: /* + ( n1 n2 -- n ) */ /* label, stack effect */
4081 :     /* */ /* documentation */
4082 : anton 1.4 @{
4083 : anton 1.3 DEF_CA /* definition of variable ca (indirect threading) */
4084 :     Cell n1; /* definitions of variables */
4085 :     Cell n2;
4086 :     Cell n;
4087 :     n1 = (Cell) sp[1]; /* input */
4088 :     n2 = (Cell) TOS;
4089 :     sp += 1; /* stack adjustment */
4090 :     NAME("+") /* debugging output (with -DDEBUG) */
4091 : anton 1.4 @{
4092 : anton 1.3 n = n1+n2; /* C code taken from the source */
4093 : anton 1.4 @}
4094 : anton 1.3 NEXT_P1; /* NEXT part 1 */
4095 :     TOS = (Cell)n; /* output */
4096 :     NEXT_P2; /* NEXT part 2 */
4097 : anton 1.4 @}
4098 : anton 1.3 @end example
4099 :    
4100 :     This looks long and inefficient, but the GNU C compiler optimizes quite
4101 :     well and produces optimal code for @code{+} on, e.g., the R3000 and the
4102 :     HP RISC machines: Defining the @code{n}s does not produce any code, and
4103 :     using them as intermediate storage also adds no cost.
4104 :    
4105 :     There are also other optimizations, that are not illustrated by this
4106 :     example: Assignments between simple variables are usually for free (copy
4107 :     propagation). If one of the stack items is not used by the primitive
4108 :     (e.g. in @code{drop}), the compiler eliminates the load from the stack
4109 :     (dead code elimination). On the other hand, there are some things that
4110 :     the compiler does not do, therefore they are performed by
4111 :     @file{prims2x.fs}: The compiler does not optimize code away that stores
4112 :     a stack item to the place where it just came from (e.g., @code{over}).
4113 :    
4114 :     While programming a primitive is usually easy, there are a few cases
4115 :     where the programmer has to take the actions of the generator into
4116 :     account, most notably @code{?dup}, but also words that do not (always)
4117 :     fall through to NEXT.
4118 :    
4119 : anton 1.4 @node TOS Optimization, Produced code, Automatic Generation, Primitives
4120 : anton 1.3 @subsection TOS Optimization
4121 :    
4122 :     An important optimization for stack machine emulators, e.g., Forth
4123 :     engines, is keeping one or more of the top stack items in
4124 : anton 1.4 registers. If a word has the stack effect @var{in1}...@var{inx} @code{--}
4125 :     @var{out1}...@var{outy}, keeping the top @var{n} items in registers
4126 : anton 1.34 @itemize @bullet
4127 : anton 1.3 @item
4128 :     is better than keeping @var{n-1} items, if @var{x>=n} and @var{y>=n},
4129 :     due to fewer loads from and stores to the stack.
4130 :     @item is slower than keeping @var{n-1} items, if @var{x<>y} and @var{x<n} and
4131 :     @var{y<n}, due to additional moves between registers.
4132 :     @end itemize
4133 :    
4134 :     In particular, keeping one item in a register is never a disadvantage,
4135 :     if there are enough registers. Keeping two items in registers is a
4136 :     disadvantage for frequent words like @code{?branch}, constants,
4137 :     variables, literals and @code{i}. Therefore our generator only produces
4138 :     code that keeps zero or one items in registers. The generated C code
4139 :     covers both cases; the selection between these alternatives is made at
4140 :     C-compile time using the switch @code{-DUSE_TOS}. @code{TOS} in the C
4141 :     code for @code{+} is just a simple variable name in the one-item case,
4142 :     otherwise it is a macro that expands into @code{sp[0]}. Note that the
4143 :     GNU C compiler tries to keep simple variables like @code{TOS} in
4144 :     registers, and it usually succeeds, if there are enough registers.
4145 :    
4146 :     The primitive generator performs the TOS optimization for the
4147 :     floating-point stack, too (@code{-DUSE_FTOS}). For floating-point
4148 :     operations the benefit of this optimization is even larger:
4149 :     floating-point operations take quite long on most processors, but can be
4150 :     performed in parallel with other operations as long as their results are
4151 :     not used. If the FP-TOS is kept in a register, this works. If
4152 :     it is kept on the stack, i.e., in memory, the store into memory has to
4153 :     wait for the result of the floating-point operation, lengthening the
4154 :     execution time of the primitive considerably.
4155 :    
4156 :     The TOS optimization makes the automatic generation of primitives a
4157 :     bit more complicated. Just replacing all occurrences of @code{sp[0]} by
4158 :     @code{TOS} is not sufficient. There are some special cases to
4159 :     consider:
4160 : anton 1.34 @itemize @bullet
4161 : anton 1.3 @item In the case of @code{dup ( w -- w w )} the generator must not
4162 :     eliminate the store to the original location of the item on the stack,
4163 :     if the TOS optimization is turned on.
4164 : anton 1.4 @item Primitives with stack effects of the form @code{--}
4165 :     @var{out1}...@var{outy} must store the TOS to the stack at the start.
4166 :     Likewise, primitives with the stack effect @var{in1}...@var{inx} @code{--}
4167 : anton 1.3 must load the TOS from the stack at the end. But for the null stack
4168 :     effect @code{--} no stores or loads should be generated.
4169 :     @end itemize
4170 :    
4171 : anton 1.4 @node Produced code, , TOS Optimization, Primitives
4172 : anton 1.3 @subsection Produced code
4173 :    
4174 :     To see what assembly code is produced for the primitives on your machine
4175 :     with your compiler and your flag settings, type @code{make engine.s} and
4176 : anton 1.4 look at the resulting file @file{engine.s}.
4177 : anton 1.3
4178 : anton 1.17 @node System Architecture, Performance, Primitives, Internals
4179 : anton 1.3 @section System Architecture
4180 :    
4181 :     Our Forth system consists not only of primitives, but also of
4182 :     definitions written in Forth. Since the Forth compiler itself belongs
4183 :     to those definitions, it is not possible to start the system with the
4184 :     primitives and the Forth source alone. Therefore we provide the Forth
4185 :     code as an image file in nearly executable form. At the start of the
4186 :     system a C routine loads the image file into memory, sets up the
4187 :     memory (stacks etc.) according to information in the image file, and
4188 :     starts executing Forth code.
4189 :    
4190 :     The image file format is a compromise between the goals of making it
4191 :     easy to generate image files and making them portable. The easiest way
4192 :     to generate an image file is to just generate a memory dump. However,
4193 :     this kind of image file cannot be used on a different machine, or on
4194 :     the next version of the engine on the same machine, it even might not
4195 :     work with the same engine compiled by a different version of the C
4196 :     compiler. We would like to have as few versions of the image file as
4197 :     possible, because we do not want to distribute many versions of the
4198 :     same image file, and to make it easy for the users to use their image
4199 :     files on many machines. We currently need to create a different image
4200 :     file for machines with different cell sizes and different byte order
4201 : anton 1.17 (little- or big-endian)@footnote{We are considering adding information to the
4202 : anton 1.3 image file that enables the loader to change the byte order.}.
4203 :    
4204 :     Forth code that is going to end up in a portable image file has to
4205 : anton 1.4 comply to some restrictions: addresses have to be stored in memory with
4206 :     special words (@code{A!}, @code{A,}, etc.) in order to make the code
4207 :     relocatable. Cells, floats, etc., have to be stored at the natural
4208 :     alignment boundaries@footnote{E.g., store floats (8 bytes) at an address
4209 :     dividable by~8. This happens automatically in our system when you use
4210 :     the ANS Forth alignment words.}, in order to avoid alignment faults on
4211 :     machines with stricter alignment. The image file is produced by a
4212 :     metacompiler (@file{cross.fs}).
4213 : anton 1.3
4214 :     So, unlike the image file of Mitch Bradleys @code{cforth}, our image
4215 :     file is not directly executable, but has to undergo some manipulations
4216 :     during loading. Address relocation is performed at image load-time, not
4217 :     at run-time. The loader also has to replace tokens standing for
4218 :     primitive calls with the appropriate code-field addresses (or code
4219 :     addresses in the case of direct threading).
4220 : anton 1.4
4221 : anton 1.17 @node Performance, , System Architecture, Internals
4222 :     @section Performance
4223 :    
4224 :     On RISCs the Gforth engine is very close to optimal; i.e., it is usually
4225 :     impossible to write a significantly faster engine.
4226 :    
4227 :     On register-starved machines like the 386 architecture processors
4228 :     improvements are possible, because @code{gcc} does not utilize the
4229 :     registers as well as a human, even with explicit register declarations;
4230 :     e.g., Bernd Beuster wrote a Forth system fragment in assembly language
4231 :     and hand-tuned it for the 486; this system is 1.19 times faster on the
4232 :     Sieve benchmark on a 486DX2/66 than Gforth compiled with
4233 :     @code{gcc-2.6.3} with @code{-DFORCE_REG}.
4234 :    
4235 :     However, this potential advantage of assembly language implementations
4236 :     is not necessarily realized in complete Forth systems: We compared
4237 : anton 1.26 Gforth (direct threaded, compiled with @code{gcc-2.6.3} and
4238 :     @code{-DFORCE_REG}) with Win32Forth 1.2093, LMI's NT Forth (Beta, May
4239 :     1994) and Eforth (with and without peephole (aka pinhole) optimization
4240 :     of the threaded code); all these systems were written in assembly
4241 : anton 1.30 language. We also compared Gforth with three systems written in C:
4242 : anton 1.32 PFE-0.9.14 (compiled with @code{gcc-2.6.3} with the default
4243 :     configuration for Linux: @code{-O2 -fomit-frame-pointer -DUSE_REGS
4244 :     -DUNROLL_NEXT}), ThisForth Beta (compiled with gcc-2.6.3 -O3
4245 :     -fomit-frame-pointer; ThisForth employs peephole optimization of the
4246 :     threaded code) and TILE (compiled with @code{make opt}). We benchmarked
4247 :     Gforth, PFE, ThisForth and TILE on a 486DX2/66 under Linux. Kenneth
4248 :     O'Heskin kindly provided the results for Win32Forth and NT Forth on a
4249 :     486DX2/66 with similar memory performance under Windows NT. Marcel
4250 :     Hendrix ported Eforth to Linux, then extended it to run the benchmarks,
4251 :     added the peephole optimizer, ran the benchmarks and reported the
4252 :     results.
4253 : anton 1.17
4254 :     We used four small benchmarks: the ubiquitous Sieve; bubble-sorting and
4255 :     matrix multiplication come from the Stanford integer benchmarks and have
4256 :     been translated into Forth by Martin Fraeman; we used the versions
4257 : anton 1.30 included in the TILE Forth package, but with bigger data set sizes; and
4258 :     a recursive Fibonacci number computation for benchmarking calling
4259 :     performance. The following table shows the time taken for the benchmarks
4260 :     scaled by the time taken by Gforth (in other words, it shows the speedup
4261 :     factor that Gforth achieved over the other systems).
4262 : anton 1.17
4263 :     @example
4264 : anton 1.30 relative Win32- NT eforth This-
4265 :     time Gforth Forth Forth eforth +opt PFE Forth TILE
4266 : anton 1.32 sieve 1.00 1.39 1.14 1.39 0.85 1.58 3.18 8.58
4267 :     bubble 1.00 1.31 1.41 1.48 0.88 1.50 3.88
4268 : anton 1.38 matmul 1.00 1.47 1.35 1.46 0.74 1.58 4.09
4269 :     fib 1.00 1.52 1.34 1.22 0.86 1.74 2.99 4.30
4270 : anton 1.17 @end example
4271 :    
4272 :     You may find the good performance of Gforth compared with the systems
4273 :     written in assembly language quite surprising. One important reason for
4274 :     the disappointing performance of these systems is probably that they are
4275 :     not written optimally for the 486 (e.g., they use the @code{lods}
4276 :     instruction). In addition, Win32Forth uses a comfortable, but costly
4277 :     method for relocating the Forth image: like @code{cforth}, it computes
4278 :     the actual addresses at run time, resulting in two address computations
4279 :     per NEXT (@pxref{System Architecture}).
4280 :    
4281 : anton 1.26 Only Eforth with the peephole optimizer performs comparable to
4282 :     Gforth. The speedups achieved with peephole optimization of threaded
4283 :     code are quite remarkable. Adding a peephole optimizer to Gforth should
4284 :     cause similar speedups.
4285 :    
4286 : anton 1.30 The speedup of Gforth over PFE, ThisForth and TILE can be easily
4287 :     explained with the self-imposed restriction to standard C, which makes
4288 :     efficient threading impossible (however, the measured implementation of
4289 :     PFE uses a GNU C extension: @ref{Global Reg Vars, , Defining Global
4290 :     Register Variables, gcc.info, GNU C Manual}). Moreover, current C
4291 :     compilers have a hard time optimizing other aspects of the ThisForth
4292 :     and the TILE source.
4293 : anton 1.17
4294 :     Note that the performance of Gforth on 386 architecture processors
4295 :     varies widely with the version of @code{gcc} used. E.g., @code{gcc-2.5.8}
4296 :     failed to allocate any of the virtual machine registers into real
4297 :     machine registers by itself and would not work correctly with explicit
4298 :     register declarations, giving a 1.3 times slower engine (on a 486DX2/66
4299 :     running the Sieve) than the one measured above.
4300 :    
4301 : anton 1.26 In @cite{Translating Forth to Efficient C} by M. Anton Ertl and Martin
4302 :     Maierhofer (presented at EuroForth '95), an indirect threaded version of
4303 :     Gforth is compared with Win32Forth, NT Forth, PFE, and ThisForth; that
4304 :     version of Gforth is 2\%@minus{}8\% slower on a 486 than the version
4305 :     used here. The paper available at
4306 : anton 1.24 @*@file{http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz};
4307 :     it also contains numbers for some native code systems. You can find
4308 :     numbers for Gforth on various machines in @file{Benchres}.
4309 :    
4310 : anton 1.29 @node Bugs, Origin, Internals, Top
4311 : anton 1.4 @chapter Bugs
4312 :    
4313 : anton 1.17 Known bugs are described in the file BUGS in the Gforth distribution.
4314 :    
4315 : anton 1.24 If you find a bug, please send a bug report to
4316 : anton 1.32 @code{bug-gforth@@gnu.ai.mit.edu}. A bug report should
4317 : anton 1.17 describe the Gforth version used (it is announced at the start of an
4318 :     interactive Gforth session), the machine and operating system (on Unix
4319 :     systems you can use @code{uname -a} to produce this information), the
4320 : anton 1.24 installation options (send the @code{config.status} file), and a
4321 :     complete list of changes you (or your installer) have made to the Gforth
4322 :     sources (if any); it should contain a program (or a sequence of keyboard
4323 :     commands) that reproduces the bug and a description of what you think
4324 :     constitutes the buggy behaviour.
4325 : anton 1.17
4326 :     For a thorough guide on reporting bugs read @ref{Bug Reporting, , How
4327 :     to Report Bugs, gcc.info, GNU C Manual}.
4328 :    
4329 :    
4330 : anton 1.29 @node Origin, Word Index, Bugs, Top
4331 :     @chapter Authors and Ancestors of Gforth
4332 :    
4333 :     @section Authors and Contributors
4334 :    
4335 :     The Gforth project was started in mid-1992 by Bernd Paysan and Anton
4336 : anton 1.30 Ertl. The third major author was Jens Wilke. Lennart Benschop (who was
4337 :     one of Gforth's first users, in mid-1993) and Stuart Ramsden inspired us
4338 :     with their continuous feedback. Lennart Benshop contributed
4339 : anton 1.29 @file{glosgen.fs}, while Stuart Ramsden has been working on automatic
4340 :     support for calling C libraries. Helpful comments also came from Paul
4341 : anton 1.37 Kleinrubatscher, Christian Pirker, Dirk Zoller, Marcel Hendrix, John
4342 : anton 1.39 Wavrik, Barrie Stott and Marc de Groot.
4343 : anton 1.29
4344 : anton 1.30 Gforth also owes a lot to the authors of the tools we used (GCC, CVS,
4345 :     and autoconf, among others), and to the creators of the Internet: Gforth
4346 :     was developed across the Internet, and its authors have not met
4347 :     physically yet.
4348 :    
4349 : anton 1.29 @section Pedigree
4350 : anton 1.4
4351 : anton 1.17 Gforth descends from BigForth (1993) and fig-Forth. Gforth and PFE (by
4352 : anton 1.24 Dirk Zoller) will cross-fertilize each other. Of course, a significant
4353 :     part of the design of Gforth was prescribed by ANS Forth.
4354 : anton 1.17
4355 : pazsan 1.23 Bernd Paysan wrote BigForth, a descendent from TurboForth, an unreleased
4356 :     32 bit native code version of VolksForth for the Atari ST, written
4357 :     mostly by Dietrich Weineck.
4358 :    
4359 :     VolksForth descends from F83. It was written by Klaus Schleisiek, Bernd
4360 :     Pennemann, Georg Rehfeld and Dietrich Weineck for the C64 (called
4361 : anton 1.24 UltraForth there) in the mid-80s and ported to the Atari ST in 1986.
4362 : anton 1.17
4363 : anton 1.34 Henry Laxen and Mike Perry wrote F83 as a model implementation of the
4364 : anton 1.17 Forth-83 standard. !! Pedigree? When?
4365 :    
4366 :     A team led by Bill Ragsdale implemented fig-Forth on many processors in
4367 : anton 1.24 1979. Robert Selzer and Bill Ragsdale developed the original
4368 :     implementation of fig-Forth for the 6502 based on microForth.
4369 :    
4370 :     The principal architect of microForth was Dean Sanderson. microForth was
4371 :     FORTH, Inc.'s first off-the-shelf product. It was developped in 1976 for
4372 :     the 1802, and subsequently implemented on the 8080, the 6800 and the
4373 :     Z80.
4374 : anton 1.17
4375 : anton 1.24 All earlier Forth systems were custom-made, usually by Charles Moore,
4376 : anton 1.30 who discovered (as he puts it) Forth during the late 60s. The first full
4377 :     Forth existed in 1971.
4378 : anton 1.17
4379 :     A part of the information in this section comes from @cite{The Evolution
4380 :     of Forth} by Elizabeth D. Rather, Donald R. Colburn and Charles
4381 :     H. Moore, presented at the HOPL-II conference and preprinted in SIGPLAN
4382 :     Notices 28(3), 1993. You can find more historical and genealogical
4383 :     information about Forth there.
4384 :    
4385 : anton 1.29 @node Word Index, Node Index, Origin, Top
4386 : anton 1.4 @chapter Word Index
4387 :    
4388 : anton 1.18 This index is as incomplete as the manual. Each word is listed with
4389 :     stack effect and wordset.
4390 : anton 1.17
4391 :     @printindex fn
4392 :    
4393 : anton 1.4 @node Node Index, , Word Index, Top
4394 :     @chapter Node Index
4395 : anton 1.17
4396 :     This index is even less complete than the manual.
4397 : anton 1.1
4398 :     @contents
4399 :     @bye
4400 :    

CVS Admin

Powered by ViewCVS 1.0-dev
(Powered by ViewCVS)

ViewCVS and CVS Help