Annotation of gforth/doc/gforth.ds, revision 1.12
1.1 anton 1: \input texinfo @c -*-texinfo-*-
2: @comment The source is gforth.ds, from which gforth.texi is generated
3: @comment %**start of header (This is for running Texinfo on a region.)
4: @setfilename gforth.info
5: @settitle Gforth Manual
6: @dircategory GNU programming tools
7: @direntry
8: * Gforth: (gforth). A fast interpreter for the Forth language.
9: @end direntry
10: @comment @setchapternewpage odd
1.12 ! anton 11: @macro progstyle {}
! 12: Programming style note:
1.3 anton 13: @end macro
1.1 anton 14: @comment %**end of header (This is for running Texinfo on a region.)
15:
1.10 anton 16: @include version.texi
17:
1.1 anton 18: @ifinfo
1.11 anton 19: This file documents Gforth @value{VERSION}
1.1 anton 20:
21: Copyright @copyright{} 1995-1997 Free Software Foundation, Inc.
22:
23: Permission is granted to make and distribute verbatim copies of
24: this manual provided the copyright notice and this permission notice
25: are preserved on all copies.
26:
27: @ignore
28: Permission is granted to process this file through TeX and print the
29: results, provided the printed document carries a copying permission
30: notice identical to this one except for the removal of this paragraph
31: (this paragraph not being relevant to the printed manual).
32:
33: @end ignore
34: Permission is granted to copy and distribute modified versions of this
35: manual under the conditions for verbatim copying, provided also that the
36: sections entitled "Distribution" and "General Public License" are
37: included exactly as in the original, and provided that the entire
38: resulting derived work is distributed under the terms of a permission
39: notice identical to this one.
40:
41: Permission is granted to copy and distribute translations of this manual
42: into another language, under the above conditions for modified versions,
43: except that the sections entitled "Distribution" and "General Public
44: License" may be included in a translation approved by the author instead
45: of in the original English.
46: @end ifinfo
47:
48: @finalout
49: @titlepage
50: @sp 10
51: @center @titlefont{Gforth Manual}
52: @sp 2
1.11 anton 53: @center for version @value{VERSION}
1.1 anton 54: @sp 2
55: @center Anton Ertl
1.6 pazsan 56: @center Bernd Paysan
1.5 anton 57: @center Jens Wilke
1.1 anton 58: @sp 3
1.6 pazsan 59: @center This manual is permanently under construction
1.1 anton 60:
61: @comment The following two commands start the copyright page.
62: @page
63: @vskip 0pt plus 1filll
64: Copyright @copyright{} 1995--1997 Free Software Foundation, Inc.
65:
66: @comment !! Published by ... or You can get a copy of this manual ...
67:
68: Permission is granted to make and distribute verbatim copies of
69: this manual provided the copyright notice and this permission notice
70: are preserved on all copies.
71:
72: Permission is granted to copy and distribute modified versions of this
73: manual under the conditions for verbatim copying, provided also that the
74: sections entitled "Distribution" and "General Public License" are
75: included exactly as in the original, and provided that the entire
76: resulting derived work is distributed under the terms of a permission
77: notice identical to this one.
78:
79: Permission is granted to copy and distribute translations of this manual
80: into another language, under the above conditions for modified versions,
81: except that the sections entitled "Distribution" and "General Public
82: License" may be included in a translation approved by the author instead
83: of in the original English.
84: @end titlepage
85:
86:
87: @node Top, License, (dir), (dir)
88: @ifinfo
89: Gforth is a free implementation of ANS Forth available on many
1.11 anton 90: personal machines. This manual corresponds to version @value{VERSION}.
1.1 anton 91: @end ifinfo
92:
93: @menu
94: * License::
95: * Goals:: About the Gforth Project
96: * Other Books:: Things you might want to read
97: * Invoking Gforth:: Starting Gforth
98: * Words:: Forth words available in Gforth
99: * Tools:: Programming tools
100: * ANS conformance:: Implementation-defined options etc.
101: * Model:: The abstract machine of Gforth
102: * Integrating Gforth:: Forth as scripting language for applications
103: * Emacs and Gforth:: The Gforth Mode
104: * Image Files:: @code{.fi} files contain compiled code
105: * Engine:: The inner interpreter and the primitives
106: * Bugs:: How to report them
107: * Origin:: Authors and ancestors of Gforth
108: * Word Index:: An item for each Forth word
109: * Concept Index:: A menu covering many topics
1.12 ! anton 110:
! 111: --- The Detailed Node Listing ---
! 112:
! 113: Forth Words
! 114:
! 115: * Notation::
! 116: * Arithmetic::
! 117: * Stack Manipulation::
! 118: * Memory::
! 119: * Control Structures::
! 120: * Locals::
! 121: * Defining Words::
! 122: * Structures::
! 123: * Object-oriented Forth::
! 124: * Tokens for Words::
! 125: * Wordlists::
! 126: * Files::
! 127: * Including Files::
! 128: * Blocks::
! 129: * Other I/O::
! 130: * Programming Tools::
! 131: * Assembler and Code Words::
! 132: * Threading Words::
! 133:
! 134: Arithmetic
! 135:
! 136: * Single precision::
! 137: * Bitwise operations::
! 138: * Mixed precision:: operations with single and double-cell integers
! 139: * Double precision:: Double-cell integer arithmetic
! 140: * Floating Point::
! 141:
! 142: Stack Manipulation
! 143:
! 144: * Data stack::
! 145: * Floating point stack::
! 146: * Return stack::
! 147: * Locals stack::
! 148: * Stack pointer manipulation::
! 149:
! 150: Memory
! 151:
! 152: * Memory Access::
! 153: * Address arithmetic::
! 154: * Memory Blocks::
! 155:
! 156: Control Structures
! 157:
! 158: * Selection::
! 159: * Simple Loops::
! 160: * Counted Loops::
! 161: * Arbitrary control structures::
! 162: * Calls and returns::
! 163: * Exception Handling::
! 164:
! 165: Locals
! 166:
! 167: * Gforth locals::
! 168: * ANS Forth locals::
! 169:
! 170: Gforth locals
! 171:
! 172: * Where are locals visible by name?::
! 173: * How long do locals live?::
! 174: * Programming Style::
! 175: * Implementation::
! 176:
! 177: Defining Words
! 178:
! 179: * Simple Defining Words::
! 180: * Colon Definitions::
! 181: * User-defined Defining Words::
! 182: * Supplying names::
! 183: * Interpretation and Compilation Semantics::
! 184:
! 185: Structures
! 186:
! 187: * Why explicit structure support?::
! 188: * Structure Usage::
! 189: * Structure Naming Convention::
! 190: * Structure Implementation::
! 191: * Structure Glossary::
! 192:
! 193: Object-oriented Forth
! 194:
! 195: * Objects::
! 196: * OOF::
! 197: * Mini-OOF::
! 198:
! 199: Objects
! 200:
! 201: * Properties of the Objects model::
! 202: * Why object-oriented programming?::
! 203: * Object-Oriented Terminology::
! 204: * Basic Objects Usage::
! 205: * The class Object::
! 206: * Creating objects::
! 207: * Object-Oriented Programming Style::
! 208: * Class Binding::
! 209: * Method conveniences::
! 210: * Classes and Scoping::
! 211: * Object Interfaces::
! 212: * Objects Implementation::
! 213: * Comparison with other object models::
! 214: * Objects Glossary::
! 215:
! 216: OOF
! 217:
! 218: * Properties of the OOF model::
! 219: * Basic OOF Usage::
! 220: * The base class object::
! 221: * Class Declaration::
! 222: * Class Implementation::
! 223:
! 224: Including Files
! 225:
! 226: * Words for Including::
! 227: * Search Path::
! 228: * Changing the Search Path::
! 229: * General Search Paths::
! 230:
! 231: Programming Tools
! 232:
! 233: * Debugging:: Simple and quick.
! 234: * Assertions:: Making your programs self-checking.
! 235: * Singlestep Debugger:: Executing your program word by word.
! 236:
! 237: Tools
! 238:
! 239: * ANS Report:: Report the words used, sorted by wordset.
! 240:
! 241: ANS conformance
! 242:
! 243: * The Core Words::
! 244: * The optional Block word set::
! 245: * The optional Double Number word set::
! 246: * The optional Exception word set::
! 247: * The optional Facility word set::
! 248: * The optional File-Access word set::
! 249: * The optional Floating-Point word set::
! 250: * The optional Locals word set::
! 251: * The optional Memory-Allocation word set::
! 252: * The optional Programming-Tools word set::
! 253: * The optional Search-Order word set::
! 254:
! 255: The Core Words
! 256:
! 257: * core-idef:: Implementation Defined Options
! 258: * core-ambcond:: Ambiguous Conditions
! 259: * core-other:: Other System Documentation
! 260:
! 261: The optional Block word set
! 262:
! 263: * block-idef:: Implementation Defined Options
! 264: * block-ambcond:: Ambiguous Conditions
! 265: * block-other:: Other System Documentation
! 266:
! 267: The optional Double Number word set
! 268:
! 269: * double-ambcond:: Ambiguous Conditions
! 270:
! 271: The optional Exception word set
! 272:
! 273: * exception-idef:: Implementation Defined Options
! 274:
! 275: The optional Facility word set
! 276:
! 277: * facility-idef:: Implementation Defined Options
! 278: * facility-ambcond:: Ambiguous Conditions
! 279:
! 280: The optional File-Access word set
! 281:
! 282: * file-idef:: Implementation Defined Options
! 283: * file-ambcond:: Ambiguous Conditions
! 284:
! 285: The optional Floating-Point word set
! 286:
! 287: * floating-idef:: Implementation Defined Options
! 288: * floating-ambcond:: Ambiguous Conditions
! 289:
! 290: The optional Locals word set
! 291:
! 292: * locals-idef:: Implementation Defined Options
! 293: * locals-ambcond:: Ambiguous Conditions
! 294:
! 295: The optional Memory-Allocation word set
! 296:
! 297: * memory-idef:: Implementation Defined Options
! 298:
! 299: The optional Programming-Tools word set
! 300:
! 301: * programming-idef:: Implementation Defined Options
! 302: * programming-ambcond:: Ambiguous Conditions
! 303:
! 304: The optional Search-Order word set
! 305:
! 306: * search-idef:: Implementation Defined Options
! 307: * search-ambcond:: Ambiguous Conditions
! 308:
! 309: Image Files
! 310:
! 311: * Image File Background:: Why have image files?
! 312: * Non-Relocatable Image Files:: don't always work.
! 313: * Data-Relocatable Image Files:: are better.
! 314: * Fully Relocatable Image Files:: better yet.
! 315: * Stack and Dictionary Sizes:: Setting the default sizes for an image.
! 316: * Running Image Files:: @code{gforth -i @var{file}} or @var{file}.
! 317: * Modifying the Startup Sequence:: and turnkey applications.
! 318:
! 319: Fully Relocatable Image Files
! 320:
! 321: * gforthmi:: The normal way
! 322: * cross.fs:: The hard way
! 323:
! 324: Engine
! 325:
! 326: * Portability::
! 327: * Threading::
! 328: * Primitives::
! 329: * Performance::
! 330:
! 331: Threading
! 332:
! 333: * Scheduling::
! 334: * Direct or Indirect Threaded?::
! 335: * DOES>::
! 336:
! 337: Primitives
! 338:
! 339: * Automatic Generation::
! 340: * TOS Optimization::
! 341: * Produced code::
1.1 anton 342: @end menu
343:
344: @node License, Goals, Top, Top
345: @unnumbered GNU GENERAL PUBLIC LICENSE
346: @center Version 2, June 1991
347:
348: @display
349: Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc.
350: 675 Mass Ave, Cambridge, MA 02139, USA
351:
352: Everyone is permitted to copy and distribute verbatim copies
353: of this license document, but changing it is not allowed.
354: @end display
355:
356: @unnumberedsec Preamble
357:
358: The licenses for most software are designed to take away your
359: freedom to share and change it. By contrast, the GNU General Public
360: License is intended to guarantee your freedom to share and change free
361: software---to make sure the software is free for all its users. This
362: General Public License applies to most of the Free Software
363: Foundation's software and to any other program whose authors commit to
364: using it. (Some other Free Software Foundation software is covered by
365: the GNU Library General Public License instead.) You can apply it to
366: your programs, too.
367:
368: When we speak of free software, we are referring to freedom, not
369: price. Our General Public Licenses are designed to make sure that you
370: have the freedom to distribute copies of free software (and charge for
371: this service if you wish), that you receive source code or can get it
372: if you want it, that you can change the software or use pieces of it
373: in new free programs; and that you know you can do these things.
374:
375: To protect your rights, we need to make restrictions that forbid
376: anyone to deny you these rights or to ask you to surrender the rights.
377: These restrictions translate to certain responsibilities for you if you
378: distribute copies of the software, or if you modify it.
379:
380: For example, if you distribute copies of such a program, whether
381: gratis or for a fee, you must give the recipients all the rights that
382: you have. You must make sure that they, too, receive or can get the
383: source code. And you must show them these terms so they know their
384: rights.
385:
386: We protect your rights with two steps: (1) copyright the software, and
387: (2) offer you this license which gives you legal permission to copy,
388: distribute and/or modify the software.
389:
390: Also, for each author's protection and ours, we want to make certain
391: that everyone understands that there is no warranty for this free
392: software. If the software is modified by someone else and passed on, we
393: want its recipients to know that what they have is not the original, so
394: that any problems introduced by others will not reflect on the original
395: authors' reputations.
396:
397: Finally, any free program is threatened constantly by software
398: patents. We wish to avoid the danger that redistributors of a free
399: program will individually obtain patent licenses, in effect making the
400: program proprietary. To prevent this, we have made it clear that any
401: patent must be licensed for everyone's free use or not licensed at all.
402:
403: The precise terms and conditions for copying, distribution and
404: modification follow.
405:
406: @iftex
407: @unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
408: @end iftex
409: @ifinfo
410: @center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
411: @end ifinfo
412:
413: @enumerate 0
414: @item
415: This License applies to any program or other work which contains
416: a notice placed by the copyright holder saying it may be distributed
417: under the terms of this General Public License. The ``Program'', below,
418: refers to any such program or work, and a ``work based on the Program''
419: means either the Program or any derivative work under copyright law:
420: that is to say, a work containing the Program or a portion of it,
421: either verbatim or with modifications and/or translated into another
422: language. (Hereinafter, translation is included without limitation in
423: the term ``modification''.) Each licensee is addressed as ``you''.
424:
425: Activities other than copying, distribution and modification are not
426: covered by this License; they are outside its scope. The act of
427: running the Program is not restricted, and the output from the Program
428: is covered only if its contents constitute a work based on the
429: Program (independent of having been made by running the Program).
430: Whether that is true depends on what the Program does.
431:
432: @item
433: You may copy and distribute verbatim copies of the Program's
434: source code as you receive it, in any medium, provided that you
435: conspicuously and appropriately publish on each copy an appropriate
436: copyright notice and disclaimer of warranty; keep intact all the
437: notices that refer to this License and to the absence of any warranty;
438: and give any other recipients of the Program a copy of this License
439: along with the Program.
440:
441: You may charge a fee for the physical act of transferring a copy, and
442: you may at your option offer warranty protection in exchange for a fee.
443:
444: @item
445: You may modify your copy or copies of the Program or any portion
446: of it, thus forming a work based on the Program, and copy and
447: distribute such modifications or work under the terms of Section 1
448: above, provided that you also meet all of these conditions:
449:
450: @enumerate a
451: @item
452: You must cause the modified files to carry prominent notices
453: stating that you changed the files and the date of any change.
454:
455: @item
456: You must cause any work that you distribute or publish, that in
457: whole or in part contains or is derived from the Program or any
458: part thereof, to be licensed as a whole at no charge to all third
459: parties under the terms of this License.
460:
461: @item
462: If the modified program normally reads commands interactively
463: when run, you must cause it, when started running for such
464: interactive use in the most ordinary way, to print or display an
465: announcement including an appropriate copyright notice and a
466: notice that there is no warranty (or else, saying that you provide
467: a warranty) and that users may redistribute the program under
468: these conditions, and telling the user how to view a copy of this
469: License. (Exception: if the Program itself is interactive but
470: does not normally print such an announcement, your work based on
471: the Program is not required to print an announcement.)
472: @end enumerate
473:
474: These requirements apply to the modified work as a whole. If
475: identifiable sections of that work are not derived from the Program,
476: and can be reasonably considered independent and separate works in
477: themselves, then this License, and its terms, do not apply to those
478: sections when you distribute them as separate works. But when you
479: distribute the same sections as part of a whole which is a work based
480: on the Program, the distribution of the whole must be on the terms of
481: this License, whose permissions for other licensees extend to the
482: entire whole, and thus to each and every part regardless of who wrote it.
483:
484: Thus, it is not the intent of this section to claim rights or contest
485: your rights to work written entirely by you; rather, the intent is to
486: exercise the right to control the distribution of derivative or
487: collective works based on the Program.
488:
489: In addition, mere aggregation of another work not based on the Program
490: with the Program (or with a work based on the Program) on a volume of
491: a storage or distribution medium does not bring the other work under
492: the scope of this License.
493:
494: @item
495: You may copy and distribute the Program (or a work based on it,
496: under Section 2) in object code or executable form under the terms of
497: Sections 1 and 2 above provided that you also do one of the following:
498:
499: @enumerate a
500: @item
501: Accompany it with the complete corresponding machine-readable
502: source code, which must be distributed under the terms of Sections
503: 1 and 2 above on a medium customarily used for software interchange; or,
504:
505: @item
506: Accompany it with a written offer, valid for at least three
507: years, to give any third party, for a charge no more than your
508: cost of physically performing source distribution, a complete
509: machine-readable copy of the corresponding source code, to be
510: distributed under the terms of Sections 1 and 2 above on a medium
511: customarily used for software interchange; or,
512:
513: @item
514: Accompany it with the information you received as to the offer
515: to distribute corresponding source code. (This alternative is
516: allowed only for noncommercial distribution and only if you
517: received the program in object code or executable form with such
518: an offer, in accord with Subsection b above.)
519: @end enumerate
520:
521: The source code for a work means the preferred form of the work for
522: making modifications to it. For an executable work, complete source
523: code means all the source code for all modules it contains, plus any
524: associated interface definition files, plus the scripts used to
525: control compilation and installation of the executable. However, as a
526: special exception, the source code distributed need not include
527: anything that is normally distributed (in either source or binary
528: form) with the major components (compiler, kernel, and so on) of the
529: operating system on which the executable runs, unless that component
530: itself accompanies the executable.
531:
532: If distribution of executable or object code is made by offering
533: access to copy from a designated place, then offering equivalent
534: access to copy the source code from the same place counts as
535: distribution of the source code, even though third parties are not
536: compelled to copy the source along with the object code.
537:
538: @item
539: You may not copy, modify, sublicense, or distribute the Program
540: except as expressly provided under this License. Any attempt
541: otherwise to copy, modify, sublicense or distribute the Program is
542: void, and will automatically terminate your rights under this License.
543: However, parties who have received copies, or rights, from you under
544: this License will not have their licenses terminated so long as such
545: parties remain in full compliance.
546:
547: @item
548: You are not required to accept this License, since you have not
549: signed it. However, nothing else grants you permission to modify or
550: distribute the Program or its derivative works. These actions are
551: prohibited by law if you do not accept this License. Therefore, by
552: modifying or distributing the Program (or any work based on the
553: Program), you indicate your acceptance of this License to do so, and
554: all its terms and conditions for copying, distributing or modifying
555: the Program or works based on it.
556:
557: @item
558: Each time you redistribute the Program (or any work based on the
559: Program), the recipient automatically receives a license from the
560: original licensor to copy, distribute or modify the Program subject to
561: these terms and conditions. You may not impose any further
562: restrictions on the recipients' exercise of the rights granted herein.
563: You are not responsible for enforcing compliance by third parties to
564: this License.
565:
566: @item
567: If, as a consequence of a court judgment or allegation of patent
568: infringement or for any other reason (not limited to patent issues),
569: conditions are imposed on you (whether by court order, agreement or
570: otherwise) that contradict the conditions of this License, they do not
571: excuse you from the conditions of this License. If you cannot
572: distribute so as to satisfy simultaneously your obligations under this
573: License and any other pertinent obligations, then as a consequence you
574: may not distribute the Program at all. For example, if a patent
575: license would not permit royalty-free redistribution of the Program by
576: all those who receive copies directly or indirectly through you, then
577: the only way you could satisfy both it and this License would be to
578: refrain entirely from distribution of the Program.
579:
580: If any portion of this section is held invalid or unenforceable under
581: any particular circumstance, the balance of the section is intended to
582: apply and the section as a whole is intended to apply in other
583: circumstances.
584:
585: It is not the purpose of this section to induce you to infringe any
586: patents or other property right claims or to contest validity of any
587: such claims; this section has the sole purpose of protecting the
588: integrity of the free software distribution system, which is
589: implemented by public license practices. Many people have made
590: generous contributions to the wide range of software distributed
591: through that system in reliance on consistent application of that
592: system; it is up to the author/donor to decide if he or she is willing
593: to distribute software through any other system and a licensee cannot
594: impose that choice.
595:
596: This section is intended to make thoroughly clear what is believed to
597: be a consequence of the rest of this License.
598:
599: @item
600: If the distribution and/or use of the Program is restricted in
601: certain countries either by patents or by copyrighted interfaces, the
602: original copyright holder who places the Program under this License
603: may add an explicit geographical distribution limitation excluding
604: those countries, so that distribution is permitted only in or among
605: countries not thus excluded. In such case, this License incorporates
606: the limitation as if written in the body of this License.
607:
608: @item
609: The Free Software Foundation may publish revised and/or new versions
610: of the General Public License from time to time. Such new versions will
611: be similar in spirit to the present version, but may differ in detail to
612: address new problems or concerns.
613:
614: Each version is given a distinguishing version number. If the Program
615: specifies a version number of this License which applies to it and ``any
616: later version'', you have the option of following the terms and conditions
617: either of that version or of any later version published by the Free
618: Software Foundation. If the Program does not specify a version number of
619: this License, you may choose any version ever published by the Free Software
620: Foundation.
621:
622: @item
623: If you wish to incorporate parts of the Program into other free
624: programs whose distribution conditions are different, write to the author
625: to ask for permission. For software which is copyrighted by the Free
626: Software Foundation, write to the Free Software Foundation; we sometimes
627: make exceptions for this. Our decision will be guided by the two goals
628: of preserving the free status of all derivatives of our free software and
629: of promoting the sharing and reuse of software generally.
630:
631: @iftex
632: @heading NO WARRANTY
633: @end iftex
634: @ifinfo
635: @center NO WARRANTY
636: @end ifinfo
637:
638: @item
639: BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
640: FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
641: OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
642: PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
643: OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
644: MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
645: TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
646: PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
647: REPAIR OR CORRECTION.
648:
649: @item
650: IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
651: WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
652: REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
653: INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
654: OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
655: TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
656: YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
657: PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
658: POSSIBILITY OF SUCH DAMAGES.
659: @end enumerate
660:
661: @iftex
662: @heading END OF TERMS AND CONDITIONS
663: @end iftex
664: @ifinfo
665: @center END OF TERMS AND CONDITIONS
666: @end ifinfo
667:
668: @page
669: @unnumberedsec How to Apply These Terms to Your New Programs
670:
671: If you develop a new program, and you want it to be of the greatest
672: possible use to the public, the best way to achieve this is to make it
673: free software which everyone can redistribute and change under these terms.
674:
675: To do so, attach the following notices to the program. It is safest
676: to attach them to the start of each source file to most effectively
677: convey the exclusion of warranty; and each file should have at least
678: the ``copyright'' line and a pointer to where the full notice is found.
679:
680: @smallexample
681: @var{one line to give the program's name and a brief idea of what it does.}
682: Copyright (C) 19@var{yy} @var{name of author}
683:
684: This program is free software; you can redistribute it and/or modify
685: it under the terms of the GNU General Public License as published by
686: the Free Software Foundation; either version 2 of the License, or
687: (at your option) any later version.
688:
689: This program is distributed in the hope that it will be useful,
690: but WITHOUT ANY WARRANTY; without even the implied warranty of
691: MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
692: GNU General Public License for more details.
693:
694: You should have received a copy of the GNU General Public License
695: along with this program; if not, write to the Free Software
696: Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
697: @end smallexample
698:
699: Also add information on how to contact you by electronic and paper mail.
700:
701: If the program is interactive, make it output a short notice like this
702: when it starts in an interactive mode:
703:
704: @smallexample
705: Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
706: Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
707: type `show w'.
708: This is free software, and you are welcome to redistribute it
709: under certain conditions; type `show c' for details.
710: @end smallexample
711:
712: The hypothetical commands @samp{show w} and @samp{show c} should show
713: the appropriate parts of the General Public License. Of course, the
714: commands you use may be called something other than @samp{show w} and
715: @samp{show c}; they could even be mouse-clicks or menu items---whatever
716: suits your program.
717:
718: You should also get your employer (if you work as a programmer) or your
719: school, if any, to sign a ``copyright disclaimer'' for the program, if
720: necessary. Here is a sample; alter the names:
721:
722: @smallexample
723: Yoyodyne, Inc., hereby disclaims all copyright interest in the program
724: `Gnomovision' (which makes passes at compilers) written by James Hacker.
725:
726: @var{signature of Ty Coon}, 1 April 1989
727: Ty Coon, President of Vice
728: @end smallexample
729:
730: This General Public License does not permit incorporating your program into
731: proprietary programs. If your program is a subroutine library, you may
732: consider it more useful to permit linking proprietary applications with the
733: library. If this is what you want to do, use the GNU Library General
734: Public License instead of this License.
735:
736: @iftex
737: @unnumbered Preface
738: @cindex Preface
739: This manual documents Gforth. The reader is expected to know
740: Forth. This manual is primarily a reference manual. @xref{Other Books}
741: for introductory material.
742: @end iftex
743:
744: @node Goals, Other Books, License, Top
745: @comment node-name, next, previous, up
746: @chapter Goals of Gforth
747: @cindex Goals
748: The goal of the Gforth Project is to develop a standard model for
749: ANS Forth. This can be split into several subgoals:
750:
751: @itemize @bullet
752: @item
753: Gforth should conform to the Forth standard (ANS Forth).
754: @item
755: It should be a model, i.e. it should define all the
756: implementation-dependent things.
757: @item
758: It should become standard, i.e. widely accepted and used. This goal
759: is the most difficult one.
760: @end itemize
761:
762: To achieve these goals Gforth should be
763: @itemize @bullet
764: @item
765: Similar to previous models (fig-Forth, F83)
766: @item
767: Powerful. It should provide for all the things that are considered
768: necessary today and even some that are not yet considered necessary.
769: @item
770: Efficient. It should not get the reputation of being exceptionally
771: slow.
772: @item
773: Free.
774: @item
775: Available on many machines/easy to port.
776: @end itemize
777:
778: Have we achieved these goals? Gforth conforms to the ANS Forth
779: standard. It may be considered a model, but we have not yet documented
780: which parts of the model are stable and which parts we are likely to
1.12 ! anton 781: change. It certainly has not yet become a de facto standard, but it
! 782: appears to be quite popular. It has some similarities to and some
! 783: differences from previous models. It has some powerful features, but not
! 784: yet everything that we envisioned. We certainly have achieved our
! 785: execution speed goals (@pxref{Performance}). It is free and available
! 786: on many machines.
1.1 anton 787:
788: @node Other Books, Invoking Gforth, Goals, Top
789: @chapter Other books on ANS Forth
790: @cindex books on Forth
791:
792: As the standard is relatively new, there are not many books out yet. It
793: is not recommended to learn Forth by using Gforth and a book that is
794: not written for ANS Forth, as you will not know your mistakes from the
795: deviations of the book.
796:
797: @cindex standard document for ANS Forth
798: @cindex ANS Forth document
799: There is, of course, the standard, the definite reference if you want to
800: write ANS Forth programs. It is available in printed form from the
801: National Standards Institute Sales Department (Tel.: USA (212) 642-4900;
802: Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about $200. You
803: can also get it from Global Engineering Documents (Tel.: USA (800)
804: 854-7179; Fax.: (303) 843-9880) for about $300.
805:
1.12 ! anton 806: @cite{dpANS6}, the last draft of the standard, which was then submitted
! 807: to ANSI for publication is available electronically and for free in some
! 808: MS Word format, and it has been converted to HTML (this is my favourite
! 809: format !!url). Some pointers to these versions can be found through
1.1 anton 810: @*@url{http://www.complang.tuwien.ac.at/projects/forth.html}.
811:
812: @cindex introductory book
813: @cindex book, introductory
814: @cindex Woehr, Jack: @cite{Forth: The New Model}
815: @cindex @cite{Forth: The new model} (book)
816: @cite{Forth: The New Model} by Jack Woehr (Prentice-Hall, 1993) is an
817: introductory book based on a draft version of the standard. It does not
818: cover the whole standard. It also contains interesting background
819: information (Jack Woehr was in the ANS Forth Technical Committee). It is
820: not appropriate for complete newbies, but programmers experienced in
821: other languages should find it ok.
822:
1.12 ! anton 823: !!Conklin, Forth programmer's handbook
! 824:
1.1 anton 825: @node Invoking Gforth, Words, Other Books, Top
826: @chapter Invoking Gforth
827: @cindex invoking Gforth
828: @cindex running Gforth
829: @cindex command-line options
830: @cindex options on the command line
831: @cindex flags on the command line
832:
833: You will usually just say @code{gforth}. In many other cases the default
834: Gforth image will be invoked like this:
835: @example
836: gforth [files] [-e forth-code]
837: @end example
1.12 ! anton 838: This interprets the contents of the files and the Forth code in the order they
1.1 anton 839: are given.
840:
841: In general, the command line looks like this:
842:
843: @example
844: gforth [initialization options] [image-specific options]
845: @end example
846:
847: The initialization options must come before the rest of the command
848: line. They are:
849:
850: @table @code
851: @cindex -i, command-line option
852: @cindex --image-file, command-line option
853: @item --image-file @var{file}
854: @itemx -i @var{file}
855: Loads the Forth image @var{file} instead of the default
856: @file{gforth.fi} (@pxref{Image Files}).
857:
858: @cindex --path, command-line option
859: @cindex -p, command-line option
860: @item --path @var{path}
861: @itemx -p @var{path}
862: Uses @var{path} for searching the image file and Forth source code files
863: instead of the default in the environment variable @code{GFORTHPATH} or
864: the path specified at installation time (e.g.,
865: @file{/usr/local/share/gforth/0.2.0:.}). A path is given as a list of
866: directories, separated by @samp{:} (on Unix) or @samp{;} (on other OSs).
867:
868: @cindex --dictionary-size, command-line option
869: @cindex -m, command-line option
870: @cindex @var{size} parameters for command-line options
871: @cindex size of the dictionary and the stacks
872: @item --dictionary-size @var{size}
873: @itemx -m @var{size}
874: Allocate @var{size} space for the Forth dictionary space instead of
875: using the default specified in the image (typically 256K). The
876: @var{size} specification consists of an integer and a unit (e.g.,
877: @code{4M}). The unit can be one of @code{b} (bytes), @code{e} (element
1.12 ! anton 878: size, in this case Cells), @code{k} (kilobytes), @code{M} (Megabytes),
! 879: @code{G} (Gigabytes), and @code{T} (Terabytes). If no unit is specified,
! 880: @code{e} is used.
1.1 anton 881:
882: @cindex --data-stack-size, command-line option
883: @cindex -d, command-line option
884: @item --data-stack-size @var{size}
885: @itemx -d @var{size}
886: Allocate @var{size} space for the data stack instead of using the
887: default specified in the image (typically 16K).
888:
889: @cindex --return-stack-size, command-line option
890: @cindex -r, command-line option
891: @item --return-stack-size @var{size}
892: @itemx -r @var{size}
893: Allocate @var{size} space for the return stack instead of using the
894: default specified in the image (typically 15K).
895:
896: @cindex --fp-stack-size, command-line option
897: @cindex -f, command-line option
898: @item --fp-stack-size @var{size}
899: @itemx -f @var{size}
900: Allocate @var{size} space for the floating point stack instead of
901: using the default specified in the image (typically 15.5K). In this case
902: the unit specifier @code{e} refers to floating point numbers.
903:
904: @cindex --locals-stack-size, command-line option
905: @cindex -l, command-line option
906: @item --locals-stack-size @var{size}
907: @itemx -l @var{size}
908: Allocate @var{size} space for the locals stack instead of using the
909: default specified in the image (typically 14.5K).
910:
911: @cindex -h, command-line option
912: @cindex --help, command-line option
913: @item --help
914: @itemx -h
915: Print a message about the command-line options
916:
917: @cindex -v, command-line option
918: @cindex --version, command-line option
919: @item --version
920: @itemx -v
921: Print version and exit
922:
923: @cindex --debug, command-line option
924: @item --debug
925: Print some information useful for debugging on startup.
926:
927: @cindex --offset-image, command-line option
928: @item --offset-image
929: Start the dictionary at a slightly different position than would be used
930: otherwise (useful for creating data-relocatable images,
931: @pxref{Data-Relocatable Image Files}).
932:
1.5 anton 933: @cindex --no-offset-im, command-line option
934: @item --no-offset-im
935: Start the dictionary at the normal position.
936:
1.1 anton 937: @cindex --clear-dictionary, command-line option
938: @item --clear-dictionary
939: Initialize all bytes in the dictionary to 0 before loading the image
940: (@pxref{Data-Relocatable Image Files}).
1.5 anton 941:
942: @cindex --die-on-signal, command-line-option
943: @item --die-on-signal
944: Normally Gforth handles most signals (e.g., the user interrupt SIGINT,
945: or the segmentation violation SIGSEGV) by translating it into a Forth
946: @code{THROW}. With this option, Gforth exits if it receives such a
947: signal. This option is useful when the engine and/or the image might be
948: severely broken (such that it causes another signal before recovering
949: from the first); this option avoids endless loops in such cases.
1.1 anton 950: @end table
951:
952: @cindex loading files at startup
953: @cindex executing code on startup
954: @cindex batch processing with Gforth
955: As explained above, the image-specific command-line arguments for the
956: default image @file{gforth.fi} consist of a sequence of filenames and
957: @code{-e @var{forth-code}} options that are interpreted in the sequence
958: in which they are given. The @code{-e @var{forth-code}} or
959: @code{--evaluate @var{forth-code}} option evaluates the forth
960: code. This option takes only one argument; if you want to evaluate more
961: Forth words, you have to quote them or use several @code{-e}s. To exit
962: after processing the command line (instead of entering interactive mode)
963: append @code{-e bye} to the command line.
964:
965: @cindex versions, invoking other versions of Gforth
966: If you have several versions of Gforth installed, @code{gforth} will
967: invoke the version that was installed last. @code{gforth-@var{version}}
968: invokes a specific version. You may want to use the option
969: @code{--path}, if your environment contains the variable
970: @code{GFORTHPATH}.
971:
972: Not yet implemented:
973: On startup the system first executes the system initialization file
974: (unless the option @code{--no-init-file} is given; note that the system
975: resulting from using this option may not be ANS Forth conformant). Then
976: the user initialization file @file{.gforth.fs} is executed, unless the
977: option @code{--no-rc} is given; this file is first searched in @file{.},
978: then in @file{~}, then in the normal path (see above).
979:
980: @node Words, Tools, Invoking Gforth, Top
981: @chapter Forth Words
982: @cindex Words
983:
984: @menu
985: * Notation::
986: * Arithmetic::
987: * Stack Manipulation::
1.5 anton 988: * Memory::
1.1 anton 989: * Control Structures::
990: * Locals::
991: * Defining Words::
1.5 anton 992: * Structures::
1.12 ! anton 993: * Object-oriented Forth::
! 994: * Tokens for Words::
! 995: * Wordlists::
! 996: * Files::
! 997: * Including Files::
! 998: * Blocks::
! 999: * Other I/O::
! 1000: * Programming Tools::
! 1001: * Assembler and Code Words::
! 1002: * Threading Words::
1.1 anton 1003: @end menu
1004:
1005: @node Notation, Arithmetic, Words, Words
1006: @section Notation
1007: @cindex notation of glossary entries
1008: @cindex format of glossary entries
1009: @cindex glossary notation format
1010: @cindex word glossary entry format
1011:
1012: The Forth words are described in this section in the glossary notation
1013: that has become a de-facto standard for Forth texts, i.e.,
1014:
1015: @format
1016: @var{word} @var{Stack effect} @var{wordset} @var{pronunciation}
1017: @end format
1018: @var{Description}
1019:
1020: @table @var
1021: @item word
1022: @cindex case insensitivity
1023: The name of the word. BTW, Gforth is case insensitive, so you can
1024: type the words in in lower case (However, @pxref{core-idef}).
1025:
1026: @item Stack effect
1027: @cindex stack effect
1028: The stack effect is written in the notation @code{@var{before} --
1029: @var{after}}, where @var{before} and @var{after} describe the top of
1030: stack entries before and after the execution of the word. The rest of
1031: the stack is not touched by the word. The top of stack is rightmost,
1032: i.e., a stack sequence is written as it is typed in. Note that Gforth
1033: uses a separate floating point stack, but a unified stack
1034: notation. Also, return stack effects are not shown in @var{stack
1035: effect}, but in @var{Description}. The name of a stack item describes
1036: the type and/or the function of the item. See below for a discussion of
1037: the types.
1038:
1039: All words have two stack effects: A compile-time stack effect and a
1040: run-time stack effect. The compile-time stack-effect of most words is
1041: @var{ -- }. If the compile-time stack-effect of a word deviates from
1042: this standard behaviour, or the word does other unusual things at
1043: compile time, both stack effects are shown; otherwise only the run-time
1044: stack effect is shown.
1045:
1046: @cindex pronounciation of words
1047: @item pronunciation
1048: How the word is pronounced.
1049:
1050: @cindex wordset
1051: @item wordset
1052: The ANS Forth standard is divided into several wordsets. A standard
1053: system need not support all of them. So, the fewer wordsets your program
1054: uses the more portable it will be in theory. However, we suspect that
1055: most ANS Forth systems on personal machines will feature all
1056: wordsets. Words that are not defined in the ANS standard have
1057: @code{gforth} or @code{gforth-internal} as wordset. @code{gforth}
1058: describes words that will work in future releases of Gforth;
1059: @code{gforth-internal} words are more volatile. Environmental query
1060: strings are also displayed like words; you can recognize them by the
1061: @code{environment} in the wordset field.
1062:
1063: @item Description
1064: A description of the behaviour of the word.
1065: @end table
1066:
1067: @cindex types of stack items
1068: @cindex stack item types
1069: The type of a stack item is specified by the character(s) the name
1070: starts with:
1071:
1072: @table @code
1073: @item f
1074: @cindex @code{f}, stack item type
1075: Boolean flags, i.e. @code{false} or @code{true}.
1076: @item c
1077: @cindex @code{c}, stack item type
1078: Char
1079: @item w
1080: @cindex @code{w}, stack item type
1081: Cell, can contain an integer or an address
1082: @item n
1083: @cindex @code{n}, stack item type
1084: signed integer
1085: @item u
1086: @cindex @code{u}, stack item type
1087: unsigned integer
1088: @item d
1089: @cindex @code{d}, stack item type
1090: double sized signed integer
1091: @item ud
1092: @cindex @code{ud}, stack item type
1093: double sized unsigned integer
1094: @item r
1095: @cindex @code{r}, stack item type
1096: Float (on the FP stack)
1097: @item a_
1098: @cindex @code{a_}, stack item type
1099: Cell-aligned address
1100: @item c_
1101: @cindex @code{c_}, stack item type
1102: Char-aligned address (note that a Char may have two bytes in Windows NT)
1103: @item f_
1104: @cindex @code{f_}, stack item type
1105: Float-aligned address
1106: @item df_
1107: @cindex @code{df_}, stack item type
1108: Address aligned for IEEE double precision float
1109: @item sf_
1110: @cindex @code{sf_}, stack item type
1111: Address aligned for IEEE single precision float
1112: @item xt
1113: @cindex @code{xt}, stack item type
1114: Execution token, same size as Cell
1115: @item wid
1116: @cindex @code{wid}, stack item type
1117: Wordlist ID, same size as Cell
1118: @item f83name
1119: @cindex @code{f83name}, stack item type
1120: Pointer to a name structure
1121: @item "
1122: @cindex @code{"}, stack item type
1.12 ! anton 1123: string in the input stream (not on the stack). The terminating character
! 1124: is a blank by default. If it is not a blank, it is shown in @code{<>}
1.1 anton 1125: quotes.
1126: @end table
1127:
1128: @node Arithmetic, Stack Manipulation, Notation, Words
1129: @section Arithmetic
1130: @cindex arithmetic words
1131:
1132: @cindex division with potentially negative operands
1133: Forth arithmetic is not checked, i.e., you will not hear about integer
1134: overflow on addition or multiplication, you may hear about division by
1135: zero if you are lucky. The operator is written after the operands, but
1136: the operands are still in the original order. I.e., the infix @code{2-1}
1137: corresponds to @code{2 1 -}. Forth offers a variety of division
1138: operators. If you perform division with potentially negative operands,
1139: you do not want to use @code{/} or @code{/mod} with its undefined
1140: behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the
1141: former, @pxref{Mixed precision}).
1142:
1143: @menu
1144: * Single precision::
1145: * Bitwise operations::
1146: * Mixed precision:: operations with single and double-cell integers
1147: * Double precision:: Double-cell integer arithmetic
1148: * Floating Point::
1149: @end menu
1150:
1151: @node Single precision, Bitwise operations, Arithmetic, Arithmetic
1152: @subsection Single precision
1153: @cindex single precision arithmetic words
1154:
1155: doc-+
1156: doc--
1157: doc-*
1158: doc-/
1159: doc-mod
1160: doc-/mod
1161: doc-negate
1162: doc-abs
1163: doc-min
1164: doc-max
1165:
1166: @node Bitwise operations, Mixed precision, Single precision, Arithmetic
1167: @subsection Bitwise operations
1168: @cindex bitwise operation words
1169:
1170: doc-and
1171: doc-or
1172: doc-xor
1173: doc-invert
1174: doc-2*
1175: doc-2/
1176:
1177: @node Mixed precision, Double precision, Bitwise operations, Arithmetic
1178: @subsection Mixed precision
1179: @cindex mixed precision arithmetic words
1180:
1181: doc-m+
1182: doc-*/
1183: doc-*/mod
1184: doc-m*
1185: doc-um*
1186: doc-m*/
1187: doc-um/mod
1188: doc-fm/mod
1189: doc-sm/rem
1190:
1191: @node Double precision, Floating Point, Mixed precision, Arithmetic
1192: @subsection Double precision
1193: @cindex double precision arithmetic words
1194:
1195: @cindex double-cell numbers, input format
1196: @cindex input format for double-cell numbers
1197: The outer (aka text) interpreter converts numbers containing a dot into
1198: a double precision number. Note that only numbers with the dot as last
1199: character are standard-conforming.
1200:
1201: doc-d+
1202: doc-d-
1203: doc-dnegate
1204: doc-dabs
1205: doc-dmin
1206: doc-dmax
1207:
1208: @node Floating Point, , Double precision, Arithmetic
1209: @subsection Floating Point
1210: @cindex floating point arithmetic words
1211:
1212: @cindex floating-point numbers, input format
1213: @cindex input format for floating-point numbers
1214: The format of floating point numbers recognized by the outer (aka text)
1215: interpreter is: a signed decimal number, possibly containing a decimal
1216: point (@code{.}), followed by @code{E} or @code{e}, optionally followed
1217: by a signed integer (the exponent). E.g., @code{1e} is the same as
1.12 ! anton 1218: @code{+1.0e+0}. Note that a number without @code{e} is not interpreted
! 1219: as floating-point number, but as double (if the number contains a
! 1220: @code{.}) or single precision integer. Also, conversions between string
! 1221: and floating point numbers always use base 10, irrespective of the value
! 1222: of @code{BASE} (in Gforth; for the standard this is an ambiguous
! 1223: condition). If @code{BASE} contains a value greater then 14, the
! 1224: @code{E} may be interpreted as digit and the number will be interpreted
! 1225: as integer, unless it has a signed exponent (both @code{+} and @code{-}
! 1226: are allowed as signs).
1.1 anton 1227:
1228: @cindex angles in trigonometric operations
1229: @cindex trigonometric operations
1230: Angles in floating point operations are given in radians (a full circle
1231: has 2 pi radians). Note, that Gforth has a separate floating point
1232: stack, but we use the unified notation.
1233:
1234: @cindex floating-point arithmetic, pitfalls
1235: Floating point numbers have a number of unpleasant surprises for the
1236: unwary (e.g., floating point addition is not associative) and even a few
1237: for the wary. You should not use them unless you know what you are doing
1238: or you don't care that the results you get are totally bogus. If you
1239: want to learn about the problems of floating point numbers (and how to
1240: avoid them), you might start with @cite{David Goldberg, What Every
1241: Computer Scientist Should Know About Floating-Point Arithmetic, ACM
1242: Computing Surveys 23(1):5@minus{}48, March 1991}.
1243:
1244: doc-f+
1245: doc-f-
1246: doc-f*
1247: doc-f/
1248: doc-fnegate
1249: doc-fabs
1250: doc-fmax
1251: doc-fmin
1252: doc-floor
1253: doc-fround
1254: doc-f**
1255: doc-fsqrt
1256: doc-fexp
1257: doc-fexpm1
1258: doc-fln
1259: doc-flnp1
1260: doc-flog
1261: doc-falog
1262: doc-fsin
1263: doc-fcos
1264: doc-fsincos
1265: doc-ftan
1266: doc-fasin
1267: doc-facos
1268: doc-fatan
1269: doc-fatan2
1270: doc-fsinh
1271: doc-fcosh
1272: doc-ftanh
1273: doc-fasinh
1274: doc-facosh
1275: doc-fatanh
1276:
1277: @node Stack Manipulation, Memory, Arithmetic, Words
1278: @section Stack Manipulation
1279: @cindex stack manipulation words
1280:
1281: @cindex floating-point stack in the standard
1282: Gforth has a data stack (aka parameter stack) for characters, cells,
1283: addresses, and double cells, a floating point stack for floating point
1284: numbers, a return stack for storing the return addresses of colon
1285: definitions and other data, and a locals stack for storing local
1286: variables. Note that while every sane Forth has a separate floating
1287: point stack, this is not strictly required; an ANS Forth system could
1288: theoretically keep floating point numbers on the data stack. As an
1289: additional difficulty, you don't know how many cells a floating point
1290: number takes. It is reportedly possible to write words in a way that
1291: they work also for a unified stack model, but we do not recommend trying
1292: it. Instead, just say that your program has an environmental dependency
1293: on a separate FP stack.
1294:
1295: @cindex return stack and locals
1296: @cindex locals and return stack
1297: Also, a Forth system is allowed to keep the local variables on the
1298: return stack. This is reasonable, as local variables usually eliminate
1299: the need to use the return stack explicitly. So, if you want to produce
1300: a standard complying program and if you are using local variables in a
1301: word, forget about return stack manipulations in that word (see the
1302: standard document for the exact rules).
1303:
1304: @menu
1305: * Data stack::
1306: * Floating point stack::
1307: * Return stack::
1308: * Locals stack::
1309: * Stack pointer manipulation::
1310: @end menu
1311:
1312: @node Data stack, Floating point stack, Stack Manipulation, Stack Manipulation
1313: @subsection Data stack
1314: @cindex data stack manipulation words
1315: @cindex stack manipulations words, data stack
1316:
1317: doc-drop
1318: doc-nip
1319: doc-dup
1320: doc-over
1321: doc-tuck
1322: doc-swap
1323: doc-rot
1324: doc--rot
1325: doc-?dup
1326: doc-pick
1327: doc-roll
1328: doc-2drop
1329: doc-2nip
1330: doc-2dup
1331: doc-2over
1332: doc-2tuck
1333: doc-2swap
1334: doc-2rot
1335:
1336: @node Floating point stack, Return stack, Data stack, Stack Manipulation
1337: @subsection Floating point stack
1338: @cindex floating-point stack manipulation words
1339: @cindex stack manipulation words, floating-point stack
1340:
1341: doc-fdrop
1342: doc-fnip
1343: doc-fdup
1344: doc-fover
1345: doc-ftuck
1346: doc-fswap
1347: doc-frot
1348:
1349: @node Return stack, Locals stack, Floating point stack, Stack Manipulation
1350: @subsection Return stack
1351: @cindex return stack manipulation words
1352: @cindex stack manipulation words, return stack
1353:
1354: doc->r
1355: doc-r>
1356: doc-r@
1357: doc-rdrop
1358: doc-2>r
1359: doc-2r>
1360: doc-2r@
1361: doc-2rdrop
1362:
1363: @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation
1364: @subsection Locals stack
1365:
1366: @node Stack pointer manipulation, , Locals stack, Stack Manipulation
1367: @subsection Stack pointer manipulation
1368: @cindex stack pointer manipulation words
1369:
1370: doc-sp@
1371: doc-sp!
1372: doc-fp@
1373: doc-fp!
1374: doc-rp@
1375: doc-rp!
1376: doc-lp@
1377: doc-lp!
1378:
1379: @node Memory, Control Structures, Stack Manipulation, Words
1380: @section Memory
1381: @cindex Memory words
1382:
1383: @menu
1384: * Memory Access::
1385: * Address arithmetic::
1386: * Memory Blocks::
1387: @end menu
1388:
1389: @node Memory Access, Address arithmetic, Memory, Memory
1390: @subsection Memory Access
1391: @cindex memory access words
1392:
1393: doc-@
1394: doc-!
1395: doc-+!
1396: doc-c@
1397: doc-c!
1398: doc-2@
1399: doc-2!
1400: doc-f@
1401: doc-f!
1402: doc-sf@
1403: doc-sf!
1404: doc-df@
1405: doc-df!
1406:
1407: @node Address arithmetic, Memory Blocks, Memory Access, Memory
1408: @subsection Address arithmetic
1409: @cindex address arithmetic words
1410:
1411: ANS Forth does not specify the sizes of the data types. Instead, it
1412: offers a number of words for computing sizes and doing address
1413: arithmetic. Basically, address arithmetic is performed in terms of
1414: address units (aus); on most systems the address unit is one byte. Note
1415: that a character may have more than one au, so @code{chars} is no noop
1416: (on systems where it is a noop, it compiles to nothing).
1417:
1418: @cindex alignment of addresses for types
1419: ANS Forth also defines words for aligning addresses for specific
1420: types. Many computers require that accesses to specific data types
1421: must only occur at specific addresses; e.g., that cells may only be
1422: accessed at addresses divisible by 4. Even if a machine allows unaligned
1423: accesses, it can usually perform aligned accesses faster.
1424:
1425: For the performance-conscious: alignment operations are usually only
1426: necessary during the definition of a data structure, not during the
1427: (more frequent) accesses to it.
1428:
1429: ANS Forth defines no words for character-aligning addresses. This is not
1430: an oversight, but reflects the fact that addresses that are not
1431: char-aligned have no use in the standard and therefore will not be
1432: created.
1433:
1434: @cindex @code{CREATE} and alignment
1435: The standard guarantees that addresses returned by @code{CREATE}d words
1436: are cell-aligned; in addition, Gforth guarantees that these addresses
1437: are aligned for all purposes.
1438:
1439: Note that the standard defines a word @code{char}, which has nothing to
1440: do with address arithmetic.
1441:
1442: doc-chars
1443: doc-char+
1444: doc-cells
1445: doc-cell+
1446: doc-cell
1447: doc-align
1448: doc-aligned
1449: doc-floats
1450: doc-float+
1451: doc-float
1452: doc-falign
1453: doc-faligned
1454: doc-sfloats
1455: doc-sfloat+
1456: doc-sfalign
1457: doc-sfaligned
1458: doc-dfloats
1459: doc-dfloat+
1460: doc-dfalign
1461: doc-dfaligned
1462: doc-maxalign
1463: doc-maxaligned
1464: doc-cfalign
1465: doc-cfaligned
1466: doc-address-unit-bits
1467:
1468: @node Memory Blocks, , Address arithmetic, Memory
1469: @subsection Memory Blocks
1470: @cindex memory block words
1471:
1472: doc-move
1473: doc-erase
1474:
1475: While the previous words work on address units, the rest works on
1476: characters.
1477:
1478: doc-cmove
1479: doc-cmove>
1480: doc-fill
1481: doc-blank
1482:
1483: @node Control Structures, Locals, Memory, Words
1484: @section Control Structures
1485: @cindex control structures
1486:
1487: Control structures in Forth cannot be used in interpret state, only in
1488: compile state@footnote{More precisely, they have no interpretation
1489: semantics (@pxref{Interpretation and Compilation Semantics})}, i.e., in
1490: a colon definition. We do not like this limitation, but have not seen a
1491: satisfying way around it yet, although many schemes have been proposed.
1492:
1493: @menu
1494: * Selection::
1495: * Simple Loops::
1496: * Counted Loops::
1497: * Arbitrary control structures::
1498: * Calls and returns::
1499: * Exception Handling::
1500: @end menu
1501:
1502: @node Selection, Simple Loops, Control Structures, Control Structures
1503: @subsection Selection
1504: @cindex selection control structures
1505: @cindex control structures for selection
1506:
1507: @cindex @code{IF} control structure
1508: @example
1509: @var{flag}
1510: IF
1511: @var{code}
1512: ENDIF
1513: @end example
1514: or
1515: @example
1516: @var{flag}
1517: IF
1518: @var{code1}
1519: ELSE
1520: @var{code2}
1521: ENDIF
1522: @end example
1523:
1524: You can use @code{THEN} instead of @code{ENDIF}. Indeed, @code{THEN} is
1525: standard, and @code{ENDIF} is not, although it is quite popular. We
1526: recommend using @code{ENDIF}, because it is less confusing for people
1527: who also know other languages (and is not prone to reinforcing negative
1528: prejudices against Forth in these people). Adding @code{ENDIF} to a
1529: system that only supplies @code{THEN} is simple:
1530: @example
1531: : endif POSTPONE then ; immediate
1532: @end example
1533:
1534: [According to @cite{Webster's New Encyclopedic Dictionary}, @dfn{then
1535: (adv.)} has the following meanings:
1536: @quotation
1537: ... 2b: following next after in order ... 3d: as a necessary consequence
1538: (if you were there, then you saw them).
1539: @end quotation
1540: Forth's @code{THEN} has the meaning 2b, whereas @code{THEN} in Pascal
1541: and many other programming languages has the meaning 3d.]
1542:
1543: Gforth also provides the words @code{?dup-if} and @code{?dup-0=-if}, so
1544: you can avoid using @code{?dup}. Using these alternatives is also more
1545: efficient than using @code{?dup}. Definitions in plain standard Forth
1546: for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in
1547: @file{compat/control.fs}.
1548:
1549: @cindex @code{CASE} control structure
1550: @example
1551: @var{n}
1552: CASE
1553: @var{n1} OF @var{code1} ENDOF
1554: @var{n2} OF @var{code2} ENDOF
1555: @dots{}
1556: ENDCASE
1557: @end example
1558:
1559: Executes the first @var{codei}, where the @var{ni} is equal to
1560: @var{n}. A default case can be added by simply writing the code after
1561: the last @code{ENDOF}. It may use @var{n}, which is on top of the stack,
1562: but must not consume it.
1563:
1564: @node Simple Loops, Counted Loops, Selection, Control Structures
1565: @subsection Simple Loops
1566: @cindex simple loops
1567: @cindex loops without count
1568:
1569: @cindex @code{WHILE} loop
1570: @example
1571: BEGIN
1572: @var{code1}
1573: @var{flag}
1574: WHILE
1575: @var{code2}
1576: REPEAT
1577: @end example
1578:
1579: @var{code1} is executed and @var{flag} is computed. If it is true,
1580: @var{code2} is executed and the loop is restarted; If @var{flag} is
1581: false, execution continues after the @code{REPEAT}.
1582:
1583: @cindex @code{UNTIL} loop
1584: @example
1585: BEGIN
1586: @var{code}
1587: @var{flag}
1588: UNTIL
1589: @end example
1590:
1591: @var{code} is executed. The loop is restarted if @code{flag} is false.
1592:
1593: @cindex endless loop
1594: @cindex loops, endless
1595: @example
1596: BEGIN
1597: @var{code}
1598: AGAIN
1599: @end example
1600:
1601: This is an endless loop.
1602:
1603: @node Counted Loops, Arbitrary control structures, Simple Loops, Control Structures
1604: @subsection Counted Loops
1605: @cindex counted loops
1606: @cindex loops, counted
1607: @cindex @code{DO} loops
1608:
1609: The basic counted loop is:
1610: @example
1611: @var{limit} @var{start}
1612: ?DO
1613: @var{body}
1614: LOOP
1615: @end example
1616:
1617: This performs one iteration for every integer, starting from @var{start}
1618: and up to, but excluding @var{limit}. The counter, aka index, can be
1619: accessed with @code{i}. E.g., the loop
1620: @example
1621: 10 0 ?DO
1622: i .
1623: LOOP
1624: @end example
1625: prints
1626: @example
1627: 0 1 2 3 4 5 6 7 8 9
1628: @end example
1629: The index of the innermost loop can be accessed with @code{i}, the index
1630: of the next loop with @code{j}, and the index of the third loop with
1631: @code{k}.
1632:
1633: doc-i
1634: doc-j
1635: doc-k
1636:
1637: The loop control data are kept on the return stack, so there are some
1638: restrictions on mixing return stack accesses and counted loop
1639: words. E.g., if you put values on the return stack outside the loop, you
1640: cannot read them inside the loop. If you put values on the return stack
1641: within a loop, you have to remove them before the end of the loop and
1642: before accessing the index of the loop.
1643:
1644: There are several variations on the counted loop:
1645:
1646: @code{LEAVE} leaves the innermost counted loop immediately.
1647:
1648: If @var{start} is greater than @var{limit}, a @code{?DO} loop is entered
1649: (and @code{LOOP} iterates until they become equal by wrap-around
1650: arithmetic). This behaviour is usually not what you want. Therefore,
1651: Gforth offers @code{+DO} and @code{U+DO} (as replacements for
1652: @code{?DO}), which do not enter the loop if @var{start} is greater than
1653: @var{limit}; @code{+DO} is for signed loop parameters, @code{U+DO} for
1654: unsigned loop parameters.
1655:
1656: @code{LOOP} can be replaced with @code{@var{n} +LOOP}; this updates the
1657: index by @var{n} instead of by 1. The loop is terminated when the border
1658: between @var{limit-1} and @var{limit} is crossed. E.g.:
1659:
1660: @code{4 0 +DO i . 2 +LOOP} prints @code{0 2}
1661:
1662: @code{4 1 +DO i . 2 +LOOP} prints @code{1 3}
1663:
1664: @cindex negative increment for counted loops
1665: @cindex counted loops with negative increment
1666: The behaviour of @code{@var{n} +LOOP} is peculiar when @var{n} is negative:
1667:
1668: @code{-1 0 ?DO i . -1 +LOOP} prints @code{0 -1}
1669:
1670: @code{ 0 0 ?DO i . -1 +LOOP} prints nothing
1671:
1672: Therefore we recommend avoiding @code{@var{n} +LOOP} with negative
1673: @var{n}. One alternative is @code{@var{u} -LOOP}, which reduces the
1674: index by @var{u} each iteration. The loop is terminated when the border
1675: between @var{limit+1} and @var{limit} is crossed. Gforth also provides
1676: @code{-DO} and @code{U-DO} for down-counting loops. E.g.:
1677:
1678: @code{-2 0 -DO i . 1 -LOOP} prints @code{0 -1}
1679:
1680: @code{-1 0 -DO i . 1 -LOOP} prints @code{0}
1681:
1682: @code{ 0 0 -DO i . 1 -LOOP} prints nothing
1683:
1684: Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and
1685: @code{-LOOP} are not in the ANS Forth standard. However, an
1686: implementation for these words that uses only standard words is provided
1687: in @file{compat/loops.fs}.
1688:
1689: @code{?DO} can also be replaced by @code{DO}. @code{DO} always enters
1690: the loop, independent of the loop parameters. Do not use @code{DO}, even
1691: if you know that the loop is entered in any case. Such knowledge tends
1692: to become invalid during maintenance of a program, and then the
1693: @code{DO} will make trouble.
1694:
1695: @code{UNLOOP} is used to prepare for an abnormal loop exit, e.g., via
1696: @code{EXIT}. @code{UNLOOP} removes the loop control parameters from the
1697: return stack so @code{EXIT} can get to its return address.
1698:
1699: @cindex @code{FOR} loops
1700: Another counted loop is
1701: @example
1702: @var{n}
1703: FOR
1704: @var{body}
1705: NEXT
1706: @end example
1707: This is the preferred loop of native code compiler writers who are too
1708: lazy to optimize @code{?DO} loops properly. In Gforth, this loop
1709: iterates @var{n+1} times; @code{i} produces values starting with @var{n}
1710: and ending with 0. Other Forth systems may behave differently, even if
1711: they support @code{FOR} loops. To avoid problems, don't use @code{FOR}
1712: loops.
1713:
1714: @node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures
1715: @subsection Arbitrary control structures
1716: @cindex control structures, user-defined
1717:
1718: @cindex control-flow stack
1719: ANS Forth permits and supports using control structures in a non-nested
1720: way. Information about incomplete control structures is stored on the
1721: control-flow stack. This stack may be implemented on the Forth data
1722: stack, and this is what we have done in Gforth.
1723:
1724: @cindex @code{orig}, control-flow stack item
1725: @cindex @code{dest}, control-flow stack item
1726: An @i{orig} entry represents an unresolved forward branch, a @i{dest}
1727: entry represents a backward branch target. A few words are the basis for
1728: building any control structure possible (except control structures that
1729: need storage, like calls, coroutines, and backtracking).
1730:
1731: doc-if
1732: doc-ahead
1733: doc-then
1734: doc-begin
1735: doc-until
1736: doc-again
1737: doc-cs-pick
1738: doc-cs-roll
1739:
1740: On many systems control-flow stack items take one word, in Gforth they
1741: currently take three (this may change in the future). Therefore it is a
1742: really good idea to manipulate the control flow stack with
1743: @code{cs-pick} and @code{cs-roll}, not with data stack manipulation
1744: words.
1745:
1746: Some standard control structure words are built from these words:
1747:
1748: doc-else
1749: doc-while
1750: doc-repeat
1751:
1752: Gforth adds some more control-structure words:
1753:
1754: doc-endif
1755: doc-?dup-if
1756: doc-?dup-0=-if
1757:
1758: Counted loop words constitute a separate group of words:
1759:
1760: doc-?do
1761: doc-+do
1762: doc-u+do
1763: doc--do
1764: doc-u-do
1765: doc-do
1766: doc-for
1767: doc-loop
1768: doc-+loop
1769: doc--loop
1770: doc-next
1771: doc-leave
1772: doc-?leave
1773: doc-unloop
1774: doc-done
1775:
1776: The standard does not allow using @code{cs-pick} and @code{cs-roll} on
1777: @i{do-sys}. Our system allows it, but it's your job to ensure that for
1778: every @code{?DO} etc. there is exactly one @code{UNLOOP} on any path
1779: through the definition (@code{LOOP} etc. compile an @code{UNLOOP} on the
1780: fall-through path). Also, you have to ensure that all @code{LEAVE}s are
1781: resolved (by using one of the loop-ending words or @code{DONE}).
1782:
1783: Another group of control structure words are
1784:
1785: doc-case
1786: doc-endcase
1787: doc-of
1788: doc-endof
1789:
1790: @i{case-sys} and @i{of-sys} cannot be processed using @code{cs-pick} and
1791: @code{cs-roll}.
1792:
1793: @subsubsection Programming Style
1794:
1795: In order to ensure readability we recommend that you do not create
1796: arbitrary control structures directly, but define new control structure
1797: words for the control structure you want and use these words in your
1798: program.
1799:
1800: E.g., instead of writing
1801:
1802: @example
1803: begin
1804: ...
1805: if [ 1 cs-roll ]
1806: ...
1807: again then
1808: @end example
1809:
1810: we recommend defining control structure words, e.g.,
1811:
1812: @example
1813: : while ( dest -- orig dest )
1814: POSTPONE if
1815: 1 cs-roll ; immediate
1816:
1817: : repeat ( orig dest -- )
1818: POSTPONE again
1819: POSTPONE then ; immediate
1820: @end example
1821:
1822: and then using these to create the control structure:
1823:
1824: @example
1825: begin
1826: ...
1827: while
1828: ...
1829: repeat
1830: @end example
1831:
1832: That's much easier to read, isn't it? Of course, @code{REPEAT} and
1833: @code{WHILE} are predefined, so in this example it would not be
1834: necessary to define them.
1835:
1836: @node Calls and returns, Exception Handling, Arbitrary control structures, Control Structures
1837: @subsection Calls and returns
1838: @cindex calling a definition
1839: @cindex returning from a definition
1840:
1.3 anton 1841: @cindex recursive definitions
1842: A definition can be called simply be writing the name of the definition
1843: to be called. Note that normally a definition is invisible during its
1844: definition. If you want to write a directly recursive definition, you
1845: can use @code{recursive} to make the current definition visible.
1846:
1847: doc-recursive
1848:
1849: Another way to perform a recursive call is
1850:
1851: doc-recurse
1852:
1.12 ! anton 1853: @quotation
! 1854: @progstyle
! 1855: I prefer using @code{recursive} to @code{recurse}, because calling the
! 1856: definition by name is more descriptive (if the name is well-chosen) than
! 1857: the somewhat cryptic @code{recurse}. E.g., in a quicksort
! 1858: implementation, it is much better to read (and think) ``now sort the
! 1859: partitions'' than to read ``now do a recursive call''.
! 1860: @end quotation
1.3 anton 1861:
1862: For mutual recursion, use @code{defer}red words, like this:
1863:
1864: @example
1865: defer foo
1866:
1867: : bar ( ... -- ... )
1868: ... foo ... ;
1869:
1870: :noname ( ... -- ... )
1871: ... bar ... ;
1872: IS foo
1873: @end example
1874:
1875: When the end of the definition is reached, it returns. An earlier return
1876: can be forced using
1.1 anton 1877:
1878: doc-exit
1879:
1880: Don't forget to clean up the return stack and @code{UNLOOP} any
1881: outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing. The
1882: primitive compiled by @code{EXIT} is
1883:
1884: doc-;s
1885:
1886: @node Exception Handling, , Calls and returns, Control Structures
1887: @subsection Exception Handling
1888: @cindex Exceptions
1889:
1890: doc-catch
1891: doc-throw
1892:
1893: @node Locals, Defining Words, Control Structures, Words
1894: @section Locals
1895: @cindex locals
1896:
1897: Local variables can make Forth programming more enjoyable and Forth
1898: programs easier to read. Unfortunately, the locals of ANS Forth are
1899: laden with restrictions. Therefore, we provide not only the ANS Forth
1900: locals wordset, but also our own, more powerful locals wordset (we
1901: implemented the ANS Forth locals wordset through our locals wordset).
1902:
1903: The ideas in this section have also been published in the paper
1904: @cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented
1905: at EuroForth '94; it is available at
1906: @*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}.
1907:
1908: @menu
1909: * Gforth locals::
1910: * ANS Forth locals::
1911: @end menu
1912:
1913: @node Gforth locals, ANS Forth locals, Locals, Locals
1914: @subsection Gforth locals
1915: @cindex Gforth locals
1916: @cindex locals, Gforth style
1917:
1918: Locals can be defined with
1919:
1920: @example
1921: @{ local1 local2 ... -- comment @}
1922: @end example
1923: or
1924: @example
1925: @{ local1 local2 ... @}
1926: @end example
1927:
1928: E.g.,
1929: @example
1930: : max @{ n1 n2 -- n3 @}
1931: n1 n2 > if
1932: n1
1933: else
1934: n2
1935: endif ;
1936: @end example
1937:
1938: The similarity of locals definitions with stack comments is intended. A
1939: locals definition often replaces the stack comment of a word. The order
1940: of the locals corresponds to the order in a stack comment and everything
1941: after the @code{--} is really a comment.
1942:
1943: This similarity has one disadvantage: It is too easy to confuse locals
1944: declarations with stack comments, causing bugs and making them hard to
1945: find. However, this problem can be avoided by appropriate coding
1946: conventions: Do not use both notations in the same program. If you do,
1947: they should be distinguished using additional means, e.g. by position.
1948:
1949: @cindex types of locals
1950: @cindex locals types
1951: The name of the local may be preceded by a type specifier, e.g.,
1952: @code{F:} for a floating point value:
1953:
1954: @example
1955: : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
1956: \ complex multiplication
1957: Ar Br f* Ai Bi f* f-
1958: Ar Bi f* Ai Br f* f+ ;
1959: @end example
1960:
1961: @cindex flavours of locals
1962: @cindex locals flavours
1963: @cindex value-flavoured locals
1964: @cindex variable-flavoured locals
1965: Gforth currently supports cells (@code{W:}, @code{W^}), doubles
1966: (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
1967: (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
1968: with @code{W:}, @code{D:} etc.) produces its value and can be changed
1969: with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
1970: produces its address (which becomes invalid when the variable's scope is
1971: left). E.g., the standard word @code{emit} can be defined in terms of
1972: @code{type} like this:
1973:
1974: @example
1975: : emit @{ C^ char* -- @}
1976: char* 1 type ;
1977: @end example
1978:
1979: @cindex default type of locals
1980: @cindex locals, default type
1981: A local without type specifier is a @code{W:} local. Both flavours of
1982: locals are initialized with values from the data or FP stack.
1983:
1984: Currently there is no way to define locals with user-defined data
1985: structures, but we are working on it.
1986:
1987: Gforth allows defining locals everywhere in a colon definition. This
1988: poses the following questions:
1989:
1990: @menu
1991: * Where are locals visible by name?::
1992: * How long do locals live?::
1993: * Programming Style::
1994: * Implementation::
1995: @end menu
1996:
1997: @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
1998: @subsubsection Where are locals visible by name?
1999: @cindex locals visibility
2000: @cindex visibility of locals
2001: @cindex scope of locals
2002:
2003: Basically, the answer is that locals are visible where you would expect
2004: it in block-structured languages, and sometimes a little longer. If you
2005: want to restrict the scope of a local, enclose its definition in
2006: @code{SCOPE}...@code{ENDSCOPE}.
2007:
2008: doc-scope
2009: doc-endscope
2010:
2011: These words behave like control structure words, so you can use them
2012: with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
2013: arbitrary ways.
2014:
2015: If you want a more exact answer to the visibility question, here's the
2016: basic principle: A local is visible in all places that can only be
2017: reached through the definition of the local@footnote{In compiler
2018: construction terminology, all places dominated by the definition of the
2019: local.}. In other words, it is not visible in places that can be reached
2020: without going through the definition of the local. E.g., locals defined
2021: in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
2022: defined in @code{BEGIN}...@code{UNTIL} are visible after the
2023: @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
2024:
2025: The reasoning behind this solution is: We want to have the locals
2026: visible as long as it is meaningful. The user can always make the
2027: visibility shorter by using explicit scoping. In a place that can
2028: only be reached through the definition of a local, the meaning of a
2029: local name is clear. In other places it is not: How is the local
2030: initialized at the control flow path that does not contain the
2031: definition? Which local is meant, if the same name is defined twice in
2032: two independent control flow paths?
2033:
2034: This should be enough detail for nearly all users, so you can skip the
2035: rest of this section. If you really must know all the gory details and
2036: options, read on.
2037:
2038: In order to implement this rule, the compiler has to know which places
2039: are unreachable. It knows this automatically after @code{AHEAD},
2040: @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
2041: most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
2042: compiler that the control flow never reaches that place. If
2043: @code{UNREACHABLE} is not used where it could, the only consequence is
2044: that the visibility of some locals is more limited than the rule above
2045: says. If @code{UNREACHABLE} is used where it should not (i.e., if you
2046: lie to the compiler), buggy code will be produced.
2047:
2048: doc-unreachable
2049:
2050: Another problem with this rule is that at @code{BEGIN}, the compiler
2051: does not know which locals will be visible on the incoming
2052: back-edge. All problems discussed in the following are due to this
2053: ignorance of the compiler (we discuss the problems using @code{BEGIN}
2054: loops as examples; the discussion also applies to @code{?DO} and other
2055: loops). Perhaps the most insidious example is:
2056: @example
2057: AHEAD
2058: BEGIN
2059: x
2060: [ 1 CS-ROLL ] THEN
2061: @{ x @}
2062: ...
2063: UNTIL
2064: @end example
2065:
2066: This should be legal according to the visibility rule. The use of
2067: @code{x} can only be reached through the definition; but that appears
2068: textually below the use.
2069:
2070: From this example it is clear that the visibility rules cannot be fully
2071: implemented without major headaches. Our implementation treats common
2072: cases as advertised and the exceptions are treated in a safe way: The
2073: compiler makes a reasonable guess about the locals visible after a
2074: @code{BEGIN}; if it is too pessimistic, the
2075: user will get a spurious error about the local not being defined; if the
2076: compiler is too optimistic, it will notice this later and issue a
2077: warning. In the case above the compiler would complain about @code{x}
2078: being undefined at its use. You can see from the obscure examples in
2079: this section that it takes quite unusual control structures to get the
2080: compiler into trouble, and even then it will often do fine.
2081:
2082: If the @code{BEGIN} is reachable from above, the most optimistic guess
2083: is that all locals visible before the @code{BEGIN} will also be
2084: visible after the @code{BEGIN}. This guess is valid for all loops that
2085: are entered only through the @code{BEGIN}, in particular, for normal
2086: @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
2087: @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
2088: compiler. When the branch to the @code{BEGIN} is finally generated by
2089: @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
2090: warns the user if it was too optimistic:
2091: @example
2092: IF
2093: @{ x @}
2094: BEGIN
2095: \ x ?
2096: [ 1 cs-roll ] THEN
2097: ...
2098: UNTIL
2099: @end example
2100:
2101: Here, @code{x} lives only until the @code{BEGIN}, but the compiler
2102: optimistically assumes that it lives until the @code{THEN}. It notices
2103: this difference when it compiles the @code{UNTIL} and issues a
2104: warning. The user can avoid the warning, and make sure that @code{x}
2105: is not used in the wrong area by using explicit scoping:
2106: @example
2107: IF
2108: SCOPE
2109: @{ x @}
2110: ENDSCOPE
2111: BEGIN
2112: [ 1 cs-roll ] THEN
2113: ...
2114: UNTIL
2115: @end example
2116:
2117: Since the guess is optimistic, there will be no spurious error messages
2118: about undefined locals.
2119:
2120: If the @code{BEGIN} is not reachable from above (e.g., after
2121: @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
2122: optimistic guess, as the locals visible after the @code{BEGIN} may be
2123: defined later. Therefore, the compiler assumes that no locals are
2124: visible after the @code{BEGIN}. However, the user can use
2125: @code{ASSUME-LIVE} to make the compiler assume that the same locals are
2126: visible at the BEGIN as at the point where the top control-flow stack
2127: item was created.
2128:
2129: doc-assume-live
2130:
2131: E.g.,
2132: @example
2133: @{ x @}
2134: AHEAD
2135: ASSUME-LIVE
2136: BEGIN
2137: x
2138: [ 1 CS-ROLL ] THEN
2139: ...
2140: UNTIL
2141: @end example
2142:
2143: Other cases where the locals are defined before the @code{BEGIN} can be
2144: handled by inserting an appropriate @code{CS-ROLL} before the
2145: @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
2146: behind the @code{ASSUME-LIVE}).
2147:
2148: Cases where locals are defined after the @code{BEGIN} (but should be
2149: visible immediately after the @code{BEGIN}) can only be handled by
2150: rearranging the loop. E.g., the ``most insidious'' example above can be
2151: arranged into:
2152: @example
2153: BEGIN
2154: @{ x @}
2155: ... 0=
2156: WHILE
2157: x
2158: REPEAT
2159: @end example
2160:
2161: @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals
2162: @subsubsection How long do locals live?
2163: @cindex locals lifetime
2164: @cindex lifetime of locals
2165:
2166: The right answer for the lifetime question would be: A local lives at
2167: least as long as it can be accessed. For a value-flavoured local this
2168: means: until the end of its visibility. However, a variable-flavoured
2169: local could be accessed through its address far beyond its visibility
2170: scope. Ultimately, this would mean that such locals would have to be
2171: garbage collected. Since this entails un-Forth-like implementation
2172: complexities, I adopted the same cowardly solution as some other
2173: languages (e.g., C): The local lives only as long as it is visible;
2174: afterwards its address is invalid (and programs that access it
2175: afterwards are erroneous).
2176:
2177: @node Programming Style, Implementation, How long do locals live?, Gforth locals
2178: @subsubsection Programming Style
2179: @cindex locals programming style
2180: @cindex programming style, locals
2181:
2182: The freedom to define locals anywhere has the potential to change
2183: programming styles dramatically. In particular, the need to use the
2184: return stack for intermediate storage vanishes. Moreover, all stack
2185: manipulations (except @code{PICK}s and @code{ROLL}s with run-time
2186: determined arguments) can be eliminated: If the stack items are in the
2187: wrong order, just write a locals definition for all of them; then
2188: write the items in the order you want.
2189:
2190: This seems a little far-fetched and eliminating stack manipulations is
2191: unlikely to become a conscious programming objective. Still, the number
2192: of stack manipulations will be reduced dramatically if local variables
2193: are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with
2194: a traditional implementation of @code{max}).
2195:
2196: This shows one potential benefit of locals: making Forth programs more
2197: readable. Of course, this benefit will only be realized if the
2198: programmers continue to honour the principle of factoring instead of
2199: using the added latitude to make the words longer.
2200:
2201: @cindex single-assignment style for locals
2202: Using @code{TO} can and should be avoided. Without @code{TO},
2203: every value-flavoured local has only a single assignment and many
2204: advantages of functional languages apply to Forth. I.e., programs are
2205: easier to analyse, to optimize and to read: It is clear from the
2206: definition what the local stands for, it does not turn into something
2207: different later.
2208:
2209: E.g., a definition using @code{TO} might look like this:
2210: @example
2211: : strcmp @{ addr1 u1 addr2 u2 -- n @}
2212: u1 u2 min 0
2213: ?do
2214: addr1 c@@ addr2 c@@ -
2215: ?dup-if
2216: unloop exit
2217: then
2218: addr1 char+ TO addr1
2219: addr2 char+ TO addr2
2220: loop
2221: u1 u2 - ;
2222: @end example
2223: Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
2224: every loop iteration. @code{strcmp} is a typical example of the
2225: readability problems of using @code{TO}. When you start reading
2226: @code{strcmp}, you think that @code{addr1} refers to the start of the
2227: string. Only near the end of the loop you realize that it is something
2228: else.
2229:
2230: This can be avoided by defining two locals at the start of the loop that
2231: are initialized with the right value for the current iteration.
2232: @example
2233: : strcmp @{ addr1 u1 addr2 u2 -- n @}
2234: addr1 addr2
2235: u1 u2 min 0
2236: ?do @{ s1 s2 @}
2237: s1 c@@ s2 c@@ -
2238: ?dup-if
2239: unloop exit
2240: then
2241: s1 char+ s2 char+
2242: loop
2243: 2drop
2244: u1 u2 - ;
2245: @end example
2246: Here it is clear from the start that @code{s1} has a different value
2247: in every loop iteration.
2248:
2249: @node Implementation, , Programming Style, Gforth locals
2250: @subsubsection Implementation
2251: @cindex locals implementation
2252: @cindex implementation of locals
2253:
2254: @cindex locals stack
2255: Gforth uses an extra locals stack. The most compelling reason for
2256: this is that the return stack is not float-aligned; using an extra stack
2257: also eliminates the problems and restrictions of using the return stack
2258: as locals stack. Like the other stacks, the locals stack grows toward
2259: lower addresses. A few primitives allow an efficient implementation:
2260:
2261: doc-@local#
2262: doc-f@local#
2263: doc-laddr#
2264: doc-lp+!#
2265: doc-lp!
2266: doc->l
2267: doc-f>l
2268:
2269: In addition to these primitives, some specializations of these
2270: primitives for commonly occurring inline arguments are provided for
2271: efficiency reasons, e.g., @code{@@local0} as specialization of
2272: @code{@@local#} for the inline argument 0. The following compiling words
2273: compile the right specialized version, or the general version, as
2274: appropriate:
2275:
2276: doc-compile-@local
2277: doc-compile-f@local
2278: doc-compile-lp+!
2279:
2280: Combinations of conditional branches and @code{lp+!#} like
2281: @code{?branch-lp+!#} (the locals pointer is only changed if the branch
2282: is taken) are provided for efficiency and correctness in loops.
2283:
2284: A special area in the dictionary space is reserved for keeping the
2285: local variable names. @code{@{} switches the dictionary pointer to this
2286: area and @code{@}} switches it back and generates the locals
2287: initializing code. @code{W:} etc.@ are normal defining words. This
2288: special area is cleared at the start of every colon definition.
2289:
2290: @cindex wordlist for defining locals
2291: A special feature of Gforth's dictionary is used to implement the
2292: definition of locals without type specifiers: every wordlist (aka
2293: vocabulary) has its own methods for searching
2294: etc. (@pxref{Wordlists}). For the present purpose we defined a wordlist
2295: with a special search method: When it is searched for a word, it
2296: actually creates that word using @code{W:}. @code{@{} changes the search
2297: order to first search the wordlist containing @code{@}}, @code{W:} etc.,
2298: and then the wordlist for defining locals without type specifiers.
2299:
2300: The lifetime rules support a stack discipline within a colon
2301: definition: The lifetime of a local is either nested with other locals
2302: lifetimes or it does not overlap them.
2303:
2304: At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
2305: pointer manipulation is generated. Between control structure words
2306: locals definitions can push locals onto the locals stack. @code{AGAIN}
2307: is the simplest of the other three control flow words. It has to
2308: restore the locals stack depth of the corresponding @code{BEGIN}
2309: before branching. The code looks like this:
2310: @format
2311: @code{lp+!#} current-locals-size @minus{} dest-locals-size
2312: @code{branch} <begin>
2313: @end format
2314:
2315: @code{UNTIL} is a little more complicated: If it branches back, it
2316: must adjust the stack just like @code{AGAIN}. But if it falls through,
2317: the locals stack must not be changed. The compiler generates the
2318: following code:
2319: @format
2320: @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
2321: @end format
2322: The locals stack pointer is only adjusted if the branch is taken.
2323:
2324: @code{THEN} can produce somewhat inefficient code:
2325: @format
2326: @code{lp+!#} current-locals-size @minus{} orig-locals-size
2327: <orig target>:
2328: @code{lp+!#} orig-locals-size @minus{} new-locals-size
2329: @end format
2330: The second @code{lp+!#} adjusts the locals stack pointer from the
2331: level at the @var{orig} point to the level after the @code{THEN}. The
2332: first @code{lp+!#} adjusts the locals stack pointer from the current
2333: level to the level at the orig point, so the complete effect is an
2334: adjustment from the current level to the right level after the
2335: @code{THEN}.
2336:
2337: @cindex locals information on the control-flow stack
2338: @cindex control-flow stack items, locals information
2339: In a conventional Forth implementation a dest control-flow stack entry
2340: is just the target address and an orig entry is just the address to be
2341: patched. Our locals implementation adds a wordlist to every orig or dest
2342: item. It is the list of locals visible (or assumed visible) at the point
2343: described by the entry. Our implementation also adds a tag to identify
2344: the kind of entry, in particular to differentiate between live and dead
2345: (reachable and unreachable) orig entries.
2346:
2347: A few unusual operations have to be performed on locals wordlists:
2348:
2349: doc-common-list
2350: doc-sub-list?
2351: doc-list-size
2352:
2353: Several features of our locals wordlist implementation make these
2354: operations easy to implement: The locals wordlists are organised as
2355: linked lists; the tails of these lists are shared, if the lists
2356: contain some of the same locals; and the address of a name is greater
2357: than the address of the names behind it in the list.
2358:
2359: Another important implementation detail is the variable
2360: @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
2361: determine if they can be reached directly or only through the branch
2362: that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
2363: @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
2364: definition, by @code{BEGIN} and usually by @code{THEN}.
2365:
2366: Counted loops are similar to other loops in most respects, but
2367: @code{LEAVE} requires special attention: It performs basically the same
2368: service as @code{AHEAD}, but it does not create a control-flow stack
2369: entry. Therefore the information has to be stored elsewhere;
2370: traditionally, the information was stored in the target fields of the
2371: branches created by the @code{LEAVE}s, by organizing these fields into a
2372: linked list. Unfortunately, this clever trick does not provide enough
2373: space for storing our extended control flow information. Therefore, we
2374: introduce another stack, the leave stack. It contains the control-flow
2375: stack entries for all unresolved @code{LEAVE}s.
2376:
2377: Local names are kept until the end of the colon definition, even if
2378: they are no longer visible in any control-flow path. In a few cases
2379: this may lead to increased space needs for the locals name area, but
2380: usually less than reclaiming this space would cost in code size.
2381:
2382:
2383: @node ANS Forth locals, , Gforth locals, Locals
2384: @subsection ANS Forth locals
2385: @cindex locals, ANS Forth style
2386:
2387: The ANS Forth locals wordset does not define a syntax for locals, but
2388: words that make it possible to define various syntaxes. One of the
2389: possible syntaxes is a subset of the syntax we used in the Gforth locals
2390: wordset, i.e.:
2391:
2392: @example
2393: @{ local1 local2 ... -- comment @}
2394: @end example
2395: or
2396: @example
2397: @{ local1 local2 ... @}
2398: @end example
2399:
2400: The order of the locals corresponds to the order in a stack comment. The
2401: restrictions are:
2402:
2403: @itemize @bullet
2404: @item
2405: Locals can only be cell-sized values (no type specifiers are allowed).
2406: @item
2407: Locals can be defined only outside control structures.
2408: @item
2409: Locals can interfere with explicit usage of the return stack. For the
2410: exact (and long) rules, see the standard. If you don't use return stack
2411: accessing words in a definition using locals, you will be all right. The
2412: purpose of this rule is to make locals implementation on the return
2413: stack easier.
2414: @item
2415: The whole definition must be in one line.
2416: @end itemize
2417:
2418: Locals defined in this way behave like @code{VALUE}s (@xref{Simple
2419: Defining Words}). I.e., they are initialized from the stack. Using their
2420: name produces their value. Their value can be changed using @code{TO}.
2421:
2422: Since this syntax is supported by Gforth directly, you need not do
2423: anything to use it. If you want to port a program using this syntax to
2424: another ANS Forth system, use @file{compat/anslocal.fs} to implement the
2425: syntax on the other system.
2426:
2427: Note that a syntax shown in the standard, section A.13 looks
2428: similar, but is quite different in having the order of locals
2429: reversed. Beware!
2430:
2431: The ANS Forth locals wordset itself consists of the following word
2432:
2433: doc-(local)
2434:
2435: The ANS Forth locals extension wordset defines a syntax, but it is so
2436: awful that we strongly recommend not to use it. We have implemented this
2437: syntax to make porting to Gforth easy, but do not document it here. The
2438: problem with this syntax is that the locals are defined in an order
2439: reversed with respect to the standard stack comment notation, making
2440: programs harder to read, and easier to misread and miswrite. The only
2441: merit of this syntax is that it is easy to implement using the ANS Forth
2442: locals wordset.
2443:
1.5 anton 2444: @node Defining Words, Structures, Locals, Words
1.1 anton 2445: @section Defining Words
2446: @cindex defining words
2447:
2448: @menu
2449: * Simple Defining Words::
2450: * Colon Definitions::
2451: * User-defined Defining Words::
2452: * Supplying names::
2453: * Interpretation and Compilation Semantics::
2454: @end menu
2455:
2456: @node Simple Defining Words, Colon Definitions, Defining Words, Defining Words
2457: @subsection Simple Defining Words
2458: @cindex simple defining words
2459: @cindex defining words, simple
2460:
2461: doc-constant
2462: doc-2constant
2463: doc-fconstant
2464: doc-variable
2465: doc-2variable
2466: doc-fvariable
2467: doc-create
2468: doc-user
2469: doc-value
2470: doc-to
2471: doc-defer
2472: doc-is
2473:
2474: @node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words
2475: @subsection Colon Definitions
2476: @cindex colon definitions
2477:
2478: @example
2479: : name ( ... -- ... )
2480: word1 word2 word3 ;
2481: @end example
2482:
2483: creates a word called @code{name}, that, upon execution, executes
2484: @code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}.
2485:
2486: The explanation above is somewhat superficial. @xref{Interpretation and
2487: Compilation Semantics} for an in-depth discussion of some of the issues
2488: involved.
2489:
2490: doc-:
2491: doc-;
2492:
2493: @node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words
2494: @subsection User-defined Defining Words
2495: @cindex user-defined defining words
2496: @cindex defining words, user-defined
2497:
2498: You can create new defining words simply by wrapping defining-time code
2499: around existing defining words and putting the sequence in a colon
2500: definition.
2501:
2502: @cindex @code{CREATE} ... @code{DOES>}
2503: If you want the words defined with your defining words to behave
2504: differently from words defined with standard defining words, you can
2505: write your defining word like this:
2506:
2507: @example
2508: : def-word ( "name" -- )
2509: Create @var{code1}
2510: DOES> ( ... -- ... )
2511: @var{code2} ;
2512:
2513: def-word name
2514: @end example
2515:
2516: Technically, this fragment defines a defining word @code{def-word}, and
2517: a word @code{name}; when you execute @code{name}, the address of the
2518: body of @code{name} is put on the data stack and @var{code2} is executed
2519: (the address of the body of @code{name} is the address @code{HERE}
2520: returns immediately after the @code{CREATE}).
2521:
2522: In other words, if you make the following definitions:
2523:
2524: @example
2525: : def-word1 ( "name" -- )
2526: Create @var{code1} ;
2527:
2528: : action1 ( ... -- ... )
2529: @var{code2} ;
2530:
2531: def-word name1
2532: @end example
2533:
2534: Using @code{name1 action1} is equivalent to using @code{name}.
2535:
2536: E.g., you can implement @code{Constant} in this way:
2537:
2538: @example
2539: : constant ( w "name" -- )
2540: create ,
2541: DOES> ( -- w )
2542: @@ ;
2543: @end example
2544:
2545: When you create a constant with @code{5 constant five}, first a new word
2546: @code{five} is created, then the value 5 is laid down in the body of
2547: @code{five} with @code{,}. When @code{five} is invoked, the address of
2548: the body is put on the stack, and @code{@@} retrieves the value 5.
2549:
2550: @cindex stack effect of @code{DOES>}-parts
2551: @cindex @code{DOES>}-parts, stack effect
2552: In the example above the stack comment after the @code{DOES>} specifies
2553: the stack effect of the defined words, not the stack effect of the
2554: following code (the following code expects the address of the body on
2555: the top of stack, which is not reflected in the stack comment). This is
2556: the convention that I use and recommend (it clashes a bit with using
2557: locals declarations for stack effect specification, though).
2558:
2559: @subsubsection Applications of @code{CREATE..DOES>}
2560: @cindex @code{CREATE} ... @code{DOES>}, applications
2561:
2562: You may wonder how to use this feature. Here are some usage patterns:
2563:
2564: @cindex factoring similar colon definitions
2565: When you see a sequence of code occurring several times, and you can
2566: identify a meaning, you will factor it out as a colon definition. When
2567: you see similar colon definitions, you can factor them using
2568: @code{CREATE..DOES>}. E.g., an assembler usually defines several words
2569: that look very similar:
2570: @example
2571: : ori, ( reg-target reg-source n -- )
2572: 0 asm-reg-reg-imm ;
2573: : andi, ( reg-target reg-source n -- )
2574: 1 asm-reg-reg-imm ;
2575: @end example
2576:
2577: This could be factored with:
2578: @example
2579: : reg-reg-imm ( op-code -- )
2580: create ,
2581: DOES> ( reg-target reg-source n -- )
2582: @@ asm-reg-reg-imm ;
2583:
2584: 0 reg-reg-imm ori,
2585: 1 reg-reg-imm andi,
2586: @end example
2587:
2588: @cindex currying
2589: Another view of @code{CREATE..DOES>} is to consider it as a crude way to
2590: supply a part of the parameters for a word (known as @dfn{currying} in
2591: the functional language community). E.g., @code{+} needs two
2592: parameters. Creating versions of @code{+} with one parameter fixed can
2593: be done like this:
2594: @example
2595: : curry+ ( n1 -- )
2596: create ,
2597: DOES> ( n2 -- n1+n2 )
2598: @@ + ;
2599:
2600: 3 curry+ 3+
2601: -2 curry+ 2-
2602: @end example
2603:
2604: @subsubsection The gory details of @code{CREATE..DOES>}
2605: @cindex @code{CREATE} ... @code{DOES>}, details
2606:
2607: doc-does>
2608:
2609: @cindex @code{DOES>} in a separate definition
2610: This means that you need not use @code{CREATE} and @code{DOES>} in the
2611: same definition; E.g., you can put the @code{DOES>}-part in a separate
2612: definition. This allows us to, e.g., select among different DOES>-parts:
2613: @example
2614: : does1
2615: DOES> ( ... -- ... )
2616: ... ;
2617:
2618: : does2
2619: DOES> ( ... -- ... )
2620: ... ;
2621:
2622: : def-word ( ... -- ... )
2623: create ...
2624: IF
2625: does1
2626: ELSE
2627: does2
2628: ENDIF ;
2629: @end example
2630:
2631: @cindex @code{DOES>} in interpretation state
2632: In a standard program you can apply a @code{DOES>}-part only if the last
2633: word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part
2634: will override the behaviour of the last word defined in any case. In a
2635: standard program, you can use @code{DOES>} only in a colon
2636: definition. In Gforth, you can also use it in interpretation state, in a
2637: kind of one-shot mode:
2638: @example
2639: CREATE name ( ... -- ... )
2640: @var{initialization}
2641: DOES>
2642: @var{code} ;
2643: @end example
2644: This is equivalent to the standard
2645: @example
2646: :noname
2647: DOES>
2648: @var{code} ;
2649: CREATE name EXECUTE ( ... -- ... )
2650: @var{initialization}
2651: @end example
2652:
2653: You can get the address of the body of a word with
2654:
2655: doc->body
2656:
2657: @node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words
2658: @subsection Supplying names for the defined words
2659: @cindex names for defined words
2660: @cindex defining words, name parameter
2661:
2662: @cindex defining words, name given in a string
2663: By default, defining words take the names for the defined words from the
2664: input stream. Sometimes you want to supply the name from a string. You
2665: can do this with
2666:
2667: doc-nextname
2668:
2669: E.g.,
2670:
2671: @example
2672: s" foo" nextname create
2673: @end example
2674: is equivalent to
2675: @example
2676: create foo
2677: @end example
2678:
2679: @cindex defining words without name
2680: Sometimes you want to define a word without a name. You can do this with
2681:
2682: doc-noname
2683:
2684: @cindex execution token of last defined word
2685: To make any use of the newly defined word, you need its execution
2686: token. You can get it with
2687:
2688: doc-lastxt
2689:
2690: E.g., you can initialize a deferred word with an anonymous colon
2691: definition:
2692: @example
2693: Defer deferred
2694: noname : ( ... -- ... )
2695: ... ;
2696: lastxt IS deferred
2697: @end example
2698:
2699: @code{lastxt} also works when the last word was not defined as
2700: @code{noname}.
2701:
2702: The standard has also recognized the need for anonymous words and
2703: provides
2704:
2705: doc-:noname
2706:
2707: This leaves the execution token for the word on the stack after the
2708: closing @code{;}. You can rewrite the last example with @code{:noname}:
2709: @example
2710: Defer deferred
2711: :noname ( ... -- ... )
2712: ... ;
2713: IS deferred
2714: @end example
2715:
2716: @node Interpretation and Compilation Semantics, , Supplying names, Defining Words
2717: @subsection Interpretation and Compilation Semantics
2718: @cindex semantics, interpretation and compilation
2719:
2720: @cindex interpretation semantics
2721: The @dfn{interpretation semantics} of a word are what the text
2722: interpreter does when it encounters the word in interpret state. It also
2723: appears in some other contexts, e.g., the execution token returned by
2724: @code{' @var{word}} identifies the interpretation semantics of
2725: @var{word} (in other words, @code{' @var{word} execute} is equivalent to
2726: interpret-state text interpretation of @code{@var{word}}).
2727:
2728: @cindex compilation semantics
2729: The @dfn{compilation semantics} of a word are what the text interpreter
2730: does when it encounters the word in compile state. It also appears in
2731: other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In
2732: standard terminology, ``appends to the current definition''.} the
2733: compilation semantics of @var{word}.
2734:
2735: @cindex execution semantics
2736: The standard also talks about @dfn{execution semantics}. They are used
2737: only for defining the interpretation and compilation semantics of many
2738: words. By default, the interpretation semantics of a word are to
2739: @code{execute} its execution semantics, and the compilation semantics of
2740: a word are to @code{compile,} its execution semantics.@footnote{In
2741: standard terminology: The default interpretation semantics are its
2742: execution semantics; the default compilation semantics are to append its
2743: execution semantics to the execution semantics of the current
2744: definition.}
2745:
2746: @cindex immediate words
2747: You can change the compilation semantics into @code{execute}ing the
2748: execution semantics with
2749:
2750: doc-immediate
2751:
2752: @cindex compile-only words
2753: You can remove the interpretation semantics of a word with
2754:
2755: doc-compile-only
2756: doc-restrict
2757:
2758: Note that ticking (@code{'}) compile-only words gives an error
2759: (``Interpreting a compile-only word'').
2760:
2761: Gforth also allows you to define words with arbitrary combinations of
2762: interpretation and compilation semantics.
2763:
2764: doc-interpret/compile:
2765:
2766: This feature was introduced for implementing @code{TO} and @code{S"}. I
2767: recommend that you do not define such words, as cute as they may be:
2768: they make it hard to get at both parts of the word in some contexts.
2769: E.g., assume you want to get an execution token for the compilation
2770: part. Instead, define two words, one that embodies the interpretation
2771: part, and one that embodies the compilation part.
2772:
2773: There is, however, a potentially useful application of this feature:
2774: Providing differing implementations for the default semantics. While
2775: this introduces redundancy and is therefore usually a bad idea, a
2776: performance improvement may be worth the trouble. E.g., consider the
2777: word @code{foobar}:
2778:
2779: @example
2780: : foobar
2781: foo bar ;
2782: @end example
2783:
2784: Let us assume that @code{foobar} is called so frequently that the
2785: calling overhead would take a significant amount of the run-time. We can
2786: optimize it with @code{interpret/compile:}:
2787:
2788: @example
2789: :noname
2790: foo bar ;
2791: :noname
2792: POSTPONE foo POSTPONE bar ;
2793: interpret/compile: foobar
2794: @end example
2795:
2796: This definition has the same interpretation semantics and essentially
2797: the same compilation semantics as the simple definition of
2798: @code{foobar}, but the implementation of the compilation semantics is
2799: more efficient with respect to run-time.
2800:
2801: @cindex state-smart words are a bad idea
2802: Some people try to use state-smart words to emulate the feature provided
2803: by @code{interpret/compile:} (words are state-smart if they check
2804: @code{STATE} during execution). E.g., they would try to code
2805: @code{foobar} like this:
2806:
2807: @example
2808: : foobar
2809: STATE @@
2810: IF ( compilation state )
2811: POSTPONE foo POSTPONE bar
2812: ELSE
2813: foo bar
2814: ENDIF ; immediate
2815: @end example
2816:
2817: While this works if @code{foobar} is processed only by the text
2818: interpreter, it does not work in other contexts (like @code{'} or
2819: @code{POSTPONE}). E.g., @code{' foobar} will produce an execution token
2820: for a state-smart word, not for the interpretation semantics of the
2821: original @code{foobar}; when you execute this execution token (directly
2822: with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile
2823: state, the result will not be what you expected (i.e., it will not
2824: perform @code{foo bar}). State-smart words are a bad idea. Simply don't
2825: write them!
2826:
2827: @cindex defining words with arbitrary semantics combinations
2828: It is also possible to write defining words that define words with
2829: arbitrary combinations of interpretation and compilation semantics (or,
2830: preferably, arbitrary combinations of implementations of the default
2831: semantics). In general, this looks like:
2832:
2833: @example
2834: : def-word
2835: create-interpret/compile
2836: @var{code1}
2837: interpretation>
2838: @var{code2}
2839: <interpretation
2840: compilation>
2841: @var{code3}
2842: <compilation ;
2843: @end example
2844:
2845: For a @var{word} defined with @code{def-word}, the interpretation
2846: semantics are to push the address of the body of @var{word} and perform
2847: @var{code2}, and the compilation semantics are to push the address of
2848: the body of @var{word} and perform @var{code3}. E.g., @code{constant}
2849: can also be defined like this:
2850:
2851: @example
2852: : constant ( n "name" -- )
2853: create-interpret/compile
2854: ,
2855: interpretation> ( -- n )
2856: @@
2857: <interpretation
2858: compilation> ( compilation. -- ; run-time. -- n )
2859: @@ postpone literal
2860: <compilation ;
2861: @end example
2862:
2863: doc-create-interpret/compile
2864: doc-interpretation>
2865: doc-<interpretation
2866: doc-compilation>
2867: doc-<compilation
2868:
2869: Note that words defined with @code{interpret/compile:} and
2870: @code{create-interpret/compile} have an extended header structure that
2871: differs from other words; however, unless you try to access them with
2872: plain address arithmetic, you should not notice this. Words for
2873: accessing the header structure usually know how to deal with this; e.g.,
2874: @code{' word >body} also gives you the body of a word created with
2875: @code{create-interpret/compile}.
2876:
1.5 anton 2877: @c ----------------------------------------------------------
1.12 ! anton 2878: @node Structures, Object-oriented Forth, Defining Words, Words
1.5 anton 2879: @section Structures
2880: @cindex structures
2881: @cindex records
2882:
2883: This section presents the structure package that comes with Gforth. A
2884: version of the package implemented in plain ANS Forth is available in
2885: @file{compat/struct.fs}. This package was inspired by a posting on
2886: comp.lang.forth in 1989 (unfortunately I don't remember, by whom;
2887: possibly John Hayes). A version of this section has been published in
2888: ???. Marcel Hendrix provided helpful comments.
2889:
2890: @menu
2891: * Why explicit structure support?::
2892: * Structure Usage::
2893: * Structure Naming Convention::
2894: * Structure Implementation::
2895: * Structure Glossary::
2896: @end menu
2897:
2898: @node Why explicit structure support?, Structure Usage, Structures, Structures
2899: @subsection Why explicit structure support?
2900:
2901: @cindex address arithmetic for structures
2902: @cindex structures using address arithmetic
2903: If we want to use a structure containing several fields, we could simply
2904: reserve memory for it, and access the fields using address arithmetic
2905: (@pxref{Address arithmetic}). As an example, consider a structure with
2906: the following fields
2907:
2908: @table @code
2909: @item a
2910: is a float
2911: @item b
2912: is a cell
2913: @item c
2914: is a float
2915: @end table
2916:
2917: Given the (float-aligned) base address of the structure we get the
2918: address of the field
2919:
2920: @table @code
2921: @item a
2922: without doing anything further.
2923: @item b
2924: with @code{float+}
2925: @item c
2926: with @code{float+ cell+ faligned}
2927: @end table
2928:
2929: It is easy to see that this can become quite tiring.
2930:
2931: Moreover, it is not very readable, because seeing a
2932: @code{cell+} tells us neither which kind of structure is
2933: accessed nor what field is accessed; we have to somehow infer the kind
2934: of structure, and then look up in the documentation, which field of
2935: that structure corresponds to that offset.
2936:
2937: Finally, this kind of address arithmetic also causes maintenance
2938: troubles: If you add or delete a field somewhere in the middle of the
2939: structure, you have to find and change all computations for the fields
2940: afterwards.
2941:
2942: So, instead of using @code{cell+} and friends directly, how
2943: about storing the offsets in constants:
2944:
2945: @example
2946: 0 constant a-offset
2947: 0 float+ constant b-offset
2948: 0 float+ cell+ faligned c-offset
2949: @end example
2950:
2951: Now we can get the address of field @code{x} with @code{x-offset
2952: +}. This is much better in all respects. Of course, you still
2953: have to change all later offset definitions if you add a field. You can
2954: fix this by declaring the offsets in the following way:
2955:
2956: @example
2957: 0 constant a-offset
2958: a-offset float+ constant b-offset
2959: b-offset cell+ faligned constant c-offset
2960: @end example
2961:
2962: Since we always use the offsets with @code{+}, using a defining
2963: word @code{cfield} that includes the @code{+} in the
2964: action of the defined word offers itself:
2965:
2966: @example
2967: : cfield ( n "name" -- )
2968: create ,
2969: does> ( name execution: addr1 -- addr2 )
2970: @@ + ;
2971:
2972: 0 cfield a
2973: 0 a float+ cfield b
2974: 0 b cell+ faligned cfield c
2975: @end example
2976:
2977: Instead of @code{x-offset +}, we now simply write @code{x}.
2978:
2979: The structure field words now can be used quite nicely. However,
2980: their definition is still a bit cumbersome: We have to repeat the
2981: name, the information about size and alignment is distributed before
2982: and after the field definitions etc. The structure package presented
2983: here addresses these problems.
2984:
2985: @node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures
2986: @subsection Structure Usage
2987: @cindex structure usage
2988:
2989: @cindex @code{field} usage
2990: @cindex @code{struct} usage
2991: @cindex @code{end-struct} usage
2992: You can define a structure for a (data-less) linked list with
2993: @example
2994: struct
2995: cell% field list-next
2996: end-struct list%
2997: @end example
2998:
2999: With the address of the list node on the stack, you can compute the
3000: address of the field that contains the address of the next node with
3001: @code{list-next}. E.g., you can determine the length of a list
3002: with:
3003:
3004: @example
3005: : list-length ( list -- n )
3006: \ "list" is a pointer to the first element of a linked list
3007: \ "n" is the length of the list
3008: 0 begin ( list1 n1 )
3009: over
3010: while ( list1 n1 )
3011: 1+ swap list-next @@ swap
3012: repeat
3013: nip ;
3014: @end example
3015:
3016: You can reserve memory for a list node in the dictionary with
3017: @code{list% %allot}, which leaves the address of the list node on the
3018: stack. For the equivalent allocation on the heap you can use @code{list%
3019: %alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior),
3020: use @code{list% %allocate}). You can also get the the size of a list
3021: node with @code{list% %size} and it's alignment with @code{list%
3022: %alignment}.
3023:
3024: Note that in ANS Forth the body of a @code{create}d word is
3025: @code{aligned} but not necessarily @code{faligned};
3026: therefore, if you do a
3027: @example
3028: create @emph{name} foo% %allot
3029: @end example
3030:
3031: then the memory alloted for @code{foo%} is
3032: guaranteed to start at the body of @code{@emph{name}} only if
3033: @code{foo%} contains only character, cell and double fields.
3034:
3035: @cindex strcutures containing structures
3036: You can also include a structure @code{foo%} as field of
3037: another structure, with:
3038: @example
3039: struct
3040: ...
3041: foo% field ...
3042: ...
3043: end-struct ...
3044: @end example
3045:
3046: @cindex structure extension
3047: @cindex extended records
3048: Instead of starting with an empty structure, you can also extend an
3049: existing structure. E.g., a plain linked list without data, as defined
3050: above, is hardly useful; You can extend it to a linked list of integers,
3051: like this:@footnote{This feature is also known as @emph{extended
3052: records}. It is the main innovation in the Oberon language; in other
3053: words, adding this feature to Modula-2 led Wirth to create a new
3054: language, write a new compiler etc. Adding this feature to Forth just
3055: requires a few lines of code.}
3056:
3057: @example
3058: list%
3059: cell% field intlist-int
3060: end-struct intlist%
3061: @end example
3062:
3063: @code{intlist%} is a structure with two fields:
3064: @code{list-next} and @code{intlist-int}.
3065:
3066: @cindex structures containing arrays
3067: You can specify an array type containing @emph{n} elements of
3068: type @code{foo%} like this:
3069:
3070: @example
3071: foo% @emph{n} *
3072: @end example
3073:
3074: You can use this array type in any place where you can use a normal
3075: type, e.g., when defining a @code{field}, or with
3076: @code{%allot}.
3077:
3078: @cindex first field optimization
3079: The first field is at the base address of a structure and the word
3080: for this field (e.g., @code{list-next}) actually does not change
3081: the address on the stack. You may be tempted to leave it away in the
3082: interest of run-time and space efficiency. This is not necessary,
3083: because the structure package optimizes this case and compiling such
3084: words does not generate any code. So, in the interest of readability
3085: and maintainability you should include the word for the field when
3086: accessing the field.
3087:
3088: @node Structure Naming Convention, Structure Implementation, Structure Usage, Structures
3089: @subsection Structure Naming Convention
3090: @cindex structure naming conventions
3091:
3092: The field names that come to (my) mind are often quite generic, and,
3093: if used, would cause frequent name clashes. E.g., many structures
3094: probably contain a @code{counter} field. The structure names
3095: that come to (my) mind are often also the logical choice for the names
3096: of words that create such a structure.
3097:
3098: Therefore, I have adopted the following naming conventions:
3099:
3100: @itemize @bullet
3101: @cindex field naming convention
3102: @item
3103: The names of fields are of the form
3104: @code{@emph{struct}-@emph{field}}, where
3105: @code{@emph{struct}} is the basic name of the structure, and
3106: @code{@emph{field}} is the basic name of the field. You can
3107: think about field words as converting converts the (address of the)
3108: structure into the (address of the) field.
3109:
3110: @cindex structure naming convention
3111: @item
3112: The names of structures are of the form
3113: @code{@emph{struct}%}, where
3114: @code{@emph{struct}} is the basic name of the structure.
3115: @end itemize
3116:
3117: This naming convention does not work that well for fields of extended
3118: structures; e.g., the integer list structure has a field
3119: @code{intlist-int}, but has @code{list-next}, not
3120: @code{intlist-next}.
3121:
3122: @node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures
3123: @subsection Structure Implementation
3124: @cindex structure implementation
3125: @cindex implementation of structures
3126:
3127: The central idea in the implementation is to pass the data about the
3128: structure being built on the stack, not in some global
3129: variable. Everything else falls into place naturally once this design
3130: decision is made.
3131:
3132: The type description on the stack is of the form @emph{align
3133: size}. Keeping the size on the top-of-stack makes dealing with arrays
3134: very simple.
3135:
3136: @code{field} is a defining word that uses @code{create}
3137: and @code{does>}. The body of the field contains the offset
3138: of the field, and the normal @code{does>} action is
3139:
3140: @example
3141: @ +
3142: @end example
3143:
3144: i.e., add the offset to the address, giving the stack effect
3145: @code{addr1 -- addr2} for a field.
3146:
3147: @cindex first field optimization, implementation
3148: This simple structure is slightly complicated by the optimization
3149: for fields with offset 0, which requires a different
3150: @code{does>}-part (because we cannot rely on there being
3151: something on the stack if such a field is invoked during
3152: compilation). Therefore, we put the different @code{does>}-parts
3153: in separate words, and decide which one to invoke based on the
3154: offset. For a zero offset, the field is basically a noop; it is
3155: immediate, and therefore no code is generated when it is compiled.
3156:
3157: @node Structure Glossary, , Structure Implementation, Structures
3158: @subsection Structure Glossary
3159: @cindex structure glossary
3160:
3161: doc-%align
3162: doc-%alignment
3163: doc-%alloc
3164: doc-%allocate
3165: doc-%allot
3166: doc-cell%
3167: doc-char%
3168: doc-dfloat%
3169: doc-double%
3170: doc-end-struct
3171: doc-field
3172: doc-float%
3173: doc-nalign
3174: doc-sfloat%
3175: doc-%size
3176: doc-struct
3177:
3178: @c -------------------------------------------------------------
1.12 ! anton 3179: @node Object-oriented Forth, Tokens for Words, Structures, Words
! 3180: @section Object-oriented Forth
! 3181:
! 3182: Gforth comes with three packets for object-oriented programming,
! 3183: @file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them
! 3184: is preloaded, so you have to @code{include} them before use. The most
! 3185: important differences between these packets (and others) are discussed
! 3186: in @ref{Comparison with other object models}. All packets are written
! 3187: in ANS Forth and can be used with any other ANS Forth.
! 3188:
! 3189: @menu
! 3190: * Objects::
! 3191: * OOF::
! 3192: * Mini-OOF::
! 3193: @end menu
! 3194:
! 3195: @node Objects, OOF, Object-oriented Forth, Object-oriented Forth
! 3196: @subsection Objects
1.5 anton 3197: @cindex objects
3198: @cindex object-oriented programming
3199:
3200: @cindex @file{objects.fs}
3201: @cindex @file{oof.fs}
1.12 ! anton 3202:
! 3203: This section describes the @file{objects.fs} packet. This material also has been published in @cite{Yet Another Forth Objects Package} by Anton Ertl and appeared in Forth Dimensions 19(2), pages 37--43 (@url{http://www.complang.tuwien.ac.at/forth/objects/objects.html}).
1.5 anton 3204: @c McKewan's and Zsoter's packages
3205:
3206: This section assumes (in some places) that you have read @ref{Structures}.
3207:
3208: @menu
3209: * Properties of the Objects model::
3210: * Why object-oriented programming?::
3211: * Object-Oriented Terminology::
3212: * Basic Objects Usage::
3213: * The class Object::
3214: * Creating objects::
3215: * Object-Oriented Programming Style::
3216: * Class Binding::
3217: * Method conveniences::
3218: * Classes and Scoping::
3219: * Object Interfaces::
3220: * Objects Implementation::
3221: * Comparison with other object models::
3222: * Objects Glossary::
3223: @end menu
3224:
3225: Marcel Hendrix provided helpful comments on this section. Andras Zsoter
3226: and Bernd Paysan helped me with the related works section.
3227:
3228: @node Properties of the Objects model, Why object-oriented programming?, Objects, Objects
1.12 ! anton 3229: @subsubsection Properties of the @file{objects.fs} model
1.5 anton 3230: @cindex @file{objects.fs} properties
3231:
3232: @itemize @bullet
3233: @item
3234: It is straightforward to pass objects on the stack. Passing
3235: selectors on the stack is a little less convenient, but possible.
3236:
3237: @item
3238: Objects are just data structures in memory, and are referenced by
3239: their address. You can create words for objects with normal defining
3240: words like @code{constant}. Likewise, there is no difference
3241: between instance variables that contain objects and those
3242: that contain other data.
3243:
3244: @item
3245: Late binding is efficient and easy to use.
3246:
3247: @item
3248: It avoids parsing, and thus avoids problems with state-smartness
3249: and reduced extensibility; for convenience there are a few parsing
3250: words, but they have non-parsing counterparts. There are also a few
3251: defining words that parse. This is hard to avoid, because all standard
3252: defining words parse (except @code{:noname}); however, such
3253: words are not as bad as many other parsing words, because they are not
3254: state-smart.
3255:
3256: @item
3257: It does not try to incorporate everything. It does a few things
3258: and does them well (IMO). In particular, I did not intend to support
3259: information hiding with this model (although it has features that may
3260: help); you can use a separate package for achieving this.
3261:
3262: @item
3263: It is layered; you don't have to learn and use all features to use this
3264: model. Only a few features are necessary (@xref{Basic Objects Usage},
3265: @xref{The class Object}, @xref{Creating objects}.), the others
3266: are optional and independent of each other.
3267:
3268: @item
3269: An implementation in ANS Forth is available.
3270:
3271: @end itemize
3272:
3273: I have used the technique, on which this model is based, for
3274: implementing the parser generator Gray; we have also used this technique
3275: in Gforth for implementing the various flavours of wordlists (hashed or
3276: not, case-sensitive or not, special-purpose wordlists for locals etc.).
3277:
3278: @node Why object-oriented programming?, Object-Oriented Terminology, Properties of the Objects model, Objects
1.12 ! anton 3279: @subsubsection Why object-oriented programming?
1.5 anton 3280: @cindex object-oriented programming motivation
3281: @cindex motivation for object-oriented programming
3282:
3283: Often we have to deal with several data structures (@emph{objects}),
3284: that have to be treated similarly in some respects, but differ in
3285: others. Graphical objects are the textbook example: circles,
3286: triangles, dinosaurs, icons, and others, and we may want to add more
3287: during program development. We want to apply some operations to any
3288: graphical object, e.g., @code{draw} for displaying it on the
3289: screen. However, @code{draw} has to do something different for
3290: every kind of object.
3291:
3292: We could implement @code{draw} as a big @code{CASE}
3293: control structure that executes the appropriate code depending on the
3294: kind of object to be drawn. This would be not be very elegant, and,
3295: moreover, we would have to change @code{draw} every time we add
3296: a new kind of graphical object (say, a spaceship).
3297:
3298: What we would rather do is: When defining spaceships, we would tell
3299: the system: "Here's how you @code{draw} a spaceship; you figure
3300: out the rest."
3301:
3302: This is the problem that all systems solve that (rightfully) call
3303: themselves object-oriented, and the object-oriented package I present
3304: here also solves this problem (and not much else).
3305:
3306: @node Object-Oriented Terminology, Basic Objects Usage, Why object-oriented programming?, Objects
1.12 ! anton 3307: @subsubsection Object-Oriented Terminology
1.5 anton 3308: @cindex object-oriented terminology
3309: @cindex terminology for object-oriented programming
3310:
3311: This section is mainly for reference, so you don't have to understand
3312: all of it right away. The terminology is mainly Smalltalk-inspired. In
3313: short:
3314:
3315: @table @emph
3316: @cindex class
3317: @item class
3318: a data structure definition with some extras.
3319:
3320: @cindex object
3321: @item object
3322: an instance of the data structure described by the class definition.
3323:
3324: @cindex instance variables
3325: @item instance variables
3326: fields of the data structure.
3327:
3328: @cindex selector
3329: @cindex method selector
3330: @cindex virtual function
3331: @item selector
3332: (or @emph{method selector}) a word (e.g.,
3333: @code{draw}) for performing an operation on a variety of data
3334: structures (classes). A selector describes @emph{what} operation to
3335: perform. In C++ terminology: a (pure) virtual function.
3336:
3337: @cindex method
3338: @item method
3339: the concrete definition that performs the operation
3340: described by the selector for a specific class. A method specifies
3341: @emph{how} the operation is performed for a specific class.
3342:
3343: @cindex selector invocation
3344: @cindex message send
3345: @cindex invoking a selector
3346: @item selector invocation
3347: a call of a selector. One argument of the call (the TOS (top-of-stack))
3348: is used for determining which method is used. In Smalltalk terminology:
3349: a message (consisting of the selector and the other arguments) is sent
3350: to the object.
3351:
3352: @cindex receiving object
3353: @item receiving object
3354: the object used for determining the method executed by a selector
3355: invocation. In our model it is the object that is on the TOS when the
3356: selector is invoked. (@emph{Receiving} comes from Smalltalks
3357: @emph{message} terminology.)
3358:
3359: @cindex child class
3360: @cindex parent class
3361: @cindex inheritance
3362: @item child class
3363: a class that has (@emph{inherits}) all properties (instance variables,
3364: selectors, methods) from a @emph{parent class}. In Smalltalk
3365: terminology: The subclass inherits from the superclass. In C++
3366: terminology: The derived class inherits from the base class.
3367:
3368: @end table
3369:
3370: @c If you wonder about the message sending terminology, it comes from
3371: @c a time when each object had it's own task and objects communicated via
3372: @c message passing; eventually the Smalltalk developers realized that
3373: @c they can do most things through simple (indirect) calls. They kept the
3374: @c terminology.
3375:
3376: @node Basic Objects Usage, The class Object, Object-Oriented Terminology, Objects
1.12 ! anton 3377: @subsubsection Basic Objects Usage
1.5 anton 3378: @cindex basic objects usage
3379: @cindex objects, basic usage
3380:
3381: You can define a class for graphical objects like this:
3382:
3383: @cindex @code{class} usage
3384: @cindex @code{end-class} usage
3385: @cindex @code{selector} usage
3386: @example
3387: object class \ "object" is the parent class
3388: selector draw ( x y graphical -- )
3389: end-class graphical
3390: @end example
3391:
3392: This code defines a class @code{graphical} with an
3393: operation @code{draw}. We can perform the operation
3394: @code{draw} on any @code{graphical} object, e.g.:
3395:
3396: @example
3397: 100 100 t-rex draw
3398: @end example
3399:
3400: where @code{t-rex} is a word (say, a constant) that produces a
3401: graphical object.
3402:
3403: @cindex abstract class
3404: How do we create a graphical object? With the present definitions,
3405: we cannot create a useful graphical object. The class
3406: @code{graphical} describes graphical objects in general, but not
3407: any concrete graphical object type (C++ users would call it an
3408: @emph{abstract class}); e.g., there is no method for the selector
3409: @code{draw} in the class @code{graphical}.
3410:
3411: For concrete graphical objects, we define child classes of the
3412: class @code{graphical}, e.g.:
3413:
3414: @cindex @code{overrides} usage
3415: @cindex @code{field} usage in class definition
3416: @example
3417: graphical class \ "graphical" is the parent class
3418: cell% field circle-radius
3419:
3420: :noname ( x y circle -- )
3421: circle-radius @@ draw-circle ;
3422: overrides draw
3423:
3424: :noname ( n-radius circle -- )
3425: circle-radius ! ;
3426: overrides construct
3427:
3428: end-class circle
3429: @end example
3430:
3431: Here we define a class @code{circle} as a child of @code{graphical},
3432: with a field @code{circle-radius} (which behaves just like a field in
3433: @pxref{Structures}); it defines new methods for the selectors
3434: @code{draw} and @code{construct} (@code{construct} is defined in
3435: @code{object}, the parent class of @code{graphical}).
3436:
3437: Now we can create a circle on the heap (i.e.,
3438: @code{allocate}d memory) with
3439:
3440: @cindex @code{heap-new} usage
3441: @example
3442: 50 circle heap-new constant my-circle
3443: @end example
3444:
3445: @code{heap-new} invokes @code{construct}, thus
3446: initializing the field @code{circle-radius} with 50. We can draw
3447: this new circle at (100,100) with
3448:
3449: @example
3450: 100 100 my-circle draw
3451: @end example
3452:
3453: @cindex selector invocation, restrictions
3454: @cindex class definition, restrictions
3455: Note: You can invoke a selector only if the object on the TOS
3456: (the receiving object) belongs to the class where the selector was
3457: defined or one of its descendents; e.g., you can invoke
3458: @code{draw} only for objects belonging to @code{graphical}
3459: or its descendents (e.g., @code{circle}). Immediately before
3460: @code{end-class}, the search order has to be the same as
3461: immediately after @code{class}.
3462:
3463: @node The class Object, Creating objects, Basic Objects Usage, Objects
1.12 ! anton 3464: @subsubsection The class @code{object}
1.5 anton 3465: @cindex @code{object} class
3466:
3467: When you define a class, you have to specify a parent class. So how do
3468: you start defining classes? There is one class available from the start:
3469: @code{object}. You can use it as ancestor for all classes. It is the
3470: only class that has no parent. It has two selectors: @code{construct}
3471: and @code{print}.
3472:
3473: @node Creating objects, Object-Oriented Programming Style, The class Object, Objects
1.12 ! anton 3474: @subsubsection Creating objects
1.5 anton 3475: @cindex creating objects
3476: @cindex object creation
3477: @cindex object allocation options
3478:
3479: @cindex @code{heap-new} discussion
3480: @cindex @code{dict-new} discussion
3481: @cindex @code{construct} discussion
3482: You can create and initialize an object of a class on the heap with
3483: @code{heap-new} ( ... class -- object ) and in the dictionary
3484: (allocation with @code{allot}) with @code{dict-new} (
3485: ... class -- object ). Both words invoke @code{construct}, which
3486: consumes the stack items indicated by "..." above.
3487:
3488: @cindex @code{init-object} discussion
3489: @cindex @code{class-inst-size} discussion
3490: If you want to allocate memory for an object yourself, you can get its
3491: alignment and size with @code{class-inst-size 2@@} ( class --
3492: align size ). Once you have memory for an object, you can initialize
3493: it with @code{init-object} ( ... class object -- );
3494: @code{construct} does only a part of the necessary work.
3495:
3496: @node Object-Oriented Programming Style, Class Binding, Creating objects, Objects
1.12 ! anton 3497: @subsubsection Object-Oriented Programming Style
1.5 anton 3498: @cindex object-oriented programming style
3499:
3500: This section is not exhaustive.
3501:
3502: @cindex stack effects of selectors
3503: @cindex selectors and stack effects
3504: In general, it is a good idea to ensure that all methods for the
3505: same selector have the same stack effect: when you invoke a selector,
3506: you often have no idea which method will be invoked, so, unless all
3507: methods have the same stack effect, you will not know the stack effect
3508: of the selector invocation.
3509:
3510: One exception to this rule is methods for the selector
3511: @code{construct}. We know which method is invoked, because we
3512: specify the class to be constructed at the same place. Actually, I
3513: defined @code{construct} as a selector only to give the users a
3514: convenient way to specify initialization. The way it is used, a
3515: mechanism different from selector invocation would be more natural
3516: (but probably would take more code and more space to explain).
3517:
3518: @node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects
1.12 ! anton 3519: @subsubsection Class Binding
1.5 anton 3520: @cindex class binding
3521: @cindex early binding
3522:
3523: @cindex late binding
3524: Normal selector invocations determine the method at run-time depending
3525: on the class of the receiving object (late binding).
3526:
3527: Sometimes we want to invoke a different method. E.g., assume that
3528: you want to use the simple method for @code{print}ing
3529: @code{object}s instead of the possibly long-winded
3530: @code{print} method of the receiver class. You can achieve this
3531: by replacing the invocation of @code{print} with
3532:
3533: @cindex @code{[bind]} usage
3534: @example
3535: [bind] object print
3536: @end example
3537:
3538: in compiled code or
3539:
3540: @cindex @code{bind} usage
3541: @example
3542: bind object print
3543: @end example
3544:
3545: @cindex class binding, alternative to
3546: in interpreted code. Alternatively, you can define the method with a
3547: name (e.g., @code{print-object}), and then invoke it through the
3548: name. Class binding is just a (often more convenient) way to achieve
3549: the same effect; it avoids name clutter and allows you to invoke
3550: methods directly without naming them first.
3551:
3552: @cindex superclass binding
3553: @cindex parent class binding
3554: A frequent use of class binding is this: When we define a method
3555: for a selector, we often want the method to do what the selector does
3556: in the parent class, and a little more. There is a special word for
3557: this purpose: @code{[parent]}; @code{[parent]
3558: @emph{selector}} is equivalent to @code{[bind] @emph{parent
3559: selector}}, where @code{@emph{parent}} is the parent
3560: class of the current class. E.g., a method definition might look like:
3561:
3562: @cindex @code{[parent]} usage
3563: @example
3564: :noname
3565: dup [parent] foo \ do parent's foo on the receiving object
3566: ... \ do some more
3567: ; overrides foo
3568: @end example
3569:
3570: @cindex class binding as optimization
3571: In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions,
3572: March 1997), Andrew McKewan presents class binding as an optimization
3573: technique. I recommend not using it for this purpose unless you are in
3574: an emergency. Late binding is pretty fast with this model anyway, so the
3575: benefit of using class binding is small; the cost of using class binding
3576: where it is not appropriate is reduced maintainability.
3577:
3578: While we are at programming style questions: You should bind
3579: selectors only to ancestor classes of the receiving object. E.g., say,
3580: you know that the receiving object is of class @code{foo} or its
3581: descendents; then you should bind only to @code{foo} and its
3582: ancestors.
3583:
3584: @node Method conveniences, Classes and Scoping, Class Binding, Objects
1.12 ! anton 3585: @subsubsection Method conveniences
1.5 anton 3586: @cindex method conveniences
3587:
3588: In a method you usually access the receiving object pretty often. If
3589: you define the method as a plain colon definition (e.g., with
3590: @code{:noname}), you may have to do a lot of stack
3591: gymnastics. To avoid this, you can define the method with @code{m:
3592: ... ;m}. E.g., you could define the method for
3593: @code{draw}ing a @code{circle} with
3594:
3595: @cindex @code{this} usage
3596: @cindex @code{m:} usage
3597: @cindex @code{;m} usage
3598: @example
3599: m: ( x y circle -- )
3600: ( x y ) this circle-radius @@ draw-circle ;m
3601: @end example
3602:
3603: @cindex @code{exit} in @code{m: ... ;m}
3604: @cindex @code{exitm} discussion
3605: @cindex @code{catch} in @code{m: ... ;m}
3606: When this method is executed, the receiver object is removed from the
3607: stack; you can access it with @code{this} (admittedly, in this
3608: example the use of @code{m: ... ;m} offers no advantage). Note
3609: that I specify the stack effect for the whole method (i.e. including
3610: the receiver object), not just for the code between @code{m:}
3611: and @code{;m}. You cannot use @code{exit} in
3612: @code{m:...;m}; instead, use
3613: @code{exitm}.@footnote{Moreover, for any word that calls
3614: @code{catch} and was defined before loading
3615: @code{objects.fs}, you have to redefine it like I redefined
3616: @code{catch}: @code{: catch this >r catch r> to-this ;}}
3617:
3618: @cindex @code{inst-var} usage
3619: You will frequently use sequences of the form @code{this
3620: @emph{field}} (in the example above: @code{this
3621: circle-radius}). If you use the field only in this way, you can
3622: define it with @code{inst-var} and eliminate the
3623: @code{this} before the field name. E.g., the @code{circle}
3624: class above could also be defined with:
3625:
3626: @example
3627: graphical class
3628: cell% inst-var radius
3629:
3630: m: ( x y circle -- )
3631: radius @@ draw-circle ;m
3632: overrides draw
3633:
3634: m: ( n-radius circle -- )
3635: radius ! ;m
3636: overrides construct
3637:
3638: end-class circle
3639: @end example
3640:
3641: @code{radius} can only be used in @code{circle} and its
3642: descendent classes and inside @code{m:...;m}.
3643:
3644: @cindex @code{inst-value} usage
3645: You can also define fields with @code{inst-value}, which is
3646: to @code{inst-var} what @code{value} is to
3647: @code{variable}. You can change the value of such a field with
3648: @code{[to-inst]}. E.g., we could also define the class
3649: @code{circle} like this:
3650:
3651: @example
3652: graphical class
3653: inst-value radius
3654:
3655: m: ( x y circle -- )
3656: radius draw-circle ;m
3657: overrides draw
3658:
3659: m: ( n-radius circle -- )
3660: [to-inst] radius ;m
3661: overrides construct
3662:
3663: end-class circle
3664: @end example
3665:
3666:
3667: @node Classes and Scoping, Object Interfaces, Method conveniences, Objects
1.12 ! anton 3668: @subsubsection Classes and Scoping
1.5 anton 3669: @cindex classes and scoping
3670: @cindex scoping and classes
3671:
3672: Inheritance is frequent, unlike structure extension. This exacerbates
3673: the problem with the field name convention (@pxref{Structure Naming
3674: Convention}): One always has to remember in which class the field was
3675: originally defined; changing a part of the class structure would require
3676: changes for renaming in otherwise unaffected code.
3677:
3678: @cindex @code{inst-var} visibility
3679: @cindex @code{inst-value} visibility
3680: To solve this problem, I added a scoping mechanism (which was not in my
3681: original charter): A field defined with @code{inst-var} (or
3682: @code{inst-value}) is visible only in the class where it is defined and in
3683: the descendent classes of this class. Using such fields only makes
3684: sense in @code{m:}-defined methods in these classes anyway.
3685:
3686: This scoping mechanism allows us to use the unadorned field name,
3687: because name clashes with unrelated words become much less likely.
3688:
3689: @cindex @code{protected} discussion
3690: @cindex @code{private} discussion
3691: Once we have this mechanism, we can also use it for controlling the
3692: visibility of other words: All words defined after
3693: @code{protected} are visible only in the current class and its
3694: descendents. @code{public} restores the compilation
3695: (i.e. @code{current}) wordlist that was in effect before. If you
3696: have several @code{protected}s without an intervening
3697: @code{public} or @code{set-current}, @code{public}
3698: will restore the compilation wordlist in effect before the first of
3699: these @code{protected}s.
3700:
3701: @node Object Interfaces, Objects Implementation, Classes and Scoping, Objects
1.12 ! anton 3702: @subsubsection Object Interfaces
1.5 anton 3703: @cindex object interfaces
3704: @cindex interfaces for objects
3705:
3706: In this model you can only call selectors defined in the class of the
3707: receiving objects or in one of its ancestors. If you call a selector
3708: with a receiving object that is not in one of these classes, the
3709: result is undefined; if you are lucky, the program crashes
3710: immediately.
3711:
3712: @cindex selectors common to hardly-related classes
3713: Now consider the case when you want to have a selector (or several)
3714: available in two classes: You would have to add the selector to a
3715: common ancestor class, in the worst case to @code{object}. You
3716: may not want to do this, e.g., because someone else is responsible for
3717: this ancestor class.
3718:
3719: The solution for this problem is interfaces. An interface is a
3720: collection of selectors. If a class implements an interface, the
3721: selectors become available to the class and its descendents. A class
3722: can implement an unlimited number of interfaces. For the problem
3723: discussed above, we would define an interface for the selector(s), and
3724: both classes would implement the interface.
3725:
3726: As an example, consider an interface @code{storage} for
3727: writing objects to disk and getting them back, and a class
3728: @code{foo} foo that implements it. The code for this would look
3729: like this:
3730:
3731: @cindex @code{interface} usage
3732: @cindex @code{end-interface} usage
3733: @cindex @code{implementation} usage
3734: @example
3735: interface
3736: selector write ( file object -- )
3737: selector read1 ( file object -- )
3738: end-interface storage
3739:
3740: bar class
3741: storage implementation
3742:
3743: ... overrides write
3744: ... overrides read
3745: ...
3746: end-class foo
3747: @end example
3748:
3749: (I would add a word @code{read} ( file -- object ) that uses
3750: @code{read1} internally, but that's beyond the point illustrated
3751: here.)
3752:
3753: Note that you cannot use @code{protected} in an interface; and
3754: of course you cannot define fields.
3755:
3756: In the Neon model, all selectors are available for all classes;
3757: therefore it does not need interfaces. The price you pay in this model
3758: is slower late binding, and therefore, added complexity to avoid late
3759: binding.
3760:
3761: @node Objects Implementation, Comparison with other object models, Object Interfaces, Objects
1.12 ! anton 3762: @subsubsection @file{objects.fs} Implementation
1.5 anton 3763: @cindex @file{objects.fs} implementation
3764:
3765: @cindex @code{object-map} discussion
3766: An object is a piece of memory, like one of the data structures
3767: described with @code{struct...end-struct}. It has a field
3768: @code{object-map} that points to the method map for the object's
3769: class.
3770:
3771: @cindex method map
3772: @cindex virtual function table
3773: The @emph{method map}@footnote{This is Self terminology; in C++
3774: terminology: virtual function table.} is an array that contains the
3775: execution tokens (XTs) of the methods for the object's class. Each
3776: selector contains an offset into the method maps.
3777:
3778: @cindex @code{selector} implementation, class
3779: @code{selector} is a defining word that uses
3780: @code{create} and @code{does>}. The body of the
3781: selector contains the offset; the @code{does>} action for a
3782: class selector is, basically:
3783:
3784: @example
3785: ( object addr ) @@ over object-map @@ + @@ execute
3786: @end example
3787:
3788: Since @code{object-map} is the first field of the object, it
3789: does not generate any code. As you can see, calling a selector has a
3790: small, constant cost.
3791:
3792: @cindex @code{current-interface} discussion
3793: @cindex class implementation and representation
3794: A class is basically a @code{struct} combined with a method
3795: map. During the class definition the alignment and size of the class
3796: are passed on the stack, just as with @code{struct}s, so
3797: @code{field} can also be used for defining class
3798: fields. However, passing more items on the stack would be
3799: inconvenient, so @code{class} builds a data structure in memory,
3800: which is accessed through the variable
3801: @code{current-interface}. After its definition is complete, the
3802: class is represented on the stack by a pointer (e.g., as parameter for
3803: a child class definition).
3804:
3805: At the start, a new class has the alignment and size of its parent,
3806: and a copy of the parent's method map. Defining new fields extends the
3807: size and alignment; likewise, defining new selectors extends the
3808: method map. @code{overrides} just stores a new XT in the method
3809: map at the offset given by the selector.
3810:
3811: @cindex class binding, implementation
3812: Class binding just gets the XT at the offset given by the selector
3813: from the class's method map and @code{compile,}s (in the case of
3814: @code{[bind]}) it.
3815:
3816: @cindex @code{this} implementation
3817: @cindex @code{catch} and @code{this}
3818: @cindex @code{this} and @code{catch}
3819: I implemented @code{this} as a @code{value}. At the
3820: start of an @code{m:...;m} method the old @code{this} is
3821: stored to the return stack and restored at the end; and the object on
3822: the TOS is stored @code{TO this}. This technique has one
3823: disadvantage: If the user does not leave the method via
3824: @code{;m}, but via @code{throw} or @code{exit},
3825: @code{this} is not restored (and @code{exit} may
3826: crash). To deal with the @code{throw} problem, I have redefined
3827: @code{catch} to save and restore @code{this}; the same
3828: should be done with any word that can catch an exception. As for
3829: @code{exit}, I simply forbid it (as a replacement, there is
3830: @code{exitm}).
3831:
3832: @cindex @code{inst-var} implementation
3833: @code{inst-var} is just the same as @code{field}, with
3834: a different @code{does>} action:
3835: @example
3836: @@ this +
3837: @end example
3838: Similar for @code{inst-value}.
3839:
3840: @cindex class scoping implementation
3841: Each class also has a wordlist that contains the words defined with
3842: @code{inst-var} and @code{inst-value}, and its protected
3843: words. It also has a pointer to its parent. @code{class} pushes
3844: the wordlists of the class an all its ancestors on the search order,
3845: and @code{end-class} drops them.
3846:
3847: @cindex interface implementation
3848: An interface is like a class without fields, parent and protected
3849: words; i.e., it just has a method map. If a class implements an
3850: interface, its method map contains a pointer to the method map of the
3851: interface. The positive offsets in the map are reserved for class
3852: methods, therefore interface map pointers have negative
3853: offsets. Interfaces have offsets that are unique throughout the
3854: system, unlike class selectors, whose offsets are only unique for the
3855: classes where the selector is available (invokable).
3856:
3857: This structure means that interface selectors have to perform one
3858: indirection more than class selectors to find their method. Their body
3859: contains the interface map pointer offset in the class method map, and
3860: the method offset in the interface method map. The
3861: @code{does>} action for an interface selector is, basically:
3862:
3863: @example
3864: ( object selector-body )
3865: 2dup selector-interface @@ ( object selector-body object interface-offset )
3866: swap object-map @@ + @@ ( object selector-body map )
3867: swap selector-offset @@ + @@ execute
3868: @end example
3869:
3870: where @code{object-map} and @code{selector-offset} are
3871: first fields and generate no code.
3872:
3873: As a concrete example, consider the following code:
3874:
3875: @example
3876: interface
3877: selector if1sel1
3878: selector if1sel2
3879: end-interface if1
3880:
3881: object class
3882: if1 implementation
3883: selector cl1sel1
3884: cell% inst-var cl1iv1
3885:
3886: ' m1 overrides construct
3887: ' m2 overrides if1sel1
3888: ' m3 overrides if1sel2
3889: ' m4 overrides cl1sel2
3890: end-class cl1
3891:
3892: create obj1 object dict-new drop
3893: create obj2 cl1 dict-new drop
3894: @end example
3895:
3896: The data structure created by this code (including the data structure
3897: for @code{object}) is shown in the <a
3898: href="objects-implementation.eps">figure</a>, assuming a cell size of 4.
3899:
3900: @node Comparison with other object models, Objects Glossary, Objects Implementation, Objects
1.12 ! anton 3901: @subsubsection Comparison with other object models
1.5 anton 3902: @cindex comparison of object models
3903: @cindex object models, comparison
3904:
3905: Many object-oriented Forth extensions have been proposed (@cite{A survey
3906: of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford
3907: J. Rodriguez and W. F. S. Poehlman lists 17). Here I'll discuss the
3908: relation of @file{objects.fs} to two well-known and two closely-related
3909: (by the use of method maps) models.
3910:
3911: @cindex Neon model
3912: The most popular model currently seems to be the Neon model (see
3913: @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March
3914: 1997) by Andrew McKewan). The Neon model uses a @code{@emph{selector
3915: object}} syntax, which makes it unnatural to pass objects on the
3916: stack. It also requires that the selector parses the input stream (at
3917: compile time); this leads to reduced extensibility and to bugs that are
3918: hard to find. Finally, it allows using every selector to every object;
3919: this eliminates the need for classes, but makes it harder to create
3920: efficient implementations. A longer version of this critique can be
3921: found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth
3922: Dimensions, May 1997) by Anton Ertl.
3923:
3924: @cindex Pountain's object-oriented model
3925: Another well-known publication is @cite{Object-Oriented Forth} (Academic
3926: Press, London, 1987) by Dick Pountain. However, it is not really about
3927: object-oriented programming, because it hardly deals with late
3928: binding. Instead, it focuses on features like information hiding and
3929: overloading that are characteristic of modular languages like Ada (83).
3930:
3931: @cindex Zsoter's object-oriented model
3932: In @cite{Does late binding have to be slow?} (Forth Dimensions ??? 1996)
3933: Andras Zsoter describes a model that makes heavy use of an active object
3934: (like @code{this} in @file{objects.fs}): The active object is not only
3935: used for accessing all fields, but also specifies the receiving object
3936: of every selector invocation; you have to change the active object
3937: explicitly with @code{@{ ... @}}, whereas in @file{objects.fs} it
3938: changes more or less implicitly at @code{m: ... ;m}. Such a change at
3939: the method entry point is unnecessary with the Zsoter's model, because
3940: the receiving object is the active object already; OTOH, the explicit
3941: change is absolutely necessary in that model, because otherwise no one
3942: could ever change the active object. An ANS Forth implementation of this
3943: model is available at @url{http://www.forth.org/fig/oopf.html}.
3944:
1.12 ! anton 3945: @cindex @file{oof.fs}, differences to other models
1.5 anton 3946: The @file{oof.fs} model combines information hiding and overloading
3947: resolution (by keeping names in various wordlists) with object-oriented
3948: programming. It sets the active object implicitly on method entry, but
3949: also allows explicit changing (with @code{>o...o>} or with
3950: @code{with...endwith}). It uses parsing and state-smart objects and
3951: classes for resolving overloading and for early binding: the object or
3952: class parses the selector and determines the method from this. If the
3953: selector is not parsed by an object or class, it performs a call to the
3954: selector for the active object (late binding), like Zsoter's model.
3955: Fields are always accessed through the active object. The big
3956: disadvantage of this model is the parsing and the state-smartness, which
3957: reduces extensibility and increases the opportunities for subtle bugs;
3958: essentially, you are only safe if you never tick or @code{postpone} an
1.12 ! anton 3959: object or class (Bernd disagrees, but I (Anton) am not convinced).
! 3960:
! 3961: @cindex @file{mini-oof.fs}, differences to other models
! 3962: The Mini-OOF model is quite similar to a very stripped-down version of
! 3963: the Objects model, but syntactically it is a mixture of the Objects and
! 3964: the OOF model.
! 3965:
1.5 anton 3966:
3967: @node Objects Glossary, , Comparison with other object models, Objects
1.12 ! anton 3968: @subsubsection @file{objects.fs} Glossary
1.5 anton 3969: @cindex @file{objects.fs} Glossary
3970:
3971: doc-bind
3972: doc-<bind>
3973: doc-bind'
3974: doc-[bind]
3975: doc-class
3976: doc-class->map
3977: doc-class-inst-size
3978: doc-class-override!
3979: doc-construct
3980: doc-current'
3981: doc-[current]
3982: doc-current-interface
3983: doc-dict-new
3984: doc-drop-order
3985: doc-end-class
3986: doc-end-class-noname
3987: doc-end-interface
3988: doc-end-interface-noname
3989: doc-exitm
3990: doc-heap-new
3991: doc-implementation
3992: doc-init-object
3993: doc-inst-value
3994: doc-inst-var
3995: doc-interface
3996: doc-;m
3997: doc-m:
3998: doc-method
3999: doc-object
4000: doc-overrides
4001: doc-[parent]
4002: doc-print
4003: doc-protected
4004: doc-public
4005: doc-push-order
4006: doc-selector
4007: doc-this
4008: doc-<to-inst>
4009: doc-[to-inst]
4010: doc-to-this
4011: doc-xt-new
4012:
4013: @c -------------------------------------------------------------
1.12 ! anton 4014: @node OOF, Mini-OOF, Objects, Object-oriented Forth
! 4015: @subsection OOF
1.6 pazsan 4016: @cindex oof
4017: @cindex object-oriented programming
4018:
4019: @cindex @file{objects.fs}
4020: @cindex @file{oof.fs}
1.12 ! anton 4021:
! 4022: This section describes the @file{oof.fs} packet. This section uses the
! 4023: same rationale why using object-oriented programming, and the same
1.6 pazsan 4024: terminology.
4025:
4026: The packet described in this section is used in bigFORTH since 1991, and
4027: used for two large applications: a chromatographic system used to
4028: create new medicaments, and a graphic user interface library (MINOS).
4029:
1.12 ! anton 4030: You can find a description (in German) of @file{oof.fs} in @cite{Object
! 4031: oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension}
! 4032: 10(2), 1994.
! 4033:
1.6 pazsan 4034: @menu
4035: * Properties of the OOF model::
4036: * Basic OOF Usage::
4037: * The base class object::
1.7 pazsan 4038: * Class Declaration::
4039: * Class Implementation::
1.6 pazsan 4040: @end menu
4041:
1.12 ! anton 4042: @node Properties of the OOF model, Basic OOF Usage, OOF, OOF
! 4043: @subsubsection Properties of the OOF model
1.6 pazsan 4044: @cindex @file{oof.fs} properties
4045:
4046: @itemize @bullet
4047: @item
4048: This model combines object oriented programming with information
4049: hiding. It helps you writing large application, where scoping is
4050: necessary, because it provides class-oriented scoping.
4051:
4052: @item
4053: Named objects, object pointers, and object arrays can be created,
4054: selector invocation uses the "object selector" syntax. Selector invocation
4055: to objects and/or selectors on the stack is a bit less convenient, but
4056: possible.
4057:
4058: @item
4059: Selector invocation and instance variable usage of the active object is
4060: straight forward, since both make use of the active object.
4061:
4062: @item
4063: Late binding is efficient and easy to use.
4064:
4065: @item
4066: State-smart objects parse selectors. However, extensibility is provided
4067: using a (parsing) selector @code{postpone} and a selector @code{'}.
4068:
4069: @item
4070: An implementation in ANS Forth is available.
4071:
4072: @end itemize
4073:
4074:
1.12 ! anton 4075: @node Basic OOF Usage, The base class object, Properties of the OOF model, OOF
! 4076: @subsubsection Basic OOF Usage
1.6 pazsan 4077: @cindex @file{oof.fs} usage
4078:
4079: Here, I use the same example as for @code{objects} (@pxref{Basic Objects Usage}).
4080:
4081: You can define a class for graphical objects like this:
4082:
4083: @cindex @code{class} usage
4084: @cindex @code{class;} usage
4085: @cindex @code{method} usage
4086: @example
4087: object class graphical \ "object" is the parent class
4088: method draw ( x y graphical -- )
4089: class;
4090: @end example
4091:
4092: This code defines a class @code{graphical} with an
4093: operation @code{draw}. We can perform the operation
4094: @code{draw} on any @code{graphical} object, e.g.:
4095:
4096: @example
4097: 100 100 t-rex draw
4098: @end example
4099:
4100: where @code{t-rex} is an object or object pointer, created with e.g.
4101: @code{graphical : trex}.
4102:
4103: @cindex abstract class
4104: How do we create a graphical object? With the present definitions,
4105: we cannot create a useful graphical object. The class
4106: @code{graphical} describes graphical objects in general, but not
4107: any concrete graphical object type (C++ users would call it an
4108: @emph{abstract class}); e.g., there is no method for the selector
4109: @code{draw} in the class @code{graphical}.
4110:
4111: For concrete graphical objects, we define child classes of the
4112: class @code{graphical}, e.g.:
4113:
4114: @example
4115: graphical class circle \ "graphical" is the parent class
4116: cell var circle-radius
4117: how:
4118: : draw ( x y -- )
4119: circle-radius @@ draw-circle ;
4120:
4121: : init ( n-radius -- (
4122: circle-radius ! ;
4123: class;
4124: @end example
4125:
4126: Here we define a class @code{circle} as a child of @code{graphical},
4127: with a field @code{circle-radius}; it defines new methods for the
4128: selectors @code{draw} and @code{init} (@code{init} is defined in
4129: @code{object}, the parent class of @code{graphical}).
4130:
4131: Now we can create a circle in the dictionary with
4132:
4133: @example
4134: 50 circle : my-circle
4135: @end example
4136:
4137: @code{:} invokes @code{init}, thus initializing the field
4138: @code{circle-radius} with 50. We can draw this new circle at (100,100)
4139: with
4140:
4141: @example
4142: 100 100 my-circle draw
4143: @end example
4144:
4145: @cindex selector invocation, restrictions
4146: @cindex class definition, restrictions
4147: Note: You can invoke a selector only if the receiving object belongs to
4148: the class where the selector was defined or one of its descendents;
4149: e.g., you can invoke @code{draw} only for objects belonging to
4150: @code{graphical} or its descendents (e.g., @code{circle}). The scoping
1.7 pazsan 4151: mechanism will check if you try to invoke a selector that is not
1.6 pazsan 4152: defined in this class hierarchy, so you'll get an error at compilation
4153: time.
4154:
4155:
1.12 ! anton 4156: @node The base class object, Class Declaration, Basic OOF Usage, OOF
! 4157: @subsubsection The base class @file{object}
1.6 pazsan 4158: @cindex @file{oof.fs} base class
4159:
4160: When you define a class, you have to specify a parent class. So how do
4161: you start defining classes? There is one class available from the start:
4162: @code{object}. You have to use it as ancestor for all classes. It is the
4163: only class that has no parent. Classes are also objects, except that
4164: they don't have instance variables; class manipulation such as
4165: inheritance or changing definitions of a class is handled through
4166: selectors of the class @code{object}.
4167:
4168: @code{object} provides a number of selectors:
4169:
4170: @itemize @bullet
4171: @item
4172: @code{class} for subclassing, @code{definitions} to add definitions
4173: later on, and @code{class?} to get type informations (is the class a
4174: subclass of the class passed on the stack?).
1.7 pazsan 4175: doc---object-class
4176: doc---object-definitions
4177: doc---object-class?
1.6 pazsan 4178:
4179: @item
4180: @code{init} and @code{dispose} as constructor and destroctor of the
4181: object. @code{init} is invocated after the object's memory is allocated,
4182: while @code{dispose} also handles deallocation. Thus if you redefine
4183: @code{dispose}, you have to call the parent's dispose with @code{super
4184: dispose}, too.
1.7 pazsan 4185: doc---object-init
4186: doc---object-dispose
1.6 pazsan 4187:
4188: @item
1.7 pazsan 4189: @code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and
4190: @code{[]} to create named and unnamed objects and object arrays or
4191: object pointers.
4192: doc---object-new
4193: doc---object-new[]
4194: doc---object-:
4195: doc---object-ptr
4196: doc---object-asptr
4197: doc---object-[]
1.6 pazsan 4198:
4199: @item
4200: @code{::} and @code{super} for expicit scoping. You should use expicit
4201: scoping only for super classes or classes with the same set of instance
4202: variables. Explicit scoped selectors use early binding.
1.7 pazsan 4203: doc---object-::
4204: doc---object-super
1.6 pazsan 4205:
4206: @item
4207: @code{self} to get the address of the object
1.7 pazsan 4208: doc---object-self
1.6 pazsan 4209:
4210: @item
4211: @code{bind}, @code{bound}, @code{link}, and @code{is} to assign object
4212: pointers and instance defers.
1.7 pazsan 4213: doc---object-bind
4214: doc---object-bound
4215: doc---object-link
4216: doc---object-is
1.6 pazsan 4217:
4218: @item
4219: @code{'} to obtain selector tokens, @code{send} to invocate selectors
4220: form the stack, and @code{postpone} to generate selector invocation code.
1.7 pazsan 4221: doc---object-'
4222: doc---object-postpone
1.6 pazsan 4223:
4224: @item
4225: @code{with} and @code{endwith} to select the active object from the
4226: stack, and enabling it's scope. Using @code{with} and @code{endwith}
4227: also allows to create code using selector @code{postpone} without being
4228: trapped bye the state-smart objects.
1.7 pazsan 4229: doc---object-with
4230: doc---object-endwith
1.6 pazsan 4231:
4232: @end itemize
4233:
1.12 ! anton 4234: @node Class Declaration, Class Implementation, The base class object, OOF
! 4235: @subsubsection Class Declaration
1.7 pazsan 4236: @cindex class declaration
4237:
4238: @itemize @bullet
4239: @item
4240: Instance variables
4241: doc---oof-var
4242:
4243: @item
4244: Object pointers
4245: doc---oof-ptr
4246: doc---oof-asptr
4247:
4248: @item
4249: Instance defers
4250: doc---oof-defer
4251:
4252: @item
4253: Method selectors
4254: doc---oof-early
4255: doc---oof-method
4256:
4257: @item
4258: Class wide variables
4259: doc---oof-static
4260:
4261: @item
4262: End declaration
4263: doc---oof-how:
4264: doc---oof-class;
4265:
4266: @end itemize
4267:
1.12 ! anton 4268: @node Class Implementation, , Class Declaration, OOF
! 4269: @subsubsection Class Implementation
1.7 pazsan 4270: @cindex class implementation
4271:
1.12 ! anton 4272: @node Mini-OOF, , OOF, Object-oriented Forth
! 4273: @subsection Mini-OOF
1.8 pazsan 4274: @cindex mini-oof
4275:
4276: Gforth's third object oriented Forth package is a 12-liner. It uses a
4277: bit of a mixture of the @file{object.fs} and the @file{oof.fs} syntax,
4278: and reduces to the bare minimum of features.
4279:
4280: @example
4281: : method ( m v -- m' v ) Create over , swap cell+ swap
4282: DOES> ( ... o -- ... ) @ over @ + @ execute ;
4283: : var ( m v size -- m v' ) Create over , +
4284: DOES> ( o -- addr ) @ + ;
4285: : class ( class -- class methods vars ) dup 2@ ;
4286: : end-class ( class methods vars -- )
4287: Create here >r , dup , 2 cells ?DO ['] noop , cell +LOOP
4288: cell+ dup cell+ swap @ 2 - cells r> 2 cells + swap move ;
4289: : defines ( xt class -- ) ' >body @ + ! ;
4290: : new ( class -- o ) here over @ allot swap over ! ;
4291: : :: ( class "name" -- ) ' >body @ + @ compile, ;
4292: Create object 1 cells , 2 cells ,
4293: @end example
4294:
1.6 pazsan 4295: @c -------------------------------------------------------------
1.12 ! anton 4296: @node Tokens for Words, Wordlists, Object-oriented Forth, Words
1.1 anton 4297: @section Tokens for Words
4298: @cindex tokens for words
4299:
4300: This chapter describes the creation and use of tokens that represent
4301: words on the stack (and in data space).
4302:
4303: Named words have interpretation and compilation semantics. Unnamed words
4304: just have execution semantics.
4305:
4306: @cindex execution token
4307: An @dfn{execution token} represents the execution semantics of an
4308: unnamed word. An execution token occupies one cell. As explained in
4309: section @ref{Supplying names}, the execution token of the last words
4310: defined can be produced with
4311:
4312: short-lastxt
4313:
4314: You can perform the semantics represented by an execution token with
4315: doc-execute
4316: You can compile the word with
4317: doc-compile,
4318:
4319: @cindex code field address
4320: @cindex CFA
4321: In Gforth, the abstract data type @emph{execution token} is implemented
4322: as CFA (code field address).
4323:
4324: The interpretation semantics of a named word are also represented by an
4325: execution token. You can get it with
4326:
4327: doc-[']
4328: doc-'
4329:
4330: For literals, you use @code{'} in interpreted code and @code{[']} in
4331: compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusual
4332: by complaining about compile-only words. To get an execution token for a
4333: compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP']
4334: @var{X} drop}.
4335:
4336: @cindex compilation token
4337: The compilation semantics are represented by a @dfn{compilation token}
4338: consisting of two cells: @var{w xt}. The top cell @var{xt} is an
4339: execution token. The compilation semantics represented by the
4340: compilation token can be performed with @code{execute}, which consumes
4341: the whole compilation token, with an additional stack effect determined
4342: by the represented compilation semantics.
4343:
4344: doc-[comp']
4345: doc-comp'
4346:
4347: You can compile the compilation semantics with @code{postpone,}. I.e.,
4348: @code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE
4349: @var{word}}.
4350:
4351: doc-postpone,
4352:
4353: At present, the @var{w} part of a compilation token is an execution
4354: token, and the @var{xt} part represents either @code{execute} or
4355: @code{compile,}. However, don't rely on that knowledge, unless necessary;
4356: we may introduce unusual compilation tokens in the future (e.g.,
4357: compilation tokens representing the compilation semantics of literals).
4358:
4359: @cindex name token
4360: @cindex name field address
4361: @cindex NFA
4362: Named words are also represented by the @dfn{name token}. The abstract
4363: data type @emph{name token} is implemented as NFA (name field address).
4364:
4365: doc-find-name
4366: doc-name>int
4367: doc-name?int
4368: doc-name>comp
4369: doc-name>string
4370:
4371: @node Wordlists, Files, Tokens for Words, Words
4372: @section Wordlists
4373:
1.12 ! anton 4374: @node Files, Including Files, Wordlists, Words
1.1 anton 4375: @section Files
4376:
1.12 ! anton 4377: @node Including Files, Blocks, Files, Words
! 4378: @section Including Files
! 4379: @cindex including files
! 4380:
! 4381: @menu
! 4382: * Words for Including::
! 4383: * Search Path::
! 4384: * Changing the Search Path::
! 4385: * General Search Paths::
! 4386: @end menu
! 4387:
! 4388: @node Words for Including, Search Path, Including Files, Including Files
! 4389: @subsection Words for Including
! 4390:
! 4391: doc-include-file
! 4392: doc-included
! 4393: doc-include
! 4394:
! 4395: Usually you want to include a file only if it is not included already
! 4396: (by, say, another source file):
! 4397:
! 4398: doc-required
! 4399: doc-require
! 4400: doc-needs
! 4401:
! 4402: @cindex stack effect of included files
! 4403: @cindex including files, stack effect
! 4404: I recommend that you write your source files such that interpreting them
! 4405: does not change the stack. This allows using these files with
! 4406: @code{required} and friends without complications. E.g.,
! 4407:
! 4408: @example
! 4409: 1 require foo.fs drop
! 4410: @end example
! 4411:
! 4412: @node Search Path, Changing the Search Path, Words for Including, Including Files
! 4413: @subsection Search Path
! 4414: @cindex path for @code{included}
! 4415: @cindex file search path
! 4416: @cindex include search path
! 4417: @cindex search path for files
! 4418:
! 4419: If you specify an absolute filename (i.e., a filename starting with
! 4420: @file{/} or @file{~}, or with @file{:} in the second position (as in
! 4421: @samp{C:...})) for @code{included} and friends, that file is included
! 4422: just as you would expect.
! 4423:
! 4424: For relative filenames, Gforth uses a search path similar to Forth's
! 4425: search order (@pxref{Wordlists}). It tries to find the given filename in
! 4426: the directories present in the path, and includes the first one it
! 4427: finds.
! 4428:
! 4429: If the search path contains the directory @file{.} (as it should), this
! 4430: refers to the directory that the present file was @code{included}
! 4431: from. This allows files to include other files relative to their own
! 4432: position (irrespective of the current working directory or the absolute
! 4433: position). This feature is essential for libraries consisting of
! 4434: several files, where a file may include other files from the library.
! 4435: It corresponds to @code{#include "..."} in C. If the current input
! 4436: source is not a file, @file{.} refers to the directory of the innermost
! 4437: file being included, or, if there is no file being included, to the
! 4438: current working directory.
! 4439:
! 4440: Use @file{~+} to refer to the current working directory (as in the
! 4441: @code{bash}).
! 4442:
! 4443: If the filename starts with @file{./}, the search path is not searched
! 4444: (just as with absolute filenames), and the @file{.} has the same meaning
! 4445: as described above.
! 4446:
! 4447: @node Changing the Search Path, General Search Paths, Search Path, Including Files
! 4448: @subsection Changing the Search Path
! 4449: @cindex search path, changes
! 4450:
! 4451: The search path is initialized when you start Gforth (@pxref{Invoking
! 4452: Gforth}). You can display it with
! 4453:
! 4454: doc-.fpath
! 4455:
! 4456: You can change it later with the following words:
! 4457:
! 4458: doc-fpath+
! 4459: doc-fpath=
! 4460:
! 4461: Using fpath and require would look like:
! 4462:
! 4463: @example
! 4464: fpath= /usr/lib/forth/|./
! 4465:
! 4466: require timer.fs
! 4467: @end example
! 4468:
! 4469: If you have the need to look for a file in the Forth search path, you could
! 4470: use this Gforth feature in your application.
! 4471:
! 4472: doc-open-fpath-file
! 4473:
! 4474:
! 4475: @node General Search Paths, , Changing the Search Path, Including Files
! 4476: @subsection General Search Paths
! 4477: @cindex search paths for user applications
! 4478:
! 4479: Your application may need to search files in sevaral directories, like
! 4480: @code{included} does. For this purpose you can define and use your own
! 4481: search paths. Create a search path like this:
! 4482:
! 4483: @example
! 4484:
! 4485: Make a buffer for the path:
! 4486: create mypath 100 chars , \ maximum length (is checked)
! 4487: 0 , \ real len
! 4488: 100 chars allot \ space for path
! 4489: @end example
! 4490:
! 4491: You have the same functions for the forth search path in a generic version
! 4492: for different paths.
! 4493:
! 4494: doc-path+
! 4495: doc-path=
! 4496: doc-.path
! 4497: doc-open-path-file
! 4498:
! 4499:
! 4500: @node Blocks, Other I/O, Including Files, Words
1.1 anton 4501: @section Blocks
4502:
4503: @node Other I/O, Programming Tools, Blocks, Words
4504: @section Other I/O
4505:
1.7 pazsan 4506: @node Programming Tools, Assembler and Code Words, Other I/O, Words
1.1 anton 4507: @section Programming Tools
4508: @cindex programming tools
4509:
4510: @menu
4511: * Debugging:: Simple and quick.
4512: * Assertions:: Making your programs self-checking.
1.6 pazsan 4513: * Singlestep Debugger:: Executing your program word by word.
1.1 anton 4514: @end menu
4515:
4516: @node Debugging, Assertions, Programming Tools, Programming Tools
4517: @subsection Debugging
4518: @cindex debugging
4519:
1.2 jwilke 4520: The simple debugging aids provided in @file{debugs.fs}
1.1 anton 4521: are meant to support a different style of debugging than the
4522: tracing/stepping debuggers used in languages with long turn-around
4523: times.
4524:
4525: A much better (faster) way in fast-compiling languages is to add
4526: printing code at well-selected places, let the program run, look at
4527: the output, see where things went wrong, add more printing code, etc.,
4528: until the bug is found.
4529:
4530: The word @code{~~} is easy to insert. It just prints debugging
4531: information (by default the source location and the stack contents). It
4532: is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
4533: query-replace them with nothing). The deferred words
4534: @code{printdebugdata} and @code{printdebugline} control the output of
4535: @code{~~}. The default source location output format works well with
4536: Emacs' compilation mode, so you can step through the program at the
4537: source level using @kbd{C-x `} (the advantage over a stepping debugger
4538: is that you can step in any direction and you know where the crash has
4539: happened or where the strange data has occurred).
4540:
4541: Note that the default actions clobber the contents of the pictured
4542: numeric output string, so you should not use @code{~~}, e.g., between
4543: @code{<#} and @code{#>}.
4544:
4545: doc-~~
4546: doc-printdebugdata
4547: doc-printdebugline
4548:
1.2 jwilke 4549: @node Assertions, Singlestep Debugger, Debugging, Programming Tools
1.1 anton 4550: @subsection Assertions
4551: @cindex assertions
4552:
4553: It is a good idea to make your programs self-checking, in particular, if
4554: you use an assumption (e.g., that a certain field of a data structure is
4555: never zero) that may become wrong during maintenance. Gforth supports
4556: assertions for this purpose. They are used like this:
4557:
4558: @example
4559: assert( @var{flag} )
4560: @end example
4561:
4562: The code between @code{assert(} and @code{)} should compute a flag, that
4563: should be true if everything is alright and false otherwise. It should
4564: not change anything else on the stack. The overall stack effect of the
4565: assertion is @code{( -- )}. E.g.
4566:
4567: @example
4568: assert( 1 1 + 2 = ) \ what we learn in school
4569: assert( dup 0<> ) \ assert that the top of stack is not zero
4570: assert( false ) \ this code should not be reached
4571: @end example
4572:
4573: The need for assertions is different at different times. During
4574: debugging, we want more checking, in production we sometimes care more
4575: for speed. Therefore, assertions can be turned off, i.e., the assertion
4576: becomes a comment. Depending on the importance of an assertion and the
4577: time it takes to check it, you may want to turn off some assertions and
4578: keep others turned on. Gforth provides several levels of assertions for
4579: this purpose:
4580:
4581: doc-assert0(
4582: doc-assert1(
4583: doc-assert2(
4584: doc-assert3(
4585: doc-assert(
4586: doc-)
4587:
4588: @code{Assert(} is the same as @code{assert1(}. The variable
4589: @code{assert-level} specifies the highest assertions that are turned
4590: on. I.e., at the default @code{assert-level} of one, @code{assert0(} and
4591: @code{assert1(} assertions perform checking, while @code{assert2(} and
4592: @code{assert3(} assertions are treated as comments.
4593:
4594: Note that the @code{assert-level} is evaluated at compile-time, not at
4595: run-time. I.e., you cannot turn assertions on or off at run-time, you
4596: have to set the @code{assert-level} appropriately before compiling a
4597: piece of code. You can compile several pieces of code at several
4598: @code{assert-level}s (e.g., a trusted library at level 1 and newly
4599: written code at level 3).
4600:
4601: doc-assert-level
4602:
4603: If an assertion fails, a message compatible with Emacs' compilation mode
4604: is produced and the execution is aborted (currently with @code{ABORT"}.
4605: If there is interest, we will introduce a special throw code. But if you
4606: intend to @code{catch} a specific condition, using @code{throw} is
4607: probably more appropriate than an assertion).
4608:
1.2 jwilke 4609: @node Singlestep Debugger, , Assertions, Programming Tools
4610: @subsection Singlestep Debugger
4611: @cindex singlestep Debugger
4612: @cindex debugging Singlestep
4613: @cindex @code{dbg}
4614: @cindex @code{BREAK:}
4615: @cindex @code{BREAK"}
4616:
4617: When a new word is created there's often the need to check whether it behaves
1.5 anton 4618: correctly or not. You can do this by typing @code{dbg badword}. This might
1.2 jwilke 4619: look like:
4620: @example
4621: : badword 0 DO i . LOOP ; ok
4622: 2 dbg badword
4623: : badword
4624: Scanning code...
4625:
4626: Nesting debugger ready!
4627:
4628: 400D4738 8049BC4 0 -> [ 2 ] 00002 00000
4629: 400D4740 8049F68 DO -> [ 0 ]
4630: 400D4744 804A0C8 i -> [ 1 ] 00000
4631: 400D4748 400C5E60 . -> 0 [ 0 ]
4632: 400D474C 8049D0C LOOP -> [ 0 ]
4633: 400D4744 804A0C8 i -> [ 1 ] 00001
4634: 400D4748 400C5E60 . -> 1 [ 0 ]
4635: 400D474C 8049D0C LOOP -> [ 0 ]
4636: 400D4758 804B384 ; -> ok
4637: @end example
4638:
1.5 anton 4639: Each line displayed is one step. You always have to hit return to
4640: execute the next word that is displayed. If you don't want to execute
4641: the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is
4642: an overview what keys are available:
1.2 jwilke 4643:
4644: @table @i
4645:
1.4 anton 4646: @item <return>
1.5 anton 4647: Next; Execute the next word.
1.2 jwilke 4648:
4649: @item n
1.5 anton 4650: Nest; Single step through next word.
1.2 jwilke 4651:
4652: @item u
1.5 anton 4653: Unnest; Stop debugging and execute rest of word. If we got to this word
4654: with nest, continue debugging with the calling word.
1.2 jwilke 4655:
4656: @item d
1.5 anton 4657: Done; Stop debugging and execute rest.
1.2 jwilke 4658:
4659: @item s
1.5 anton 4660: Stopp; Abort immediately.
1.2 jwilke 4661:
4662: @end table
4663:
4664: Debugging large application with this mechanism is very difficult, because
4665: you have to nest very deep into the program before the interesting part
4666: begins. This takes a lot of time.
4667:
4668: To do it more directly put a @code{BREAK:} command into your source code.
4669: When program execution reaches @code{BREAK:} the single step debugger is
4670: invoked and you have all the features described above.
4671:
4672: If you have more than one part to debug it is useful to know where the
4673: program has stopped at the moment. You can do this by the
4674: @code{BREAK" string"} command. This behaves like @code{BREAK:} except that
4675: string is typed out when the ``breakpoint'' is reached.
4676:
1.7 pazsan 4677: @node Assembler and Code Words, Threading Words, Programming Tools, Words
4678: @section Assembler and Code Words
1.1 anton 4679: @cindex assembler
4680: @cindex code words
4681:
4682: Gforth provides some words for defining primitives (words written in
4683: machine code), and for defining the the machine-code equivalent of
4684: @code{DOES>}-based defining words. However, the machine-independent
4685: nature of Gforth poses a few problems: First of all, Gforth runs on
4686: several architectures, so it can provide no standard assembler. What's
4687: worse is that the register allocation not only depends on the processor,
4688: but also on the @code{gcc} version and options used.
4689:
4690: The words that Gforth offers encapsulate some system dependences (e.g., the
4691: header structure), so a system-independent assembler may be used in
4692: Gforth. If you do not have an assembler, you can compile machine code
4693: directly with @code{,} and @code{c,}.
4694:
4695: doc-assembler
4696: doc-code
4697: doc-end-code
4698: doc-;code
4699: doc-flush-icache
4700:
4701: If @code{flush-icache} does not work correctly, @code{code} words
4702: etc. will not work (reliably), either.
4703:
4704: These words are rarely used. Therefore they reside in @code{code.fs},
4705: which is usually not loaded (except @code{flush-icache}, which is always
4706: present). You can load them with @code{require code.fs}.
4707:
4708: @cindex registers of the inner interpreter
4709: In the assembly code you will want to refer to the inner interpreter's
4710: registers (e.g., the data stack pointer) and you may want to use other
4711: registers for temporary storage. Unfortunately, the register allocation
4712: is installation-dependent.
4713:
4714: The easiest solution is to use explicit register declarations
4715: (@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info,
4716: GNU C Manual}) for all of the inner interpreter's registers: You have to
4717: compile Gforth with @code{-DFORCE_REG} (configure option
4718: @code{--enable-force-reg}) and the appropriate declarations must be
4719: present in the @code{machine.h} file (see @code{mips.h} for an example;
4720: you can find a full list of all declarable register symbols with
4721: @code{grep register engine.c}). If you give explicit registers to all
4722: variables that are declared at the beginning of @code{engine()}, you
4723: should be able to use the other caller-saved registers for temporary
4724: storage. Alternatively, you can use the @code{gcc} option
4725: @code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code
4726: Generation Conventions, gcc.info, GNU C Manual}) to reserve a register
4727: (however, this restriction on register allocation may slow Gforth
4728: significantly).
4729:
4730: If this solution is not viable (e.g., because @code{gcc} does not allow
4731: you to explicitly declare all the registers you need), you have to find
4732: out by looking at the code where the inner interpreter's registers
4733: reside and which registers can be used for temporary storage. You can
4734: get an assembly listing of the engine's code with @code{make engine.s}.
4735:
4736: In any case, it is good practice to abstract your assembly code from the
4737: actual register allocation. E.g., if the data stack pointer resides in
4738: register @code{$17}, create an alias for this register called @code{sp},
4739: and use that in your assembly code.
4740:
4741: @cindex code words, portable
4742: Another option for implementing normal and defining words efficiently
4743: is: adding the wanted functionality to the source of Gforth. For normal
4744: words you just have to edit @file{primitives} (@pxref{Automatic
4745: Generation}), defining words (equivalent to @code{;CODE} words, for fast
4746: defined words) may require changes in @file{engine.c}, @file{kernal.fs},
4747: @file{prims2x.fs}, and possibly @file{cross.fs}.
4748:
4749:
1.12 ! anton 4750: @node Threading Words, , Assembler and Code Words, Words
1.1 anton 4751: @section Threading Words
4752: @cindex threading words
4753:
4754: @cindex code address
4755: These words provide access to code addresses and other threading stuff
4756: in Gforth (and, possibly, other interpretive Forths). It more or less
4757: abstracts away the differences between direct and indirect threading
4758: (and, for direct threading, the machine dependences). However, at
4759: present this wordset is still incomplete. It is also pretty low-level;
4760: some day it will hopefully be made unnecessary by an internals wordset
4761: that abstracts implementation details away completely.
4762:
4763: doc->code-address
4764: doc->does-code
4765: doc-code-address!
4766: doc-does-code!
4767: doc-does-handler!
4768: doc-/does-handler
4769:
4770: The code addresses produced by various defining words are produced by
4771: the following words:
4772:
4773: doc-docol:
4774: doc-docon:
4775: doc-dovar:
4776: doc-douser:
4777: doc-dodefer:
4778: doc-dofield:
4779:
4780: You can recognize words defined by a @code{CREATE}...@code{DOES>} word
4781: with @code{>DOES-CODE}. If the word was defined in that way, the value
4782: returned is different from 0 and identifies the @code{DOES>} used by the
4783: defining word.
1.2 jwilke 4784:
1.5 anton 4785: @c ******************************************************************
1.1 anton 4786: @node Tools, ANS conformance, Words, Top
4787: @chapter Tools
4788:
4789: @menu
4790: * ANS Report:: Report the words used, sorted by wordset.
4791: @end menu
4792:
4793: See also @ref{Emacs and Gforth}.
4794:
4795: @node ANS Report, , Tools, Tools
4796: @section @file{ans-report.fs}: Report the words used, sorted by wordset
4797: @cindex @file{ans-report.fs}
4798: @cindex report the words used in your program
4799: @cindex words used in your program
4800:
4801: If you want to label a Forth program as ANS Forth Program, you must
4802: document which wordsets the program uses; for extension wordsets, it is
4803: helpful to list the words the program requires from these wordsets
4804: (because Forth systems are allowed to provide only some words of them).
4805:
4806: The @file{ans-report.fs} tool makes it easy for you to determine which
4807: words from which wordset and which non-ANS words your application
4808: uses. You simply have to include @file{ans-report.fs} before loading the
4809: program you want to check. After loading your program, you can get the
4810: report with @code{print-ans-report}. A typical use is to run this as
4811: batch job like this:
4812: @example
4813: gforth ans-report.fs myprog.fs -e "print-ans-report bye"
4814: @end example
4815:
4816: The output looks like this (for @file{compat/control.fs}):
4817: @example
4818: The program uses the following words
4819: from CORE :
4820: : POSTPONE THEN ; immediate ?dup IF 0=
4821: from BLOCK-EXT :
4822: \
4823: from FILE :
4824: (
4825: @end example
4826:
4827: @subsection Caveats
4828:
4829: Note that @file{ans-report.fs} just checks which words are used, not whether
4830: they are used in an ANS Forth conforming way!
4831:
4832: Some words are defined in several wordsets in the
4833: standard. @file{ans-report.fs} reports them for only one of the
4834: wordsets, and not necessarily the one you expect. It depends on usage
4835: which wordset is the right one to specify. E.g., if you only use the
4836: compilation semantics of @code{S"}, it is a Core word; if you also use
4837: its interpretation semantics, it is a File word.
4838:
4839: @c ******************************************************************
4840: @node ANS conformance, Model, Tools, Top
4841: @chapter ANS conformance
4842: @cindex ANS conformance of Gforth
4843:
4844: To the best of our knowledge, Gforth is an
4845:
4846: ANS Forth System
4847: @itemize @bullet
4848: @item providing the Core Extensions word set
4849: @item providing the Block word set
4850: @item providing the Block Extensions word set
4851: @item providing the Double-Number word set
4852: @item providing the Double-Number Extensions word set
4853: @item providing the Exception word set
4854: @item providing the Exception Extensions word set
4855: @item providing the Facility word set
4856: @item providing @code{MS} and @code{TIME&DATE} from the Facility Extensions word set
4857: @item providing the File Access word set
4858: @item providing the File Access Extensions word set
4859: @item providing the Floating-Point word set
4860: @item providing the Floating-Point Extensions word set
4861: @item providing the Locals word set
4862: @item providing the Locals Extensions word set
4863: @item providing the Memory-Allocation word set
4864: @item providing the Memory-Allocation Extensions word set (that one's easy)
4865: @item providing the Programming-Tools word set
4866: @item providing @code{;CODE}, @code{AHEAD}, @code{ASSEMBLER}, @code{BYE}, @code{CODE}, @code{CS-PICK}, @code{CS-ROLL}, @code{STATE}, @code{[ELSE]}, @code{[IF]}, @code{[THEN]} from the Programming-Tools Extensions word set
4867: @item providing the Search-Order word set
4868: @item providing the Search-Order Extensions word set
4869: @item providing the String word set
4870: @item providing the String Extensions word set (another easy one)
4871: @end itemize
4872:
4873: @cindex system documentation
4874: In addition, ANS Forth systems are required to document certain
4875: implementation choices. This chapter tries to meet these
4876: requirements. In many cases it gives a way to ask the system for the
4877: information instead of providing the information directly, in
4878: particular, if the information depends on the processor, the operating
4879: system or the installation options chosen, or if they are likely to
4880: change during the maintenance of Gforth.
4881:
4882: @comment The framework for the rest has been taken from pfe.
4883:
4884: @menu
4885: * The Core Words::
4886: * The optional Block word set::
4887: * The optional Double Number word set::
4888: * The optional Exception word set::
4889: * The optional Facility word set::
4890: * The optional File-Access word set::
4891: * The optional Floating-Point word set::
4892: * The optional Locals word set::
4893: * The optional Memory-Allocation word set::
4894: * The optional Programming-Tools word set::
4895: * The optional Search-Order word set::
4896: @end menu
4897:
4898:
4899: @c =====================================================================
4900: @node The Core Words, The optional Block word set, ANS conformance, ANS conformance
4901: @comment node-name, next, previous, up
4902: @section The Core Words
4903: @c =====================================================================
4904: @cindex core words, system documentation
4905: @cindex system documentation, core words
4906:
4907: @menu
4908: * core-idef:: Implementation Defined Options
4909: * core-ambcond:: Ambiguous Conditions
4910: * core-other:: Other System Documentation
4911: @end menu
4912:
4913: @c ---------------------------------------------------------------------
4914: @node core-idef, core-ambcond, The Core Words, The Core Words
4915: @subsection Implementation Defined Options
4916: @c ---------------------------------------------------------------------
4917: @cindex core words, implementation-defined options
4918: @cindex implementation-defined options, core words
4919:
4920:
4921: @table @i
4922: @item (Cell) aligned addresses:
4923: @cindex cell-aligned addresses
4924: @cindex aligned addresses
4925: processor-dependent. Gforth's alignment words perform natural alignment
4926: (e.g., an address aligned for a datum of size 8 is divisible by
4927: 8). Unaligned accesses usually result in a @code{-23 THROW}.
4928:
4929: @item @code{EMIT} and non-graphic characters:
4930: @cindex @code{EMIT} and non-graphic characters
4931: @cindex non-graphic characters and @code{EMIT}
4932: The character is output using the C library function (actually, macro)
4933: @code{putc}.
4934:
4935: @item character editing of @code{ACCEPT} and @code{EXPECT}:
4936: @cindex character editing of @code{ACCEPT} and @code{EXPECT}
4937: @cindex editing in @code{ACCEPT} and @code{EXPECT}
4938: @cindex @code{ACCEPT}, editing
4939: @cindex @code{EXPECT}, editing
4940: This is modeled on the GNU readline library (@pxref{Readline
4941: Interaction, , Command Line Editing, readline, The GNU Readline
4942: Library}) with Emacs-like key bindings. @kbd{Tab} deviates a little by
4943: producing a full word completion every time you type it (instead of
4944: producing the common prefix of all completions).
4945:
4946: @item character set:
4947: @cindex character set
4948: The character set of your computer and display device. Gforth is
4949: 8-bit-clean (but some other component in your system may make trouble).
4950:
4951: @item Character-aligned address requirements:
4952: @cindex character-aligned address requirements
4953: installation-dependent. Currently a character is represented by a C
4954: @code{unsigned char}; in the future we might switch to @code{wchar_t}
4955: (Comments on that requested).
4956:
4957: @item character-set extensions and matching of names:
4958: @cindex character-set extensions and matching of names
4959: @cindex case sensitivity for name lookup
4960: @cindex name lookup, case sensitivity
4961: @cindex locale and case sensitivity
4962: Any character except the ASCII NUL charcter can be used in a
4963: name. Matching is case-insensitive (except in @code{TABLE}s). The
4964: matching is performed using the C function @code{strncasecmp}, whose
4965: function is probably influenced by the locale. E.g., the @code{C} locale
4966: does not know about accents and umlauts, so they are matched
4967: case-sensitively in that locale. For portability reasons it is best to
4968: write programs such that they work in the @code{C} locale. Then one can
4969: use libraries written by a Polish programmer (who might use words
4970: containing ISO Latin-2 encoded characters) and by a French programmer
4971: (ISO Latin-1) in the same program (of course, @code{WORDS} will produce
4972: funny results for some of the words (which ones, depends on the font you
4973: are using)). Also, the locale you prefer may not be available in other
4974: operating systems. Hopefully, Unicode will solve these problems one day.
4975:
4976: @item conditions under which control characters match a space delimiter:
4977: @cindex space delimiters
4978: @cindex control characters as delimiters
4979: If @code{WORD} is called with the space character as a delimiter, all
4980: white-space characters (as identified by the C macro @code{isspace()})
4981: are delimiters. @code{PARSE}, on the other hand, treats space like other
4982: delimiters. @code{PARSE-WORD} treats space like @code{WORD}, but behaves
4983: like @code{PARSE} otherwise. @code{(NAME)}, which is used by the outer
4984: interpreter (aka text interpreter) by default, treats all white-space
4985: characters as delimiters.
4986:
4987: @item format of the control flow stack:
4988: @cindex control flow stack, format
4989: The data stack is used as control flow stack. The size of a control flow
4990: stack item in cells is given by the constant @code{cs-item-size}. At the
4991: time of this writing, an item consists of a (pointer to a) locals list
4992: (third), an address in the code (second), and a tag for identifying the
4993: item (TOS). The following tags are used: @code{defstart},
4994: @code{live-orig}, @code{dead-orig}, @code{dest}, @code{do-dest},
4995: @code{scopestart}.
4996:
4997: @item conversion of digits > 35
4998: @cindex digits > 35
4999: The characters @code{[\]^_'} are the digits with the decimal value
5000: 36@minus{}41. There is no way to input many of the larger digits.
5001:
5002: @item display after input terminates in @code{ACCEPT} and @code{EXPECT}:
5003: @cindex @code{EXPECT}, display after end of input
5004: @cindex @code{ACCEPT}, display after end of input
5005: The cursor is moved to the end of the entered string. If the input is
5006: terminated using the @kbd{Return} key, a space is typed.
5007:
5008: @item exception abort sequence of @code{ABORT"}:
5009: @cindex exception abort sequence of @code{ABORT"}
5010: @cindex @code{ABORT"}, exception abort sequence
5011: The error string is stored into the variable @code{"error} and a
5012: @code{-2 throw} is performed.
5013:
5014: @item input line terminator:
5015: @cindex input line terminator
5016: @cindex line terminator on input
5017: @cindex newline charcter on input
5018: For interactive input, @kbd{C-m} (CR) and @kbd{C-j} (LF) terminate
5019: lines. One of these characters is typically produced when you type the
5020: @kbd{Enter} or @kbd{Return} key.
5021:
5022: @item maximum size of a counted string:
5023: @cindex maximum size of a counted string
5024: @cindex counted string, maximum size
5025: @code{s" /counted-string" environment? drop .}. Currently 255 characters
5026: on all ports, but this may change.
5027:
5028: @item maximum size of a parsed string:
5029: @cindex maximum size of a parsed string
5030: @cindex parsed string, maximum size
5031: Given by the constant @code{/line}. Currently 255 characters.
5032:
5033: @item maximum size of a definition name, in characters:
5034: @cindex maximum size of a definition name, in characters
5035: @cindex name, maximum length
5036: 31
5037:
5038: @item maximum string length for @code{ENVIRONMENT?}, in characters:
5039: @cindex maximum string length for @code{ENVIRONMENT?}, in characters
5040: @cindex @code{ENVIRONMENT?} string length, maximum
5041: 31
5042:
5043: @item method of selecting the user input device:
5044: @cindex user input device, method of selecting
5045: The user input device is the standard input. There is currently no way to
5046: change it from within Gforth. However, the input can typically be
5047: redirected in the command line that starts Gforth.
5048:
5049: @item method of selecting the user output device:
5050: @cindex user output device, method of selecting
5051: @code{EMIT} and @code{TYPE} output to the file-id stored in the value
1.10 anton 5052: @code{outfile-id} (@code{stdout} by default). Gforth uses unbuffered
5053: output when the user output device is a terminal, otherwise the output
5054: is buffered.
1.1 anton 5055:
5056: @item methods of dictionary compilation:
5057: What are we expected to document here?
5058:
5059: @item number of bits in one address unit:
5060: @cindex number of bits in one address unit
5061: @cindex address unit, size in bits
5062: @code{s" address-units-bits" environment? drop .}. 8 in all current
5063: ports.
5064:
5065: @item number representation and arithmetic:
5066: @cindex number representation and arithmetic
5067: Processor-dependent. Binary two's complement on all current ports.
5068:
5069: @item ranges for integer types:
5070: @cindex ranges for integer types
5071: @cindex integer types, ranges
5072: Installation-dependent. Make environmental queries for @code{MAX-N},
5073: @code{MAX-U}, @code{MAX-D} and @code{MAX-UD}. The lower bounds for
5074: unsigned (and positive) types is 0. The lower bound for signed types on
5075: two's complement and one's complement machines machines can be computed
5076: by adding 1 to the upper bound.
5077:
5078: @item read-only data space regions:
5079: @cindex read-only data space regions
5080: @cindex data-space, read-only regions
5081: The whole Forth data space is writable.
5082:
5083: @item size of buffer at @code{WORD}:
5084: @cindex size of buffer at @code{WORD}
5085: @cindex @code{WORD} buffer size
5086: @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is
5087: shared with the pictured numeric output string. If overwriting
5088: @code{PAD} is acceptable, it is as large as the remaining dictionary
5089: space, although only as much can be sensibly used as fits in a counted
5090: string.
5091:
5092: @item size of one cell in address units:
5093: @cindex cell size
5094: @code{1 cells .}.
5095:
5096: @item size of one character in address units:
5097: @cindex char size
5098: @code{1 chars .}. 1 on all current ports.
5099:
5100: @item size of the keyboard terminal buffer:
5101: @cindex size of the keyboard terminal buffer
5102: @cindex terminal buffer, size
5103: Varies. You can determine the size at a specific time using @code{lp@@
5104: tib - .}. It is shared with the locals stack and TIBs of files that
5105: include the current file. You can change the amount of space for TIBs
5106: and locals stack at Gforth startup with the command line option
5107: @code{-l}.
5108:
5109: @item size of the pictured numeric output buffer:
5110: @cindex size of the pictured numeric output buffer
5111: @cindex pictured numeric output buffer, size
5112: @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is
5113: shared with @code{WORD}.
5114:
5115: @item size of the scratch area returned by @code{PAD}:
5116: @cindex size of the scratch area returned by @code{PAD}
5117: @cindex @code{PAD} size
5118: The remainder of dictionary space. @code{unused pad here - - .}.
5119:
5120: @item system case-sensitivity characteristics:
5121: @cindex case-sensitivity characteristics
5122: Dictionary searches are case insensitive (except in
5123: @code{TABLE}s). However, as explained above under @i{character-set
5124: extensions}, the matching for non-ASCII characters is determined by the
5125: locale you are using. In the default @code{C} locale all non-ASCII
5126: characters are matched case-sensitively.
5127:
5128: @item system prompt:
5129: @cindex system prompt
5130: @cindex prompt
5131: @code{ ok} in interpret state, @code{ compiled} in compile state.
5132:
5133: @item division rounding:
5134: @cindex division rounding
5135: installation dependent. @code{s" floored" environment? drop .}. We leave
5136: the choice to @code{gcc} (what to use for @code{/}) and to you (whether
5137: to use @code{fm/mod}, @code{sm/rem} or simply @code{/}).
5138:
5139: @item values of @code{STATE} when true:
5140: @cindex @code{STATE} values
5141: -1.
5142:
5143: @item values returned after arithmetic overflow:
5144: On two's complement machines, arithmetic is performed modulo
5145: 2**bits-per-cell for single arithmetic and 4**bits-per-cell for double
5146: arithmetic (with appropriate mapping for signed types). Division by zero
5147: typically results in a @code{-55 throw} (Floating-point unidentified
5148: fault), although a @code{-10 throw} (divide by zero) would be more
5149: appropriate.
5150:
5151: @item whether the current definition can be found after @t{DOES>}:
5152: @cindex @t{DOES>}, visibility of current definition
5153: No.
5154:
5155: @end table
5156:
5157: @c ---------------------------------------------------------------------
5158: @node core-ambcond, core-other, core-idef, The Core Words
5159: @subsection Ambiguous conditions
5160: @c ---------------------------------------------------------------------
5161: @cindex core words, ambiguous conditions
5162: @cindex ambiguous conditions, core words
5163:
5164: @table @i
5165:
5166: @item a name is neither a word nor a number:
5167: @cindex name not found
5168: @cindex Undefined word
5169: @code{-13 throw} (Undefined word). Actually, @code{-13 bounce}, which
5170: preserves the data and FP stack, so you don't lose more work than
5171: necessary.
5172:
5173: @item a definition name exceeds the maximum length allowed:
5174: @cindex Word name too long
5175: @code{-19 throw} (Word name too long)
5176:
5177: @item addressing a region not inside the various data spaces of the forth system:
5178: @cindex Invalid memory address
5179: The stacks, code space and name space are accessible. Machine code space is
5180: typically readable. Accessing other addresses gives results dependent on
5181: the operating system. On decent systems: @code{-9 throw} (Invalid memory
5182: address).
5183:
5184: @item argument type incompatible with parameter:
5185: @cindex Argument type mismatch
5186: This is usually not caught. Some words perform checks, e.g., the control
5187: flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type
5188: mismatch).
5189:
5190: @item attempting to obtain the execution token of a word with undefined execution semantics:
5191: @cindex Interpreting a compile-only word, for @code{'} etc.
5192: @cindex execution token of words with undefined execution semantics
5193: @code{-14 throw} (Interpreting a compile-only word). In some cases, you
5194: get an execution token for @code{compile-only-error} (which performs a
5195: @code{-14 throw} when executed).
5196:
5197: @item dividing by zero:
5198: @cindex dividing by zero
5199: @cindex floating point unidentified fault, integer division
5200: @cindex divide by zero
5201: typically results in a @code{-55 throw} (floating point unidentified
5202: fault), although a @code{-10 throw} (divide by zero) would be more
5203: appropriate.
5204:
5205: @item insufficient data stack or return stack space:
5206: @cindex insufficient data stack or return stack space
5207: @cindex stack overflow
5208: @cindex Address alignment exception, stack overflow
5209: @cindex Invalid memory address, stack overflow
5210: Depending on the operating system, the installation, and the invocation
5211: of Gforth, this is either checked by the memory management hardware, or
5212: it is not checked. If it is checked, you typically get a @code{-9 throw}
5213: (Invalid memory address) as soon as the overflow happens. If it is not
5214: check, overflows typically result in mysterious illegal memory accesses,
5215: producing @code{-9 throw} (Invalid memory address) or @code{-23 throw}
5216: (Address alignment exception); they might also destroy the internal data
5217: structure of @code{ALLOCATE} and friends, resulting in various errors in
5218: these words.
5219:
5220: @item insufficient space for loop control parameters:
5221: @cindex insufficient space for loop control parameters
5222: like other return stack overflows.
5223:
5224: @item insufficient space in the dictionary:
5225: @cindex insufficient space in the dictionary
5226: @cindex dictionary overflow
1.12 ! anton 5227: If you try to allot (either directly with @code{allot}, or indirectly
! 5228: with @code{,}, @code{create} etc.) more memory than available in the
! 5229: dictionary, you get a @code{-8 throw} (Dictionary overflow). If you try
! 5230: to access memory beyond the end of the dictionary, the results are
! 5231: similar to stack overflows.
1.1 anton 5232:
5233: @item interpreting a word with undefined interpretation semantics:
5234: @cindex interpreting a word with undefined interpretation semantics
5235: @cindex Interpreting a compile-only word
5236: For some words, we have defined interpretation semantics. For the
5237: others: @code{-14 throw} (Interpreting a compile-only word).
5238:
5239: @item modifying the contents of the input buffer or a string literal:
5240: @cindex modifying the contents of the input buffer or a string literal
5241: These are located in writable memory and can be modified.
5242:
5243: @item overflow of the pictured numeric output string:
5244: @cindex overflow of the pictured numeric output string
5245: @cindex pictured numeric output string, overflow
5246: Not checked. Runs into the dictionary and destroys it (at least,
5247: partially).
5248:
5249: @item parsed string overflow:
5250: @cindex parsed string overflow
5251: @code{PARSE} cannot overflow. @code{WORD} does not check for overflow.
5252:
5253: @item producing a result out of range:
5254: @cindex result out of range
5255: On two's complement machines, arithmetic is performed modulo
5256: 2**bits-per-cell for single arithmetic and 4**bits-per-cell for double
5257: arithmetic (with appropriate mapping for signed types). Division by zero
5258: typically results in a @code{-55 throw} (floatingpoint unidentified
5259: fault), although a @code{-10 throw} (divide by zero) would be more
5260: appropriate. @code{convert} and @code{>number} currently overflow
5261: silently.
5262:
5263: @item reading from an empty data or return stack:
5264: @cindex stack empty
5265: @cindex stack underflow
5266: The data stack is checked by the outer (aka text) interpreter after
5267: every word executed. If it has underflowed, a @code{-4 throw} (Stack
5268: underflow) is performed. Apart from that, stacks may be checked or not,
5269: depending on operating system, installation, and invocation. The
5270: consequences of stack underflows are similar to the consequences of
5271: stack overflows. Note that even if the system uses checking (through the
5272: MMU), your program may have to underflow by a significant number of
5273: stack items to trigger the reaction (the reason for this is that the
5274: MMU, and therefore the checking, works with a page-size granularity).
5275:
5276: @item unexpected end of the input buffer, resulting in an attempt to use a zero-length string as a name:
5277: @cindex unexpected end of the input buffer
5278: @cindex zero-length string as a name
5279: @cindex Attempt to use zero-length string as a name
5280: @code{Create} and its descendants perform a @code{-16 throw} (Attempt to
5281: use zero-length string as a name). Words like @code{'} probably will not
5282: find what they search. Note that it is possible to create zero-length
5283: names with @code{nextname} (should it not?).
5284:
5285: @item @code{>IN} greater than input buffer:
5286: @cindex @code{>IN} greater than input buffer
5287: The next invocation of a parsing word returns a string with length 0.
5288:
5289: @item @code{RECURSE} appears after @code{DOES>}:
5290: @cindex @code{RECURSE} appears after @code{DOES>}
5291: Compiles a recursive call to the defining word, not to the defined word.
5292:
5293: @item argument input source different than current input source for @code{RESTORE-INPUT}:
5294: @cindex argument input source different than current input source for @code{RESTORE-INPUT}
5295: @cindex Argument type mismatch, @code{RESTORE-INPUT}
5296: @cindex @code{RESTORE-INPUT}, Argument type mismatch
5297: @code{-12 THROW}. Note that, once an input file is closed (e.g., because
5298: the end of the file was reached), its source-id may be
5299: reused. Therefore, restoring an input source specification referencing a
5300: closed file may lead to unpredictable results instead of a @code{-12
5301: THROW}.
5302:
5303: In the future, Gforth may be able to restore input source specifications
5304: from other than the current input source.
5305:
5306: @item data space containing definitions gets de-allocated:
5307: @cindex data space containing definitions gets de-allocated
5308: Deallocation with @code{allot} is not checked. This typically results in
5309: memory access faults or execution of illegal instructions.
5310:
5311: @item data space read/write with incorrect alignment:
5312: @cindex data space read/write with incorrect alignment
5313: @cindex alignment faults
5314: @cindex Address alignment exception
5315: Processor-dependent. Typically results in a @code{-23 throw} (Address
1.12 ! anton 5316: alignment exception). Under Linux-Intel on a 486 or later processor with
1.1 anton 5317: alignment turned on, incorrect alignment results in a @code{-9 throw}
5318: (Invalid memory address). There are reportedly some processors with
1.12 ! anton 5319: alignment restrictions that do not report violations.
1.1 anton 5320:
5321: @item data space pointer not properly aligned, @code{,}, @code{C,}:
5322: @cindex data space pointer not properly aligned, @code{,}, @code{C,}
5323: Like other alignment errors.
5324:
5325: @item less than u+2 stack items (@code{PICK} and @code{ROLL}):
5326: Like other stack underflows.
5327:
5328: @item loop control parameters not available:
5329: @cindex loop control parameters not available
5330: Not checked. The counted loop words simply assume that the top of return
5331: stack items are loop control parameters and behave accordingly.
5332:
5333: @item most recent definition does not have a name (@code{IMMEDIATE}):
5334: @cindex most recent definition does not have a name (@code{IMMEDIATE})
5335: @cindex last word was headerless
5336: @code{abort" last word was headerless"}.
5337:
5338: @item name not defined by @code{VALUE} used by @code{TO}:
5339: @cindex name not defined by @code{VALUE} used by @code{TO}
5340: @cindex @code{TO} on non-@code{VALUE}s
5341: @cindex Invalid name argument, @code{TO}
5342: @code{-32 throw} (Invalid name argument) (unless name is a local or was
5343: defined by @code{CONSTANT}; in the latter case it just changes the constant).
5344:
5345: @item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}):
5346: @cindex name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]})
5347: @cindex Undefined word, @code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}
5348: @code{-13 throw} (Undefined word)
5349:
5350: @item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}):
5351: @cindex parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN})
5352: Gforth behaves as if they were of the same type. I.e., you can predict
5353: the behaviour by interpreting all parameters as, e.g., signed.
5354:
5355: @item @code{POSTPONE} or @code{[COMPILE]} applied to @code{TO}:
5356: @cindex @code{POSTPONE} or @code{[COMPILE]} applied to @code{TO}
5357: Assume @code{: X POSTPONE TO ; IMMEDIATE}. @code{X} performs the
5358: compilation semantics of @code{TO}.
5359:
5360: @item String longer than a counted string returned by @code{WORD}:
5361: @cindex String longer than a counted string returned by @code{WORD}
5362: @cindex @code{WORD}, string overflow
5363: Not checked. The string will be ok, but the count will, of course,
5364: contain only the least significant bits of the length.
5365:
5366: @item u greater than or equal to the number of bits in a cell (@code{LSHIFT}, @code{RSHIFT}):
5367: @cindex @code{LSHIFT}, large shift counts
5368: @cindex @code{RSHIFT}, large shift counts
5369: Processor-dependent. Typical behaviours are returning 0 and using only
5370: the low bits of the shift count.
5371:
5372: @item word not defined via @code{CREATE}:
5373: @cindex @code{>BODY} of non-@code{CREATE}d words
5374: @code{>BODY} produces the PFA of the word no matter how it was defined.
5375:
5376: @cindex @code{DOES>} of non-@code{CREATE}d words
5377: @code{DOES>} changes the execution semantics of the last defined word no
5378: matter how it was defined. E.g., @code{CONSTANT DOES>} is equivalent to
5379: @code{CREATE , DOES>}.
5380:
5381: @item words improperly used outside @code{<#} and @code{#>}:
5382: Not checked. As usual, you can expect memory faults.
5383:
5384: @end table
5385:
5386:
5387: @c ---------------------------------------------------------------------
5388: @node core-other, , core-ambcond, The Core Words
5389: @subsection Other system documentation
5390: @c ---------------------------------------------------------------------
5391: @cindex other system documentation, core words
5392: @cindex core words, other system documentation
5393:
5394: @table @i
5395: @item nonstandard words using @code{PAD}:
5396: @cindex @code{PAD} use by nonstandard words
5397: None.
5398:
5399: @item operator's terminal facilities available:
5400: @cindex operator's terminal facilities available
5401: After processing the command line, Gforth goes into interactive mode,
5402: and you can give commands to Gforth interactively. The actual facilities
5403: available depend on how you invoke Gforth.
5404:
5405: @item program data space available:
5406: @cindex program data space available
5407: @cindex data space available
5408: @code{UNUSED .} gives the remaining dictionary space. The total
5409: dictionary space can be specified with the @code{-m} switch
5410: (@pxref{Invoking Gforth}) when Gforth starts up.
5411:
5412: @item return stack space available:
5413: @cindex return stack space available
5414: You can compute the total return stack space in cells with
5415: @code{s" RETURN-STACK-CELLS" environment? drop .}. You can specify it at
5416: startup time with the @code{-r} switch (@pxref{Invoking Gforth}).
5417:
5418: @item stack space available:
5419: @cindex stack space available
5420: You can compute the total data stack space in cells with
5421: @code{s" STACK-CELLS" environment? drop .}. You can specify it at
5422: startup time with the @code{-d} switch (@pxref{Invoking Gforth}).
5423:
5424: @item system dictionary space required, in address units:
5425: @cindex system dictionary space required, in address units
5426: Type @code{here forthstart - .} after startup. At the time of this
5427: writing, this gives 80080 (bytes) on a 32-bit system.
5428: @end table
5429:
5430:
5431: @c =====================================================================
5432: @node The optional Block word set, The optional Double Number word set, The Core Words, ANS conformance
5433: @section The optional Block word set
5434: @c =====================================================================
5435: @cindex system documentation, block words
5436: @cindex block words, system documentation
5437:
5438: @menu
5439: * block-idef:: Implementation Defined Options
5440: * block-ambcond:: Ambiguous Conditions
5441: * block-other:: Other System Documentation
5442: @end menu
5443:
5444:
5445: @c ---------------------------------------------------------------------
5446: @node block-idef, block-ambcond, The optional Block word set, The optional Block word set
5447: @subsection Implementation Defined Options
5448: @c ---------------------------------------------------------------------
5449: @cindex implementation-defined options, block words
5450: @cindex block words, implementation-defined options
5451:
5452: @table @i
5453: @item the format for display by @code{LIST}:
5454: @cindex @code{LIST} display format
5455: First the screen number is displayed, then 16 lines of 64 characters,
5456: each line preceded by the line number.
5457:
5458: @item the length of a line affected by @code{\}:
5459: @cindex length of a line affected by @code{\}
5460: @cindex @code{\}, line length in blocks
5461: 64 characters.
5462: @end table
5463:
5464:
5465: @c ---------------------------------------------------------------------
5466: @node block-ambcond, block-other, block-idef, The optional Block word set
5467: @subsection Ambiguous conditions
5468: @c ---------------------------------------------------------------------
5469: @cindex block words, ambiguous conditions
5470: @cindex ambiguous conditions, block words
5471:
5472: @table @i
5473: @item correct block read was not possible:
5474: @cindex block read not possible
5475: Typically results in a @code{throw} of some OS-derived value (between
5476: -512 and -2048). If the blocks file was just not long enough, blanks are
5477: supplied for the missing portion.
5478:
5479: @item I/O exception in block transfer:
5480: @cindex I/O exception in block transfer
5481: @cindex block transfer, I/O exception
5482: Typically results in a @code{throw} of some OS-derived value (between
5483: -512 and -2048).
5484:
5485: @item invalid block number:
5486: @cindex invalid block number
5487: @cindex block number invalid
5488: @code{-35 throw} (Invalid block number)
5489:
5490: @item a program directly alters the contents of @code{BLK}:
5491: @cindex @code{BLK}, altering @code{BLK}
5492: The input stream is switched to that other block, at the same
5493: position. If the storing to @code{BLK} happens when interpreting
5494: non-block input, the system will get quite confused when the block ends.
5495:
5496: @item no current block buffer for @code{UPDATE}:
5497: @cindex @code{UPDATE}, no current block buffer
5498: @code{UPDATE} has no effect.
5499:
5500: @end table
5501:
5502: @c ---------------------------------------------------------------------
5503: @node block-other, , block-ambcond, The optional Block word set
5504: @subsection Other system documentation
5505: @c ---------------------------------------------------------------------
5506: @cindex other system documentation, block words
5507: @cindex block words, other system documentation
5508:
5509: @table @i
5510: @item any restrictions a multiprogramming system places on the use of buffer addresses:
5511: No restrictions (yet).
5512:
5513: @item the number of blocks available for source and data:
5514: depends on your disk space.
5515:
5516: @end table
5517:
5518:
5519: @c =====================================================================
5520: @node The optional Double Number word set, The optional Exception word set, The optional Block word set, ANS conformance
5521: @section The optional Double Number word set
5522: @c =====================================================================
5523: @cindex system documentation, double words
5524: @cindex double words, system documentation
5525:
5526: @menu
5527: * double-ambcond:: Ambiguous Conditions
5528: @end menu
5529:
5530:
5531: @c ---------------------------------------------------------------------
5532: @node double-ambcond, , The optional Double Number word set, The optional Double Number word set
5533: @subsection Ambiguous conditions
5534: @c ---------------------------------------------------------------------
5535: @cindex double words, ambiguous conditions
5536: @cindex ambiguous conditions, double words
5537:
5538: @table @i
5539: @item @var{d} outside of range of @var{n} in @code{D>S}:
5540: @cindex @code{D>S}, @var{d} out of range of @var{n}
5541: The least significant cell of @var{d} is produced.
5542:
5543: @end table
5544:
5545:
5546: @c =====================================================================
5547: @node The optional Exception word set, The optional Facility word set, The optional Double Number word set, ANS conformance
5548: @section The optional Exception word set
5549: @c =====================================================================
5550: @cindex system documentation, exception words
5551: @cindex exception words, system documentation
5552:
5553: @menu
5554: * exception-idef:: Implementation Defined Options
5555: @end menu
5556:
5557:
5558: @c ---------------------------------------------------------------------
5559: @node exception-idef, , The optional Exception word set, The optional Exception word set
5560: @subsection Implementation Defined Options
5561: @c ---------------------------------------------------------------------
5562: @cindex implementation-defined options, exception words
5563: @cindex exception words, implementation-defined options
5564:
5565: @table @i
5566: @item @code{THROW}-codes used in the system:
5567: @cindex @code{THROW}-codes used in the system
5568: The codes -256@minus{}-511 are used for reporting signals. The mapping
5569: from OS signal numbers to throw codes is -256@minus{}@var{signal}. The
5570: codes -512@minus{}-2047 are used for OS errors (for file and memory
5571: allocation operations). The mapping from OS error numbers to throw codes
5572: is -512@minus{}@code{errno}. One side effect of this mapping is that
5573: undefined OS errors produce a message with a strange number; e.g.,
5574: @code{-1000 THROW} results in @code{Unknown error 488} on my system.
5575: @end table
5576:
5577: @c =====================================================================
5578: @node The optional Facility word set, The optional File-Access word set, The optional Exception word set, ANS conformance
5579: @section The optional Facility word set
5580: @c =====================================================================
5581: @cindex system documentation, facility words
5582: @cindex facility words, system documentation
5583:
5584: @menu
5585: * facility-idef:: Implementation Defined Options
5586: * facility-ambcond:: Ambiguous Conditions
5587: @end menu
5588:
5589:
5590: @c ---------------------------------------------------------------------
5591: @node facility-idef, facility-ambcond, The optional Facility word set, The optional Facility word set
5592: @subsection Implementation Defined Options
5593: @c ---------------------------------------------------------------------
5594: @cindex implementation-defined options, facility words
5595: @cindex facility words, implementation-defined options
5596:
5597: @table @i
5598: @item encoding of keyboard events (@code{EKEY}):
5599: @cindex keyboard events, encoding in @code{EKEY}
5600: @cindex @code{EKEY}, encoding of keyboard events
5601: Not yet implemented.
5602:
5603: @item duration of a system clock tick:
5604: @cindex duration of a system clock tick
5605: @cindex clock tick duration
5606: System dependent. With respect to @code{MS}, the time is specified in
5607: microseconds. How well the OS and the hardware implement this, is
5608: another question.
5609:
5610: @item repeatability to be expected from the execution of @code{MS}:
5611: @cindex repeatability to be expected from the execution of @code{MS}
5612: @cindex @code{MS}, repeatability to be expected
5613: System dependent. On Unix, a lot depends on load. If the system is
5614: lightly loaded, and the delay is short enough that Gforth does not get
5615: swapped out, the performance should be acceptable. Under MS-DOS and
5616: other single-tasking systems, it should be good.
5617:
5618: @end table
5619:
5620:
5621: @c ---------------------------------------------------------------------
5622: @node facility-ambcond, , facility-idef, The optional Facility word set
5623: @subsection Ambiguous conditions
5624: @c ---------------------------------------------------------------------
5625: @cindex facility words, ambiguous conditions
5626: @cindex ambiguous conditions, facility words
5627:
5628: @table @i
5629: @item @code{AT-XY} can't be performed on user output device:
5630: @cindex @code{AT-XY} can't be performed on user output device
5631: Largely terminal dependent. No range checks are done on the arguments.
5632: No errors are reported. You may see some garbage appearing, you may see
5633: simply nothing happen.
5634:
5635: @end table
5636:
5637:
5638: @c =====================================================================
5639: @node The optional File-Access word set, The optional Floating-Point word set, The optional Facility word set, ANS conformance
5640: @section The optional File-Access word set
5641: @c =====================================================================
5642: @cindex system documentation, file words
5643: @cindex file words, system documentation
5644:
5645: @menu
5646: * file-idef:: Implementation Defined Options
5647: * file-ambcond:: Ambiguous Conditions
5648: @end menu
5649:
5650: @c ---------------------------------------------------------------------
5651: @node file-idef, file-ambcond, The optional File-Access word set, The optional File-Access word set
5652: @subsection Implementation Defined Options
5653: @c ---------------------------------------------------------------------
5654: @cindex implementation-defined options, file words
5655: @cindex file words, implementation-defined options
5656:
5657: @table @i
5658: @item file access methods used:
5659: @cindex file access methods used
5660: @code{R/O}, @code{R/W} and @code{BIN} work as you would
5661: expect. @code{W/O} translates into the C file opening mode @code{w} (or
5662: @code{wb}): The file is cleared, if it exists, and created, if it does
5663: not (with both @code{open-file} and @code{create-file}). Under Unix
5664: @code{create-file} creates a file with 666 permissions modified by your
5665: umask.
5666:
5667: @item file exceptions:
5668: @cindex file exceptions
5669: The file words do not raise exceptions (except, perhaps, memory access
5670: faults when you pass illegal addresses or file-ids).
5671:
5672: @item file line terminator:
5673: @cindex file line terminator
5674: System-dependent. Gforth uses C's newline character as line
5675: terminator. What the actual character code(s) of this are is
5676: system-dependent.
5677:
5678: @item file name format:
5679: @cindex file name format
5680: System dependent. Gforth just uses the file name format of your OS.
5681:
5682: @item information returned by @code{FILE-STATUS}:
5683: @cindex @code{FILE-STATUS}, returned information
5684: @code{FILE-STATUS} returns the most powerful file access mode allowed
5685: for the file: Either @code{R/O}, @code{W/O} or @code{R/W}. If the file
5686: cannot be accessed, @code{R/O BIN} is returned. @code{BIN} is applicable
5687: along with the returned mode.
5688:
5689: @item input file state after an exception when including source:
5690: @cindex exception when including source
5691: All files that are left via the exception are closed.
5692:
5693: @item @var{ior} values and meaning:
5694: @cindex @var{ior} values and meaning
5695: The @var{ior}s returned by the file and memory allocation words are
5696: intended as throw codes. They typically are in the range
5697: -512@minus{}-2047 of OS errors. The mapping from OS error numbers to
5698: @var{ior}s is -512@minus{}@var{errno}.
5699:
5700: @item maximum depth of file input nesting:
5701: @cindex maximum depth of file input nesting
5702: @cindex file input nesting, maximum depth
5703: limited by the amount of return stack, locals/TIB stack, and the number
5704: of open files available. This should not give you troubles.
5705:
5706: @item maximum size of input line:
5707: @cindex maximum size of input line
5708: @cindex input line size, maximum
5709: @code{/line}. Currently 255.
5710:
5711: @item methods of mapping block ranges to files:
5712: @cindex mapping block ranges to files
5713: @cindex files containing blocks
5714: @cindex blocks in files
5715: By default, blocks are accessed in the file @file{blocks.fb} in the
5716: current working directory. The file can be switched with @code{USE}.
5717:
5718: @item number of string buffers provided by @code{S"}:
5719: @cindex @code{S"}, number of string buffers
5720: 1
5721:
5722: @item size of string buffer used by @code{S"}:
5723: @cindex @code{S"}, size of string buffer
5724: @code{/line}. currently 255.
5725:
5726: @end table
5727:
5728: @c ---------------------------------------------------------------------
5729: @node file-ambcond, , file-idef, The optional File-Access word set
5730: @subsection Ambiguous conditions
5731: @c ---------------------------------------------------------------------
5732: @cindex file words, ambiguous conditions
5733: @cindex ambiguous conditions, file words
5734:
5735: @table @i
5736: @item attempting to position a file outside its boundaries:
5737: @cindex @code{REPOSITION-FILE}, outside the file's boundaries
5738: @code{REPOSITION-FILE} is performed as usual: Afterwards,
5739: @code{FILE-POSITION} returns the value given to @code{REPOSITION-FILE}.
5740:
5741: @item attempting to read from file positions not yet written:
5742: @cindex reading from file positions not yet written
5743: End-of-file, i.e., zero characters are read and no error is reported.
5744:
5745: @item @var{file-id} is invalid (@code{INCLUDE-FILE}):
5746: @cindex @code{INCLUDE-FILE}, @var{file-id} is invalid
5747: An appropriate exception may be thrown, but a memory fault or other
5748: problem is more probable.
5749:
5750: @item I/O exception reading or closing @var{file-id} (@code{INCLUDE-FILE}, @code{INCLUDED}):
5751: @cindex @code{INCLUDE-FILE}, I/O exception reading or closing @var{file-id}
5752: @cindex @code{INCLUDED}, I/O exception reading or closing @var{file-id}
5753: The @var{ior} produced by the operation, that discovered the problem, is
5754: thrown.
5755:
5756: @item named file cannot be opened (@code{INCLUDED}):
5757: @cindex @code{INCLUDED}, named file cannot be opened
5758: The @var{ior} produced by @code{open-file} is thrown.
5759:
5760: @item requesting an unmapped block number:
5761: @cindex unmapped block numbers
5762: There are no unmapped legal block numbers. On some operating systems,
5763: writing a block with a large number may overflow the file system and
5764: have an error message as consequence.
5765:
5766: @item using @code{source-id} when @code{blk} is non-zero:
5767: @cindex @code{SOURCE-ID}, behaviour when @code{BLK} is non-zero
5768: @code{source-id} performs its function. Typically it will give the id of
5769: the source which loaded the block. (Better ideas?)
5770:
5771: @end table
5772:
5773:
5774: @c =====================================================================
5775: @node The optional Floating-Point word set, The optional Locals word set, The optional File-Access word set, ANS conformance
5776: @section The optional Floating-Point word set
5777: @c =====================================================================
5778: @cindex system documentation, floating-point words
5779: @cindex floating-point words, system documentation
5780:
5781: @menu
5782: * floating-idef:: Implementation Defined Options
5783: * floating-ambcond:: Ambiguous Conditions
5784: @end menu
5785:
5786:
5787: @c ---------------------------------------------------------------------
5788: @node floating-idef, floating-ambcond, The optional Floating-Point word set, The optional Floating-Point word set
5789: @subsection Implementation Defined Options
5790: @c ---------------------------------------------------------------------
5791: @cindex implementation-defined options, floating-point words
5792: @cindex floating-point words, implementation-defined options
5793:
5794: @table @i
5795: @item format and range of floating point numbers:
5796: @cindex format and range of floating point numbers
5797: @cindex floating point numbers, format and range
5798: System-dependent; the @code{double} type of C.
5799:
5800: @item results of @code{REPRESENT} when @var{float} is out of range:
5801: @cindex @code{REPRESENT}, results when @var{float} is out of range
5802: System dependent; @code{REPRESENT} is implemented using the C library
5803: function @code{ecvt()} and inherits its behaviour in this respect.
5804:
5805: @item rounding or truncation of floating-point numbers:
5806: @cindex rounding of floating-point numbers
5807: @cindex truncation of floating-point numbers
5808: @cindex floating-point numbers, rounding or truncation
5809: System dependent; the rounding behaviour is inherited from the hosting C
5810: compiler. IEEE-FP-based (i.e., most) systems by default round to
5811: nearest, and break ties by rounding to even (i.e., such that the last
5812: bit of the mantissa is 0).
5813:
5814: @item size of floating-point stack:
5815: @cindex floating-point stack size
5816: @code{s" FLOATING-STACK" environment? drop .} gives the total size of
5817: the floating-point stack (in floats). You can specify this on startup
5818: with the command-line option @code{-f} (@pxref{Invoking Gforth}).
5819:
5820: @item width of floating-point stack:
5821: @cindex floating-point stack width
5822: @code{1 floats}.
5823:
5824: @end table
5825:
5826:
5827: @c ---------------------------------------------------------------------
5828: @node floating-ambcond, , floating-idef, The optional Floating-Point word set
5829: @subsection Ambiguous conditions
5830: @c ---------------------------------------------------------------------
5831: @cindex floating-point words, ambiguous conditions
5832: @cindex ambiguous conditions, floating-point words
5833:
5834: @table @i
5835: @item @code{df@@} or @code{df!} used with an address that is not double-float aligned:
5836: @cindex @code{df@@} or @code{df!} used with an address that is not double-float aligned
5837: System-dependent. Typically results in a @code{-23 THROW} like other
5838: alignment violations.
5839:
5840: @item @code{f@@} or @code{f!} used with an address that is not float aligned:
5841: @cindex @code{f@@} used with an address that is not float aligned
5842: @cindex @code{f!} used with an address that is not float aligned
5843: System-dependent. Typically results in a @code{-23 THROW} like other
5844: alignment violations.
5845:
5846: @item floating-point result out of range:
5847: @cindex floating-point result out of range
5848: System-dependent. Can result in a @code{-55 THROW} (Floating-point
5849: unidentified fault), or can produce a special value representing, e.g.,
5850: Infinity.
5851:
5852: @item @code{sf@@} or @code{sf!} used with an address that is not single-float aligned:
5853: @cindex @code{sf@@} or @code{sf!} used with an address that is not single-float aligned
5854: System-dependent. Typically results in an alignment fault like other
5855: alignment violations.
5856:
5857: @item @code{BASE} is not decimal (@code{REPRESENT}, @code{F.}, @code{FE.}, @code{FS.}):
5858: @cindex @code{BASE} is not decimal (@code{REPRESENT}, @code{F.}, @code{FE.}, @code{FS.})
5859: The floating-point number is converted into decimal nonetheless.
5860:
5861: @item Both arguments are equal to zero (@code{FATAN2}):
5862: @cindex @code{FATAN2}, both arguments are equal to zero
5863: System-dependent. @code{FATAN2} is implemented using the C library
5864: function @code{atan2()}.
5865:
5866: @item Using @code{FTAN} on an argument @var{r1} where cos(@var{r1}) is zero:
5867: @cindex @code{FTAN} on an argument @var{r1} where cos(@var{r1}) is zero
5868: System-dependent. Anyway, typically the cos of @var{r1} will not be zero
5869: because of small errors and the tan will be a very large (or very small)
5870: but finite number.
5871:
5872: @item @var{d} cannot be presented precisely as a float in @code{D>F}:
5873: @cindex @code{D>F}, @var{d} cannot be presented precisely as a float
5874: The result is rounded to the nearest float.
5875:
5876: @item dividing by zero:
5877: @cindex dividing by zero, floating-point
5878: @cindex floating-point dividing by zero
5879: @cindex floating-point unidentified fault, FP divide-by-zero
5880: @code{-55 throw} (Floating-point unidentified fault)
5881:
5882: @item exponent too big for conversion (@code{DF!}, @code{DF@@}, @code{SF!}, @code{SF@@}):
5883: @cindex exponent too big for conversion (@code{DF!}, @code{DF@@}, @code{SF!}, @code{SF@@})
5884: System dependent. On IEEE-FP based systems the number is converted into
5885: an infinity.
5886:
5887: @item @var{float}<1 (@code{FACOSH}):
5888: @cindex @code{FACOSH}, @var{float}<1
5889: @cindex floating-point unidentified fault, @code{FACOSH}
5890: @code{-55 throw} (Floating-point unidentified fault)
5891:
5892: @item @var{float}=<-1 (@code{FLNP1}):
5893: @cindex @code{FLNP1}, @var{float}=<-1
5894: @cindex floating-point unidentified fault, @code{FLNP1}
5895: @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems
5896: negative infinity is typically produced for @var{float}=-1.
5897:
5898: @item @var{float}=<0 (@code{FLN}, @code{FLOG}):
5899: @cindex @code{FLN}, @var{float}=<0
5900: @cindex @code{FLOG}, @var{float}=<0
5901: @cindex floating-point unidentified fault, @code{FLN} or @code{FLOG}
5902: @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems
5903: negative infinity is typically produced for @var{float}=0.
5904:
5905: @item @var{float}<0 (@code{FASINH}, @code{FSQRT}):
5906: @cindex @code{FASINH}, @var{float}<0
5907: @cindex @code{FSQRT}, @var{float}<0
5908: @cindex floating-point unidentified fault, @code{FASINH} or @code{FSQRT}
5909: @code{-55 throw} (Floating-point unidentified fault). @code{fasinh}
5910: produces values for these inputs on my Linux box (Bug in the C library?)
5911:
5912: @item |@var{float}|>1 (@code{FACOS}, @code{FASIN}, @code{FATANH}):
5913: @cindex @code{FACOS}, |@var{float}|>1
5914: @cindex @code{FASIN}, |@var{float}|>1
5915: @cindex @code{FATANH}, |@var{float}|>1
5916: @cindex floating-point unidentified fault, @code{FACOS}, @code{FASIN} or @code{FATANH}
5917: @code{-55 throw} (Floating-point unidentified fault).
5918:
5919: @item integer part of float cannot be represented by @var{d} in @code{F>D}:
5920: @cindex @code{F>D}, integer part of float cannot be represented by @var{d}
5921: @cindex floating-point unidentified fault, @code{F>D}
5922: @code{-55 throw} (Floating-point unidentified fault).
5923:
5924: @item string larger than pictured numeric output area (@code{f.}, @code{fe.}, @code{fs.}):
5925: @cindex string larger than pictured numeric output area (@code{f.}, @code{fe.}, @code{fs.})
5926: This does not happen.
5927: @end table
5928:
5929: @c =====================================================================
5930: @node The optional Locals word set, The optional Memory-Allocation word set, The optional Floating-Point word set, ANS conformance
5931: @section The optional Locals word set
5932: @c =====================================================================
5933: @cindex system documentation, locals words
5934: @cindex locals words, system documentation
5935:
5936: @menu
5937: * locals-idef:: Implementation Defined Options
5938: * locals-ambcond:: Ambiguous Conditions
5939: @end menu
5940:
5941:
5942: @c ---------------------------------------------------------------------
5943: @node locals-idef, locals-ambcond, The optional Locals word set, The optional Locals word set
5944: @subsection Implementation Defined Options
5945: @c ---------------------------------------------------------------------
5946: @cindex implementation-defined options, locals words
5947: @cindex locals words, implementation-defined options
5948:
5949: @table @i
5950: @item maximum number of locals in a definition:
5951: @cindex maximum number of locals in a definition
5952: @cindex locals, maximum number in a definition
5953: @code{s" #locals" environment? drop .}. Currently 15. This is a lower
5954: bound, e.g., on a 32-bit machine there can be 41 locals of up to 8
5955: characters. The number of locals in a definition is bounded by the size
5956: of locals-buffer, which contains the names of the locals.
5957:
5958: @end table
5959:
5960:
5961: @c ---------------------------------------------------------------------
5962: @node locals-ambcond, , locals-idef, The optional Locals word set
5963: @subsection Ambiguous conditions
5964: @c ---------------------------------------------------------------------
5965: @cindex locals words, ambiguous conditions
5966: @cindex ambiguous conditions, locals words
5967:
5968: @table @i
5969: @item executing a named local in interpretation state:
5970: @cindex local in interpretation state
5971: @cindex Interpreting a compile-only word, for a local
5972: Locals have no interpretation semantics. If you try to perform the
5973: interpretation semantics, you will get a @code{-14 throw} somewhere
5974: (Interpreting a compile-only word). If you perform the compilation
5975: semantics, the locals access will be compiled (irrespective of state).
5976:
5977: @item @var{name} not defined by @code{VALUE} or @code{(LOCAL)} (@code{TO}):
5978: @cindex name not defined by @code{VALUE} or @code{(LOCAL)} used by @code{TO}
5979: @cindex @code{TO} on non-@code{VALUE}s and non-locals
5980: @cindex Invalid name argument, @code{TO}
5981: @code{-32 throw} (Invalid name argument)
5982:
5983: @end table
5984:
5985:
5986: @c =====================================================================
5987: @node The optional Memory-Allocation word set, The optional Programming-Tools word set, The optional Locals word set, ANS conformance
5988: @section The optional Memory-Allocation word set
5989: @c =====================================================================
5990: @cindex system documentation, memory-allocation words
5991: @cindex memory-allocation words, system documentation
5992:
5993: @menu
5994: * memory-idef:: Implementation Defined Options
5995: @end menu
5996:
5997:
5998: @c ---------------------------------------------------------------------
5999: @node memory-idef, , The optional Memory-Allocation word set, The optional Memory-Allocation word set
6000: @subsection Implementation Defined Options
6001: @c ---------------------------------------------------------------------
6002: @cindex implementation-defined options, memory-allocation words
6003: @cindex memory-allocation words, implementation-defined options
6004:
6005: @table @i
6006: @item values and meaning of @var{ior}:
6007: @cindex @var{ior} values and meaning
6008: The @var{ior}s returned by the file and memory allocation words are
6009: intended as throw codes. They typically are in the range
6010: -512@minus{}-2047 of OS errors. The mapping from OS error numbers to
6011: @var{ior}s is -512@minus{}@var{errno}.
6012:
6013: @end table
6014:
6015: @c =====================================================================
6016: @node The optional Programming-Tools word set, The optional Search-Order word set, The optional Memory-Allocation word set, ANS conformance
6017: @section The optional Programming-Tools word set
6018: @c =====================================================================
6019: @cindex system documentation, programming-tools words
6020: @cindex programming-tools words, system documentation
6021:
6022: @menu
6023: * programming-idef:: Implementation Defined Options
6024: * programming-ambcond:: Ambiguous Conditions
6025: @end menu
6026:
6027:
6028: @c ---------------------------------------------------------------------
6029: @node programming-idef, programming-ambcond, The optional Programming-Tools word set, The optional Programming-Tools word set
6030: @subsection Implementation Defined Options
6031: @c ---------------------------------------------------------------------
6032: @cindex implementation-defined options, programming-tools words
6033: @cindex programming-tools words, implementation-defined options
6034:
6035: @table @i
6036: @item ending sequence for input following @code{;CODE} and @code{CODE}:
6037: @cindex @code{;CODE} ending sequence
6038: @cindex @code{CODE} ending sequence
6039: @code{END-CODE}
6040:
6041: @item manner of processing input following @code{;CODE} and @code{CODE}:
6042: @cindex @code{;CODE}, processing input
6043: @cindex @code{CODE}, processing input
6044: The @code{ASSEMBLER} vocabulary is pushed on the search order stack, and
6045: the input is processed by the text interpreter, (starting) in interpret
6046: state.
6047:
6048: @item search order capability for @code{EDITOR} and @code{ASSEMBLER}:
6049: @cindex @code{ASSEMBLER}, search order capability
6050: The ANS Forth search order word set.
6051:
6052: @item source and format of display by @code{SEE}:
6053: @cindex @code{SEE}, source and format of output
6054: The source for @code{see} is the intermediate code used by the inner
6055: interpreter. The current @code{see} tries to output Forth source code
6056: as well as possible.
6057:
6058: @end table
6059:
6060: @c ---------------------------------------------------------------------
6061: @node programming-ambcond, , programming-idef, The optional Programming-Tools word set
6062: @subsection Ambiguous conditions
6063: @c ---------------------------------------------------------------------
6064: @cindex programming-tools words, ambiguous conditions
6065: @cindex ambiguous conditions, programming-tools words
6066:
6067: @table @i
6068:
6069: @item deleting the compilation wordlist (@code{FORGET}):
6070: @cindex @code{FORGET}, deleting the compilation wordlist
6071: Not implemented (yet).
6072:
6073: @item fewer than @var{u}+1 items on the control flow stack (@code{CS-PICK}, @code{CS-ROLL}):
6074: @cindex @code{CS-PICK}, fewer than @var{u}+1 items on the control flow stack
6075: @cindex @code{CS-ROLL}, fewer than @var{u}+1 items on the control flow stack
6076: @cindex control-flow stack underflow
6077: This typically results in an @code{abort"} with a descriptive error
6078: message (may change into a @code{-22 throw} (Control structure mismatch)
6079: in the future). You may also get a memory access error. If you are
6080: unlucky, this ambiguous condition is not caught.
6081:
6082: @item @var{name} can't be found (@code{FORGET}):
6083: @cindex @code{FORGET}, @var{name} can't be found
6084: Not implemented (yet).
6085:
6086: @item @var{name} not defined via @code{CREATE}:
6087: @cindex @code{;CODE}, @var{name} not defined via @code{CREATE}
6088: @code{;CODE} behaves like @code{DOES>} in this respect, i.e., it changes
6089: the execution semantics of the last defined word no matter how it was
6090: defined.
6091:
6092: @item @code{POSTPONE} applied to @code{[IF]}:
6093: @cindex @code{POSTPONE} applied to @code{[IF]}
6094: @cindex @code{[IF]} and @code{POSTPONE}
6095: After defining @code{: X POSTPONE [IF] ; IMMEDIATE}. @code{X} is
6096: equivalent to @code{[IF]}.
6097:
6098: @item reaching the end of the input source before matching @code{[ELSE]} or @code{[THEN]}:
6099: @cindex @code{[IF]}, end of the input source before matching @code{[ELSE]} or @code{[THEN]}
6100: Continue in the same state of conditional compilation in the next outer
6101: input source. Currently there is no warning to the user about this.
6102:
6103: @item removing a needed definition (@code{FORGET}):
6104: @cindex @code{FORGET}, removing a needed definition
6105: Not implemented (yet).
6106:
6107: @end table
6108:
6109:
6110: @c =====================================================================
6111: @node The optional Search-Order word set, , The optional Programming-Tools word set, ANS conformance
6112: @section The optional Search-Order word set
6113: @c =====================================================================
6114: @cindex system documentation, search-order words
6115: @cindex search-order words, system documentation
6116:
6117: @menu
6118: * search-idef:: Implementation Defined Options
6119: * search-ambcond:: Ambiguous Conditions
6120: @end menu
6121:
6122:
6123: @c ---------------------------------------------------------------------
6124: @node search-idef, search-ambcond, The optional Search-Order word set, The optional Search-Order word set
6125: @subsection Implementation Defined Options
6126: @c ---------------------------------------------------------------------
6127: @cindex implementation-defined options, search-order words
6128: @cindex search-order words, implementation-defined options
6129:
6130: @table @i
6131: @item maximum number of word lists in search order:
6132: @cindex maximum number of word lists in search order
6133: @cindex search order, maximum depth
6134: @code{s" wordlists" environment? drop .}. Currently 16.
6135:
6136: @item minimum search order:
6137: @cindex minimum search order
6138: @cindex search order, minimum
6139: @code{root root}.
6140:
6141: @end table
6142:
6143: @c ---------------------------------------------------------------------
6144: @node search-ambcond, , search-idef, The optional Search-Order word set
6145: @subsection Ambiguous conditions
6146: @c ---------------------------------------------------------------------
6147: @cindex search-order words, ambiguous conditions
6148: @cindex ambiguous conditions, search-order words
6149:
6150: @table @i
6151: @item changing the compilation wordlist (during compilation):
6152: @cindex changing the compilation wordlist (during compilation)
6153: @cindex compilation wordlist, change before definition ends
6154: The word is entered into the wordlist that was the compilation wordlist
6155: at the start of the definition. Any changes to the name field (e.g.,
6156: @code{immediate}) or the code field (e.g., when executing @code{DOES>})
6157: are applied to the latest defined word (as reported by @code{last} or
6158: @code{lastxt}), if possible, irrespective of the compilation wordlist.
6159:
6160: @item search order empty (@code{previous}):
6161: @cindex @code{previous}, search order empty
6162: @cindex Vocstack empty, @code{previous}
6163: @code{abort" Vocstack empty"}.
6164:
6165: @item too many word lists in search order (@code{also}):
6166: @cindex @code{also}, too many word lists in search order
6167: @cindex Vocstack full, @code{also}
6168: @code{abort" Vocstack full"}.
6169:
6170: @end table
6171:
6172: @c ***************************************************************
6173: @node Model, Integrating Gforth, ANS conformance, Top
6174: @chapter Model
6175:
6176: This chapter has yet to be written. It will contain information, on
6177: which internal structures you can rely.
6178:
6179: @c ***************************************************************
6180: @node Integrating Gforth, Emacs and Gforth, Model, Top
6181: @chapter Integrating Gforth into C programs
6182:
6183: This is not yet implemented.
6184:
6185: Several people like to use Forth as scripting language for applications
6186: that are otherwise written in C, C++, or some other language.
6187:
6188: The Forth system ATLAST provides facilities for embedding it into
6189: applications; unfortunately it has several disadvantages: most
6190: importantly, it is not based on ANS Forth, and it is apparently dead
6191: (i.e., not developed further and not supported). The facilities
6192: provided by Gforth in this area are inspired by ATLASTs facilities, so
6193: making the switch should not be hard.
6194:
6195: We also tried to design the interface such that it can easily be
6196: implemented by other Forth systems, so that we may one day arrive at a
6197: standardized interface. Such a standard interface would allow you to
6198: replace the Forth system without having to rewrite C code.
6199:
6200: You embed the Gforth interpreter by linking with the library
6201: @code{libgforth.a} (give the compiler the option @code{-lgforth}). All
6202: global symbols in this library that belong to the interface, have the
6203: prefix @code{forth_}. (Global symbols that are used internally have the
6204: prefix @code{gforth_}).
6205:
6206: You can include the declarations of Forth types and the functions and
6207: variables of the interface with @code{#include <forth.h>}.
6208:
6209: Types.
6210:
6211: Variables.
6212:
6213: Data and FP Stack pointer. Area sizes.
6214:
6215: functions.
6216:
6217: forth_init(imagefile)
6218: forth_evaluate(string) exceptions?
6219: forth_goto(address) (or forth_execute(xt)?)
6220: forth_continue() (a corountining mechanism)
6221:
6222: Adding primitives.
6223:
6224: No checking.
6225:
6226: Signals?
6227:
6228: Accessing the Stacks
6229:
6230: @node Emacs and Gforth, Image Files, Integrating Gforth, Top
6231: @chapter Emacs and Gforth
6232: @cindex Emacs and Gforth
6233:
6234: @cindex @file{gforth.el}
6235: @cindex @file{forth.el}
6236: @cindex Rydqvist, Goran
6237: @cindex comment editing commands
6238: @cindex @code{\}, editing with Emacs
6239: @cindex debug tracer editing commands
6240: @cindex @code{~~}, removal with Emacs
6241: @cindex Forth mode in Emacs
6242: Gforth comes with @file{gforth.el}, an improved version of
6243: @file{forth.el} by Goran Rydqvist (included in the TILE package). The
6244: improvements are a better (but still not perfect) handling of
6245: indentation. I have also added comment paragraph filling (@kbd{M-q}),
6246: commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) regions and
6247: removing debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). I left the
6248: stuff I do not use alone, even though some of it only makes sense for
6249: TILE. To get a description of these features, enter Forth mode and type
6250: @kbd{C-h m}.
6251:
6252: @cindex source location of error or debugging output in Emacs
6253: @cindex error output, finding the source location in Emacs
6254: @cindex debugging output, finding the source location in Emacs
6255: In addition, Gforth supports Emacs quite well: The source code locations
6256: given in error messages, debugging output (from @code{~~}) and failed
6257: assertion messages are in the right format for Emacs' compilation mode
6258: (@pxref{Compilation, , Running Compilations under Emacs, emacs, Emacs
6259: Manual}) so the source location corresponding to an error or other
6260: message is only a few keystrokes away (@kbd{C-x `} for the next error,
6261: @kbd{C-c C-c} for the error under the cursor).
6262:
6263: @cindex @file{TAGS} file
6264: @cindex @file{etags.fs}
6265: @cindex viewing the source of a word in Emacs
6266: Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file
6267: (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) will be produced that
6268: contains the definitions of all words defined afterwards. You can then
6269: find the source for a word using @kbd{M-.}. Note that emacs can use
6270: several tags files at the same time (e.g., one for the Gforth sources
6271: and one for your program, @pxref{Select Tags Table,,Selecting a Tags
6272: Table,emacs, Emacs Manual}). The TAGS file for the preloaded words is
6273: @file{$(datadir)/gforth/$(VERSION)/TAGS} (e.g.,
6274: @file{/usr/local/share/gforth/0.2.0/TAGS}).
6275:
6276: @cindex @file{.emacs}
6277: To get all these benefits, add the following lines to your @file{.emacs}
6278: file:
6279:
6280: @example
6281: (autoload 'forth-mode "gforth.el")
6282: (setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist))
6283: @end example
6284:
6285: @node Image Files, Engine, Emacs and Gforth, Top
6286: @chapter Image Files
6287: @cindex image files
6288: @cindex @code{.fi} files
6289: @cindex precompiled Forth code
6290: @cindex dictionary in persistent form
6291: @cindex persistent form of dictionary
6292:
6293: An image file is a file containing an image of the Forth dictionary,
6294: i.e., compiled Forth code and data residing in the dictionary. By
6295: convention, we use the extension @code{.fi} for image files.
6296:
6297: @menu
6298: * Image File Background:: Why have image files?
6299: * Non-Relocatable Image Files:: don't always work.
6300: * Data-Relocatable Image Files:: are better.
6301: * Fully Relocatable Image Files:: better yet.
6302: * Stack and Dictionary Sizes:: Setting the default sizes for an image.
6303: * Running Image Files:: @code{gforth -i @var{file}} or @var{file}.
6304: * Modifying the Startup Sequence:: and turnkey applications.
6305: @end menu
6306:
6307: @node Image File Background, Non-Relocatable Image Files, Image Files, Image Files
6308: @section Image File Background
6309: @cindex image file background
6310:
6311: Our Forth system consists not only of primitives, but also of
6312: definitions written in Forth. Since the Forth compiler itself belongs to
6313: those definitions, it is not possible to start the system with the
6314: primitives and the Forth source alone. Therefore we provide the Forth
6315: code as an image file in nearly executable form. At the start of the
6316: system a C routine loads the image file into memory, optionally
6317: relocates the addresses, then sets up the memory (stacks etc.) according
6318: to information in the image file, and starts executing Forth code.
6319:
6320: The image file variants represent different compromises between the
6321: goals of making it easy to generate image files and making them
6322: portable.
6323:
6324: @cindex relocation at run-time
6325: Win32Forth 3.4 and Mitch Bradleys @code{cforth} use relocation at
6326: run-time. This avoids many of the complications discussed below (image
6327: files are data relocatable without further ado), but costs performance
6328: (one addition per memory access).
6329:
6330: @cindex relocation at load-time
6331: By contrast, our loader performs relocation at image load time. The
6332: loader also has to replace tokens standing for primitive calls with the
6333: appropriate code-field addresses (or code addresses in the case of
6334: direct threading).
6335:
6336: There are three kinds of image files, with different degrees of
6337: relocatability: non-relocatable, data-relocatable, and fully relocatable
6338: image files.
6339:
6340: @cindex image file loader
6341: @cindex relocating loader
6342: @cindex loader for image files
6343: These image file variants have several restrictions in common; they are
6344: caused by the design of the image file loader:
6345:
6346: @itemize @bullet
6347: @item
6348: There is only one segment; in particular, this means, that an image file
6349: cannot represent @code{ALLOCATE}d memory chunks (and pointers to
6350: them). And the contents of the stacks are not represented, either.
6351:
6352: @item
6353: The only kinds of relocation supported are: adding the same offset to
6354: all cells that represent data addresses; and replacing special tokens
6355: with code addresses or with pieces of machine code.
6356:
6357: If any complex computations involving addresses are performed, the
6358: results cannot be represented in the image file. Several applications that
6359: use such computations come to mind:
6360: @itemize @minus
6361: @item
6362: Hashing addresses (or data structures which contain addresses) for table
6363: lookup. If you use Gforth's @code{table}s or @code{wordlist}s for this
6364: purpose, you will have no problem, because the hash tables are
6365: recomputed automatically when the system is started. If you use your own
6366: hash tables, you will have to do something similar.
6367:
6368: @item
6369: There's a cute implementation of doubly-linked lists that uses
6370: @code{XOR}ed addresses. You could represent such lists as singly-linked
6371: in the image file, and restore the doubly-linked representation on
6372: startup.@footnote{In my opinion, though, you should think thrice before
6373: using a doubly-linked list (whatever implementation).}
6374:
6375: @item
6376: The code addresses of run-time routines like @code{docol:} cannot be
6377: represented in the image file (because their tokens would be replaced by
6378: machine code in direct threaded implementations). As a workaround,
6379: compute these addresses at run-time with @code{>code-address} from the
6380: executions tokens of appropriate words (see the definitions of
6381: @code{docol:} and friends in @file{kernel.fs}).
6382:
6383: @item
6384: On many architectures addresses are represented in machine code in some
6385: shifted or mangled form. You cannot put @code{CODE} words that contain
6386: absolute addresses in this form in a relocatable image file. Workarounds
6387: are representing the address in some relative form (e.g., relative to
6388: the CFA, which is present in some register), or loading the address from
6389: a place where it is stored in a non-mangled form.
6390: @end itemize
6391: @end itemize
6392:
6393: @node Non-Relocatable Image Files, Data-Relocatable Image Files, Image File Background, Image Files
6394: @section Non-Relocatable Image Files
6395: @cindex non-relocatable image files
6396: @cindex image files, non-relocatable
6397:
6398: These files are simple memory dumps of the dictionary. They are specific
6399: to the executable (i.e., @file{gforth} file) they were created
6400: with. What's worse, they are specific to the place on which the
6401: dictionary resided when the image was created. Now, there is no
6402: guarantee that the dictionary will reside at the same place the next
6403: time you start Gforth, so there's no guarantee that a non-relocatable
6404: image will work the next time (Gforth will complain instead of crashing,
6405: though).
6406:
6407: You can create a non-relocatable image file with
6408:
6409: doc-savesystem
6410:
6411: @node Data-Relocatable Image Files, Fully Relocatable Image Files, Non-Relocatable Image Files, Image Files
6412: @section Data-Relocatable Image Files
6413: @cindex data-relocatable image files
6414: @cindex image files, data-relocatable
6415:
6416: These files contain relocatable data addresses, but fixed code addresses
6417: (instead of tokens). They are specific to the executable (i.e.,
6418: @file{gforth} file) they were created with. For direct threading on some
6419: architectures (e.g., the i386), data-relocatable images do not work. You
6420: get a data-relocatable image, if you use @file{gforthmi} with a
6421: Gforth binary that is not doubly indirect threaded (@pxref{Fully
6422: Relocatable Image Files}).
6423:
6424: @node Fully Relocatable Image Files, Stack and Dictionary Sizes, Data-Relocatable Image Files, Image Files
6425: @section Fully Relocatable Image Files
6426: @cindex fully relocatable image files
6427: @cindex image files, fully relocatable
6428:
6429: @cindex @file{kern*.fi}, relocatability
6430: @cindex @file{gforth.fi}, relocatability
6431: These image files have relocatable data addresses, and tokens for code
6432: addresses. They can be used with different binaries (e.g., with and
6433: without debugging) on the same machine, and even across machines with
6434: the same data formats (byte order, cell size, floating point
6435: format). However, they are usually specific to the version of Gforth
6436: they were created with. The files @file{gforth.fi} and @file{kernl*.fi}
6437: are fully relocatable.
6438:
6439: There are two ways to create a fully relocatable image file:
6440:
6441: @menu
6442: * gforthmi:: The normal way
6443: * cross.fs:: The hard way
6444: @end menu
6445:
6446: @node gforthmi, cross.fs, Fully Relocatable Image Files, Fully Relocatable Image Files
6447: @subsection @file{gforthmi}
6448: @cindex @file{comp-i.fs}
6449: @cindex @file{gforthmi}
6450:
6451: You will usually use @file{gforthmi}. If you want to create an
6452: image @var{file} that contains everything you would load by invoking
6453: Gforth with @code{gforth @var{options}}, you simply say
6454: @example
6455: gforthmi @var{file} @var{options}
6456: @end example
6457:
6458: E.g., if you want to create an image @file{asm.fi} that has the file
6459: @file{asm.fs} loaded in addition to the usual stuff, you could do it
6460: like this:
6461:
6462: @example
6463: gforthmi asm.fi asm.fs
6464: @end example
6465:
6466: @file{gforthmi} works like this: It produces two non-relocatable
6467: images for different addresses and then compares them. Its output
6468: reflects this: first you see the output (if any) of the two Gforth
6469: invocations that produce the nonrelocatable image files, then you see
6470: the output of the comparing program: It displays the offset used for
6471: data addresses and the offset used for code addresses;
6472: moreover, for each cell that cannot be represented correctly in the
6473: image files, it displays a line like the following one:
6474:
6475: @example
6476: 78DC BFFFFA50 BFFFFA40
6477: @end example
6478:
6479: This means that at offset $78dc from @code{forthstart}, one input image
6480: contains $bffffa50, and the other contains $bffffa40. Since these cells
6481: cannot be represented correctly in the output image, you should examine
6482: these places in the dictionary and verify that these cells are dead
6483: (i.e., not read before they are written).
6484:
6485: @cindex @code{savesystem} during @file{gforthmi}
6486: @cindex @code{bye} during @file{gforthmi}
6487: @cindex doubly indirect threaded code
6488: @cindex environment variable @code{GFORTHD}
6489: @cindex @code{GFORTHD} environment variable
6490: @cindex @code{gforth-ditc}
6491: There are a few wrinkles: After processing the passed @var{options}, the
6492: words @code{savesystem} and @code{bye} must be visible. A special doubly
6493: indirect threaded version of the @file{gforth} executable is used for
6494: creating the nonrelocatable images; you can pass the exact filename of
6495: this executable through the environment variable @code{GFORTHD}
6496: (default: @file{gforth-ditc}); if you pass a version that is not doubly
6497: indirect threaded, you will not get a fully relocatable image, but a
6498: data-relocatable image (because there is no code address offset).
6499:
6500: @node cross.fs, , gforthmi, Fully Relocatable Image Files
6501: @subsection @file{cross.fs}
6502: @cindex @file{cross.fs}
6503: @cindex cross-compiler
6504: @cindex metacompiler
6505:
6506: You can also use @code{cross}, a batch compiler that accepts a Forth-like
6507: programming language. This @code{cross} language has to be documented
6508: yet.
6509:
6510: @cindex target compiler
6511: @code{cross} also allows you to create image files for machines with
6512: different data sizes and data formats than the one used for generating
6513: the image file. You can also use it to create an application image that
6514: does not contain a Forth compiler. These features are bought with
6515: restrictions and inconveniences in programming. E.g., addresses have to
6516: be stored in memory with special words (@code{A!}, @code{A,}, etc.) in
6517: order to make the code relocatable.
6518:
6519:
6520: @node Stack and Dictionary Sizes, Running Image Files, Fully Relocatable Image Files, Image Files
6521: @section Stack and Dictionary Sizes
6522: @cindex image file, stack and dictionary sizes
6523: @cindex dictionary size default
6524: @cindex stack size default
6525:
6526: If you invoke Gforth with a command line flag for the size
6527: (@pxref{Invoking Gforth}), the size you specify is stored in the
6528: dictionary. If you save the dictionary with @code{savesystem} or create
6529: an image with @file{gforthmi}, this size will become the default
6530: for the resulting image file. E.g., the following will create a
6531: fully relocatable version of gforth.fi with a 1MB dictionary:
6532:
6533: @example
6534: gforthmi gforth.fi -m 1M
6535: @end example
6536:
6537: In other words, if you want to set the default size for the dictionary
6538: and the stacks of an image, just invoke @file{gforthmi} with the
6539: appropriate options when creating the image.
6540:
6541: @cindex stack size, cache-friendly
6542: Note: For cache-friendly behaviour (i.e., good performance), you should
6543: make the sizes of the stacks modulo, say, 2K, somewhat different. E.g.,
6544: the default stack sizes are: data: 16k (mod 2k=0); fp: 15.5k (mod
6545: 2k=1.5k); return: 15k(mod 2k=1k); locals: 14.5k (mod 2k=0.5k).
6546:
6547: @node Running Image Files, Modifying the Startup Sequence, Stack and Dictionary Sizes, Image Files
6548: @section Running Image Files
6549: @cindex running image files
6550: @cindex invoking image files
6551: @cindex image file invocation
6552:
6553: @cindex -i, invoke image file
6554: @cindex --image file, invoke image file
6555: You can invoke Gforth with an image file @var{image} instead of the
6556: default @file{gforth.fi} with the @code{-i} flag (@pxref{Invoking Gforth}):
6557: @example
6558: gforth -i @var{image}
6559: @end example
6560:
6561: @cindex executable image file
6562: @cindex image files, executable
6563: If your operating system supports starting scripts with a line of the
6564: form @code{#! ...}, you just have to type the image file name to start
6565: Gforth with this image file (note that the file extension @code{.fi} is
6566: just a convention). I.e., to run Gforth with the image file @var{image},
6567: you can just type @var{image} instead of @code{gforth -i @var{image}}.
6568:
6569: doc-#!
6570:
6571: @node Modifying the Startup Sequence, , Running Image Files, Image Files
6572: @section Modifying the Startup Sequence
6573: @cindex startup sequence for image file
6574: @cindex image file initialization sequence
6575: @cindex initialization sequence of image file
6576:
6577: You can add your own initialization to the startup sequence through the
6578: deferred word
6579:
6580: doc-'cold
6581:
6582: @code{'cold} is invoked just before the image-specific command line
6583: processing (by default, loading files and evaluating (@code{-e}) strings)
6584: starts.
6585:
6586: A sequence for adding your initialization usually looks like this:
6587:
6588: @example
6589: :noname
6590: Defers 'cold \ do other initialization stuff (e.g., rehashing wordlists)
6591: ... \ your stuff
6592: ; IS 'cold
6593: @end example
6594:
6595: @cindex turnkey image files
6596: @cindex image files, turnkey applications
6597: You can make a turnkey image by letting @code{'cold} execute a word
6598: (your turnkey application) that never returns; instead, it exits Gforth
6599: via @code{bye} or @code{throw}.
6600:
6601: @cindex command-line arguments, access
6602: @cindex arguments on the command line, access
6603: You can access the (image-specific) command-line arguments through the
6604: variables @code{argc} and @code{argv}. @code{arg} provides conventient
6605: access to @code{argv}.
6606:
6607: doc-argc
6608: doc-argv
6609: doc-arg
6610:
6611: If @code{'cold} exits normally, Gforth processes the command-line
6612: arguments as files to be loaded and strings to be evaluated. Therefore,
6613: @code{'cold} should remove the arguments it has used in this case.
6614:
6615: @c ******************************************************************
6616: @node Engine, Bugs, Image Files, Top
6617: @chapter Engine
6618: @cindex engine
6619: @cindex virtual machine
6620:
6621: Reading this section is not necessary for programming with Gforth. It
6622: may be helpful for finding your way in the Gforth sources.
6623:
6624: The ideas in this section have also been published in the papers
6625: @cite{ANS fig/GNU/??? Forth} (in German) by Bernd Paysan, presented at
6626: the Forth-Tagung '93 and @cite{A Portable Forth Engine} by M. Anton
6627: Ertl, presented at EuroForth '93; the latter is available at
6628: @*@url{http://www.complang.tuwien.ac.at/papers/ertl93.ps.Z}.
6629:
6630: @menu
6631: * Portability::
6632: * Threading::
6633: * Primitives::
6634: * Performance::
6635: @end menu
6636:
6637: @node Portability, Threading, Engine, Engine
6638: @section Portability
6639: @cindex engine portability
6640:
6641: One of the main goals of the effort is availability across a wide range
6642: of personal machines. fig-Forth, and, to a lesser extent, F83, achieved
6643: this goal by manually coding the engine in assembly language for several
6644: then-popular processors. This approach is very labor-intensive and the
6645: results are short-lived due to progress in computer architecture.
6646:
6647: @cindex C, using C for the engine
6648: Others have avoided this problem by coding in C, e.g., Mitch Bradley
6649: (cforth), Mikael Patel (TILE) and Dirk Zoller (pfe). This approach is
6650: particularly popular for UNIX-based Forths due to the large variety of
6651: architectures of UNIX machines. Unfortunately an implementation in C
6652: does not mix well with the goals of efficiency and with using
6653: traditional techniques: Indirect or direct threading cannot be expressed
6654: in C, and switch threading, the fastest technique available in C, is
6655: significantly slower. Another problem with C is that it is very
6656: cumbersome to express double integer arithmetic.
6657:
6658: @cindex GNU C for the engine
6659: @cindex long long
6660: Fortunately, there is a portable language that does not have these
6661: limitations: GNU C, the version of C processed by the GNU C compiler
6662: (@pxref{C Extensions, , Extensions to the C Language Family, gcc.info,
6663: GNU C Manual}). Its labels as values feature (@pxref{Labels as Values, ,
6664: Labels as Values, gcc.info, GNU C Manual}) makes direct and indirect
6665: threading possible, its @code{long long} type (@pxref{Long Long, ,
6666: Double-Word Integers, gcc.info, GNU C Manual}) corresponds to Forth's
6667: double numbers@footnote{Unfortunately, long longs are not implemented
6668: properly on all machines (e.g., on alpha-osf1, long longs are only 64
6669: bits, the same size as longs (and pointers), but they should be twice as
1.4 anton 6670: long according to @pxref{Long Long, , Double-Word Integers, gcc.info, GNU
1.1 anton 6671: C Manual}). So, we had to implement doubles in C after all. Still, on
6672: most machines we can use long longs and achieve better performance than
6673: with the emulation package.}. GNU C is available for free on all
6674: important (and many unimportant) UNIX machines, VMS, 80386s running
6675: MS-DOS, the Amiga, and the Atari ST, so a Forth written in GNU C can run
6676: on all these machines.
6677:
6678: Writing in a portable language has the reputation of producing code that
6679: is slower than assembly. For our Forth engine we repeatedly looked at
6680: the code produced by the compiler and eliminated most compiler-induced
6681: inefficiencies by appropriate changes in the source code.
6682:
6683: @cindex explicit register declarations
6684: @cindex --enable-force-reg, configuration flag
6685: @cindex -DFORCE_REG
6686: However, register allocation cannot be portably influenced by the
6687: programmer, leading to some inefficiencies on register-starved
6688: machines. We use explicit register declarations (@pxref{Explicit Reg
6689: Vars, , Variables in Specified Registers, gcc.info, GNU C Manual}) to
6690: improve the speed on some machines. They are turned on by using the
6691: configuration flag @code{--enable-force-reg} (@code{gcc} switch
6692: @code{-DFORCE_REG}). Unfortunately, this feature not only depends on the
6693: machine, but also on the compiler version: On some machines some
6694: compiler versions produce incorrect code when certain explicit register
6695: declarations are used. So by default @code{-DFORCE_REG} is not used.
6696:
6697: @node Threading, Primitives, Portability, Engine
6698: @section Threading
6699: @cindex inner interpreter implementation
6700: @cindex threaded code implementation
6701:
6702: @cindex labels as values
6703: GNU C's labels as values extension (available since @code{gcc-2.0},
6704: @pxref{Labels as Values, , Labels as Values, gcc.info, GNU C Manual})
6705: makes it possible to take the address of @var{label} by writing
6706: @code{&&@var{label}}. This address can then be used in a statement like
6707: @code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as
6708: @code{goto x}.
6709:
6710: @cindex NEXT, indirect threaded
6711: @cindex indirect threaded inner interpreter
6712: @cindex inner interpreter, indirect threaded
6713: With this feature an indirect threaded NEXT looks like:
6714: @example
6715: cfa = *ip++;
6716: ca = *cfa;
6717: goto *ca;
6718: @end example
6719: @cindex instruction pointer
6720: For those unfamiliar with the names: @code{ip} is the Forth instruction
6721: pointer; the @code{cfa} (code-field address) corresponds to ANS Forths
6722: execution token and points to the code field of the next word to be
6723: executed; The @code{ca} (code address) fetched from there points to some
6724: executable code, e.g., a primitive or the colon definition handler
6725: @code{docol}.
6726:
6727: @cindex NEXT, direct threaded
6728: @cindex direct threaded inner interpreter
6729: @cindex inner interpreter, direct threaded
6730: Direct threading is even simpler:
6731: @example
6732: ca = *ip++;
6733: goto *ca;
6734: @end example
6735:
6736: Of course we have packaged the whole thing neatly in macros called
6737: @code{NEXT} and @code{NEXT1} (the part of NEXT after fetching the cfa).
6738:
6739: @menu
6740: * Scheduling::
6741: * Direct or Indirect Threaded?::
6742: * DOES>::
6743: @end menu
6744:
6745: @node Scheduling, Direct or Indirect Threaded?, Threading, Threading
6746: @subsection Scheduling
6747: @cindex inner interpreter optimization
6748:
6749: There is a little complication: Pipelined and superscalar processors,
6750: i.e., RISC and some modern CISC machines can process independent
6751: instructions while waiting for the results of an instruction. The
6752: compiler usually reorders (schedules) the instructions in a way that
6753: achieves good usage of these delay slots. However, on our first tries
6754: the compiler did not do well on scheduling primitives. E.g., for
6755: @code{+} implemented as
6756: @example
6757: n=sp[0]+sp[1];
6758: sp++;
6759: sp[0]=n;
6760: NEXT;
6761: @end example
6762: the NEXT comes strictly after the other code, i.e., there is nearly no
6763: scheduling. After a little thought the problem becomes clear: The
6764: compiler cannot know that sp and ip point to different addresses (and
6765: the version of @code{gcc} we used would not know it even if it was
6766: possible), so it could not move the load of the cfa above the store to
6767: the TOS. Indeed the pointers could be the same, if code on or very near
6768: the top of stack were executed. In the interest of speed we chose to
6769: forbid this probably unused ``feature'' and helped the compiler in
6770: scheduling: NEXT is divided into the loading part (@code{NEXT_P1}) and
6771: the goto part (@code{NEXT_P2}). @code{+} now looks like:
6772: @example
6773: n=sp[0]+sp[1];
6774: sp++;
6775: NEXT_P1;
6776: sp[0]=n;
6777: NEXT_P2;
6778: @end example
6779: This can be scheduled optimally by the compiler.
6780:
6781: This division can be turned off with the switch @code{-DCISC_NEXT}. This
6782: switch is on by default on machines that do not profit from scheduling
6783: (e.g., the 80386), in order to preserve registers.
6784:
6785: @node Direct or Indirect Threaded?, DOES>, Scheduling, Threading
6786: @subsection Direct or Indirect Threaded?
6787: @cindex threading, direct or indirect?
6788:
6789: @cindex -DDIRECT_THREADED
6790: Both! After packaging the nasty details in macro definitions we
6791: realized that we could switch between direct and indirect threading by
6792: simply setting a compilation flag (@code{-DDIRECT_THREADED}) and
6793: defining a few machine-specific macros for the direct-threading case.
6794: On the Forth level we also offer access words that hide the
6795: differences between the threading methods (@pxref{Threading Words}).
6796:
6797: Indirect threading is implemented completely machine-independently.
6798: Direct threading needs routines for creating jumps to the executable
6799: code (e.g. to docol or dodoes). These routines are inherently
6800: machine-dependent, but they do not amount to many source lines. I.e.,
6801: even porting direct threading to a new machine is a small effort.
6802:
6803: @cindex --enable-indirect-threaded, configuration flag
6804: @cindex --enable-direct-threaded, configuration flag
6805: The default threading method is machine-dependent. You can enforce a
6806: specific threading method when building Gforth with the configuration
6807: flag @code{--enable-direct-threaded} or
6808: @code{--enable-indirect-threaded}. Note that direct threading is not
6809: supported on all machines.
6810:
6811: @node DOES>, , Direct or Indirect Threaded?, Threading
6812: @subsection DOES>
6813: @cindex @code{DOES>} implementation
6814:
6815: @cindex dodoes routine
6816: @cindex DOES-code
6817: One of the most complex parts of a Forth engine is @code{dodoes}, i.e.,
6818: the chunk of code executed by every word defined by a
6819: @code{CREATE}...@code{DOES>} pair. The main problem here is: How to find
6820: the Forth code to be executed, i.e. the code after the
6821: @code{DOES>} (the DOES-code)? There are two solutions:
6822:
6823: In fig-Forth the code field points directly to the dodoes and the
6824: DOES-code address is stored in the cell after the code address (i.e. at
6825: @code{@var{cfa} cell+}). It may seem that this solution is illegal in
6826: the Forth-79 and all later standards, because in fig-Forth this address
6827: lies in the body (which is illegal in these standards). However, by
6828: making the code field larger for all words this solution becomes legal
6829: again. We use this approach for the indirect threaded version and for
6830: direct threading on some machines. Leaving a cell unused in most words
6831: is a bit wasteful, but on the machines we are targeting this is hardly a
6832: problem. The other reason for having a code field size of two cells is
6833: to avoid having different image files for direct and indirect threaded
6834: systems (direct threaded systems require two-cell code fields on many
6835: machines).
6836:
6837: @cindex DOES-handler
6838: The other approach is that the code field points or jumps to the cell
6839: after @code{DOES}. In this variant there is a jump to @code{dodoes} at
6840: this address (the DOES-handler). @code{dodoes} can then get the
6841: DOES-code address by computing the code address, i.e., the address of
6842: the jump to dodoes, and add the length of that jump field. A variant of
6843: this is to have a call to @code{dodoes} after the @code{DOES>}; then the
6844: return address (which can be found in the return register on RISCs) is
6845: the DOES-code address. Since the two cells available in the code field
6846: are used up by the jump to the code address in direct threading on many
6847: architectures, we use this approach for direct threading on these
6848: architectures. We did not want to add another cell to the code field.
6849:
6850: @node Primitives, Performance, Threading, Engine
6851: @section Primitives
6852: @cindex primitives, implementation
6853: @cindex virtual machine instructions, implementation
6854:
6855: @menu
6856: * Automatic Generation::
6857: * TOS Optimization::
6858: * Produced code::
6859: @end menu
6860:
6861: @node Automatic Generation, TOS Optimization, Primitives, Primitives
6862: @subsection Automatic Generation
6863: @cindex primitives, automatic generation
6864:
6865: @cindex @file{prims2x.fs}
6866: Since the primitives are implemented in a portable language, there is no
6867: longer any need to minimize the number of primitives. On the contrary,
6868: having many primitives has an advantage: speed. In order to reduce the
6869: number of errors in primitives and to make programming them easier, we
6870: provide a tool, the primitive generator (@file{prims2x.fs}), that
6871: automatically generates most (and sometimes all) of the C code for a
6872: primitive from the stack effect notation. The source for a primitive
6873: has the following form:
6874:
6875: @cindex primitive source format
6876: @format
6877: @var{Forth-name} @var{stack-effect} @var{category} [@var{pronounc.}]
6878: [@code{""}@var{glossary entry}@code{""}]
6879: @var{C code}
6880: [@code{:}
6881: @var{Forth code}]
6882: @end format
6883:
6884: The items in brackets are optional. The category and glossary fields
6885: are there for generating the documentation, the Forth code is there
6886: for manual implementations on machines without GNU C. E.g., the source
6887: for the primitive @code{+} is:
6888: @example
6889: + n1 n2 -- n core plus
6890: n = n1+n2;
6891: @end example
6892:
6893: This looks like a specification, but in fact @code{n = n1+n2} is C
6894: code. Our primitive generation tool extracts a lot of information from
6895: the stack effect notations@footnote{We use a one-stack notation, even
6896: though we have separate data and floating-point stacks; The separate
6897: notation can be generated easily from the unified notation.}: The number
6898: of items popped from and pushed on the stack, their type, and by what
6899: name they are referred to in the C code. It then generates a C code
6900: prelude and postlude for each primitive. The final C code for @code{+}
6901: looks like this:
6902:
6903: @example
6904: I_plus: /* + ( n1 n2 -- n ) */ /* label, stack effect */
6905: /* */ /* documentation */
6906: @{
6907: DEF_CA /* definition of variable ca (indirect threading) */
6908: Cell n1; /* definitions of variables */
6909: Cell n2;
6910: Cell n;
6911: n1 = (Cell) sp[1]; /* input */
6912: n2 = (Cell) TOS;
6913: sp += 1; /* stack adjustment */
6914: NAME("+") /* debugging output (with -DDEBUG) */
6915: @{
6916: n = n1+n2; /* C code taken from the source */
6917: @}
6918: NEXT_P1; /* NEXT part 1 */
6919: TOS = (Cell)n; /* output */
6920: NEXT_P2; /* NEXT part 2 */
6921: @}
6922: @end example
6923:
6924: This looks long and inefficient, but the GNU C compiler optimizes quite
6925: well and produces optimal code for @code{+} on, e.g., the R3000 and the
6926: HP RISC machines: Defining the @code{n}s does not produce any code, and
6927: using them as intermediate storage also adds no cost.
6928:
6929: There are also other optimizations, that are not illustrated by this
6930: example: Assignments between simple variables are usually for free (copy
6931: propagation). If one of the stack items is not used by the primitive
6932: (e.g. in @code{drop}), the compiler eliminates the load from the stack
6933: (dead code elimination). On the other hand, there are some things that
6934: the compiler does not do, therefore they are performed by
6935: @file{prims2x.fs}: The compiler does not optimize code away that stores
6936: a stack item to the place where it just came from (e.g., @code{over}).
6937:
6938: While programming a primitive is usually easy, there are a few cases
6939: where the programmer has to take the actions of the generator into
6940: account, most notably @code{?dup}, but also words that do not (always)
6941: fall through to NEXT.
6942:
6943: @node TOS Optimization, Produced code, Automatic Generation, Primitives
6944: @subsection TOS Optimization
6945: @cindex TOS optimization for primitives
6946: @cindex primitives, keeping the TOS in a register
6947:
6948: An important optimization for stack machine emulators, e.g., Forth
6949: engines, is keeping one or more of the top stack items in
6950: registers. If a word has the stack effect @var{in1}...@var{inx} @code{--}
6951: @var{out1}...@var{outy}, keeping the top @var{n} items in registers
6952: @itemize @bullet
6953: @item
6954: is better than keeping @var{n-1} items, if @var{x>=n} and @var{y>=n},
6955: due to fewer loads from and stores to the stack.
6956: @item is slower than keeping @var{n-1} items, if @var{x<>y} and @var{x<n} and
6957: @var{y<n}, due to additional moves between registers.
6958: @end itemize
6959:
6960: @cindex -DUSE_TOS
6961: @cindex -DUSE_NO_TOS
6962: In particular, keeping one item in a register is never a disadvantage,
6963: if there are enough registers. Keeping two items in registers is a
6964: disadvantage for frequent words like @code{?branch}, constants,
6965: variables, literals and @code{i}. Therefore our generator only produces
6966: code that keeps zero or one items in registers. The generated C code
6967: covers both cases; the selection between these alternatives is made at
6968: C-compile time using the switch @code{-DUSE_TOS}. @code{TOS} in the C
6969: code for @code{+} is just a simple variable name in the one-item case,
6970: otherwise it is a macro that expands into @code{sp[0]}. Note that the
6971: GNU C compiler tries to keep simple variables like @code{TOS} in
6972: registers, and it usually succeeds, if there are enough registers.
6973:
6974: @cindex -DUSE_FTOS
6975: @cindex -DUSE_NO_FTOS
6976: The primitive generator performs the TOS optimization for the
6977: floating-point stack, too (@code{-DUSE_FTOS}). For floating-point
6978: operations the benefit of this optimization is even larger:
6979: floating-point operations take quite long on most processors, but can be
6980: performed in parallel with other operations as long as their results are
6981: not used. If the FP-TOS is kept in a register, this works. If
6982: it is kept on the stack, i.e., in memory, the store into memory has to
6983: wait for the result of the floating-point operation, lengthening the
6984: execution time of the primitive considerably.
6985:
6986: The TOS optimization makes the automatic generation of primitives a
6987: bit more complicated. Just replacing all occurrences of @code{sp[0]} by
6988: @code{TOS} is not sufficient. There are some special cases to
6989: consider:
6990: @itemize @bullet
6991: @item In the case of @code{dup ( w -- w w )} the generator must not
6992: eliminate the store to the original location of the item on the stack,
6993: if the TOS optimization is turned on.
6994: @item Primitives with stack effects of the form @code{--}
6995: @var{out1}...@var{outy} must store the TOS to the stack at the start.
6996: Likewise, primitives with the stack effect @var{in1}...@var{inx} @code{--}
6997: must load the TOS from the stack at the end. But for the null stack
6998: effect @code{--} no stores or loads should be generated.
6999: @end itemize
7000:
7001: @node Produced code, , TOS Optimization, Primitives
7002: @subsection Produced code
7003: @cindex primitives, assembly code listing
7004:
7005: @cindex @file{engine.s}
7006: To see what assembly code is produced for the primitives on your machine
7007: with your compiler and your flag settings, type @code{make engine.s} and
7008: look at the resulting file @file{engine.s}.
7009:
7010: @node Performance, , Primitives, Engine
7011: @section Performance
7012: @cindex performance of some Forth interpreters
7013: @cindex engine performance
7014: @cindex benchmarking Forth systems
7015: @cindex Gforth performance
7016:
7017: On RISCs the Gforth engine is very close to optimal; i.e., it is usually
7018: impossible to write a significantly faster engine.
7019:
7020: On register-starved machines like the 386 architecture processors
7021: improvements are possible, because @code{gcc} does not utilize the
7022: registers as well as a human, even with explicit register declarations;
7023: e.g., Bernd Beuster wrote a Forth system fragment in assembly language
7024: and hand-tuned it for the 486; this system is 1.19 times faster on the
7025: Sieve benchmark on a 486DX2/66 than Gforth compiled with
7026: @code{gcc-2.6.3} with @code{-DFORCE_REG}.
7027:
7028: @cindex Win32Forth performance
7029: @cindex NT Forth performance
7030: @cindex eforth performance
7031: @cindex ThisForth performance
7032: @cindex PFE performance
7033: @cindex TILE performance
7034: However, this potential advantage of assembly language implementations
7035: is not necessarily realized in complete Forth systems: We compared
7036: Gforth (direct threaded, compiled with @code{gcc-2.6.3} and
7037: @code{-DFORCE_REG}) with Win32Forth 1.2093, LMI's NT Forth (Beta, May
7038: 1994) and Eforth (with and without peephole (aka pinhole) optimization
7039: of the threaded code); all these systems were written in assembly
7040: language. We also compared Gforth with three systems written in C:
7041: PFE-0.9.14 (compiled with @code{gcc-2.6.3} with the default
7042: configuration for Linux: @code{-O2 -fomit-frame-pointer -DUSE_REGS
7043: -DUNROLL_NEXT}), ThisForth Beta (compiled with gcc-2.6.3 -O3
7044: -fomit-frame-pointer; ThisForth employs peephole optimization of the
7045: threaded code) and TILE (compiled with @code{make opt}). We benchmarked
7046: Gforth, PFE, ThisForth and TILE on a 486DX2/66 under Linux. Kenneth
7047: O'Heskin kindly provided the results for Win32Forth and NT Forth on a
7048: 486DX2/66 with similar memory performance under Windows NT. Marcel
7049: Hendrix ported Eforth to Linux, then extended it to run the benchmarks,
7050: added the peephole optimizer, ran the benchmarks and reported the
7051: results.
7052:
7053: We used four small benchmarks: the ubiquitous Sieve; bubble-sorting and
7054: matrix multiplication come from the Stanford integer benchmarks and have
7055: been translated into Forth by Martin Fraeman; we used the versions
7056: included in the TILE Forth package, but with bigger data set sizes; and
7057: a recursive Fibonacci number computation for benchmarking calling
7058: performance. The following table shows the time taken for the benchmarks
7059: scaled by the time taken by Gforth (in other words, it shows the speedup
7060: factor that Gforth achieved over the other systems).
7061:
7062: @example
7063: relative Win32- NT eforth This-
7064: time Gforth Forth Forth eforth +opt PFE Forth TILE
7065: sieve 1.00 1.39 1.14 1.39 0.85 1.58 3.18 8.58
7066: bubble 1.00 1.31 1.41 1.48 0.88 1.50 3.88
7067: matmul 1.00 1.47 1.35 1.46 0.74 1.58 4.09
7068: fib 1.00 1.52 1.34 1.22 0.86 1.74 2.99 4.30
7069: @end example
7070:
7071: You may find the good performance of Gforth compared with the systems
7072: written in assembly language quite surprising. One important reason for
7073: the disappointing performance of these systems is probably that they are
7074: not written optimally for the 486 (e.g., they use the @code{lods}
7075: instruction). In addition, Win32Forth uses a comfortable, but costly
7076: method for relocating the Forth image: like @code{cforth}, it computes
7077: the actual addresses at run time, resulting in two address computations
7078: per NEXT (@pxref{Image File Background}).
7079:
7080: Only Eforth with the peephole optimizer performs comparable to
7081: Gforth. The speedups achieved with peephole optimization of threaded
7082: code are quite remarkable. Adding a peephole optimizer to Gforth should
7083: cause similar speedups.
7084:
7085: The speedup of Gforth over PFE, ThisForth and TILE can be easily
7086: explained with the self-imposed restriction of the latter systems to
7087: standard C, which makes efficient threading impossible (however, the
1.4 anton 7088: measured implementation of PFE uses a GNU C extension: @pxref{Global Reg
1.1 anton 7089: Vars, , Defining Global Register Variables, gcc.info, GNU C Manual}).
7090: Moreover, current C compilers have a hard time optimizing other aspects
7091: of the ThisForth and the TILE source.
7092:
7093: Note that the performance of Gforth on 386 architecture processors
7094: varies widely with the version of @code{gcc} used. E.g., @code{gcc-2.5.8}
7095: failed to allocate any of the virtual machine registers into real
7096: machine registers by itself and would not work correctly with explicit
7097: register declarations, giving a 1.3 times slower engine (on a 486DX2/66
7098: running the Sieve) than the one measured above.
7099:
7100: Note also that there have been several releases of Win32Forth since the
7101: release presented here, so the results presented here may have little
7102: predictive value for the performance of Win32Forth today.
7103:
7104: @cindex @file{Benchres}
7105: In @cite{Translating Forth to Efficient C} by M. Anton Ertl and Martin
7106: Maierhofer (presented at EuroForth '95), an indirect threaded version of
7107: Gforth is compared with Win32Forth, NT Forth, PFE, and ThisForth; that
7108: version of Gforth is 2%@minus{}8% slower on a 486 than the direct
7109: threaded version used here. The paper available at
7110: @*@url{http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz};
7111: it also contains numbers for some native code systems. You can find a
7112: newer version of these measurements at
7113: @url{http://www.complang.tuwien.ac.at/forth/performance.html}. You can
7114: find numbers for Gforth on various machines in @file{Benchres}.
7115:
7116: @node Bugs, Origin, Engine, Top
7117: @chapter Bugs
7118: @cindex bug reporting
7119:
7120: Known bugs are described in the file BUGS in the Gforth distribution.
7121:
7122: If you find a bug, please send a bug report to
7123: @email{bug-gforth@@gnu.ai.mit.edu}. A bug report should
7124: describe the Gforth version used (it is announced at the start of an
7125: interactive Gforth session), the machine and operating system (on Unix
7126: systems you can use @code{uname -a} to produce this information), the
7127: installation options (send the @file{config.status} file), and a
7128: complete list of changes you (or your installer) have made to the Gforth
7129: sources (if any); it should contain a program (or a sequence of keyboard
7130: commands) that reproduces the bug and a description of what you think
7131: constitutes the buggy behaviour.
7132:
7133: For a thorough guide on reporting bugs read @ref{Bug Reporting, , How
7134: to Report Bugs, gcc.info, GNU C Manual}.
7135:
7136:
7137: @node Origin, Word Index, Bugs, Top
7138: @chapter Authors and Ancestors of Gforth
7139:
7140: @section Authors and Contributors
7141: @cindex authors of Gforth
7142: @cindex contributors to Gforth
7143:
7144: The Gforth project was started in mid-1992 by Bernd Paysan and Anton
7145: Ertl. The third major author was Jens Wilke. Lennart Benschop (who was
7146: one of Gforth's first users, in mid-1993) and Stuart Ramsden inspired us
7147: with their continuous feedback. Lennart Benshop contributed
7148: @file{glosgen.fs}, while Stuart Ramsden has been working on automatic
7149: support for calling C libraries. Helpful comments also came from Paul
7150: Kleinrubatscher, Christian Pirker, Dirk Zoller, Marcel Hendrix, John
1.12 ! anton 7151: Wavrik, Barrie Stott, Marc de Groot, and Jorge Acerada. Since the
! 7152: release of Gforth-0.2.1 there were also helpful comments from many
! 7153: others; thank you all, sorry for not listing you here (but digging
! 7154: through my mailbox to extract your names is on my to-do list).
1.1 anton 7155:
7156: Gforth also owes a lot to the authors of the tools we used (GCC, CVS,
7157: and autoconf, among others), and to the creators of the Internet: Gforth
7158: was developed across the Internet, and its authors have not met
7159: physically yet.
7160:
7161: @section Pedigree
7162: @cindex Pedigree of Gforth
7163:
7164: Gforth descends from BigForth (1993) and fig-Forth. Gforth and PFE (by
7165: Dirk Zoller) will cross-fertilize each other. Of course, a significant
7166: part of the design of Gforth was prescribed by ANS Forth.
7167:
7168: Bernd Paysan wrote BigForth, a descendent from TurboForth, an unreleased
7169: 32 bit native code version of VolksForth for the Atari ST, written
7170: mostly by Dietrich Weineck.
7171:
7172: VolksForth descends from F83. It was written by Klaus Schleisiek, Bernd
7173: Pennemann, Georg Rehfeld and Dietrich Weineck for the C64 (called
7174: UltraForth there) in the mid-80s and ported to the Atari ST in 1986.
7175:
7176: Henry Laxen and Mike Perry wrote F83 as a model implementation of the
7177: Forth-83 standard. !! Pedigree? When?
7178:
7179: A team led by Bill Ragsdale implemented fig-Forth on many processors in
7180: 1979. Robert Selzer and Bill Ragsdale developed the original
7181: implementation of fig-Forth for the 6502 based on microForth.
7182:
7183: The principal architect of microForth was Dean Sanderson. microForth was
7184: FORTH, Inc.'s first off-the-shelf product. It was developed in 1976 for
7185: the 1802, and subsequently implemented on the 8080, the 6800 and the
7186: Z80.
7187:
7188: All earlier Forth systems were custom-made, usually by Charles Moore,
7189: who discovered (as he puts it) Forth during the late 60s. The first full
7190: Forth existed in 1971.
7191:
7192: A part of the information in this section comes from @cite{The Evolution
7193: of Forth} by Elizabeth D. Rather, Donald R. Colburn and Charles
7194: H. Moore, presented at the HOPL-II conference and preprinted in SIGPLAN
7195: Notices 28(3), 1993. You can find more historical and genealogical
7196: information about Forth there.
7197:
7198: @node Word Index, Concept Index, Origin, Top
7199: @unnumbered Word Index
7200:
7201: This index is as incomplete as the manual. Each word is listed with
7202: stack effect and wordset.
7203:
7204: @printindex fn
7205:
7206: @node Concept Index, , Word Index, Top
7207: @unnumbered Concept and Word Index
7208:
7209: This index is as incomplete as the manual. Not all entries listed are
7210: present verbatim in the text. Only the names are listed for the words
7211: here.
7212:
7213: @printindex cp
7214:
7215: @contents
7216: @bye
7217:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>