gforth/gforth.ds - view

File: [gforth] / gforth / Attic / gforth.ds
Revision 1.18: download - view: text, annotated - select for diffs
Sat Oct 7 17:38:14 1995 UTC (28 years, 6 months ago) by anton
Branches: MAIN
CVS tags: HEAD

added code.fs (code, ;code, end-code, assembler)
renamed dostruc to dofield
made index and doc-entries nicer
Only words containing 'e' or 'E' are converted to FP numbers.
added many wordset comments
added flush-icache primitive and FLUSH_ICACHE macro
added +DO, U+DO, -DO, U-DO and -LOOP
added code address labels (`docol:' etc.)
fixed sparc cache_flush

1: \input texinfo @c -*-texinfo-*- 2: @comment The source is gforth.ds, from which gforth.texi is generated 3: @comment %**start of header (This is for running Texinfo on a region.) 4: @setfilename gforth.info 5: @settitle Gforth Manual 6: @comment @setchapternewpage odd 7: @comment %**end of header (This is for running Texinfo on a region.) 8: 9: @ifinfo 10: This file documents Gforth 0.1 11: 12: Copyright @copyright{} 1994 Gforth Development Group 13: 14: Permission is granted to make and distribute verbatim copies of 15: this manual provided the copyright notice and this permission notice 16: are preserved on all copies. 17: 18: @ignore 19: Permission is granted to process this file through TeX and print the 20: results, provided the printed document carries a copying permission 21: notice identical to this one except for the removal of this paragraph 22: (this paragraph not being relevant to the printed manual). 23: 24: @end ignore 25: Permission is granted to copy and distribute modified versions of this 26: manual under the conditions for verbatim copying, provided also that the 27: sections entitled "Distribution" and "General Public License" are 28: included exactly as in the original, and provided that the entire 29: resulting derived work is distributed under the terms of a permission 30: notice identical to this one. 31: 32: Permission is granted to copy and distribute translations of this manual 33: into another language, under the above conditions for modified versions, 34: except that the sections entitled "Distribution" and "General Public 35: License" may be included in a translation approved by the author instead 36: of in the original English. 37: @end ifinfo 38: 39: @titlepage 40: @sp 10 41: @center @titlefont{Gforth Manual} 42: @sp 2 43: @center for version 0.1 44: @sp 2 45: @center Anton Ertl 46: @sp 3 47: @center This manual is under construction 48: 49: @comment The following two commands start the copyright page. 50: @page 51: @vskip 0pt plus 1filll 52: Copyright @copyright{} 1994 Gforth Development Group 53: 54: @comment !! Published by ... or You can get a copy of this manual ... 55: 56: Permission is granted to make and distribute verbatim copies of 57: this manual provided the copyright notice and this permission notice 58: are preserved on all copies. 59: 60: Permission is granted to copy and distribute modified versions of this 61: manual under the conditions for verbatim copying, provided also that the 62: sections entitled "Distribution" and "General Public License" are 63: included exactly as in the original, and provided that the entire 64: resulting derived work is distributed under the terms of a permission 65: notice identical to this one. 66: 67: Permission is granted to copy and distribute translations of this manual 68: into another language, under the above conditions for modified versions, 69: except that the sections entitled "Distribution" and "General Public 70: License" may be included in a translation approved by the author instead 71: of in the original English. 72: @end titlepage 73: 74: 75: @node Top, License, (dir), (dir) 76: @ifinfo 77: Gforth is a free implementation of ANS Forth available on many 78: personal machines. This manual corresponds to version 0.0. 79: @end ifinfo 80: 81: @menu 82: * License:: 83: * Goals:: About the Gforth Project 84: * Other Books:: Things you might want to read 85: * Invocation:: Starting Gforth 86: * Words:: Forth words available in Gforth 87: * ANS conformance:: Implementation-defined options etc. 88: * Model:: The abstract machine of Gforth 89: * Emacs and Gforth:: The Gforth Mode 90: * Internals:: Implementation details 91: * Bugs:: How to report them 92: * Pedigree:: Ancestors of Gforth 93: * Word Index:: An item for each Forth word 94: * Node Index:: An item for each node 95: @end menu 96: 97: @node License, Goals, Top, Top 98: @unnumbered License 99: !! Insert GPL here 100: 101: @iftex 102: @unnumbered Preface 103: This manual documents Gforth. The reader is expected to know 104: Forth. This manual is primarily a reference manual. @xref{Other Books} 105: for introductory material. 106: @end iftex 107: 108: @node Goals, Other Books, License, Top 109: @comment node-name, next, previous, up 110: @chapter Goals of Gforth 111: @cindex Goals 112: The goal of the Gforth Project is to develop a standard model for 113: ANSI Forth. This can be split into several subgoals: 114: 115: @itemize @bullet 116: @item 117: Gforth should conform to the ANSI Forth standard. 118: @item 119: It should be a model, i.e. it should define all the 120: implementation-dependent things. 121: @item 122: It should become standard, i.e. widely accepted and used. This goal 123: is the most difficult one. 124: @end itemize 125: 126: To achieve these goals Gforth should be 127: @itemize @bullet 128: @item 129: Similar to previous models (fig-Forth, F83) 130: @item 131: Powerful. It should provide for all the things that are considered 132: necessary today and even some that are not yet considered necessary. 133: @item 134: Efficient. It should not get the reputation of being exceptionally 135: slow. 136: @item 137: Free. 138: @item 139: Available on many machines/easy to port. 140: @end itemize 141: 142: Have we achieved these goals? Gforth conforms to the ANS Forth 143: standard. It may be considered a model, but we have not yet documented 144: which parts of the model are stable and which parts we are likely to 145: change. It certainly has not yet become a de facto standard. It has some 146: similarities and some differences to previous models. It has some 147: powerful features, but not yet everything that we envisioned. We 148: certainly have achieved our execution speed goals (@pxref{Performance}). 149: It is free and available on many machines. 150: 151: @node Other Books, Invocation, Goals, Top 152: @chapter Other books on ANS Forth 153: 154: As the standard is relatively new, there are not many books out yet. It 155: is not recommended to learn Forth by using Gforth and a book that is 156: not written for ANS Forth, as you will not know your mistakes from the 157: deviations of the book. 158: 159: There is, of course, the standard, the definite reference if you want to 160: write ANS Forth programs. It will be available in printed form from 161: Global Engineering Documents !! somtime in spring or summer 1994. If you 162: are lucky, you can still get dpANS6 (the draft that was approved as 163: standard) by aftp from ftp.uu.net:/vendor/minerva/x3j14. 164: 165: @cite{Forth: The new model} by Jack Woehr (!! Publisher) is an 166: introductory book based on a draft version of the standard. It does not 167: cover the whole standard. It also contains interesting background 168: information (Jack Woehr was in the ANS Forth Technical Committe). It is 169: not appropriate for complete newbies, but programmers experienced in 170: other languages should find it ok. 171: 172: @node Invocation, Words, Other Books, Top 173: @chapter Invocation 174: 175: You will usually just say @code{gforth}. In many other cases the default 176: Gforth image will be invoked like this: 177: 178: @example 179: gforth [files] [-e forth-code] 180: @end example 181: 182: executing the contents of the files and the Forth code in the order they 183: are given. 184: 185: In general, the command line looks like this: 186: 187: @example 188: gforth [initialization options] [image-specific options] 189: @end example 190: 191: The initialization options must come before the rest of the command 192: line. They are: 193: 194: @table @code 195: @item --image-file @var{file} 196: Loads the Forth image @var{file} instead of the default 197: @file{gforth.fi}. 198: 199: @item --path @var{path} 200: Uses @var{path} for searching the image file and Forth source code 201: files instead of the default in the environment variable 202: @code{GFORTHPATH} or the path specified at installation time (typically 203: @file{/usr/local/lib/gforth:.}). A path is given as a @code{:}-separated 204: list. 205: 206: @item --dictionary-size @var{size} 207: @item -m @var{size} 208: Allocate @var{size} space for the Forth dictionary space instead of 209: using the default specified in the image (typically 256K). The 210: @var{size} specification consists of an integer and a unit (e.g., 211: @code{4M}). The unit can be one of @code{b} (bytes), @code{e} (element 212: size, in this case Cells), @code{k} (kilobytes), and @code{M} 213: (Megabytes). If no unit is specified, @code{e} is used. 214: 215: @item --data-stack-size @var{size} 216: @item -d @var{size} 217: Allocate @var{size} space for the data stack instead of using the 218: default specified in the image (typically 16K). 219: 220: @item --return-stack-size @var{size} 221: @item -r @var{size} 222: Allocate @var{size} space for the return stack instead of using the 223: default specified in the image (typically 16K). 224: 225: @item --fp-stack-size @var{size} 226: @item -f @var{size} 227: Allocate @var{size} space for the floating point stack instead of 228: using the default specified in the image (typically 16K). In this case 229: the unit specifier @code{e} refers to floating point numbers. 230: 231: @item --locals-stack-size @var{size} 232: @item -l @var{size} 233: Allocate @var{size} space for the locals stack instead of using the 234: default specified in the image (typically 16K). 235: 236: @end table 237: 238: As explained above, the image-specific command-line arguments for the 239: default image @file{gforth.fi} consist of a sequence of filenames and 240: @code{-e @var{forth-code}} options that are interpreted in the seqence 241: in which they are given. The @code{-e @var{forth-code}} or 242: @code{--evaluate @var{forth-code}} option evaluates the forth 243: code. This option takes only one argument; if you want to evaluate more 244: Forth words, you have to quote them or use several @code{-e}s. To exit 245: after processing the command line (instead of entering interactive mode) 246: append @code{-e bye} to the command line. 247: 248: Not yet implemented: 249: On startup the system first executes the system initialization file 250: (unless the option @code{--no-init-file} is given; note that the system 251: resulting from using this option may not be ANS Forth conformant). Then 252: the user initialization file @file{.gforth.fs} is executed, unless the 253: option @code{--no-rc} is given; this file is first searched in @file{.}, 254: then in @file{~}, then in the normal path (see above). 255: 256: @node Words, ANS conformance, Invocation, Top 257: @chapter Forth Words 258: 259: @menu 260: * Notation:: 261: * Arithmetic:: 262: * Stack Manipulation:: 263: * Memory access:: 264: * Control Structures:: 265: * Locals:: 266: * Defining Words:: 267: * Wordlists:: 268: * Files:: 269: * Blocks:: 270: * Other I/O:: 271: * Programming Tools:: 272: * Assembler and Code words:: 273: * Threading Words:: 274: @end menu 275: 276: @node Notation, Arithmetic, Words, Words 277: @section Notation 278: 279: The Forth words are described in this section in the glossary notation 280: that has become a de-facto standard for Forth texts, i.e. 281: 282: @format 283: @var{word} @var{Stack effect} @var{wordset} @var{pronunciation} 284: @end format 285: @var{Description} 286: 287: @table @var 288: @item word 289: The name of the word. BTW, Gforth is case insensitive, so you can 290: type the words in in lower case (However, @pxref{core-idef}). 291: 292: @item Stack effect 293: The stack effect is written in the notation @code{@var{before} -- 294: @var{after}}, where @var{before} and @var{after} describe the top of 295: stack entries before and after the execution of the word. The rest of 296: the stack is not touched by the word. The top of stack is rightmost, 297: i.e., a stack sequence is written as it is typed in. Note that Gforth 298: uses a separate floating point stack, but a unified stack 299: notation. Also, return stack effects are not shown in @var{stack 300: effect}, but in @var{Description}. The name of a stack item describes 301: the type and/or the function of the item. See below for a discussion of 302: the types. 303: 304: @item pronunciation 305: How the word is pronounced 306: 307: @item wordset 308: The ANS Forth standard is divided into several wordsets. A standard 309: system need not support all of them. So, the fewer wordsets your program 310: uses the more portable it will be in theory. However, we suspect that 311: most ANS Forth systems on personal machines will feature all 312: wordsets. Words that are not defined in the ANS standard have 313: @code{gforth} as wordset. 314: 315: @item Description 316: A description of the behaviour of the word. 317: @end table 318: 319: The type of a stack item is specified by the character(s) the name 320: starts with: 321: 322: @table @code 323: @item f 324: Bool, i.e. @code{false} or @code{true}. 325: @item c 326: Char 327: @item w 328: Cell, can contain an integer or an address 329: @item n 330: signed integer 331: @item u 332: unsigned integer 333: @item d 334: double sized signed integer 335: @item ud 336: double sized unsigned integer 337: @item r 338: Float 339: @item a_ 340: Cell-aligned address 341: @item c_ 342: Char-aligned address (note that a Char is two bytes in Windows NT) 343: @item f_ 344: Float-aligned address 345: @item df_ 346: Address aligned for IEEE double precision float 347: @item sf_ 348: Address aligned for IEEE single precision float 349: @item xt 350: Execution token, same size as Cell 351: @item wid 352: Wordlist ID, same size as Cell 353: @item f83name 354: Pointer to a name structure 355: @end table 356: 357: @node Arithmetic, Stack Manipulation, Notation, Words 358: @section Arithmetic 359: Forth arithmetic is not checked, i.e., you will not hear about integer 360: overflow on addition or multiplication, you may hear about division by 361: zero if you are lucky. The operator is written after the operands, but 362: the operands are still in the original order. I.e., the infix @code{2-1} 363: corresponds to @code{2 1 -}. Forth offers a variety of division 364: operators. If you perform division with potentially negative operands, 365: you do not want to use @code{/} or @code{/mod} with its undefined 366: behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the 367: former, @pxref{Mixed precision}). 368: 369: @menu 370: * Single precision:: 371: * Bitwise operations:: 372: * Mixed precision:: operations with single and double-cell integers 373: * Double precision:: Double-cell integer arithmetic 374: * Floating Point:: 375: @end menu 376: 377: @node Single precision, Bitwise operations, Arithmetic, Arithmetic 378: @subsection Single precision 379: doc-+ 380: doc-- 381: doc-* 382: doc-/ 383: doc-mod 384: doc-/mod 385: doc-negate 386: doc-abs 387: doc-min 388: doc-max 389: 390: @node Bitwise operations, Mixed precision, Single precision, Arithmetic 391: @subsection Bitwise operations 392: doc-and 393: doc-or 394: doc-xor 395: doc-invert 396: doc-2* 397: doc-2/ 398: 399: @node Mixed precision, Double precision, Bitwise operations, Arithmetic 400: @subsection Mixed precision 401: doc-m+ 402: doc-*/ 403: doc-*/mod 404: doc-m* 405: doc-um* 406: doc-m*/ 407: doc-um/mod 408: doc-fm/mod 409: doc-sm/rem 410: 411: @node Double precision, Floating Point, Mixed precision, Arithmetic 412: @subsection Double precision 413: 414: The outer (aka text) interpreter converts numbers containing a dot into 415: a double precision number. Note that only numbers with the dot as last 416: character are standard-conforming. 417: 418: doc-d+ 419: doc-d- 420: doc-dnegate 421: doc-dabs 422: doc-dmin 423: doc-dmax 424: 425: @node Floating Point, , Double precision, Arithmetic 426: @subsection Floating Point 427: 428: The format of floating point numbers recognized by the outer (aka text) 429: interpreter is: a signed decimal number, possibly containing a decimal 430: point (@code{.}), followed by @code{E} or @code{e}, optionally followed 431: by a signed integer (the exponent). E.g., @code{1e} ist the same as 432: @code{+1.0e+1}. Note that a number without @code{e} 433: is not interpreted as floating-point number, but as double (if the 434: number contains a @code{.}) or single precision integer. Also, 435: conversions between string and floating point numbers always use base 436: 10, irrespective of the value of @code{BASE}. If @code{BASE} contains a 437: value greater then 14, the @code{E} may be interpreted as digit and the 438: number will be interpreted as integer, unless it has a signed exponent 439: (both @code{+} and @code{-} are allowed as signs). 440: 441: Angles in floating point operations are given in radians (a full circle 442: has 2 pi radians). Note, that Gforth has a separate floating point 443: stack, but we use the unified notation. 444: 445: Floating point numbers have a number of unpleasant surprises for the 446: unwary (e.g., floating point addition is not associative) and even a few 447: for the wary. You should not use them unless you know what you are doing 448: or you don't care that the results you get are totally bogus. If you 449: want to learn about the problems of floating point numbers (and how to 450: avoid them), you might start with @cite{David Goldberg, What Every 451: Computer Scientist Should Know About Floating-Point Arithmetic, ACM 452: Computing Surveys 23(1):5@minus{}48, March 1991}. 453: 454: doc-f+ 455: doc-f- 456: doc-f* 457: doc-f/ 458: doc-fnegate 459: doc-fabs 460: doc-fmax 461: doc-fmin 462: doc-floor 463: doc-fround 464: doc-f** 465: doc-fsqrt 466: doc-fexp 467: doc-fexpm1 468: doc-fln 469: doc-flnp1 470: doc-flog 471: doc-falog 472: doc-fsin 473: doc-fcos 474: doc-fsincos 475: doc-ftan 476: doc-fasin 477: doc-facos 478: doc-fatan 479: doc-fatan2 480: doc-fsinh 481: doc-fcosh 482: doc-ftanh 483: doc-fasinh 484: doc-facosh 485: doc-fatanh 486: 487: @node Stack Manipulation, Memory access, Arithmetic, Words 488: @section Stack Manipulation 489: 490: Gforth has a data stack (aka parameter stack) for characters, cells, 491: addresses, and double cells, a floating point stack for floating point 492: numbers, a return stack for storing the return addresses of colon 493: definitions and other data, and a locals stack for storing local 494: variables. Note that while every sane Forth has a separate floating 495: point stack, this is not strictly required; an ANS Forth system could 496: theoretically keep floating point numbers on the data stack. As an 497: additional difficulty, you don't know how many cells a floating point 498: number takes. It is reportedly possible to write words in a way that 499: they work also for a unified stack model, but we do not recommend trying 500: it. Instead, just say that your program has an environmental dependency 501: on a separate FP stack. 502: 503: Also, a Forth system is allowed to keep the local variables on the 504: return stack. This is reasonable, as local variables usually eliminate 505: the need to use the return stack explicitly. So, if you want to produce 506: a standard complying program and if you are using local variables in a 507: word, forget about return stack manipulations in that word (see the 508: standard document for the exact rules). 509: 510: @menu 511: * Data stack:: 512: * Floating point stack:: 513: * Return stack:: 514: * Locals stack:: 515: * Stack pointer manipulation:: 516: @end menu 517: 518: @node Data stack, Floating point stack, Stack Manipulation, Stack Manipulation 519: @subsection Data stack 520: doc-drop 521: doc-nip 522: doc-dup 523: doc-over 524: doc-tuck 525: doc-swap 526: doc-rot 527: doc--rot 528: doc-?dup 529: doc-pick 530: doc-roll 531: doc-2drop 532: doc-2nip 533: doc-2dup 534: doc-2over 535: doc-2tuck 536: doc-2swap 537: doc-2rot 538: 539: @node Floating point stack, Return stack, Data stack, Stack Manipulation 540: @subsection Floating point stack 541: doc-fdrop 542: doc-fnip 543: doc-fdup 544: doc-fover 545: doc-ftuck 546: doc-fswap 547: doc-frot 548: 549: @node Return stack, Locals stack, Floating point stack, Stack Manipulation 550: @subsection Return stack 551: doc->r 552: doc-r> 553: doc-r@ 554: doc-rdrop 555: doc-2>r 556: doc-2r> 557: doc-2r@ 558: doc-2rdrop 559: 560: @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation 561: @subsection Locals stack 562: 563: @node Stack pointer manipulation, , Locals stack, Stack Manipulation 564: @subsection Stack pointer manipulation 565: doc-sp@ 566: doc-sp! 567: doc-fp@ 568: doc-fp! 569: doc-rp@ 570: doc-rp! 571: doc-lp@ 572: doc-lp! 573: 574: @node Memory access, Control Structures, Stack Manipulation, Words 575: @section Memory access 576: 577: @menu 578: * Stack-Memory transfers:: 579: * Address arithmetic:: 580: * Memory block access:: 581: @end menu 582: 583: @node Stack-Memory transfers, Address arithmetic, Memory access, Memory access 584: @subsection Stack-Memory transfers 585: 586: doc-@ 587: doc-! 588: doc-+! 589: doc-c@ 590: doc-c! 591: doc-2@ 592: doc-2! 593: doc-f@ 594: doc-f! 595: doc-sf@ 596: doc-sf! 597: doc-df@ 598: doc-df! 599: 600: @node Address arithmetic, Memory block access, Stack-Memory transfers, Memory access 601: @subsection Address arithmetic 602: 603: ANS Forth does not specify the sizes of the data types. Instead, it 604: offers a number of words for computing sizes and doing address 605: arithmetic. Basically, address arithmetic is performed in terms of 606: address units (aus); on most systems the address unit is one byte. Note 607: that a character may have more than one au, so @code{chars} is no noop 608: (on systems where it is a noop, it compiles to nothing). 609: 610: ANS Forth also defines words for aligning addresses for specific 611: addresses. Many computers require that accesses to specific data types 612: must only occur at specific addresses; e.g., that cells may only be 613: accessed at addresses divisible by 4. Even if a machine allows unaligned 614: accesses, it can usually perform aligned accesses faster. 615: 616: For the performance-conscious: alignment operations are usually only 617: necessary during the definition of a data structure, not during the 618: (more frequent) accesses to it. 619: 620: ANS Forth defines no words for character-aligning addresses. This is not 621: an oversight, but reflects the fact that addresses that are not 622: char-aligned have no use in the standard and therefore will not be 623: created. 624: 625: The standard guarantees that addresses returned by @code{CREATE}d words 626: are cell-aligned; in addition, Gforth guarantees that these addresses 627: are aligned for all purposes. 628: 629: Note that the standard defines a word @code{char}, which has nothing to 630: do with address arithmetic. 631: 632: doc-chars 633: doc-char+ 634: doc-cells 635: doc-cell+ 636: doc-align 637: doc-aligned 638: doc-floats 639: doc-float+ 640: doc-falign 641: doc-faligned 642: doc-sfloats 643: doc-sfloat+ 644: doc-sfalign 645: doc-sfaligned 646: doc-dfloats 647: doc-dfloat+ 648: doc-dfalign 649: doc-dfaligned 650: doc-maxalign 651: doc-maxaligned 652: doc-cfalign 653: doc-cfaligned 654: doc-address-unit-bits 655: 656: @node Memory block access, , Address arithmetic, Memory access 657: @subsection Memory block access 658: 659: doc-move 660: doc-erase 661: 662: While the previous words work on address units, the rest works on 663: characters. 664: 665: doc-cmove 666: doc-cmove> 667: doc-fill 668: doc-blank 669: 670: @node Control Structures, Locals, Memory access, Words 671: @section Control Structures 672: 673: Control structures in Forth cannot be used in interpret state, only in 674: compile state, i.e., in a colon definition. We do not like this 675: limitation, but have not seen a satisfying way around it yet, although 676: many schemes have been proposed. 677: 678: @menu 679: * Selection:: 680: * Simple Loops:: 681: * Counted Loops:: 682: * Arbitrary control structures:: 683: * Calls and returns:: 684: * Exception Handling:: 685: @end menu 686: 687: @node Selection, Simple Loops, Control Structures, Control Structures 688: @subsection Selection 689: 690: @example 691: @var{flag} 692: IF 693: @var{code} 694: ENDIF 695: @end example 696: or 697: @example 698: @var{flag} 699: IF 700: @var{code1} 701: ELSE 702: @var{code2} 703: ENDIF 704: @end example 705: 706: You can use @code{THEN} instead of @code{ENDIF}. Indeed, @code{THEN} is 707: standard, and @code{ENDIF} is not, although it is quite popular. We 708: recommend using @code{ENDIF}, because it is less confusing for people 709: who also know other languages (and is not prone to reinforcing negative 710: prejudices against Forth in these people). Adding @code{ENDIF} to a 711: system that only supplies @code{THEN} is simple: 712: @example 713: : endif POSTPONE then ; immediate 714: @end example 715: 716: [According to @cite{Webster's New Encyclopedic Dictionary}, @dfn{then 717: (adv.)} has the following meanings: 718: @quotation 719: ... 2b: following next after in order ... 3d: as a necessary consequence 720: (if you were there, then you saw them). 721: @end quotation 722: Forth's @code{THEN} has the meaning 2b, whereas @code{THEN} in Pascal 723: and many other programming languages has the meaning 3d.] 724: 725: We also provide the words @code{?dup-if} and @code{?dup-0=-if}, so you 726: can avoid using @code{?dup}. 727: 728: @example 729: @var{n} 730: CASE 731: @var{n1} OF @var{code1} ENDOF 732: @var{n2} OF @var{code2} ENDOF 733: @dots{} 734: ENDCASE 735: @end example 736: 737: Executes the first @var{codei}, where the @var{ni} is equal to 738: @var{n}. A default case can be added by simply writing the code after 739: the last @code{ENDOF}. It may use @var{n}, which is on top of the stack, 740: but must not consume it. 741: 742: @node Simple Loops, Counted Loops, Selection, Control Structures 743: @subsection Simple Loops 744: 745: @example 746: BEGIN 747: @var{code1} 748: @var{flag} 749: WHILE 750: @var{code2} 751: REPEAT 752: @end example 753: 754: @var{code1} is executed and @var{flag} is computed. If it is true, 755: @var{code2} is executed and the loop is restarted; If @var{flag} is false, execution continues after the @code{REPEAT}. 756: 757: @example 758: BEGIN 759: @var{code} 760: @var{flag} 761: UNTIL 762: @end example 763: 764: @var{code} is executed. The loop is restarted if @code{flag} is false. 765: 766: @example 767: BEGIN 768: @var{code} 769: AGAIN 770: @end example 771: 772: This is an endless loop. 773: 774: @node Counted Loops, Arbitrary control structures, Simple Loops, Control Structures 775: @subsection Counted Loops 776: 777: The basic counted loop is: 778: @example 779: @var{limit} @var{start} 780: ?DO 781: @var{body} 782: LOOP 783: @end example 784: 785: This performs one iteration for every integer, starting from @var{start} 786: and up to, but excluding @var{limit}. The counter, aka index, can be 787: accessed with @code{i}. E.g., the loop 788: @example 789: 10 0 ?DO 790: i . 791: LOOP 792: @end example 793: prints 794: @example 795: 0 1 2 3 4 5 6 7 8 9 796: @end example 797: The index of the innermost loop can be accessed with @code{i}, the index 798: of the next loop with @code{j}, and the index of the third loop with 799: @code{k}. 800: 801: The loop control data are kept on the return stack, so there are some 802: restrictions on mixing return stack accesses and counted loop 803: words. E.g., if you put values on the return stack outside the loop, you 804: cannot read them inside the loop. If you put values on the return stack 805: within a loop, you have to remove them before the end of the loop and 806: before accessing the index of the loop. 807: 808: There are several variations on the counted loop: 809: 810: @code{LEAVE} leaves the innermost counted loop immediately. 811: 812: If @var{start} is greater than @var{limit}, a @code{?DO} loop is entered 813: (and @code{LOOP} iterates until they become equal by wrap-around 814: arithmetic). This behaviour is usually not what you want. Therefore, 815: Gforth offers @code{+DO} and @code{U+DO} (as replacements for 816: @code{?DO}), which do not enter the loop if @var{start} is greater than 817: @var{limit}; @code{+DO} is for signed loop parameters, @code{U+DO} for 818: unsigned loop parameters. These words can be implemented easily on 819: standard systems, so using them does not make your programs hard to 820: port; e.g.: 821: @example 822: : +DO ( compile-time: -- do-sys; run-time: n1 n2 -- ) 823: POSTPONE over POSTPONE min POSTPONE ?DO ; immediate 824: @end example 825: 826: @code{LOOP} can be replaced with @code{@var{n} +LOOP}; this updates the 827: index by @var{n} instead of by 1. The loop is terminated when the border 828: between @var{limit-1} and @var{limit} is crossed. E.g.: 829: 830: @code{4 0 +DO i . 2 +LOOP} prints @code{0 2} 831: 832: @code{4 1 +DO i . 2 +LOOP} prints @code{1 3} 833: 834: The behaviour of @code{@var{n} +LOOP} is peculiar when @var{n} is negative: 835: 836: @code{-1 0 ?DO i . -1 +LOOP} prints @code{0 -1} 837: 838: @code{ 0 0 ?DO i . -1 +LOOP} prints nothing 839: 840: Therefore we recommend avoiding @code{@var{n} +LOOP} with negative 841: @var{n}. One alternative is @code{@var{u} -LOOP}, which reduces the 842: index by @var{u} each iteration. The loop is terminated when the border 843: between @var{limit+1} and @var{limit} is crossed. Gforth also provides 844: @code{-DO} and @code{U-DO} for down-counting loops. E.g.: 845: 846: @code{-2 0 -DO i . 1 -LOOP} prints @code{0 -1} 847: 848: @code{-1 0 -DO i . 1 -LOOP} prints @code{0} 849: 850: @code{ 0 0 -DO i . 1 -LOOP} prints nothing 851: 852: Another alternative is @code{@var{n} S+LOOP}, where the negative 853: case behaves symmetrical to the positive case: 854: 855: @code{-2 0 -DO i . -1 S+LOOP} prints @code{0 -1} 856: 857: The loop is terminated when the border between @var{limit@minus{}sgn(n)} 858: and @var{limit} is crossed. Unfortunately, neither @code{-LOOP} nor 859: @code{S+LOOP} are part of the ANS Forth standard, and they are not easy 860: to implement using standard words. If you want to write standard 861: programs, just avoid counting down. 862: 863: @code{?DO} can also be replaced by @code{DO}. @code{DO} always enters 864: the loop, independent of the loop parameters. Do not use @code{DO}, even 865: if you know that the loop is entered in any case. Such knowledge tends 866: to become invalid during maintenance of a program, and then the 867: @code{DO} will make trouble. 868: 869: @code{UNLOOP} is used to prepare for an abnormal loop exit, e.g., via 870: @code{EXIT}. @code{UNLOOP} removes the loop control parameters from the 871: return stack so @code{EXIT} can get to its return address. 872: 873: Another counted loop is 874: @example 875: @var{n} 876: FOR 877: @var{body} 878: NEXT 879: @end example 880: This is the preferred loop of native code compiler writers who are too 881: lazy to optimize @code{?DO} loops properly. In Gforth, this loop 882: iterates @var{n+1} times; @code{i} produces values starting with @var{n} 883: and ending with 0. Other Forth systems may behave differently, even if 884: they support @code{FOR} loops. 885: 886: @node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures 887: @subsection Arbitrary control structures 888: 889: ANS Forth permits and supports using control structures in a non-nested 890: way. Information about incomplete control structures is stored on the 891: control-flow stack. This stack may be implemented on the Forth data 892: stack, and this is what we have done in Gforth. 893: 894: An @i{orig} entry represents an unresolved forward branch, a @i{dest} 895: entry represents a backward branch target. A few words are the basis for 896: building any control structure possible (except control structures that 897: need storage, like calls, coroutines, and backtracking). 898: 899: doc-if 900: doc-ahead 901: doc-then 902: doc-begin 903: doc-until 904: doc-again 905: doc-cs-pick 906: doc-cs-roll 907: 908: On many systems control-flow stack items take one word, in Gforth they 909: currently take three (this may change in the future). Therefore it is a 910: really good idea to manipulate the control flow stack with 911: @code{cs-pick} and @code{cs-roll}, not with data stack manipulation 912: words. 913: 914: Some standard control structure words are built from these words: 915: 916: doc-else 917: doc-while 918: doc-repeat 919: 920: Counted loop words constitute a separate group of words: 921: 922: doc-?do 923: doc-+do 924: doc-u+do 925: doc--do 926: doc-u-do 927: doc-do 928: doc-for 929: doc-loop 930: doc-s+loop 931: doc-+loop 932: doc--loop 933: doc-next 934: doc-leave 935: doc-?leave 936: doc-unloop 937: doc-done 938: 939: The standard does not allow using @code{cs-pick} and @code{cs-roll} on 940: @i{do-sys}. Our system allows it, but it's your job to ensure that for 941: every @code{?DO} etc. there is exactly one @code{UNLOOP} on any path 942: through the definition (@code{LOOP} etc. compile an @code{UNLOOP} on the 943: fall-through path). Also, you have to ensure that all @code{LEAVE}s are 944: resolved (by using one of the loop-ending words or @code{DONE}). 945: 946: Another group of control structure words are 947: 948: doc-case 949: doc-endcase 950: doc-of 951: doc-endof 952: 953: @i{case-sys} and @i{of-sys} cannot be processed using @code{cs-pick} and 954: @code{cs-roll}. 955: 956: @subsubsection Programming Style 957: 958: In order to ensure readability we recommend that you do not create 959: arbitrary control structures directly, but define new control structure 960: words for the control structure you want and use these words in your 961: program. 962: 963: E.g., instead of writing 964: 965: @example 966: begin 967: ... 968: if [ 1 cs-roll ] 969: ... 970: again then 971: @end example 972: 973: we recommend defining control structure words, e.g., 974: 975: @example 976: : while ( dest -- orig dest ) 977: POSTPONE if 978: 1 cs-roll ; immediate 979: 980: : repeat ( orig dest -- ) 981: POSTPONE again 982: POSTPONE then ; immediate 983: @end example 984: 985: and then using these to create the control structure: 986: 987: @example 988: begin 989: ... 990: while 991: ... 992: repeat 993: @end example 994: 995: That's much easier to read, isn't it? Of course, @code{BEGIN} and 996: @code{WHILE} are predefined, so in this example it would not be 997: necessary to define them. 998: 999: @node Calls and returns, Exception Handling, Arbitrary control structures, Control Structures 1000: @subsection Calls and returns 1001: 1002: A definition can be called simply be writing the name of the 1003: definition. When the end of the definition is reached, it returns. An 1004: earlier return can be forced using 1005: 1006: doc-exit 1007: 1008: Don't forget to clean up the return stack and @code{UNLOOP} any 1009: outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing. The 1010: primitive compiled by @code{EXIT} is 1011: 1012: doc-;s 1013: 1014: @node Exception Handling, , Calls and returns, Control Structures 1015: @subsection Exception Handling 1016: 1017: doc-catch 1018: doc-throw 1019: 1020: @node Locals, Defining Words, Control Structures, Words 1021: @section Locals 1022: 1023: Local variables can make Forth programming more enjoyable and Forth 1024: programs easier to read. Unfortunately, the locals of ANS Forth are 1025: laden with restrictions. Therefore, we provide not only the ANS Forth 1026: locals wordset, but also our own, more powerful locals wordset (we 1027: implemented the ANS Forth locals wordset through our locals wordset). 1028: 1029: @menu 1030: * Gforth locals:: 1031: * ANS Forth locals:: 1032: @end menu 1033: 1034: @node Gforth locals, ANS Forth locals, Locals, Locals 1035: @subsection Gforth locals 1036: 1037: Locals can be defined with 1038: 1039: @example 1040: @{ local1 local2 ... -- comment @} 1041: @end example 1042: or 1043: @example 1044: @{ local1 local2 ... @} 1045: @end example 1046: 1047: E.g., 1048: @example 1049: : max @{ n1 n2 -- n3 @} 1050: n1 n2 > if 1051: n1 1052: else 1053: n2 1054: endif ; 1055: @end example 1056: 1057: The similarity of locals definitions with stack comments is intended. A 1058: locals definition often replaces the stack comment of a word. The order 1059: of the locals corresponds to the order in a stack comment and everything 1060: after the @code{--} is really a comment. 1061: 1062: This similarity has one disadvantage: It is too easy to confuse locals 1063: declarations with stack comments, causing bugs and making them hard to 1064: find. However, this problem can be avoided by appropriate coding 1065: conventions: Do not use both notations in the same program. If you do, 1066: they should be distinguished using additional means, e.g. by position. 1067: 1068: The name of the local may be preceded by a type specifier, e.g., 1069: @code{F:} for a floating point value: 1070: 1071: @example 1072: : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @} 1073: \ complex multiplication 1074: Ar Br f* Ai Bi f* f- 1075: Ar Bi f* Ai Br f* f+ ; 1076: @end example 1077: 1078: Gforth currently supports cells (@code{W:}, @code{W^}), doubles 1079: (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters 1080: (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined 1081: with @code{W:}, @code{D:} etc.) produces its value and can be changed 1082: with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.) 1083: produces its address (which becomes invalid when the variable's scope is 1084: left). E.g., the standard word @code{emit} can be defined in therms of 1085: @code{type} like this: 1086: 1087: @example 1088: : emit @{ C^ char* -- @} 1089: char* 1 type ; 1090: @end example 1091: 1092: A local without type specifier is a @code{W:} local. Both flavours of 1093: locals are initialized with values from the data or FP stack. 1094: 1095: Currently there is no way to define locals with user-defined data 1096: structures, but we are working on it. 1097: 1098: Gforth allows defining locals everywhere in a colon definition. This 1099: poses the following questions: 1100: 1101: @menu 1102: * Where are locals visible by name?:: 1103: * How long do locals live?:: 1104: * Programming Style:: 1105: * Implementation:: 1106: @end menu 1107: 1108: @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals 1109: @subsubsection Where are locals visible by name? 1110: 1111: Basically, the answer is that locals are visible where you would expect 1112: it in block-structured languages, and sometimes a little longer. If you 1113: want to restrict the scope of a local, enclose its definition in 1114: @code{SCOPE}...@code{ENDSCOPE}. 1115: 1116: doc-scope 1117: doc-endscope 1118: 1119: These words behave like control structure words, so you can use them 1120: with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in 1121: arbitrary ways. 1122: 1123: If you want a more exact answer to the visibility question, here's the 1124: basic principle: A local is visible in all places that can only be 1125: reached through the definition of the local@footnote{In compiler 1126: construction terminology, all places dominated by the definition of the 1127: local.}. In other words, it is not visible in places that can be reached 1128: without going through the definition of the local. E.g., locals defined 1129: in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals 1130: defined in @code{BEGIN}...@code{UNTIL} are visible after the 1131: @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}). 1132: 1133: The reasoning behind this solution is: We want to have the locals 1134: visible as long as it is meaningful. The user can always make the 1135: visibility shorter by using explicit scoping. In a place that can 1136: only be reached through the definition of a local, the meaning of a 1137: local name is clear. In other places it is not: How is the local 1138: initialized at the control flow path that does not contain the 1139: definition? Which local is meant, if the same name is defined twice in 1140: two independent control flow paths? 1141: 1142: This should be enough detail for nearly all users, so you can skip the 1143: rest of this section. If you relly must know all the gory details and 1144: options, read on. 1145: 1146: In order to implement this rule, the compiler has to know which places 1147: are unreachable. It knows this automatically after @code{AHEAD}, 1148: @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after 1149: most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the 1150: compiler that the control flow never reaches that place. If 1151: @code{UNREACHABLE} is not used where it could, the only consequence is 1152: that the visibility of some locals is more limited than the rule above 1153: says. If @code{UNREACHABLE} is used where it should not (i.e., if you 1154: lie to the compiler), buggy code will be produced. 1155: 1156: Another problem with this rule is that at @code{BEGIN}, the compiler 1157: does not know which locals will be visible on the incoming 1158: back-edge. All problems discussed in the following are due to this 1159: ignorance of the compiler (we discuss the problems using @code{BEGIN} 1160: loops as examples; the discussion also applies to @code{?DO} and other 1161: loops). Perhaps the most insidious example is: 1162: @example 1163: AHEAD 1164: BEGIN 1165: x 1166: [ 1 CS-ROLL ] THEN 1167: @{ x @} 1168: ... 1169: UNTIL 1170: @end example 1171: 1172: This should be legal according to the visibility rule. The use of 1173: @code{x} can only be reached through the definition; but that appears 1174: textually below the use. 1175: 1176: From this example it is clear that the visibility rules cannot be fully 1177: implemented without major headaches. Our implementation treats common 1178: cases as advertised and the exceptions are treated in a safe way: The 1179: compiler makes a reasonable guess about the locals visible after a 1180: @code{BEGIN}; if it is too pessimistic, the 1181: user will get a spurious error about the local not being defined; if the 1182: compiler is too optimistic, it will notice this later and issue a 1183: warning. In the case above the compiler would complain about @code{x} 1184: being undefined at its use. You can see from the obscure examples in 1185: this section that it takes quite unusual control structures to get the 1186: compiler into trouble, and even then it will often do fine. 1187: 1188: If the @code{BEGIN} is reachable from above, the most optimistic guess 1189: is that all locals visible before the @code{BEGIN} will also be 1190: visible after the @code{BEGIN}. This guess is valid for all loops that 1191: are entered only through the @code{BEGIN}, in particular, for normal 1192: @code{BEGIN}...@code{WHILE}...@code{REPEAT} and 1193: @code{BEGIN}...@code{UNTIL} loops and it is implemented in our 1194: compiler. When the branch to the @code{BEGIN} is finally generated by 1195: @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and 1196: warns the user if it was too optimisitic: 1197: @example 1198: IF 1199: @{ x @} 1200: BEGIN 1201: \ x ? 1202: [ 1 cs-roll ] THEN 1203: ... 1204: UNTIL 1205: @end example 1206: 1207: Here, @code{x} lives only until the @code{BEGIN}, but the compiler 1208: optimistically assumes that it lives until the @code{THEN}. It notices 1209: this difference when it compiles the @code{UNTIL} and issues a 1210: warning. The user can avoid the warning, and make sure that @code{x} 1211: is not used in the wrong area by using explicit scoping: 1212: @example 1213: IF 1214: SCOPE 1215: @{ x @} 1216: ENDSCOPE 1217: BEGIN 1218: [ 1 cs-roll ] THEN 1219: ... 1220: UNTIL 1221: @end example 1222: 1223: Since the guess is optimistic, there will be no spurious error messages 1224: about undefined locals. 1225: 1226: If the @code{BEGIN} is not reachable from above (e.g., after 1227: @code{AHEAD} or @code{EXIT}), the compiler cannot even make an 1228: optimistic guess, as the locals visible after the @code{BEGIN} may be 1229: defined later. Therefore, the compiler assumes that no locals are 1230: visible after the @code{BEGIN}. However, the user can use 1231: @code{ASSUME-LIVE} to make the compiler assume that the same locals are 1232: visible at the BEGIN as at the point where the top control-flow stack 1233: item was created. 1234: 1235: doc-assume-live 1236: 1237: E.g., 1238: @example 1239: @{ x @} 1240: AHEAD 1241: ASSUME-LIVE 1242: BEGIN 1243: x 1244: [ 1 CS-ROLL ] THEN 1245: ... 1246: UNTIL 1247: @end example 1248: 1249: Other cases where the locals are defined before the @code{BEGIN} can be 1250: handled by inserting an appropriate @code{CS-ROLL} before the 1251: @code{ASSUME-LIVE} (and changing the control-flow stack manipulation 1252: behind the @code{ASSUME-LIVE}). 1253: 1254: Cases where locals are defined after the @code{BEGIN} (but should be 1255: visible immediately after the @code{BEGIN}) can only be handled by 1256: rearranging the loop. E.g., the ``most insidious'' example above can be 1257: arranged into: 1258: @example 1259: BEGIN 1260: @{ x @} 1261: ... 0= 1262: WHILE 1263: x 1264: REPEAT 1265: @end example 1266: 1267: @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals 1268: @subsubsection How long do locals live? 1269: 1270: The right answer for the lifetime question would be: A local lives at 1271: least as long as it can be accessed. For a value-flavoured local this 1272: means: until the end of its visibility. However, a variable-flavoured 1273: local could be accessed through its address far beyond its visibility 1274: scope. Ultimately, this would mean that such locals would have to be 1275: garbage collected. Since this entails un-Forth-like implementation 1276: complexities, I adopted the same cowardly solution as some other 1277: languages (e.g., C): The local lives only as long as it is visible; 1278: afterwards its address is invalid (and programs that access it 1279: afterwards are erroneous). 1280: 1281: @node Programming Style, Implementation, How long do locals live?, Gforth locals 1282: @subsubsection Programming Style 1283: 1284: The freedom to define locals anywhere has the potential to change 1285: programming styles dramatically. In particular, the need to use the 1286: return stack for intermediate storage vanishes. Moreover, all stack 1287: manipulations (except @code{PICK}s and @code{ROLL}s with run-time 1288: determined arguments) can be eliminated: If the stack items are in the 1289: wrong order, just write a locals definition for all of them; then 1290: write the items in the order you want. 1291: 1292: This seems a little far-fetched and eliminating stack manipulations is 1293: unlikely to become a conscious programming objective. Still, the number 1294: of stack manipulations will be reduced dramatically if local variables 1295: are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with 1296: a traditional implementation of @code{max}). 1297: 1298: This shows one potential benefit of locals: making Forth programs more 1299: readable. Of course, this benefit will only be realized if the 1300: programmers continue to honour the principle of factoring instead of 1301: using the added latitude to make the words longer. 1302: 1303: Using @code{TO} can and should be avoided. Without @code{TO}, 1304: every value-flavoured local has only a single assignment and many 1305: advantages of functional languages apply to Forth. I.e., programs are 1306: easier to analyse, to optimize and to read: It is clear from the 1307: definition what the local stands for, it does not turn into something 1308: different later. 1309: 1310: E.g., a definition using @code{TO} might look like this: 1311: @example 1312: : strcmp @{ addr1 u1 addr2 u2 -- n @} 1313: u1 u2 min 0 1314: ?do 1315: addr1 c@ addr2 c@ - ?dup 1316: if 1317: unloop exit 1318: then 1319: addr1 char+ TO addr1 1320: addr2 char+ TO addr2 1321: loop 1322: u1 u2 - ; 1323: @end example 1324: Here, @code{TO} is used to update @code{addr1} and @code{addr2} at 1325: every loop iteration. @code{strcmp} is a typical example of the 1326: readability problems of using @code{TO}. When you start reading 1327: @code{strcmp}, you think that @code{addr1} refers to the start of the 1328: string. Only near the end of the loop you realize that it is something 1329: else. 1330: 1331: This can be avoided by defining two locals at the start of the loop that 1332: are initialized with the right value for the current iteration. 1333: @example 1334: : strcmp @{ addr1 u1 addr2 u2 -- n @} 1335: addr1 addr2 1336: u1 u2 min 0 1337: ?do @{ s1 s2 @} 1338: s1 c@ s2 c@ - ?dup 1339: if 1340: unloop exit 1341: then 1342: s1 char+ s2 char+ 1343: loop 1344: 2drop 1345: u1 u2 - ; 1346: @end example 1347: Here it is clear from the start that @code{s1} has a different value 1348: in every loop iteration. 1349: 1350: @node Implementation, , Programming Style, Gforth locals 1351: @subsubsection Implementation 1352: 1353: Gforth uses an extra locals stack. The most compelling reason for 1354: this is that the return stack is not float-aligned; using an extra stack 1355: also eliminates the problems and restrictions of using the return stack 1356: as locals stack. Like the other stacks, the locals stack grows toward 1357: lower addresses. A few primitives allow an efficient implementation: 1358: 1359: doc-@local# 1360: doc-f@local# 1361: doc-laddr# 1362: doc-lp+!# 1363: doc-lp! 1364: doc->l 1365: doc-f>l 1366: 1367: In addition to these primitives, some specializations of these 1368: primitives for commonly occurring inline arguments are provided for 1369: efficiency reasons, e.g., @code{@@local0} as specialization of 1370: @code{@@local#} for the inline argument 0. The following compiling words 1371: compile the right specialized version, or the general version, as 1372: appropriate: 1373: 1374: doc-compile-@local 1375: doc-compile-f@local 1376: doc-compile-lp+! 1377: 1378: Combinations of conditional branches and @code{lp+!#} like 1379: @code{?branch-lp+!#} (the locals pointer is only changed if the branch 1380: is taken) are provided for efficiency and correctness in loops. 1381: 1382: A special area in the dictionary space is reserved for keeping the 1383: local variable names. @code{@{} switches the dictionary pointer to this 1384: area and @code{@}} switches it back and generates the locals 1385: initializing code. @code{W:} etc.@ are normal defining words. This 1386: special area is cleared at the start of every colon definition. 1387: 1388: A special feature of Gforth's dictionary is used to implement the 1389: definition of locals without type specifiers: every wordlist (aka 1390: vocabulary) has its own methods for searching 1391: etc. (@pxref{Wordlists}). For the present purpose we defined a wordlist 1392: with a special search method: When it is searched for a word, it 1393: actually creates that word using @code{W:}. @code{@{} changes the search 1394: order to first search the wordlist containing @code{@}}, @code{W:} etc., 1395: and then the wordlist for defining locals without type specifiers. 1396: 1397: The lifetime rules support a stack discipline within a colon 1398: definition: The lifetime of a local is either nested with other locals 1399: lifetimes or it does not overlap them. 1400: 1401: At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack 1402: pointer manipulation is generated. Between control structure words 1403: locals definitions can push locals onto the locals stack. @code{AGAIN} 1404: is the simplest of the other three control flow words. It has to 1405: restore the locals stack depth of the corresponding @code{BEGIN} 1406: before branching. The code looks like this: 1407: @format 1408: @code{lp+!#} current-locals-size @minus{} dest-locals-size 1409: @code{branch} <begin> 1410: @end format 1411: 1412: @code{UNTIL} is a little more complicated: If it branches back, it 1413: must adjust the stack just like @code{AGAIN}. But if it falls through, 1414: the locals stack must not be changed. The compiler generates the 1415: following code: 1416: @format 1417: @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size 1418: @end format 1419: The locals stack pointer is only adjusted if the branch is taken. 1420: 1421: @code{THEN} can produce somewhat inefficient code: 1422: @format 1423: @code{lp+!#} current-locals-size @minus{} orig-locals-size 1424: <orig target>: 1425: @code{lp+!#} orig-locals-size @minus{} new-locals-size 1426: @end format 1427: The second @code{lp+!#} adjusts the locals stack pointer from the 1428: level at the @var{orig} point to the level after the @code{THEN}. The 1429: first @code{lp+!#} adjusts the locals stack pointer from the current 1430: level to the level at the orig point, so the complete effect is an 1431: adjustment from the current level to the right level after the 1432: @code{THEN}. 1433: 1434: In a conventional Forth implementation a dest control-flow stack entry 1435: is just the target address and an orig entry is just the address to be 1436: patched. Our locals implementation adds a wordlist to every orig or dest 1437: item. It is the list of locals visible (or assumed visible) at the point 1438: described by the entry. Our implementation also adds a tag to identify 1439: the kind of entry, in particular to differentiate between live and dead 1440: (reachable and unreachable) orig entries. 1441: 1442: A few unusual operations have to be performed on locals wordlists: 1443: 1444: doc-common-list 1445: doc-sub-list? 1446: doc-list-size 1447: 1448: Several features of our locals wordlist implementation make these 1449: operations easy to implement: The locals wordlists are organised as 1450: linked lists; the tails of these lists are shared, if the lists 1451: contain some of the same locals; and the address of a name is greater 1452: than the address of the names behind it in the list. 1453: 1454: Another important implementation detail is the variable 1455: @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to 1456: determine if they can be reached directly or only through the branch 1457: that they resolve. @code{dead-code} is set by @code{UNREACHABLE}, 1458: @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon 1459: definition, by @code{BEGIN} and usually by @code{THEN}. 1460: 1461: Counted loops are similar to other loops in most respects, but 1462: @code{LEAVE} requires special attention: It performs basically the same 1463: service as @code{AHEAD}, but it does not create a control-flow stack 1464: entry. Therefore the information has to be stored elsewhere; 1465: traditionally, the information was stored in the target fields of the 1466: branches created by the @code{LEAVE}s, by organizing these fields into a 1467: linked list. Unfortunately, this clever trick does not provide enough 1468: space for storing our extended control flow information. Therefore, we 1469: introduce another stack, the leave stack. It contains the control-flow 1470: stack entries for all unresolved @code{LEAVE}s. 1471: 1472: Local names are kept until the end of the colon definition, even if 1473: they are no longer visible in any control-flow path. In a few cases 1474: this may lead to increased space needs for the locals name area, but 1475: usually less than reclaiming this space would cost in code size. 1476: 1477: 1478: @node ANS Forth locals, , Gforth locals, Locals 1479: @subsection ANS Forth locals 1480: 1481: The ANS Forth locals wordset does not define a syntax for locals, but 1482: words that make it possible to define various syntaxes. One of the 1483: possible syntaxes is a subset of the syntax we used in the Gforth locals 1484: wordset, i.e.: 1485: 1486: @example 1487: @{ local1 local2 ... -- comment @} 1488: @end example 1489: or 1490: @example 1491: @{ local1 local2 ... @} 1492: @end example 1493: 1494: The order of the locals corresponds to the order in a stack comment. The 1495: restrictions are: 1496: 1497: @itemize @bullet 1498: @item 1499: Locals can only be cell-sized values (no type specifiers are allowed). 1500: @item 1501: Locals can be defined only outside control structures. 1502: @item 1503: Locals can interfere with explicit usage of the return stack. For the 1504: exact (and long) rules, see the standard. If you don't use return stack 1505: accessing words in a definition using locals, you will be all right. The 1506: purpose of this rule is to make locals implementation on the return 1507: stack easier. 1508: @item 1509: The whole definition must be in one line. 1510: @end itemize 1511: 1512: Locals defined in this way behave like @code{VALUE}s 1513: (@xref{Values}). I.e., they are initialized from the stack. Using their 1514: name produces their value. Their value can be changed using @code{TO}. 1515: 1516: Since this syntax is supported by Gforth directly, you need not do 1517: anything to use it. If you want to port a program using this syntax to 1518: another ANS Forth system, use @file{anslocal.fs} to implement the syntax 1519: on the other system. 1520: 1521: Note that a syntax shown in the standard, section A.13 looks 1522: similar, but is quite different in having the order of locals 1523: reversed. Beware! 1524: 1525: The ANS Forth locals wordset itself consists of the following word 1526: 1527: doc-(local) 1528: 1529: The ANS Forth locals extension wordset defines a syntax, but it is so 1530: awful that we strongly recommend not to use it. We have implemented this 1531: syntax to make porting to Gforth easy, but do not document it here. The 1532: problem with this syntax is that the locals are defined in an order 1533: reversed with respect to the standard stack comment notation, making 1534: programs harder to read, and easier to misread and miswrite. The only 1535: merit of this syntax is that it is easy to implement using the ANS Forth 1536: locals wordset. 1537: 1538: @node Defining Words, Wordlists, Locals, Words 1539: @section Defining Words 1540: 1541: @menu 1542: * Values:: 1543: @end menu 1544: 1545: @node Values, , Defining Words, Defining Words 1546: @subsection Values 1547: 1548: @node Wordlists, Files, Defining Words, Words 1549: @section Wordlists 1550: 1551: @node Files, Blocks, Wordlists, Words 1552: @section Files 1553: 1554: @node Blocks, Other I/O, Files, Words 1555: @section Blocks 1556: 1557: @node Other I/O, Programming Tools, Blocks, Words 1558: @section Other I/O 1559: 1560: @node Programming Tools, Assembler and Code words, Other I/O, Words 1561: @section Programming Tools 1562: 1563: @menu 1564: * Debugging:: Simple and quick. 1565: * Assertions:: Making your programs self-checking. 1566: @end menu 1567: 1568: @node Debugging, Assertions, Programming Tools, Programming Tools 1569: @subsection Debugging 1570: 1571: The simple debugging aids provided in @file{debugging.fs} 1572: are meant to support a different style of debugging than the 1573: tracing/stepping debuggers used in languages with long turn-around 1574: times. 1575: 1576: A much better (faster) way in fast-compilig languages is to add 1577: printing code at well-selected places, let the program run, look at 1578: the output, see where things went wrong, add more printing code, etc., 1579: until the bug is found. 1580: 1581: The word @code{~~} is easy to insert. It just prints debugging 1582: information (by default the source location and the stack contents). It 1583: is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to 1584: query-replace them with nothing). The deferred words 1585: @code{printdebugdata} and @code{printdebugline} control the output of 1586: @code{~~}. The default source location output format works well with 1587: Emacs' compilation mode, so you can step through the program at the 1588: source level using @kbd{C-x `} (the advantage over a stepping debugger 1589: is that you can step in any direction and you know where the crash has 1590: happened or where the strange data has occurred). 1591: 1592: Note that the default actions clobber the contents of the pictured 1593: numeric output string, so you should not use @code{~~}, e.g., between 1594: @code{<#} and @code{#>}. 1595: 1596: doc-~~ 1597: doc-printdebugdata 1598: doc-printdebugline 1599: 1600: @node Assertions, , Debugging, Programming Tools 1601: @subsection Assertions 1602: 1603: It is a good idea to make your programs self-checking, in particular, if 1604: you use an assumption (e.g., that a certain field of a data structure is 1605: never zero) that may become wrong during maintenance. Gforth supports 1606: assertions for this purpose. They are used like this: 1607: 1608: @example 1609: assert( @var{flag} ) 1610: @end example 1611: 1612: The code between @code{assert(} and @code{)} should compute a flag, that 1613: should be true if everything is alright and false otherwise. It should 1614: not change anything else on the stack. The overall stack effect of the 1615: assertion is @code{( -- )}. E.g. 1616: 1617: @example 1618: assert( 1 1 + 2 = ) \ what we learn in school 1619: assert( dup 0<> ) \ assert that the top of stack is not zero 1620: assert( false ) \ this code should not be reached 1621: @end example 1622: 1623: The need for assertions is different at different times. During 1624: debugging, we want more checking, in production we sometimes care more 1625: for speed. Therefore, assertions can be turned off, i.e., the assertion 1626: becomes a comment. Depending on the importance of an assertion and the 1627: time it takes to check it, you may want to turn off some assertions and 1628: keep others turned on. Gforth provides several levels of assertions for 1629: this purpose: 1630: 1631: doc-assert0( 1632: doc-assert1( 1633: doc-assert2( 1634: doc-assert3( 1635: doc-assert( 1636: doc-) 1637: 1638: @code{Assert(} is the same as @code{assert1(}. The variable 1639: @code{assert-level} specifies the highest assertions that are turned 1640: on. I.e., at the default @code{assert-level} of one, @code{assert0(} and 1641: @code{assert1(} assertions perform checking, while @code{assert2(} and 1642: @code{assert3(} assertions are treated as comments. 1643: 1644: Note that the @code{assert-level} is evaluated at compile-time, not at 1645: run-time. I.e., you cannot turn assertions on or off at run-time, you 1646: have to set the @code{assert-level} appropriately before compiling a 1647: piece of code. You can compile several pieces of code at several 1648: @code{assert-level}s (e.g., a trusted library at level 1 and newly 1649: written code at level 3). 1650: 1651: doc-assert-level 1652: 1653: If an assertion fails, a message compatible with Emacs' compilation mode 1654: is produced and the execution is aborted (currently with @code{ABORT"}. 1655: If there is interest, we will introduce a special throw code. But if you 1656: intend to @code{catch} a specific condition, using @code{throw} is 1657: probably more appropriate than an assertion). 1658: 1659: @node Assembler and Code words, Threading Words, Programming Tools, Words 1660: @section Assembler and Code words 1661: 1662: Gforth provides some words for defining primitives (words written in 1663: machine code), and for defining the the machine-code equivalent of 1664: @code{DOES>}-based defining words. However, the machine-independent 1665: nature of Gforth poses a few problems: First of all. Gforth runs on 1666: several architectures, so it can provide no standard assembler. What's 1667: worse is that the register allocation not only depends on the processor, 1668: but also on the gcc version and options used. 1669: 1670: The words Gforth offers encapsulate some system dependences (e.g., the 1671: header structure), so a system-independent assembler may be used in 1672: Gforth. If you do not have an assembler, you can compile machine code 1673: directly with @code{,} and @code{c,}. 1674: 1675: doc-assembler 1676: doc-code 1677: doc-end-code 1678: doc-;code 1679: doc-flush-icache 1680: 1681: If @code{flush-icache} does not work correctly, @code{code} words 1682: etc. will not work (reliably), either. 1683: 1684: These words are rarely used. Therefore they reside in @code{code.fs}, 1685: which is usually not loaded (except @code{flush-icache}, which is always 1686: present). You can load it with @code{require code.fs}. 1687: 1688: Another option for implementing normal and defining words efficiently 1689: is: adding the wanted functionality to the source of Gforth. For normal 1690: words you just have to edit @file{primitives}, defining words (for fast 1691: defined words) probably require changes in @file{engine.c}, 1692: @file{kernal.fs}, @file{prims2x.fs}, and possibly @file{cross.fs}. 1693: 1694: 1695: @node Threading Words, , Assembler and Code words, Words 1696: @section Threading Words 1697: 1698: These words provide access to code addresses and other threading stuff 1699: in Gforth (and, possibly, other interpretive Forths). It more or less 1700: abstracts away the differences between direct and indirect threading 1701: (and, for direct threading, the machine dependences). However, at 1702: present this wordset is still inclomplete. It is also pretty low-level; 1703: some day it will hopefully be made unnecessary by an internals words set 1704: that abstracts implementation details away completely. 1705: 1706: doc->code-address 1707: doc->does-code 1708: doc-code-address! 1709: doc-does-code! 1710: doc-does-handler! 1711: doc-/does-handler 1712: 1713: The code addresses produced by various defining words are produced by 1714: the following words: 1715: 1716: doc-docol: 1717: doc-docon: 1718: doc-dovar: 1719: doc-douser: 1720: doc-dodefer: 1721: doc-dofield: 1722: 1723: Currently there is no installation-independent way for recogizing words 1724: defined by a @code{CREATE}...@code{DOES>} word; however, once you know 1725: that a word is defined by a @code{CREATE}...@code{DOES>} word, you can 1726: use @code{>DOES-CODE}. 1727: 1728: @node ANS conformance, Model, Words, Top 1729: @chapter ANS conformance 1730: 1731: To the best of our knowledge, Gforth is an 1732: 1733: ANS Forth System 1734: @itemize 1735: @item providing the Core Extensions word set 1736: @item providing the Block word set 1737: @item providing the Block Extensions word set 1738: @item providing the Double-Number word set 1739: @item providing the Double-Number Extensions word set 1740: @item providing the Exception word set 1741: @item providing the Exception Extensions word set 1742: @item providing the Facility word set 1743: @item providing @code{MS} and @code{TIME&DATE} from the Facility Extensions word set 1744: @item providing the File Access word set 1745: @item providing the File Access Extensions word set 1746: @item providing the Floating-Point word set 1747: @item providing the Floating-Point Extensions word set 1748: @item providing the Locals word set 1749: @item providing the Locals Extensions word set 1750: @item providing the Memory-Allocation word set 1751: @item providing the Memory-Allocation Extensions word set (that one's easy) 1752: @item providing the Programming-Tools word set 1753: @item providing @code{;code}, @code{AHEAD}, @code{ASSEMBLER}, @code{BYE}, @code{CODE}, @code{CS-PICK}, @code{CS-ROLL}, @code{STATE}, @code{[ELSE]}, @code{[IF]}, @code{[THEN]} from the Programming-Tools Extensions word set 1754: @item providing the Search-Order word set 1755: @item providing the Search-Order Extensions word set 1756: @item providing the String word set 1757: @item providing the String Extensions word set (another easy one) 1758: @end itemize 1759: 1760: In addition, ANS Forth systems are required to document certain 1761: implementation choices. This chapter tries to meet these 1762: requirements. In many cases it gives a way to ask the system for the 1763: information instead of providing the information directly, in 1764: particular, if the information depends on the processor, the operating 1765: system or the installation options chosen, or if they are likely to 1766: change during the maintenance of Gforth. 1767: 1768: @comment The framework for the rest has been taken from pfe. 1769: 1770: @menu 1771: * The Core Words:: 1772: * The optional Block word set:: 1773: * The optional Double Number word set:: 1774: * The optional Exception word set:: 1775: * The optional Facility word set:: 1776: * The optional File-Access word set:: 1777: * The optional Floating-Point word set:: 1778: * The optional Locals word set:: 1779: * The optional Memory-Allocation word set:: 1780: * The optional Programming-Tools word set:: 1781: * The optional Search-Order word set:: 1782: @end menu 1783: 1784: 1785: @c ===================================================================== 1786: @node The Core Words, The optional Block word set, ANS conformance, ANS conformance 1787: @comment node-name, next, previous, up 1788: @section The Core Words 1789: @c ===================================================================== 1790: 1791: @menu 1792: * core-idef:: Implementation Defined Options 1793: * core-ambcond:: Ambiguous Conditions 1794: * core-other:: Other System Documentation 1795: @end menu 1796: 1797: @c --------------------------------------------------------------------- 1798: @node core-idef, core-ambcond, The Core Words, The Core Words 1799: @subsection Implementation Defined Options 1800: @c --------------------------------------------------------------------- 1801: 1802: @table @i 1803: 1804: @item (Cell) aligned addresses: 1805: processor-dependent. Gforth's alignment words perform natural alignment 1806: (e.g., an address aligned for a datum of size 8 is divisible by 1807: 8). Unaligned accesses usually result in a @code{-23 THROW}. 1808: 1809: @item @code{EMIT} and non-graphic characters: 1810: The character is output using the C library function (actually, macro) 1811: @code{putchar}. 1812: 1813: @item character editing of @code{ACCEPT} and @code{EXPECT}: 1814: This is modeled on the GNU readline library (@pxref{Readline 1815: Interaction, , Command Line Editing, readline, The GNU Readline 1816: Library}) with Emacs-like key bindings. @kbd{Tab} deviates a little by 1817: producing a full word completion every time you type it (instead of 1818: producing the common prefix of all completions). 1819: 1820: @item character set: 1821: The character set of your computer and display device. Gforth is 1822: 8-bit-clean (but some other component in your system may make trouble). 1823: 1824: @item Character-aligned address requirements: 1825: installation-dependent. Currently a character is represented by a C 1826: @code{unsigned char}; in the future we might switch to @code{wchar_t} 1827: (Comments on that requested). 1828: 1829: @item character-set extensions and matching of names: 1830: Any character except the ASCII NUL charcter can be used in a 1831: name. Matching is case-insensitive. The matching is performed using the 1832: C function @code{strncasecmp}, whose function is probably influenced by 1833: the locale. E.g., the @code{C} locale does not know about accents and 1834: umlauts, so they are matched case-sensitively in that locale. For 1835: portability reasons it is best to write programs such that they work in 1836: the @code{C} locale. Then one can use libraries written by a Polish 1837: programmer (who might use words containing ISO Latin-2 encoded 1838: characters) and by a French programmer (ISO Latin-1) in the same program 1839: (of course, @code{WORDS} will produce funny results for some of the 1840: words (which ones, depends on the font you are using)). Also, the locale 1841: you prefer may not be available in other operating systems. Hopefully, 1842: Unicode will solve these problems one day. 1843: 1844: @item conditions under which control characters match a space delimiter: 1845: If @code{WORD} is called with the space character as a delimiter, all 1846: white-space characters (as identified by the C macro @code{isspace()}) 1847: are delimiters. @code{PARSE}, on the other hand, treats space like other 1848: delimiters. @code{PARSE-WORD} treats space like @code{WORD}, but behaves 1849: like @code{PARSE} otherwise. @code{(NAME)}, which is used by the outer 1850: interpreter (aka text interpreter) by default, treats all white-space 1851: characters as delimiters. 1852: 1853: @item format of the control flow stack: 1854: The data stack is used as control flow stack. The size of a control flow 1855: stack item in cells is given by the constant @code{cs-item-size}. At the 1856: time of this writing, an item consists of a (pointer to a) locals list 1857: (third), an address in the code (second), and a tag for identifying the 1858: item (TOS). The following tags are used: @code{defstart}, 1859: @code{live-orig}, @code{dead-orig}, @code{dest}, @code{do-dest}, 1860: @code{scopestart}. 1861: 1862: @item conversion of digits > 35 1863: The characters @code{[\]^_'} are the digits with the decimal value 1864: 36@minus{}41. There is no way to input many of the larger digits. 1865: 1866: @item display after input terminates in @code{ACCEPT} and @code{EXPECT}: 1867: The cursor is moved to the end of the entered string. If the input is 1868: terminated using the @kbd{Return} key, a space is typed. 1869: 1870: @item exception abort sequence of @code{ABORT"}: 1871: The error string is stored into the variable @code{"error} and a 1872: @code{-2 throw} is performed. 1873: 1874: @item input line terminator: 1875: For interactive input, @kbd{C-m} and @kbd{C-j} terminate lines. One of 1876: these characters is typically produced when you type the @kbd{Enter} or 1877: @kbd{Return} key. 1878: 1879: @item maximum size of a counted string: 1880: @code{s" /counted-string" environment? drop .}. Currently 255 characters 1881: on all ports, but this may change. 1882: 1883: @item maximum size of a parsed string: 1884: Given by the constant @code{/line}. Currently 255 characters. 1885: 1886: @item maximum size of a definition name, in characters: 1887: 31 1888: 1889: @item maximum string length for @code{ENVIRONMENT?}, in characters: 1890: 31 1891: 1892: @item method of selecting the user input device: 1893: The user input device is the standard input. There is currently no way to 1894: change it from within Gforth. However, the input can typically be 1895: redirected in the command line that starts Gforth. 1896: 1897: @item method of selecting the user output device: 1898: The user output device is the standard output. It cannot be redirected 1899: from within Gforth, but typically from the command line that starts 1900: Gforth. Gforth uses buffered output, so output on a terminal does not 1901: become visible before the next newline or buffer overflow. Output on 1902: non-terminals is invisible until the buffer overflows. 1903: 1904: @item methods of dictionary compilation: 1905: What are we expected to document here? 1906: 1907: @item number of bits in one address unit: 1908: @code{s" address-units-bits" environment? drop .}. 8 in all current 1909: ports. 1910: 1911: @item number representation and arithmetic: 1912: Processor-dependent. Binary two's complement on all current ports. 1913: 1914: @item ranges for integer types: 1915: Installation-dependent. Make environmental queries for @code{MAX-N}, 1916: @code{MAX-U}, @code{MAX-D} and @code{MAX-UD}. The lower bounds for 1917: unsigned (and positive) types is 0. The lower bound for signed types on 1918: two's complement and one's complement machines machines can be computed 1919: by adding 1 to the upper bound. 1920: 1921: @item read-only data space regions: 1922: The whole Forth data space is writable. 1923: 1924: @item size of buffer at @code{WORD}: 1925: @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is 1926: shared with the pictured numeric output string. If overwriting 1927: @code{PAD} is acceptable, it is as large as the remaining dictionary 1928: space, although only as much can be sensibly used as fits in a counted 1929: string. 1930: 1931: @item size of one cell in address units: 1932: @code{1 cells .}. 1933: 1934: @item size of one character in address units: 1935: @code{1 chars .}. 1 on all current ports. 1936: 1937: @item size of the keyboard terminal buffer: 1938: Varies. You can determine the size at a specific time using @code{lp@ 1939: tib - .}. It is shared with the locals stack and TIBs of files that 1940: include the current file. You can change the amount of space for TIBs 1941: and locals stack at Gforth startup with the command line option 1942: @code{-l}. 1943: 1944: @item size of the pictured numeric output buffer: 1945: @code{PAD HERE - .}. 104 characters on 32-bit machines. The buffer is 1946: shared with @code{WORD}. 1947: 1948: @item size of the scratch area returned by @code{PAD}: 1949: The remainder of dictionary space. You can even use the unused part of 1950: the data stack space. The current size can be computed with @code{sp@ 1951: pad - .}. 1952: 1953: @item system case-sensitivity characteristics: 1954: Dictionary searches are case insensitive. However, as explained above 1955: under @i{character-set extensions}, the matching for non-ASCII 1956: characters is determined by the locale you are using. In the default 1957: @code{C} locale all non-ASCII characters are matched case-sensitively. 1958: 1959: @item system prompt: 1960: @code{ ok} in interpret state, @code{ compiled} in compile state. 1961: 1962: @item division rounding: 1963: installation dependent. @code{s" floored" environment? drop .}. We leave 1964: the choice to gcc (what to use for @code{/}) and to you (whether to use 1965: @code{fm/mod}, @code{sm/rem} or simply @code{/}). 1966: 1967: @item values of @code{STATE} when true: 1968: -1. 1969: 1970: @item values returned after arithmetic overflow: 1971: On two's complement machines, arithmetic is performed modulo 1972: 2**bits-per-cell for single arithmetic and 4**bits-per-cell for double 1973: arithmetic (with appropriate mapping for signed types). Division by zero 1974: typically results in a @code{-55 throw} (floatingpoint unidentified 1975: fault), although a @code{-10 throw} (divide by zero) would be more 1976: appropriate. 1977: 1978: @item whether the current definition can be found after @t{DOES>}: 1979: No. 1980: 1981: @end table 1982: 1983: @c --------------------------------------------------------------------- 1984: @node core-ambcond, core-other, core-idef, The Core Words 1985: @subsection Ambiguous conditions 1986: @c --------------------------------------------------------------------- 1987: 1988: @table @i 1989: 1990: @item a name is neither a word nor a number: 1991: @code{-13 throw} (Undefined word) 1992: 1993: @item a definition name exceeds the maximum length allowed: 1994: @code{-19 throw} (Word name too long) 1995: 1996: @item addressing a region not inside the various data spaces of the forth system: 1997: The stacks, code space and name space are accessible. Machine code space is 1998: typically readable. Accessing other addresses gives results dependent on 1999: the operating system. On decent systems: @code{-9 throw} (Invalid memory 2000: address). 2001: 2002: @item argument type incompatible with parameter: 2003: This is usually not caught. Some words perform checks, e.g., the control 2004: flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type 2005: mismatch). 2006: 2007: @item attempting to obtain the execution token of a word with undefined execution semantics: 2008: You get an execution token representing the compilation semantics 2009: instead. 2010: 2011: @item dividing by zero: 2012: typically results in a @code{-55 throw} (floating point unidentified 2013: fault), although a @code{-10 throw} (divide by zero) would be more 2014: appropriate. 2015: 2016: @item insufficient data stack or return stack space: 2017: Not checked. This typically results in mysterious illegal memory 2018: accesses, producing @code{-9 throw} (Invalid memory address) or 2019: @code{-23 throw} (Address alignment exception). 2020: 2021: @item insufficient space for loop control parameters: 2022: like other return stack overflows. 2023: 2024: @item insufficient space in the dictionary: 2025: Not checked. Similar results as stack overflows. However, typically the 2026: error appears at a different place when one inserts or removes code. 2027: 2028: @item interpreting a word with undefined interpretation semantics: 2029: For some words, we defined interpretation semantics. For the others: 2030: @code{-14 throw} (Interpreting a compile-only word). Note that this is 2031: checked only by the outer (aka text) interpreter; if the word is 2032: @code{execute}d in some other way, it will typically perform it's 2033: compilation semantics even in interpret state. (We could change @code{'} 2034: and relatives not to give the xt of such words, but we think that would 2035: be too restrictive). 2036: 2037: @item modifying the contents of the input buffer or a string literal: 2038: These are located in writable memory and can be modified. 2039: 2040: @item overflow of the pictured numeric output string: 2041: Not checked. 2042: 2043: @item parsed string overflow: 2044: @code{PARSE} cannot overflow. @code{WORD} does not check for overflow. 2045: 2046: @item producing a result out of range: 2047: On two's complement machines, arithmetic is performed modulo 2048: 2**bits-per-cell for single arithmetic and 4**bits-per-cell for double 2049: arithmetic (with appropriate mapping for signed types). Division by zero 2050: typically results in a @code{-55 throw} (floatingpoint unidentified 2051: fault), although a @code{-10 throw} (divide by zero) would be more 2052: appropriate. @code{convert} and @code{>number} currently overflow 2053: silently. 2054: 2055: @item reading from an empty data or return stack: 2056: The data stack is checked by the outer (aka text) interpreter after 2057: every word executed. If it has underflowed, a @code{-4 throw} (Stack 2058: underflow) is performed. Apart from that, the stacks are not checked and 2059: underflows can result in similar behaviour as overflows (of adjacent 2060: stacks). 2061: 2062: @item unexepected end of the input buffer, resulting in an attempt to use a zero-length string as a name: 2063: @code{Create} and its descendants perform a @code{-16 throw} (Attempt to 2064: use zero-length string as a name). Words like @code{'} probably will not 2065: find what they search. Note that it is possible to create zero-length 2066: names with @code{nextname} (should it not?). 2067: 2068: @item @code{>IN} greater than input buffer: 2069: The next invocation of a parsing word returns a string wih length 0. 2070: 2071: @item @code{RECURSE} appears after @code{DOES>}: 2072: Compiles a recursive call to the defining word not to the defined word. 2073: 2074: @item argument input source different than current input source for @code{RESTORE-INPUT}: 2075: !!???If the argument input source is a valid input source then it gets 2076: restored. Otherwise causes @code{-12 THROW} which unless caught issues 2077: the message "argument type mismatch" and aborts. 2078: 2079: @item data space containing definitions gets de-allocated: 2080: Deallocation with @code{allot} is not checked. This typically resuls in 2081: memory access faults or execution of illegal instructions. 2082: 2083: @item data space read/write with incorrect alignment: 2084: Processor-dependent. Typically results in a @code{-23 throw} (Address 2085: alignment exception). Under Linux on a 486 or later processor with 2086: alignment turned on, incorrect alignment results in a @code{-9 throw} 2087: (Invalid memory address). There are reportedly some processors with 2088: alignment restrictions that do not report them. 2089: 2090: @item data space pointer not properly aligned, @code{,}, @code{C,}: 2091: Like other alignment errors. 2092: 2093: @item less than u+2 stack items (@code{PICK} and @code{ROLL}): 2094: Not checked. May cause an illegal memory access. 2095: 2096: @item loop control parameters not available: 2097: Not checked. The counted loop words simply assume that the top of return 2098: stack items are loop control parameters and behave accordingly. 2099: 2100: @item most recent definition does not have a name (@code{IMMEDIATE}): 2101: @code{abort" last word was headerless"}. 2102: 2103: @item name not defined by @code{VALUE} used by @code{TO}: 2104: @code{-32 throw} (Invalid name argument) 2105: 2106: @item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}): 2107: @code{-13 throw} (Undefined word) 2108: 2109: @item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}): 2110: Gforth behaves as if they were of the same type. I.e., you can predict 2111: the behaviour by interpreting all parameters as, e.g., signed. 2112: 2113: @item @code{POSTPONE} or @code{[COMPILE]} applied to @code{TO}: 2114: Assume @code{: X POSTPONE TO ; IMMEDIATE}. @code{X} is equivalent to 2115: @code{TO}. 2116: 2117: @item String longer than a counted string returned by @code{WORD}: 2118: Not checked. The string will be ok, but the count will, of course, 2119: contain only the least significant bits of the length. 2120: 2121: @item u greater than or equal to the number of bits in a cell (@code{LSHIFT}, @code{RSHIFT}): 2122: Processor-dependent. Typical behaviours are returning 0 and using only 2123: the low bits of the shift count. 2124: 2125: @item word not defined via @code{CREATE}: 2126: @code{>BODY} produces the PFA of the word no matter how it was defined. 2127: 2128: @code{DOES>} changes the execution semantics of the last defined word no 2129: matter how it was defined. E.g., @code{CONSTANT DOES>} is equivalent to 2130: @code{CREATE , DOES>}. 2131: 2132: @item words improperly used outside @code{<#} and @code{#>}: 2133: Not checked. As usual, you can expect memory faults. 2134: 2135: @end table 2136: 2137: 2138: @c --------------------------------------------------------------------- 2139: @node core-other, , core-ambcond, The Core Words 2140: @subsection Other system documentation 2141: @c --------------------------------------------------------------------- 2142: 2143: @table @i 2144: 2145: @item nonstandard words using @code{PAD}: 2146: None. 2147: 2148: @item operator's terminal facilities available: 2149: !!?? 2150: 2151: @item program data space available: 2152: @code{sp@ here - .} gives the space remaining for dictionary and data 2153: stack together. 2154: 2155: @item return stack space available: 2156: !!?? 2157: 2158: @item stack space available: 2159: @code{sp@ here - .} gives the space remaining for dictionary and data 2160: stack together. 2161: 2162: @item system dictionary space required, in address units: 2163: Type @code{here forthstart - .} after startup. At the time of this 2164: writing, this gives 70108 (bytes) on a 32-bit system. 2165: @end table 2166: 2167: 2168: @c ===================================================================== 2169: @node The optional Block word set, The optional Double Number word set, The Core Words, ANS conformance 2170: @section The optional Block word set 2171: @c ===================================================================== 2172: 2173: @menu 2174: * block-idef:: Implementation Defined Options 2175: * block-ambcond:: Ambiguous Conditions 2176: * block-other:: Other System Documentation 2177: @end menu 2178: 2179: 2180: @c --------------------------------------------------------------------- 2181: @node block-idef, block-ambcond, The optional Block word set, The optional Block word set 2182: @subsection Implementation Defined Options 2183: @c --------------------------------------------------------------------- 2184: 2185: @table @i 2186: 2187: @item the format for display by @code{LIST}: 2188: First the screen number is displayed, then 16 lines of 64 characters, 2189: each line preceded by the line number. 2190: 2191: @item the length of a line affected by @code{\}: 2192: 64 characters. 2193: @end table 2194: 2195: 2196: @c --------------------------------------------------------------------- 2197: @node block-ambcond, block-other, block-idef, The optional Block word set 2198: @subsection Ambiguous conditions 2199: @c --------------------------------------------------------------------- 2200: 2201: @table @i 2202: 2203: @item correct block read was not possible: 2204: Typically results in a @code{throw} of some OS-derived value (between 2205: -512 and -2048). If the blocks file was just not long enough, blanks are 2206: supplied for the missing portion. 2207: 2208: @item I/O exception in block transfer: 2209: Typically results in a @code{throw} of some OS-derived value (between 2210: -512 and -2048). 2211: 2212: @item invalid block number: 2213: @code{-35 throw} (Invalid block number) 2214: 2215: @item a program directly alters the contents of @code{BLK}: 2216: The input stream is switched to that other block, at the same 2217: position. If the storing to @code{BLK} happens when interpreting 2218: non-block input, the system will get quite confused when the block ends. 2219: 2220: @item no current block buffer for @code{UPDATE}: 2221: @code{UPDATE} has no effect. 2222: 2223: @end table 2224: 2225: 2226: @c --------------------------------------------------------------------- 2227: @node block-other, , block-ambcond, The optional Block word set 2228: @subsection Other system documentation 2229: @c --------------------------------------------------------------------- 2230: 2231: @table @i 2232: 2233: @item any restrictions a multiprogramming system places on the use of buffer addresses: 2234: No restrictions (yet). 2235: 2236: @item the number of blocks available for source and data: 2237: depends on your disk space. 2238: 2239: @end table 2240: 2241: 2242: @c ===================================================================== 2243: @node The optional Double Number word set, The optional Exception word set, The optional Block word set, ANS conformance 2244: @section The optional Double Number word set 2245: @c ===================================================================== 2246: 2247: @menu 2248: * double-ambcond:: Ambiguous Conditions 2249: @end menu 2250: 2251: 2252: @c --------------------------------------------------------------------- 2253: @node double-ambcond, , The optional Double Number word set, The optional Double Number word set 2254: @subsection Ambiguous conditions 2255: @c --------------------------------------------------------------------- 2256: 2257: @table @i 2258: 2259: @item @var{d} outside of range of @var{n} in @code{D>S}: 2260: The least significant cell of @var{d} is produced. 2261: 2262: @end table 2263: 2264: 2265: @c ===================================================================== 2266: @node The optional Exception word set, The optional Facility word set, The optional Double Number word set, ANS conformance 2267: @section The optional Exception word set 2268: @c ===================================================================== 2269: 2270: @menu 2271: * exception-idef:: Implementation Defined Options 2272: @end menu 2273: 2274: 2275: @c --------------------------------------------------------------------- 2276: @node exception-idef, , The optional Exception word set, The optional Exception word set 2277: @subsection Implementation Defined Options 2278: @c --------------------------------------------------------------------- 2279: 2280: @table @i 2281: @item @code{THROW}-codes used in the system: 2282: The codes -256@minus{}-511 are used for reporting signals (see 2283: @file{errore.fs}). The codes -512@minus{}-2047 are used for OS errors 2284: (for file and memory allocation operations). The mapping from OS error 2285: numbers to throw code is -512@minus{}@var{errno}. One side effect of 2286: this mapping is that undefined OS errors produce a message with a 2287: strange number; e.g., @code{-1000 THROW} results in @code{Unknown error 2288: 488} on my system. 2289: @end table 2290: 2291: @c ===================================================================== 2292: @node The optional Facility word set, The optional File-Access word set, The optional Exception word set, ANS conformance 2293: @section The optional Facility word set 2294: @c ===================================================================== 2295: 2296: @menu 2297: * facility-idef:: Implementation Defined Options 2298: * facility-ambcond:: Ambiguous Conditions 2299: @end menu 2300: 2301: 2302: @c --------------------------------------------------------------------- 2303: @node facility-idef, facility-ambcond, The optional Facility word set, The optional Facility word set 2304: @subsection Implementation Defined Options 2305: @c --------------------------------------------------------------------- 2306: 2307: @table @i 2308: 2309: @item encoding of keyboard events (@code{EKEY}): 2310: Not yet implemeted. 2311: 2312: @item duration of a system clock tick 2313: System dependent. With respect to @code{MS}, the time is specified in 2314: microseconds. How well the OS and the hardware implement this, is 2315: another question. 2316: 2317: @item repeatability to be expected from the execution of @code{MS}: 2318: System dependent. On Unix, a lot depends on load. If the system is 2319: lightly loaded, and the delay is short enough that Gforth does not get 2320: swapped out, the performance should be acceptable. Under MS-DOS and 2321: other single-tasking systems, it should be good. 2322: 2323: @end table 2324: 2325: 2326: @c --------------------------------------------------------------------- 2327: @node facility-ambcond, , facility-idef, The optional Facility word set 2328: @subsection Ambiguous conditions 2329: @c --------------------------------------------------------------------- 2330: 2331: @table @i 2332: 2333: @item @code{AT-XY} can't be performed on user output device: 2334: Largely terminal dependant. No range checks are done on the arguments. 2335: No errors are reported. You may see some garbage appearing, you may see 2336: simply nothing happen. 2337: 2338: @end table 2339: 2340: 2341: @c ===================================================================== 2342: @node The optional File-Access word set, The optional Floating-Point word set, The optional Facility word set, ANS conformance 2343: @section The optional File-Access word set 2344: @c ===================================================================== 2345: 2346: @menu 2347: * file-idef:: Implementation Defined Options 2348: * file-ambcond:: Ambiguous Conditions 2349: @end menu 2350: 2351: 2352: @c --------------------------------------------------------------------- 2353: @node file-idef, file-ambcond, The optional File-Access word set, The optional File-Access word set 2354: @subsection Implementation Defined Options 2355: @c --------------------------------------------------------------------- 2356: 2357: @table @i 2358: 2359: @item File access methods used: 2360: @code{R/O}, @code{R/W} and @code{BIN} work as you would 2361: expect. @code{W/O} translates into the C file opening mode @code{w} (or 2362: @code{wb}): The file is cleared, if it exists, and created, if it does 2363: not (both with @code{open-file} and @code{create-file}). Under Unix 2364: @code{create-file} creates a file with 666 permissions modified by your 2365: umask. 2366: 2367: @item file exceptions: 2368: The file words do not raise exceptions (except, perhaps, memory access 2369: faults when you pass illegal addresses or file-ids). 2370: 2371: @item file line terminator: 2372: System-dependent. Gforth uses C's newline character as line 2373: terminator. What the actual character code(s) of this are is 2374: system-dependent. 2375: 2376: @item file name format 2377: System dependent. Gforth just uses the file name format of your OS. 2378: 2379: @item information returned by @code{FILE-STATUS}: 2380: @code{FILE-STATUS} returns the most powerful file access mode allowed 2381: for the file: Either @code{R/O}, @code{W/O} or @code{R/W}. If the file 2382: cannot be accessed, @code{R/O BIN} is returned. @code{BIN} is applicable 2383: along with the retured mode. 2384: 2385: @item input file state after an exception when including source: 2386: All files that are left via the exception are closed. 2387: 2388: @item @var{ior} values and meaning: 2389: The @var{ior}s returned by the file and memory allocation words are 2390: intended as throw codes. They typically are in the range 2391: -512@minus{}-2047 of OS errors. The mapping from OS error numbers to 2392: @var{ior}s is -512@minus{}@var{errno}. 2393: 2394: @item maximum depth of file input nesting: 2395: limited by the amount of return stack, locals/TIB stack, and the number 2396: of open files available. This should not give you troubles. 2397: 2398: @item maximum size of input line: 2399: @code{/line}. Currently 255. 2400: 2401: @item methods of mapping block ranges to files: 2402: Currently, the block words automatically access the file 2403: @file{blocks.fb} in the currend working directory. More sophisticated 2404: methods could be implemented if there is demand (and a volunteer). 2405: 2406: @item number of string buffers provided by @code{S"}: 2407: 1 2408: 2409: @item size of string buffer used by @code{S"}: 2410: @code{/line}. currently 255. 2411: 2412: @end table 2413: 2414: @c --------------------------------------------------------------------- 2415: @node file-ambcond, , file-idef, The optional File-Access word set 2416: @subsection Ambiguous conditions 2417: @c --------------------------------------------------------------------- 2418: 2419: @table @i 2420: 2421: @item attempting to position a file outside it's boundaries: 2422: @code{REPOSITION-FILE} is performed as usual: Afterwards, 2423: @code{FILE-POSITION} returns the value given to @code{REPOSITION-FILE}. 2424: 2425: @item attempting to read from file positions not yet written: 2426: End-of-file, i.e., zero characters are read and no error is reported. 2427: 2428: @item @var{file-id} is invalid (@code{INCLUDE-FILE}): 2429: An appropriate exception may be thrown, but a memory fault or other 2430: problem is more probable. 2431: 2432: @item I/O exception reading or closing @var{file-id} (@code{include-file}, @code{included}): 2433: The @var{ior} produced by the operation, that discovered the problem, is 2434: thrown. 2435: 2436: @item named file cannot be opened (@code{included}): 2437: The @var{ior} produced by @code{open-file} is thrown. 2438: 2439: @item requesting an unmapped block number: 2440: There are no unmapped legal block numbers. On some operating systems, 2441: writing a block with a large number may overflow the file system and 2442: have an error message as consequence. 2443: 2444: @item using @code{source-id} when @code{blk} is non-zero: 2445: @code{source-id} performs its function. Typically it will give the id of 2446: the source which loaded the block. (Better ideas?) 2447: 2448: @end table 2449: 2450: 2451: @c ===================================================================== 2452: @node The optional Floating-Point word set, The optional Locals word set, The optional File-Access word set, ANS conformance 2453: @section The optional Floating-Point word set 2454: @c ===================================================================== 2455: 2456: @menu 2457: * floating-idef:: Implementation Defined Options 2458: * floating-ambcond:: Ambiguous Conditions 2459: @end menu 2460: 2461: 2462: @c --------------------------------------------------------------------- 2463: @node floating-idef, floating-ambcond, The optional Floating-Point word set, The optional Floating-Point word set 2464: @subsection Implementation Defined Options 2465: @c --------------------------------------------------------------------- 2466: 2467: @table @i 2468: 2469: @item format and range of floating point numbers: 2470: System-dependent; the @code{double} type of C. 2471: 2472: @item results of @code{REPRESENT} when @var{float} is out of range: 2473: System dependent; @code{REPRESENT} is implemented using the C library 2474: function @code{ecvt()} and inherits its behaviour in this respect. 2475: 2476: @item rounding or truncation of floating-point numbers: 2477: What's the question?!! 2478: 2479: @item size of floating-point stack: 2480: @code{s" FLOATING-STACK" environment? drop .}. Can be changed at startup 2481: with the command-line option @code{-f}. 2482: 2483: @item width of floating-point stack: 2484: @code{1 floats}. 2485: 2486: @end table 2487: 2488: 2489: @c --------------------------------------------------------------------- 2490: @node floating-ambcond, , floating-idef, The optional Floating-Point word set 2491: @subsection Ambiguous conditions 2492: @c --------------------------------------------------------------------- 2493: 2494: @table @i 2495: 2496: @item @code{df@@} or @code{df!} used with an address that is not double-float aligned: 2497: System-dependent. Typically results in an alignment fault like other 2498: alignment violations. 2499: 2500: @item @code{f@@} or @code{f!} used with an address that is not float aligned: 2501: System-dependent. Typically results in an alignment fault like other 2502: alignment violations. 2503: 2504: @item Floating-point result out of range: 2505: System-dependent. Can result in a @code{-55 THROW} (Floating-point 2506: unidentified fault), or can produce a special value representing, e.g., 2507: Infinity. 2508: 2509: @item @code{sf@@} or @code{sf!} used with an address that is not single-float aligned: 2510: System-dependent. Typically results in an alignment fault like other 2511: alignment violations. 2512: 2513: @item BASE is not decimal (@code{REPRESENT}, @code{F.}, @code{FE.}, @code{FS.}): 2514: The floating-point number is converted into decimal nonetheless. 2515: 2516: @item Both arguments are equal to zero (@code{FATAN2}): 2517: System-dependent. @code{FATAN2} is implemented using the C library 2518: function @code{atan2()}. 2519: 2520: @item Using ftan on an argument @var{r1} where cos(@var{r1}) is zero: 2521: System-dependent. Anyway, typically the cos of @var{r1} will not be zero 2522: because of small errors and the tan will be a very large (or very small) 2523: but finite number. 2524: 2525: @item @var{d} cannot be presented precisely as a float in @code{D>F}: 2526: The result is rounded to the nearest float. 2527: 2528: @item dividing by zero: 2529: @code{-55 throw} (Floating-point unidentified fault) 2530: 2531: @item exponent too big for conversion (@code{DF!}, @code{DF@@}, @code{SF!}, @code{SF@@}): 2532: System dependent. On IEEE-FP based systems the number is converted into 2533: an infinity. 2534: 2535: @item @var{float}<1 (@code{facosh}): 2536: @code{-55 throw} (Floating-point unidentified fault) 2537: 2538: @item @var{float}=<-1 (@code{flnp1}): 2539: @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems 2540: negative infinity is typically produced for @var{float}=-1. 2541: 2542: @item @var{float}=<0 (@code{fln}, @code{flog}): 2543: @code{-55 throw} (Floating-point unidentified fault). On IEEE-FP systems 2544: negative infinity is typically produced for @var{float}=0. 2545: 2546: @item @var{float}<0 (@code{fasinh}, @code{fsqrt}): 2547: @code{-55 throw} (Floating-point unidentified fault). @code{fasinh} 2548: produces values for these inputs on my Linux box (Bug in the C library?) 2549: 2550: @item |@var{float}|>1 (@code{facos}, @code{fasin}, @code{fatanh}): 2551: @code{-55 throw} (Floating-point unidentified fault). 2552: 2553: @item integer part of float cannot be represented by @var{d} in @code{f>d}: 2554: @code{-55 throw} (Floating-point unidentified fault). 2555: 2556: @item string larger than pictured numeric output area (@code{f.}, @code{fe.}, @code{fs.}): 2557: This does not happen. 2558: @end table 2559: 2560: 2561: 2562: @c ===================================================================== 2563: @node The optional Locals word set, The optional Memory-Allocation word set, The optional Floating-Point word set, ANS conformance 2564: @section The optional Locals word set 2565: @c ===================================================================== 2566: 2567: @menu 2568: * locals-idef:: Implementation Defined Options 2569: * locals-ambcond:: Ambiguous Conditions 2570: @end menu 2571: 2572: 2573: @c --------------------------------------------------------------------- 2574: @node locals-idef, locals-ambcond, The optional Locals word set, The optional Locals word set 2575: @subsection Implementation Defined Options 2576: @c --------------------------------------------------------------------- 2577: 2578: @table @i 2579: 2580: @item maximum number of locals in a definition: 2581: @code{s" #locals" environment? drop .}. Currently 15. This is a lower 2582: bound, e.g., on a 32-bit machine there can be 41 locals of up to 8 2583: characters. The number of locals in a definition is bounded by the size 2584: of locals-buffer, which contains the names of the locals. 2585: 2586: @end table 2587: 2588: 2589: @c --------------------------------------------------------------------- 2590: @node locals-ambcond, , locals-idef, The optional Locals word set 2591: @subsection Ambiguous conditions 2592: @c --------------------------------------------------------------------- 2593: 2594: @table @i 2595: 2596: @item executing a named local in interpretation state: 2597: @code{-14 throw} (Interpreting a compile-only word). 2598: 2599: @item @var{name} not defined by @code{VALUE} or @code{(LOCAL)} (@code{TO}): 2600: @code{-32 throw} (Invalid name argument) 2601: 2602: @end table 2603: 2604: 2605: @c ===================================================================== 2606: @node The optional Memory-Allocation word set, The optional Programming-Tools word set, The optional Locals word set, ANS conformance 2607: @section The optional Memory-Allocation word set 2608: @c ===================================================================== 2609: 2610: @menu 2611: * memory-idef:: Implementation Defined Options 2612: @end menu 2613: 2614: 2615: @c --------------------------------------------------------------------- 2616: @node memory-idef, , The optional Memory-Allocation word set, The optional Memory-Allocation word set 2617: @subsection Implementation Defined Options 2618: @c --------------------------------------------------------------------- 2619: 2620: @table @i 2621: 2622: @item values and meaning of @var{ior}: 2623: The @var{ior}s returned by the file and memory allocation words are 2624: intended as throw codes. They typically are in the range 2625: -512@minus{}-2047 of OS errors. The mapping from OS error numbers to 2626: @var{ior}s is -512@minus{}@var{errno}. 2627: 2628: @end table 2629: 2630: @c ===================================================================== 2631: @node The optional Programming-Tools word set, The optional Search-Order word set, The optional Memory-Allocation word set, ANS conformance 2632: @section The optional Programming-Tools word set 2633: @c ===================================================================== 2634: 2635: @menu 2636: * programming-idef:: Implementation Defined Options 2637: * programming-ambcond:: Ambiguous Conditions 2638: @end menu 2639: 2640: 2641: @c --------------------------------------------------------------------- 2642: @node programming-idef, programming-ambcond, The optional Programming-Tools word set, The optional Programming-Tools word set 2643: @subsection Implementation Defined Options 2644: @c --------------------------------------------------------------------- 2645: 2646: @table @i 2647: 2648: @item ending sequence for input following @code{;code} and @code{code}: 2649: Not implemented (yet). 2650: 2651: @item manner of processing input following @code{;code} and @code{code}: 2652: Not implemented (yet). 2653: 2654: @item search order capability for @code{EDITOR} and @code{ASSEMBLER}: 2655: Not implemented (yet). If they were implemented, they would use the 2656: search order wordset. 2657: 2658: @item source and format of display by @code{SEE}: 2659: The source for @code{see} is the intermediate code used by the inner 2660: interpreter. The current @code{see} tries to output Forth source code 2661: as well as possible. 2662: 2663: @end table 2664: 2665: @c --------------------------------------------------------------------- 2666: @node programming-ambcond, , programming-idef, The optional Programming-Tools word set 2667: @subsection Ambiguous conditions 2668: @c --------------------------------------------------------------------- 2669: 2670: @table @i 2671: 2672: @item deleting the compilation wordlist (@code{FORGET}): 2673: Not implemented (yet). 2674: 2675: @item fewer than @var{u}+1 items on the control flow stack (@code{CS-PICK}, @code{CS-ROLL}): 2676: This typically results in an @code{abort"} with a descriptive error 2677: message (may change into a @code{-22 throw} (Control structure mismatch) 2678: in the future). You may also get a memory access error. If you are 2679: unlucky, this ambiguous condition is not caught. 2680: 2681: @item @var{name} can't be found (@code{forget}): 2682: Not implemented (yet). 2683: 2684: @item @var{name} not defined via @code{CREATE}: 2685: @code{;code} is not implemented (yet). If it were, it would behave like 2686: @code{DOES>} in this respect, i.e., change the execution semantics of 2687: the last defined word no matter how it was defined. 2688: 2689: @item @code{POSTPONE} applied to @code{[IF]}: 2690: After defining @code{: X POSTPONE [IF] ; IMMEDIATE}. @code{X} is 2691: equivalent to @code{[IF]}. 2692: 2693: @item reaching the end of the input source before matching @code{[ELSE]} or @code{[THEN]}: 2694: Continue in the same state of conditional compilation in the next outer 2695: input source. Currently there is no warning to the user about this. 2696: 2697: @item removing a needed definition (@code{FORGET}): 2698: Not implemented (yet). 2699: 2700: @end table 2701: 2702: 2703: @c ===================================================================== 2704: @node The optional Search-Order word set, , The optional Programming-Tools word set, ANS conformance 2705: @section The optional Search-Order word set 2706: @c ===================================================================== 2707: 2708: @menu 2709: * search-idef:: Implementation Defined Options 2710: * search-ambcond:: Ambiguous Conditions 2711: @end menu 2712: 2713: 2714: @c --------------------------------------------------------------------- 2715: @node search-idef, search-ambcond, The optional Search-Order word set, The optional Search-Order word set 2716: @subsection Implementation Defined Options 2717: @c --------------------------------------------------------------------- 2718: 2719: @table @i 2720: 2721: @item maximum number of word lists in search order: 2722: @code{s" wordlists" environment? drop .}. Currently 16. 2723: 2724: @item minimum search order: 2725: @code{root root}. 2726: 2727: @end table 2728: 2729: @c --------------------------------------------------------------------- 2730: @node search-ambcond, , search-idef, The optional Search-Order word set 2731: @subsection Ambiguous conditions 2732: @c --------------------------------------------------------------------- 2733: 2734: @table @i 2735: 2736: @item changing the compilation wordlist (during compilation): 2737: The definition is put into the wordlist that is the compilation wordlist 2738: when @code{REVEAL} is executed (by @code{;}, @code{DOES>}, 2739: @code{RECURSIVE}, etc.). 2740: 2741: @item search order empty (@code{previous}): 2742: @code{abort" Vocstack empty"}. 2743: 2744: @item too many word lists in search order (@code{also}): 2745: @code{abort" Vocstack full"}. 2746: 2747: @end table 2748: 2749: 2750: @node Model, Emacs and Gforth, ANS conformance, Top 2751: @chapter Model 2752: 2753: @node Emacs and Gforth, Internals, Model, Top 2754: @chapter Emacs and Gforth 2755: 2756: Gforth comes with @file{gforth.el}, an improved version of 2757: @file{forth.el} by Goran Rydqvist (icluded in the TILE package). The 2758: improvements are a better (but still not perfect) handling of 2759: indentation. I have also added comment paragraph filling (@kbd{M-q}), 2760: commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) regions and 2761: removing debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). I left the 2762: stuff I do not use alone, even though some of it only makes sense for 2763: TILE. To get a description of these features, enter Forth mode and type 2764: @kbd{C-h m}. 2765: 2766: In addition, Gforth supports Emacs quite well: The source code locations 2767: given in error messages, debugging output (from @code{~~}) and failed 2768: assertion messages are in the right format for Emacs' compilation mode 2769: (@pxref{Compilation, , Running Compilations under Emacs, emacs, Emacs 2770: Manual}) so the source location corresponding to an error or other 2771: message is only a few keystrokes away (@kbd{C-x `} for the next error, 2772: @kbd{C-c C-c} for the error under the cursor). 2773: 2774: Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file 2775: (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) will be produced that 2776: contains the definitions of all words defined afterwards. You can then 2777: find the source for a word using @kbd{M-.}. Note that emacs can use 2778: several tags files at the same time (e.g., one for the Gforth sources 2779: and one for your program). 2780: 2781: To get all these benefits, add the following lines to your @file{.emacs} 2782: file: 2783: 2784: @example 2785: (autoload 'forth-mode "gforth.el") 2786: (setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist)) 2787: @end example 2788: 2789: @node Internals, Bugs, Emacs and Gforth, Top 2790: @chapter Internals 2791: 2792: Reading this section is not necessary for programming with Gforth. It 2793: should be helpful for finding your way in the Gforth sources. 2794: 2795: @menu 2796: * Portability:: 2797: * Threading:: 2798: * Primitives:: 2799: * System Architecture:: 2800: * Performance:: 2801: @end menu 2802: 2803: @node Portability, Threading, Internals, Internals 2804: @section Portability 2805: 2806: One of the main goals of the effort is availability across a wide range 2807: of personal machines. fig-Forth, and, to a lesser extent, F83, achieved 2808: this goal by manually coding the engine in assembly language for several 2809: then-popular processors. This approach is very labor-intensive and the 2810: results are short-lived due to progress in computer architecture. 2811: 2812: Others have avoided this problem by coding in C, e.g., Mitch Bradley 2813: (cforth), Mikael Patel (TILE) and Dirk Zoller (pfe). This approach is 2814: particularly popular for UNIX-based Forths due to the large variety of 2815: architectures of UNIX machines. Unfortunately an implementation in C 2816: does not mix well with the goals of efficiency and with using 2817: traditional techniques: Indirect or direct threading cannot be expressed 2818: in C, and switch threading, the fastest technique available in C, is 2819: significantly slower. Another problem with C is that it's very 2820: cumbersome to express double integer arithmetic. 2821: 2822: Fortunately, there is a portable language that does not have these 2823: limitations: GNU C, the version of C processed by the GNU C compiler 2824: (@pxref{C Extensions, , Extensions to the C Language Family, gcc.info, 2825: GNU C Manual}). Its labels as values feature (@pxref{Labels as Values, , 2826: Labels as Values, gcc.info, GNU C Manual}) makes direct and indirect 2827: threading possible, its @code{long long} type (@pxref{Long Long, , 2828: Double-Word Integers, gcc.info, GNU C Manual}) corresponds to Forths 2829: double numbers. GNU C is available for free on all important (and many 2830: unimportant) UNIX machines, VMS, 80386s running MS-DOS, the Amiga, and 2831: the Atari ST, so a Forth written in GNU C can run on all these 2832: machines. 2833: 2834: Writing in a portable language has the reputation of producing code that 2835: is slower than assembly. For our Forth engine we repeatedly looked at 2836: the code produced by the compiler and eliminated most compiler-induced 2837: inefficiencies by appropriate changes in the source-code. 2838: 2839: However, register allocation cannot be portably influenced by the 2840: programmer, leading to some inefficiencies on register-starved 2841: machines. We use explicit register declarations (@pxref{Explicit Reg 2842: Vars, , Variables in Specified Registers, gcc.info, GNU C Manual}) to 2843: improve the speed on some machines. They are turned on by using the 2844: @code{gcc} switch @code{-DFORCE_REG}. Unfortunately, this feature not 2845: only depends on the machine, but also on the compiler version: On some 2846: machines some compiler versions produce incorrect code when certain 2847: explicit register declarations are used. So by default 2848: @code{-DFORCE_REG} is not used. 2849: 2850: @node Threading, Primitives, Portability, Internals 2851: @section Threading 2852: 2853: GNU C's labels as values extension (available since @code{gcc-2.0}, 2854: @pxref{Labels as Values, , Labels as Values, gcc.info, GNU C Manual}) 2855: makes it possible to take the address of @var{label} by writing 2856: @code{&&@var{label}}. This address can then be used in a statement like 2857: @code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as 2858: @code{goto x}. 2859: 2860: With this feature an indirect threaded NEXT looks like: 2861: @example 2862: cfa = *ip++; 2863: ca = *cfa; 2864: goto *ca; 2865: @end example 2866: For those unfamiliar with the names: @code{ip} is the Forth instruction 2867: pointer; the @code{cfa} (code-field address) corresponds to ANS Forths 2868: execution token and points to the code field of the next word to be 2869: executed; The @code{ca} (code address) fetched from there points to some 2870: executable code, e.g., a primitive or the colon definition handler 2871: @code{docol}. 2872: 2873: Direct threading is even simpler: 2874: @example 2875: ca = *ip++; 2876: goto *ca; 2877: @end example 2878: 2879: Of course we have packaged the whole thing neatly in macros called 2880: @code{NEXT} and @code{NEXT1} (the part of NEXT after fetching the cfa). 2881: 2882: @menu 2883: * Scheduling:: 2884: * Direct or Indirect Threaded?:: 2885: * DOES>:: 2886: @end menu 2887: 2888: @node Scheduling, Direct or Indirect Threaded?, Threading, Threading 2889: @subsection Scheduling 2890: 2891: There is a little complication: Pipelined and superscalar processors, 2892: i.e., RISC and some modern CISC machines can process independent 2893: instructions while waiting for the results of an instruction. The 2894: compiler usually reorders (schedules) the instructions in a way that 2895: achieves good usage of these delay slots. However, on our first tries 2896: the compiler did not do well on scheduling primitives. E.g., for 2897: @code{+} implemented as 2898: @example 2899: n=sp[0]+sp[1]; 2900: sp++; 2901: sp[0]=n; 2902: NEXT; 2903: @end example 2904: the NEXT comes strictly after the other code, i.e., there is nearly no 2905: scheduling. After a little thought the problem becomes clear: The 2906: compiler cannot know that sp and ip point to different addresses (and 2907: the version of @code{gcc} we used would not know it even if it was 2908: possible), so it could not move the load of the cfa above the store to 2909: the TOS. Indeed the pointers could be the same, if code on or very near 2910: the top of stack were executed. In the interest of speed we chose to 2911: forbid this probably unused ``feature'' and helped the compiler in 2912: scheduling: NEXT is divided into the loading part (@code{NEXT_P1}) and 2913: the goto part (@code{NEXT_P2}). @code{+} now looks like: 2914: @example 2915: n=sp[0]+sp[1]; 2916: sp++; 2917: NEXT_P1; 2918: sp[0]=n; 2919: NEXT_P2; 2920: @end example 2921: This can be scheduled optimally by the compiler. 2922: 2923: This division can be turned off with the switch @code{-DCISC_NEXT}. This 2924: switch is on by default on machines that do not profit from scheduling 2925: (e.g., the 80386), in order to preserve registers. 2926: 2927: @node Direct or Indirect Threaded?, DOES>, Scheduling, Threading 2928: @subsection Direct or Indirect Threaded? 2929: 2930: Both! After packaging the nasty details in macro definitions we 2931: realized that we could switch between direct and indirect threading by 2932: simply setting a compilation flag (@code{-DDIRECT_THREADED}) and 2933: defining a few machine-specific macros for the direct-threading case. 2934: On the Forth level we also offer access words that hide the 2935: differences between the threading methods (@pxref{Threading Words}). 2936: 2937: Indirect threading is implemented completely 2938: machine-independently. Direct threading needs routines for creating 2939: jumps to the executable code (e.g. to docol or dodoes). These routines 2940: are inherently machine-dependent, but they do not amount to many source 2941: lines. I.e., even porting direct threading to a new machine is a small 2942: effort. 2943: 2944: @node DOES>, , Direct or Indirect Threaded?, Threading 2945: @subsection DOES> 2946: One of the most complex parts of a Forth engine is @code{dodoes}, i.e., 2947: the chunk of code executed by every word defined by a 2948: @code{CREATE}...@code{DOES>} pair. The main problem here is: How to find 2949: the Forth code to be executed, i.e. the code after the @code{DOES>} (the 2950: DOES-code)? There are two solutions: 2951: 2952: In fig-Forth the code field points directly to the dodoes and the 2953: DOES-code address is stored in the cell after the code address 2954: (i.e. at cfa cell+). It may seem that this solution is illegal in the 2955: Forth-79 and all later standards, because in fig-Forth this address 2956: lies in the body (which is illegal in these standards). However, by 2957: making the code field larger for all words this solution becomes legal 2958: again. We use this approach for the indirect threaded version. Leaving 2959: a cell unused in most words is a bit wasteful, but on the machines we 2960: are targetting this is hardly a problem. The other reason for having a 2961: code field size of two cells is to avoid having different image files 2962: for direct and indirect threaded systems (@pxref{System Architecture}). 2963: 2964: The other approach is that the code field points or jumps to the cell 2965: after @code{DOES}. In this variant there is a jump to @code{dodoes} at 2966: this address. @code{dodoes} can then get the DOES-code address by 2967: computing the code address, i.e., the address of the jump to dodoes, 2968: and add the length of that jump field. A variant of this is to have a 2969: call to @code{dodoes} after the @code{DOES>}; then the return address 2970: (which can be found in the return register on RISCs) is the DOES-code 2971: address. Since the two cells available in the code field are usually 2972: used up by the jump to the code address in direct threading, we use 2973: this approach for direct threading. We did not want to add another 2974: cell to the code field. 2975: 2976: @node Primitives, System Architecture, Threading, Internals 2977: @section Primitives 2978: 2979: @menu 2980: * Automatic Generation:: 2981: * TOS Optimization:: 2982: * Produced code:: 2983: @end menu 2984: 2985: @node Automatic Generation, TOS Optimization, Primitives, Primitives 2986: @subsection Automatic Generation 2987: 2988: Since the primitives are implemented in a portable language, there is no 2989: longer any need to minimize the number of primitives. On the contrary, 2990: having many primitives is an advantage: speed. In order to reduce the 2991: number of errors in primitives and to make programming them easier, we 2992: provide a tool, the primitive generator (@file{prims2x.fs}), that 2993: automatically generates most (and sometimes all) of the C code for a 2994: primitive from the stack effect notation. The source for a primitive 2995: has the following form: 2996: 2997: @format 2998: @var{Forth-name} @var{stack-effect} @var{category} [@var{pronounc.}] 2999: [@code{""}@var{glossary entry}@code{""}] 3000: @var{C code} 3001: [@code{:} 3002: @var{Forth code}] 3003: @end format 3004: 3005: The items in brackets are optional. The category and glossary fields 3006: are there for generating the documentation, the Forth code is there 3007: for manual implementations on machines without GNU C. E.g., the source 3008: for the primitive @code{+} is: 3009: @example 3010: + n1 n2 -- n core plus 3011: n = n1+n2; 3012: @end example 3013: 3014: This looks like a specification, but in fact @code{n = n1+n2} is C 3015: code. Our primitive generation tool extracts a lot of information from 3016: the stack effect notations@footnote{We use a one-stack notation, even 3017: though we have separate data and floating-point stacks; The separate 3018: notation can be generated easily from the unified notation.}: The number 3019: of items popped from and pushed on the stack, their type, and by what 3020: name they are referred to in the C code. It then generates a C code 3021: prelude and postlude for each primitive. The final C code for @code{+} 3022: looks like this: 3023: 3024: @example 3025: I_plus: /* + ( n1 n2 -- n ) */ /* label, stack effect */ 3026: /* */ /* documentation */ 3027: @{ 3028: DEF_CA /* definition of variable ca (indirect threading) */ 3029: Cell n1; /* definitions of variables */ 3030: Cell n2; 3031: Cell n; 3032: n1 = (Cell) sp[1]; /* input */ 3033: n2 = (Cell) TOS; 3034: sp += 1; /* stack adjustment */ 3035: NAME("+") /* debugging output (with -DDEBUG) */ 3036: @{ 3037: n = n1+n2; /* C code taken from the source */ 3038: @} 3039: NEXT_P1; /* NEXT part 1 */ 3040: TOS = (Cell)n; /* output */ 3041: NEXT_P2; /* NEXT part 2 */ 3042: @} 3043: @end example 3044: 3045: This looks long and inefficient, but the GNU C compiler optimizes quite 3046: well and produces optimal code for @code{+} on, e.g., the R3000 and the 3047: HP RISC machines: Defining the @code{n}s does not produce any code, and 3048: using them as intermediate storage also adds no cost. 3049: 3050: There are also other optimizations, that are not illustrated by this 3051: example: Assignments between simple variables are usually for free (copy 3052: propagation). If one of the stack items is not used by the primitive 3053: (e.g. in @code{drop}), the compiler eliminates the load from the stack 3054: (dead code elimination). On the other hand, there are some things that 3055: the compiler does not do, therefore they are performed by 3056: @file{prims2x.fs}: The compiler does not optimize code away that stores 3057: a stack item to the place where it just came from (e.g., @code{over}). 3058: 3059: While programming a primitive is usually easy, there are a few cases 3060: where the programmer has to take the actions of the generator into 3061: account, most notably @code{?dup}, but also words that do not (always) 3062: fall through to NEXT. 3063: 3064: @node TOS Optimization, Produced code, Automatic Generation, Primitives 3065: @subsection TOS Optimization 3066: 3067: An important optimization for stack machine emulators, e.g., Forth 3068: engines, is keeping one or more of the top stack items in 3069: registers. If a word has the stack effect @var{in1}...@var{inx} @code{--} 3070: @var{out1}...@var{outy}, keeping the top @var{n} items in registers 3071: @itemize 3072: @item 3073: is better than keeping @var{n-1} items, if @var{x>=n} and @var{y>=n}, 3074: due to fewer loads from and stores to the stack. 3075: @item is slower than keeping @var{n-1} items, if @var{x<>y} and @var{x<n} and 3076: @var{y<n}, due to additional moves between registers. 3077: @end itemize 3078: 3079: In particular, keeping one item in a register is never a disadvantage, 3080: if there are enough registers. Keeping two items in registers is a 3081: disadvantage for frequent words like @code{?branch}, constants, 3082: variables, literals and @code{i}. Therefore our generator only produces 3083: code that keeps zero or one items in registers. The generated C code 3084: covers both cases; the selection between these alternatives is made at 3085: C-compile time using the switch @code{-DUSE_TOS}. @code{TOS} in the C 3086: code for @code{+} is just a simple variable name in the one-item case, 3087: otherwise it is a macro that expands into @code{sp[0]}. Note that the 3088: GNU C compiler tries to keep simple variables like @code{TOS} in 3089: registers, and it usually succeeds, if there are enough registers. 3090: 3091: The primitive generator performs the TOS optimization for the 3092: floating-point stack, too (@code{-DUSE_FTOS}). For floating-point 3093: operations the benefit of this optimization is even larger: 3094: floating-point operations take quite long on most processors, but can be 3095: performed in parallel with other operations as long as their results are 3096: not used. If the FP-TOS is kept in a register, this works. If 3097: it is kept on the stack, i.e., in memory, the store into memory has to 3098: wait for the result of the floating-point operation, lengthening the 3099: execution time of the primitive considerably. 3100: 3101: The TOS optimization makes the automatic generation of primitives a 3102: bit more complicated. Just replacing all occurrences of @code{sp[0]} by 3103: @code{TOS} is not sufficient. There are some special cases to 3104: consider: 3105: @itemize 3106: @item In the case of @code{dup ( w -- w w )} the generator must not 3107: eliminate the store to the original location of the item on the stack, 3108: if the TOS optimization is turned on. 3109: @item Primitives with stack effects of the form @code{--} 3110: @var{out1}...@var{outy} must store the TOS to the stack at the start. 3111: Likewise, primitives with the stack effect @var{in1}...@var{inx} @code{--} 3112: must load the TOS from the stack at the end. But for the null stack 3113: effect @code{--} no stores or loads should be generated. 3114: @end itemize 3115: 3116: @node Produced code, , TOS Optimization, Primitives 3117: @subsection Produced code 3118: 3119: To see what assembly code is produced for the primitives on your machine 3120: with your compiler and your flag settings, type @code{make engine.s} and 3121: look at the resulting file @file{engine.s}. 3122: 3123: @node System Architecture, Performance, Primitives, Internals 3124: @section System Architecture 3125: 3126: Our Forth system consists not only of primitives, but also of 3127: definitions written in Forth. Since the Forth compiler itself belongs 3128: to those definitions, it is not possible to start the system with the 3129: primitives and the Forth source alone. Therefore we provide the Forth 3130: code as an image file in nearly executable form. At the start of the 3131: system a C routine loads the image file into memory, sets up the 3132: memory (stacks etc.) according to information in the image file, and 3133: starts executing Forth code. 3134: 3135: The image file format is a compromise between the goals of making it 3136: easy to generate image files and making them portable. The easiest way 3137: to generate an image file is to just generate a memory dump. However, 3138: this kind of image file cannot be used on a different machine, or on 3139: the next version of the engine on the same machine, it even might not 3140: work with the same engine compiled by a different version of the C 3141: compiler. We would like to have as few versions of the image file as 3142: possible, because we do not want to distribute many versions of the 3143: same image file, and to make it easy for the users to use their image 3144: files on many machines. We currently need to create a different image 3145: file for machines with different cell sizes and different byte order 3146: (little- or big-endian)@footnote{We are considering adding information to the 3147: image file that enables the loader to change the byte order.}. 3148: 3149: Forth code that is going to end up in a portable image file has to 3150: comply to some restrictions: addresses have to be stored in memory with 3151: special words (@code{A!}, @code{A,}, etc.) in order to make the code 3152: relocatable. Cells, floats, etc., have to be stored at the natural 3153: alignment boundaries@footnote{E.g., store floats (8 bytes) at an address 3154: dividable by~8. This happens automatically in our system when you use 3155: the ANS Forth alignment words.}, in order to avoid alignment faults on 3156: machines with stricter alignment. The image file is produced by a 3157: metacompiler (@file{cross.fs}). 3158: 3159: So, unlike the image file of Mitch Bradleys @code{cforth}, our image 3160: file is not directly executable, but has to undergo some manipulations 3161: during loading. Address relocation is performed at image load-time, not 3162: at run-time. The loader also has to replace tokens standing for 3163: primitive calls with the appropriate code-field addresses (or code 3164: addresses in the case of direct threading). 3165: 3166: @node Performance, , System Architecture, Internals 3167: @section Performance 3168: 3169: On RISCs the Gforth engine is very close to optimal; i.e., it is usually 3170: impossible to write a significantly faster engine. 3171: 3172: On register-starved machines like the 386 architecture processors 3173: improvements are possible, because @code{gcc} does not utilize the 3174: registers as well as a human, even with explicit register declarations; 3175: e.g., Bernd Beuster wrote a Forth system fragment in assembly language 3176: and hand-tuned it for the 486; this system is 1.19 times faster on the 3177: Sieve benchmark on a 486DX2/66 than Gforth compiled with 3178: @code{gcc-2.6.3} with @code{-DFORCE_REG}. 3179: 3180: However, this potential advantage of assembly language implementations 3181: is not necessarily realized in complete Forth systems: We compared 3182: Gforth (compiled with @code{gcc-2.6.3} and @code{-DFORCE_REG}) with 3183: Win32Forth 1.2093 and LMI's NT Forth (Beta, May 1994), two systems 3184: written in assembly, and with two systems written in C: PFE-0.9.11 3185: (compiled with @code{gcc-2.6.3} with the default configuration for 3186: Linux: @code{-O2 -fomit-frame-pointer -DUSE_REGS}) and ThisForth Beta 3187: (compiled with gcc-2.6.3 -O3 -fomit-frame-pointer). We benchmarked 3188: Gforth, PFE and ThisForth on a 486DX2/66 under Linux. Kenneth O'Heskin 3189: kindly provided the results for Win32Forth and NT Forth on a 486DX2/66 3190: with similar memory performance under Windows NT. 3191: 3192: We used four small benchmarks: the ubiquitous Sieve; bubble-sorting and 3193: matrix multiplication come from the Stanford integer benchmarks and have 3194: been translated into Forth by Martin Fraeman; we used the versions 3195: included in the TILE Forth package; and a recursive Fibonacci number 3196: computation for benchmark calling performance. The following table shows 3197: the time taken for the benchmarks scaled by the time taken by Gforth (in 3198: other words, it shows the speedup factor that Gforth achieved over the 3199: other systems). 3200: 3201: @example 3202: relative Win32- NT This- 3203: time Gforth Forth Forth PFE Forth 3204: sieve 1.00 1.30 1.07 1.67 2.98 3205: bubble 1.00 1.30 1.40 1.66 3206: matmul 1.00 1.40 1.29 2.24 3207: fib 1.00 1.44 1.26 1.82 2.82 3208: @end example 3209: 3210: You may find the good performance of Gforth compared with the systems 3211: written in assembly language quite surprising. One important reason for 3212: the disappointing performance of these systems is probably that they are 3213: not written optimally for the 486 (e.g., they use the @code{lods} 3214: instruction). In addition, Win32Forth uses a comfortable, but costly 3215: method for relocating the Forth image: like @code{cforth}, it computes 3216: the actual addresses at run time, resulting in two address computations 3217: per NEXT (@pxref{System Architecture}). 3218: 3219: The speedup of Gforth over PFE and ThisForth can be easily explained 3220: with the self-imposed restriction to standard C (although the measured 3221: implementation of PFE uses a GNU C extension: global register 3222: variables), which makes efficient threading impossible. Moreover, 3223: current C compilers have a hard time optimizing other aspects of the 3224: ThisForth source. 3225: 3226: Note that the performance of Gforth on 386 architecture processors 3227: varies widely with the version of @code{gcc} used. E.g., @code{gcc-2.5.8} 3228: failed to allocate any of the virtual machine registers into real 3229: machine registers by itself and would not work correctly with explicit 3230: register declarations, giving a 1.3 times slower engine (on a 486DX2/66 3231: running the Sieve) than the one measured above. 3232: 3233: @node Bugs, Pedigree, Internals, Top 3234: @chapter Bugs 3235: 3236: Known bugs are described in the file BUGS in the Gforth distribution. 3237: 3238: If you find a bug, please send a bug report to !!. A bug report should 3239: describe the Gforth version used (it is announced at the start of an 3240: interactive Gforth session), the machine and operating system (on Unix 3241: systems you can use @code{uname -a} to produce this information), the 3242: installation options (!! a way to find them out), and a complete list of 3243: changes you (or your installer) have made to the Gforth sources (if 3244: any); it should contain a program (or a sequence of keyboard commands) 3245: that reproduces the bug and a description of what you think constitutes 3246: the buggy behaviour. 3247: 3248: For a thorough guide on reporting bugs read @ref{Bug Reporting, , How 3249: to Report Bugs, gcc.info, GNU C Manual}. 3250: 3251: 3252: @node Pedigree, Word Index, Bugs, Top 3253: @chapter Pedigree 3254: 3255: Gforth descends from BigForth (1993) and fig-Forth. Gforth and PFE (by 3256: Dirk Zoller) will cross-fertilize each other. Of course, a significant part of the design of Gforth was prescribed by ANS Forth. 3257: 3258: Bernd Paysan wrote BigForth, a child of VolksForth. 3259: 3260: VolksForth descends from F83. !! Authors? When? 3261: 3262: Laxen and Perry wrote F83 as a model implementation of the 3263: Forth-83 standard. !! Pedigree? When? 3264: 3265: A team led by Bill Ragsdale implemented fig-Forth on many processors in 3266: 1979. Dean Sanderson and Bill Ragsdale developed the original 3267: implementation of fig-Forth based on microForth. 3268: 3269: !! microForth pedigree 3270: 3271: A part of the information in this section comes from @cite{The Evolution 3272: of Forth} by Elizabeth D. Rather, Donald R. Colburn and Charles 3273: H. Moore, presented at the HOPL-II conference and preprinted in SIGPLAN 3274: Notices 28(3), 1993. You can find more historical and genealogical 3275: information about Forth there. 3276: 3277: @node Word Index, Node Index, Pedigree, Top 3278: @chapter Word Index 3279: 3280: This index is as incomplete as the manual. Each word is listed with 3281: stack effect and wordset. 3282: 3283: @printindex fn 3284: 3285: @node Node Index, , Word Index, Top 3286: @chapter Node Index 3287: 3288: This index is even less complete than the manual. 3289: 3290: @contents 3291: @bye 3292: