| \input texinfo @c -*-texinfo-*- |
\input texinfo @c -*-texinfo-*- |
| @comment The source is gforth.ds, from which gforth.texi is generated |
@comment The source is gforth.ds, from which gforth.texi is generated |
| |
|
| |
@comment TODO: nac29jan99 - a list of things to add in the next edit: |
| |
@comment 1. x-ref all ambiguous or implementation-defined features? |
| |
@comment 2. Describe the use of Auser Avariable AConstant A, etc. |
| |
@comment 3. words in miscellaneous section need a home. |
| |
@comment 4. search for TODO for other minor and major works required. |
| |
@comment 5. [rats] change all @var to @i in Forth source so that info |
| |
@comment file looks decent. |
| |
@c Not an improvement IMO - anton |
| |
@c and anyway, this should be taken up |
| |
@c with Karl Berry (the texinfo guy) - anton |
| |
@comment .. would be useful to have a word that identified all deferred words |
| |
@comment should semantics stuff in intro be moved to another section |
| |
|
| |
@c POSTPONE, COMPILE, [COMPILE], LITERAL should have their own section |
| |
|
| @comment %**start of header (This is for running Texinfo on a region.) |
@comment %**start of header (This is for running Texinfo on a region.) |
| @setfilename gforth.info |
@setfilename gforth.info |
| @settitle Gforth Manual |
@settitle Gforth Manual |
| @direntry |
@direntry |
| * Gforth: (gforth). A fast interpreter for the Forth language. |
* Gforth: (gforth). A fast interpreter for the Forth language. |
| @end direntry |
@end direntry |
| |
@c The Texinfo manual also recommends doing this, but for Gforth it may |
| |
@c not make much sense |
| |
@c @dircategory Individual utilities |
| |
@c @direntry |
| |
@c * Gforth: (gforth)Invoking Gforth. gforth, gforth-fast, gforthmi |
| |
@c @end direntry |
| |
|
| @comment @setchapternewpage odd |
@comment @setchapternewpage odd |
| |
@comment TODO this gets left in by HTML converter |
| @macro progstyle {} |
@macro progstyle {} |
| Programming style note: |
Programming style note: |
| @end macro |
@end macro |
| |
|
| |
@macro assignment {} |
| |
@table @i |
| |
@item Assignment: |
| |
@end macro |
| |
@macro endassignment {} |
| |
@end table |
| |
@end macro |
| |
|
| @comment %**end of header (This is for running Texinfo on a region.) |
@comment %**end of header (This is for running Texinfo on a region.) |
| |
|
| |
|
| |
@comment ---------------------------------------------------------- |
| |
@comment macros for beautifying glossary entries |
| |
@comment if these are used, need to strip them out for HTML converter |
| |
@comment else they get repeated verbatim in HTML output. |
| |
@comment .. not working yet. |
| |
|
| |
@macro GLOSS-START {} |
| |
@iftex |
| |
@ninerm |
| |
@end iftex |
| |
@end macro |
| |
|
| |
@macro GLOSS-END {} |
| |
@iftex |
| |
@rm |
| |
@end iftex |
| |
@end macro |
| |
|
| |
@comment ---------------------------------------------------------- |
| |
|
| |
|
| @include version.texi |
@include version.texi |
| |
|
| @ifinfo |
@ifnottex |
| This file documents Gforth @value{VERSION} |
This file documents Gforth @value{VERSION} |
| |
|
| Copyright @copyright{} 1995-1998 Free Software Foundation, Inc. |
Copyright @copyright{} 1995--2000 Free Software Foundation, Inc. |
| |
|
| Permission is granted to make and distribute verbatim copies of |
Permission is granted to make and distribute verbatim copies of |
| this manual provided the copyright notice and this permission notice |
this manual provided the copyright notice and this permission notice |
| except that the sections entitled "Distribution" and "General Public |
except that the sections entitled "Distribution" and "General Public |
| License" may be included in a translation approved by the author instead |
License" may be included in a translation approved by the author instead |
| of in the original English. |
of in the original English. |
| @end ifinfo |
@end ifnottex |
| |
|
| @finalout |
@finalout |
| @titlepage |
@titlepage |
| @sp 2 |
@sp 2 |
| @center for version @value{VERSION} |
@center for version @value{VERSION} |
| @sp 2 |
@sp 2 |
| |
@center Neal Crook |
| @center Anton Ertl |
@center Anton Ertl |
| @center Bernd Paysan |
@center Bernd Paysan |
| @center Jens Wilke |
@center Jens Wilke |
| @sp 3 |
@sp 3 |
| @center This manual is permanently under construction |
@center This manual is permanently under construction and was last updated on 15-Mar-2000 |
| |
|
| @comment The following two commands start the copyright page. |
@comment The following two commands start the copyright page. |
| @page |
@page |
| @vskip 0pt plus 1filll |
@vskip 0pt plus 1filll |
| Copyright @copyright{} 1995--1998 Free Software Foundation, Inc. |
Copyright @copyright{} 1995--2000 Free Software Foundation, Inc. |
| |
|
| @comment !! Published by ... or You can get a copy of this manual ... |
@comment !! Published by ... or You can get a copy of this manual ... |
| |
|
| of in the original English. |
of in the original English. |
| @end titlepage |
@end titlepage |
| |
|
| |
|
| @node Top, License, (dir), (dir) |
@node Top, License, (dir), (dir) |
| @ifinfo |
@ifnottex |
| Gforth is a free implementation of ANS Forth available on many |
Gforth is a free implementation of ANS Forth available on many |
| personal machines. This manual corresponds to version @value{VERSION}. |
personal machines. This manual corresponds to version @value{VERSION}. |
| @end ifinfo |
@end ifnottex |
| |
|
| @menu |
@menu |
| * License:: |
* License:: The GPL |
| * Goals:: About the Gforth Project |
* Goals:: About the Gforth Project |
| * Other Books:: Things you might want to read |
* Gforth Environment:: Starting (and exiting) Gforth |
| * Invoking Gforth:: Starting Gforth |
* Tutorial:: Hands-on Forth Tutorial |
| |
* Introduction:: An introduction to ANS Forth |
| * Words:: Forth words available in Gforth |
* Words:: Forth words available in Gforth |
| |
* Error messages:: How to interpret them |
| * Tools:: Programming tools |
* Tools:: Programming tools |
| * ANS conformance:: Implementation-defined options etc. |
* ANS conformance:: Implementation-defined options etc. |
| |
* Standard vs Extensions:: Should I use extensions? |
| * Model:: The abstract machine of Gforth |
* Model:: The abstract machine of Gforth |
| * Integrating Gforth:: Forth as scripting language for applications |
* Integrating Gforth:: Forth as scripting language for applications |
| * Emacs and Gforth:: The Gforth Mode |
* Emacs and Gforth:: The Gforth Mode |
| * Image Files:: @code{.fi} files contain compiled code |
* Image Files:: @code{.fi} files contain compiled code |
| * Engine:: The inner interpreter and the primitives |
* Engine:: The inner interpreter and the primitives |
| |
* Binding to System Library:: |
| * Cross Compiler:: The Cross Compiler |
* Cross Compiler:: The Cross Compiler |
| * Bugs:: How to report them |
* Bugs:: How to report them |
| * Origin:: Authors and ancestors of Gforth |
* Origin:: Authors and ancestors of Gforth |
| |
* Forth-related information:: Books and places to look on the WWW |
| * Word Index:: An item for each Forth word |
* Word Index:: An item for each Forth word |
| |
* Name Index:: Forth words, only names listed |
| * Concept Index:: A menu covering many topics |
* Concept Index:: A menu covering many topics |
| |
|
| --- The Detailed Node Listing --- |
@detailmenu --- The Detailed Node Listing --- |
| |
|
| |
Gforth Environment |
| |
|
| |
* Invoking Gforth:: Getting in |
| |
* Leaving Gforth:: Getting out |
| |
* Command-line editing:: |
| |
* Environment variables:: that affect how Gforth starts up |
| |
* Gforth Files:: What gets installed and where |
| |
* Startup speed:: When 35ms is not fast enough ... |
| |
|
| |
Forth Tutorial |
| |
|
| |
* Starting Gforth Tutorial:: |
| |
* Syntax Tutorial:: |
| |
* Crash Course Tutorial:: |
| |
* Stack Tutorial:: |
| |
* Arithmetics Tutorial:: |
| |
* Stack Manipulation Tutorial:: |
| |
* Using files for Forth code Tutorial:: |
| |
* Comments Tutorial:: |
| |
* Colon Definitions Tutorial:: |
| |
* Decompilation Tutorial:: |
| |
* Stack-Effect Comments Tutorial:: |
| |
* Types Tutorial:: |
| |
* Factoring Tutorial:: |
| |
* Designing the stack effect Tutorial:: |
| |
* Local Variables Tutorial:: |
| |
* Conditional execution Tutorial:: |
| |
* Flags and Comparisons Tutorial:: |
| |
* General Loops Tutorial:: |
| |
* Counted loops Tutorial:: |
| |
* Recursion Tutorial:: |
| |
* Leaving definitions or loops Tutorial:: |
| |
* Return Stack Tutorial:: |
| |
* Memory Tutorial:: |
| |
* Characters and Strings Tutorial:: |
| |
* Alignment Tutorial:: |
| |
* Interpretation and Compilation Semantics and Immediacy Tutorial:: |
| |
* Execution Tokens Tutorial:: |
| |
* Exceptions Tutorial:: |
| |
* Defining Words Tutorial:: |
| |
* Arrays and Records Tutorial:: |
| |
* POSTPONE Tutorial:: |
| |
* Literal Tutorial:: |
| |
* Advanced macros Tutorial:: |
| |
* Compilation Tokens Tutorial:: |
| |
* Wordlists and Search Order Tutorial:: |
| |
|
| |
An Introduction to ANS Forth |
| |
|
| |
* Introducing the Text Interpreter:: |
| |
* Stacks and Postfix notation:: |
| |
* Your first definition:: |
| |
* How does that work?:: |
| |
* Forth is written in Forth:: |
| |
* Review - elements of a Forth system:: |
| |
* Where to go next:: |
| |
* Exercises:: |
| |
|
| Forth Words |
Forth Words |
| |
|
| * Notation:: |
* Notation:: |
| |
* Case insensitivity:: |
| |
* Comments:: |
| |
* Boolean Flags:: |
| * Arithmetic:: |
* Arithmetic:: |
| * Stack Manipulation:: |
* Stack Manipulation:: |
| * Memory:: |
* Memory:: |
| * Control Structures:: |
* Control Structures:: |
| * Locals:: |
|
| * Defining Words:: |
* Defining Words:: |
| * Structures:: |
* Interpretation and Compilation Semantics:: |
| * Object-oriented Forth:: |
|
| * Tokens for Words:: |
* Tokens for Words:: |
| * Wordlists:: |
* The Text Interpreter:: |
| |
* Word Lists:: |
| |
* Environmental Queries:: |
| * Files:: |
* Files:: |
| * Including Files:: |
|
| * Blocks:: |
* Blocks:: |
| * Other I/O:: |
* Other I/O:: |
| |
* Locals:: |
| |
* Structures:: |
| |
* Object-oriented Forth:: |
| * Programming Tools:: |
* Programming Tools:: |
| * Assembler and Code Words:: |
* Assembler and Code Words:: |
| * Threading Words:: |
* Threading Words:: |
| |
* Passing Commands to the OS:: |
| |
* Keeping track of Time:: |
| |
* Miscellaneous Words:: |
| |
|
| Arithmetic |
Arithmetic |
| |
|
| * Single precision:: |
* Single precision:: |
| * Bitwise operations:: |
|
| * Mixed precision:: operations with single and double-cell integers |
|
| * Double precision:: Double-cell integer arithmetic |
* Double precision:: Double-cell integer arithmetic |
| |
* Bitwise operations:: |
| |
* Numeric comparison:: |
| |
* Mixed precision:: Operations with single and double-cell integers |
| * Floating Point:: |
* Floating Point:: |
| |
|
| Stack Manipulation |
Stack Manipulation |
| |
|
| Memory |
Memory |
| |
|
| |
* Memory model:: |
| |
* Dictionary allocation:: |
| |
* Heap Allocation:: |
| * Memory Access:: |
* Memory Access:: |
| * Address arithmetic:: |
* Address arithmetic:: |
| * Memory Blocks:: |
* Memory Blocks:: |
| |
|
| Control Structures |
Control Structures |
| |
|
| * Selection:: |
* Selection:: IF ... ELSE ... ENDIF |
| * Simple Loops:: |
* Simple Loops:: BEGIN ... |
| * Counted Loops:: |
* Counted Loops:: DO |
| * Arbitrary control structures:: |
* Arbitrary control structures:: |
| * Calls and returns:: |
* Calls and returns:: |
| * Exception Handling:: |
* Exception Handling:: |
| |
|
| |
Defining Words |
| |
|
| |
* CREATE:: |
| |
* Variables:: Variables and user variables |
| |
* Constants:: |
| |
* Values:: Initialised variables |
| |
* Colon Definitions:: |
| |
* Anonymous Definitions:: Definitions without names |
| |
* Supplying names:: Passing definition names as strings |
| |
* User-defined Defining Words:: |
| |
* Deferred words:: Allow forward references |
| |
* Aliases:: |
| |
|
| |
User-defined Defining Words |
| |
|
| |
* CREATE..DOES> applications:: |
| |
* CREATE..DOES> details:: |
| |
* Advanced does> usage example:: |
| |
|
| |
Interpretation and Compilation Semantics |
| |
|
| |
* Combined words:: |
| |
|
| |
Tokens for Words |
| |
|
| |
* Execution token:: represents execution/interpretation semantics |
| |
* Compilation token:: represents compilation semantics |
| |
* Name token:: represents named words |
| |
|
| |
The Text Interpreter |
| |
|
| |
* Input Sources:: |
| |
* Number Conversion:: |
| |
* Interpret/Compile states:: |
| |
* Literals:: |
| |
* Interpreter Directives:: |
| |
|
| |
Word Lists |
| |
|
| |
* Vocabularies:: |
| |
* Why use word lists?:: |
| |
* Word list example:: |
| |
|
| |
Files |
| |
|
| |
* Forth source files:: |
| |
* General files:: |
| |
* Search Paths:: |
| |
|
| |
Search Paths |
| |
|
| |
* Source Search Paths:: |
| |
* General Search Paths:: |
| |
|
| |
Other I/O |
| |
|
| |
* Simple numeric output:: Predefined formats |
| |
* Formatted numeric output:: Formatted (pictured) output |
| |
* String Formats:: How Forth stores strings in memory |
| |
* Displaying characters and strings:: Other stuff |
| |
* Input:: Input |
| |
|
| Locals |
Locals |
| |
|
| * Gforth locals:: |
* Gforth locals:: |
| |
|
| * Where are locals visible by name?:: |
* Where are locals visible by name?:: |
| * How long do locals live?:: |
* How long do locals live?:: |
| * Programming Style:: |
* Locals programming style:: |
| * Implementation:: |
* Locals implementation:: |
| |
|
| Defining Words |
|
| |
|
| * Simple Defining Words:: |
|
| * Colon Definitions:: |
|
| * User-defined Defining Words:: |
|
| * Supplying names:: |
|
| * Interpretation and Compilation Semantics:: |
|
| |
|
| Structures |
Structures |
| |
|
| |
|
| Object-oriented Forth |
Object-oriented Forth |
| |
|
| |
* Why object-oriented programming?:: |
| |
* Object-Oriented Terminology:: |
| * Objects:: |
* Objects:: |
| * OOF:: |
* OOF:: |
| * Mini-OOF:: |
* Mini-OOF:: |
| |
* Comparison with other object models:: |
| |
|
| Objects |
The @file{objects.fs} model |
| |
|
| * Properties of the Objects model:: |
* Properties of the Objects model:: |
| * Why object-oriented programming?:: |
|
| * Object-Oriented Terminology:: |
|
| * Basic Objects Usage:: |
* Basic Objects Usage:: |
| * The class Object:: |
* The Objects base class:: |
| * Creating objects:: |
* Creating objects:: |
| * Object-Oriented Programming Style:: |
* Object-Oriented Programming Style:: |
| * Class Binding:: |
* Class Binding:: |
| * Method conveniences:: |
* Method conveniences:: |
| * Classes and Scoping:: |
* Classes and Scoping:: |
| |
* Dividing classes:: |
| * Object Interfaces:: |
* Object Interfaces:: |
| * Objects Implementation:: |
* Objects Implementation:: |
| * Comparison with other object models:: |
|
| * Objects Glossary:: |
* Objects Glossary:: |
| |
|
| OOF |
The @file{oof.fs} model |
| |
|
| * Properties of the OOF model:: |
* Properties of the OOF model:: |
| * Basic OOF Usage:: |
* Basic OOF Usage:: |
| * The base class object:: |
* The OOF base class:: |
| * Class Declaration:: |
* Class Declaration:: |
| * Class Implementation:: |
* Class Implementation:: |
| |
|
| Including Files |
The @file{mini-oof.fs} model |
| |
|
| * Words for Including:: |
* Basic Mini-OOF Usage:: |
| * Search Path:: |
* Mini-OOF Example:: |
| * Changing the Search Path:: |
* Mini-OOF Implementation:: |
| * General Search Paths:: |
|
| |
|
| Programming Tools |
Programming Tools |
| |
|
| |
* Examining:: |
| |
* Forgetting words:: |
| * Debugging:: Simple and quick. |
* Debugging:: Simple and quick. |
| * Assertions:: Making your programs self-checking. |
* Assertions:: Making your programs self-checking. |
| * Singlestep Debugger:: Executing your program word by word. |
* Singlestep Debugger:: Executing your program word by word. |
| |
|
| |
Assembler and Code Words |
| |
|
| |
* Code and ;code:: |
| |
* Common Assembler:: Assembler Syntax |
| |
* Common Disassembler:: |
| |
* 386 Assembler:: Deviations and special cases |
| |
* Alpha Assembler:: Deviations and special cases |
| |
* MIPS assembler:: Deviations and special cases |
| |
* Other assemblers:: How to write them |
| |
|
| Tools |
Tools |
| |
|
| * ANS Report:: Report the words used, sorted by wordset. |
* ANS Report:: Report the words used, sorted by wordset. |
| |
|
| Image Files |
Image Files |
| |
|
| |
* Image Licensing Issues:: Distribution terms for images. |
| * Image File Background:: Why have image files? |
* Image File Background:: Why have image files? |
| * Non-Relocatable Image Files:: don't always work. |
* Non-Relocatable Image Files:: don't always work. |
| * Data-Relocatable Image Files:: are better. |
* Data-Relocatable Image Files:: are better. |
| * Fully Relocatable Image Files:: better yet. |
* Fully Relocatable Image Files:: better yet. |
| * Stack and Dictionary Sizes:: Setting the default sizes for an image. |
* Stack and Dictionary Sizes:: Setting the default sizes for an image. |
| * Running Image Files:: @code{gforth -i @var{file}} or @var{file}. |
* Running Image Files:: @code{gforth -i @i{file}} or @i{file}. |
| * Modifying the Startup Sequence:: and turnkey applications. |
* Modifying the Startup Sequence:: and turnkey applications. |
| |
|
| Fully Relocatable Image Files |
Fully Relocatable Image Files |
| * TOS Optimization:: |
* TOS Optimization:: |
| * Produced code:: |
* Produced code:: |
| |
|
| System Libraries |
|
| |
|
| * Binding to System Library:: |
|
| |
|
| Cross Compiler |
Cross Compiler |
| |
|
| * Using the Cross Compiler:: |
* Using the Cross Compiler:: |
| * How the Cross Compiler Works:: |
* How the Cross Compiler Works:: |
| |
|
| |
Other Forth-related information |
| |
|
| |
* Internet resources:: |
| |
* Books:: |
| |
* The Forth Interest Group:: |
| |
* Conferences:: |
| |
|
| |
@end detailmenu |
| @end menu |
@end menu |
| |
|
| @node License, Goals, Top, Top |
@node License, Goals, Top, Top |
| @iftex |
@iftex |
| @unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
| @end iftex |
@end iftex |
| @ifinfo |
@ifnottex |
| @center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION |
| @end ifinfo |
@end ifnottex |
| |
|
| @enumerate 0 |
@enumerate 0 |
| @item |
@item |
| @iftex |
@iftex |
| @heading NO WARRANTY |
@heading NO WARRANTY |
| @end iftex |
@end iftex |
| @ifinfo |
@ifnottex |
| @center NO WARRANTY |
@center NO WARRANTY |
| @end ifinfo |
@end ifnottex |
| |
|
| @item |
@item |
| BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY |
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY |
| @iftex |
@iftex |
| @heading END OF TERMS AND CONDITIONS |
@heading END OF TERMS AND CONDITIONS |
| @end iftex |
@end iftex |
| @ifinfo |
@ifnottex |
| @center END OF TERMS AND CONDITIONS |
@center END OF TERMS AND CONDITIONS |
| @end ifinfo |
@end ifnottex |
| |
|
| @page |
@page |
| @unnumberedsec How to Apply These Terms to Your New Programs |
@unnumberedsec How to Apply These Terms to Your New Programs |
| @iftex |
@iftex |
| @unnumbered Preface |
@unnumbered Preface |
| @cindex Preface |
@cindex Preface |
| This manual documents Gforth. The reader is expected to know |
This manual documents Gforth. Some introductory material is provided for |
| Forth. This manual is primarily a reference manual. @xref{Other Books} |
readers who are unfamiliar with Forth or who are migrating to Gforth |
| for introductory material. |
from other Forth compilers. However, this manual is primarily a |
| |
reference manual. |
| @end iftex |
@end iftex |
| |
|
| @node Goals, Other Books, License, Top |
@comment TODO much more blurb here. |
| |
|
| |
@c ****************************************************************** |
| |
@node Goals, Gforth Environment, License, Top |
| @comment node-name, next, previous, up |
@comment node-name, next, previous, up |
| @chapter Goals of Gforth |
@chapter Goals of Gforth |
| @cindex Goals |
@cindex goals of the Gforth project |
| The goal of the Gforth Project is to develop a standard model for |
The goal of the Gforth Project is to develop a standard model for |
| ANS Forth. This can be split into several subgoals: |
ANS Forth. This can be split into several subgoals: |
| |
|
| @itemize @bullet |
@itemize @bullet |
| @item |
@item |
| Gforth should conform to the Forth standard (ANS Forth). |
Gforth should conform to the ANS Forth Standard. |
| @item |
@item |
| It should be a model, i.e. it should define all the |
It should be a model, i.e. it should define all the |
| implementation-dependent things. |
implementation-dependent things. |
| appears to be quite popular. It has some similarities to and some |
appears to be quite popular. It has some similarities to and some |
| differences from previous models. It has some powerful features, but not |
differences from previous models. It has some powerful features, but not |
| yet everything that we envisioned. We certainly have achieved our |
yet everything that we envisioned. We certainly have achieved our |
| execution speed goals (@pxref{Performance}). It is free and available |
execution speed goals (@pxref{Performance})@footnote{However, in 1998 |
| on many machines. |
the bar was raised when the major commercial Forth vendors switched to |
| |
native code compilers.}. It is free and available on many machines. |
| |
|
| @node Other Books, Invoking Gforth, Goals, Top |
@c ****************************************************************** |
| @chapter Other books on ANS Forth |
@node Gforth Environment, Tutorial, Goals, Top |
| @cindex books on Forth |
@chapter Gforth Environment |
| |
@cindex Gforth environment |
| As the standard is relatively new, there are not many books out yet. It |
|
| is not recommended to learn Forth by using Gforth and a book that is |
|
| not written for ANS Forth, as you will not know your mistakes from the |
|
| deviations of the book. |
|
| |
|
| @cindex standard document for ANS Forth |
Note: ultimately, the Gforth man page will be auto-generated from the |
| @cindex ANS Forth document |
material in this chapter. |
| There is, of course, the standard, the definite reference if you want to |
|
| write ANS Forth programs. It is available in printed form from the |
|
| National Standards Institute Sales Department (Tel.: USA (212) 642-4900; |
|
| Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about $200. You |
|
| can also get it from Global Engineering Documents (Tel.: USA (800) |
|
| 854-7179; Fax.: (303) 843-9880) for about $300. |
|
| |
|
| @cite{dpANS6}, the last draft of the standard, which was then submitted |
@menu |
| to ANSI for publication is available electronically and for free in some |
* Invoking Gforth:: Getting in |
| MS Word format, and it has been converted to HTML (this is my favourite |
* Leaving Gforth:: Getting out |
| format !!url). Some pointers to these versions can be found through |
* Command-line editing:: |
| @*@url{http://www.complang.tuwien.ac.at/projects/forth.html}. |
* Environment variables:: that affect how Gforth starts up |
| |
* Gforth Files:: What gets installed and where |
| @cindex introductory book |
* Startup speed:: When 35ms is not fast enough ... |
| @cindex book, introductory |
@end menu |
| @cindex Woehr, Jack: @cite{Forth: The New Model} |
|
| @cindex @cite{Forth: The new model} (book) |
|
| @cite{Forth: The New Model} by Jack Woehr (Prentice-Hall, 1993) is an |
|
| introductory book based on a draft version of the standard. It does not |
|
| cover the whole standard. It also contains interesting background |
|
| information (Jack Woehr was in the ANS Forth Technical Committee). It is |
|
| not appropriate for complete newbies, but programmers experienced in |
|
| other languages should find it ok. |
|
| |
|
| !!Conklin, Forth programmer's handbook |
For related information about the creation of images see @ref{Image Files}. |
| |
|
| @node Invoking Gforth, Words, Other Books, Top |
@comment ---------------------------------------------- |
| @chapter Invoking Gforth |
@node Invoking Gforth, Leaving Gforth, Gforth Environment, Gforth Environment |
| |
@section Invoking Gforth |
| @cindex invoking Gforth |
@cindex invoking Gforth |
| @cindex running Gforth |
@cindex running Gforth |
| @cindex command-line options |
@cindex command-line options |
| @cindex options on the command line |
@cindex options on the command line |
| @cindex flags on the command line |
@cindex flags on the command line |
| |
|
| You will usually just say @code{gforth}. In many other cases the default |
Gforth is made up of two parts; an executable ``engine'' (named |
| |
@file{gforth} or @file{gforth-fast}) and an image file. To start it, you |
| |
will usually just say @code{gforth} -- this automatically loads the |
| |
default image file @file{gforth.fi}. In many other cases the default |
| Gforth image will be invoked like this: |
Gforth image will be invoked like this: |
| @example |
@example |
| gforth [files] [-e forth-code] |
gforth [file | -e forth-code] ... |
| @end example |
@end example |
| |
@noindent |
| This interprets the contents of the files and the Forth code in the order they |
This interprets the contents of the files and the Forth code in the order they |
| are given. |
are given. |
| |
|
| |
In addition to the @file{gforth} engine, there is also an engine called |
| |
@file{gforth-fast}, which is faster, but gives less informative error |
| |
messages (@pxref{Error messages}). |
| |
|
| In general, the command line looks like this: |
In general, the command line looks like this: |
| |
|
| @example |
@example |
| gforth [initialization options] [image-specific options] |
gforth[-fast] [engine options] [image options] |
| @end example |
@end example |
| |
|
| The initialization options must come before the rest of the command |
The engine options must come before the rest of the command |
| line. They are: |
line. They are: |
| |
|
| @table @code |
@table @code |
| @cindex -i, command-line option |
@cindex -i, command-line option |
| @cindex --image-file, command-line option |
@cindex --image-file, command-line option |
| @item --image-file @var{file} |
@item --image-file @i{file} |
| @itemx -i @var{file} |
@itemx -i @i{file} |
| Loads the Forth image @var{file} instead of the default |
Loads the Forth image @i{file} instead of the default |
| @file{gforth.fi} (@pxref{Image Files}). |
@file{gforth.fi} (@pxref{Image Files}). |
| |
|
| |
@cindex --appl-image, command-line option |
| |
@item --appl-image @i{file} |
| |
Loads the image @i{file} and leaves all further command-line arguments |
| |
to the image (instead of processing them as engine options). This is |
| |
useful for building executable application images on Unix, built with |
| |
@code{gforthmi --application ...}. |
| |
|
| @cindex --path, command-line option |
@cindex --path, command-line option |
| @cindex -p, command-line option |
@cindex -p, command-line option |
| @item --path @var{path} |
@item --path @i{path} |
| @itemx -p @var{path} |
@itemx -p @i{path} |
| Uses @var{path} for searching the image file and Forth source code files |
Uses @i{path} for searching the image file and Forth source code files |
| instead of the default in the environment variable @code{GFORTHPATH} or |
instead of the default in the environment variable @code{GFORTHPATH} or |
| the path specified at installation time (e.g., |
the path specified at installation time (e.g., |
| @file{/usr/local/share/gforth/0.2.0:.}). A path is given as a list of |
@file{/usr/local/share/gforth/0.2.0:.}). A path is given as a list of |
| |
|
| @cindex --dictionary-size, command-line option |
@cindex --dictionary-size, command-line option |
| @cindex -m, command-line option |
@cindex -m, command-line option |
| @cindex @var{size} parameters for command-line options |
@cindex @i{size} parameters for command-line options |
| @cindex size of the dictionary and the stacks |
@cindex size of the dictionary and the stacks |
| @item --dictionary-size @var{size} |
@item --dictionary-size @i{size} |
| @itemx -m @var{size} |
@itemx -m @i{size} |
| Allocate @var{size} space for the Forth dictionary space instead of |
Allocate @i{size} space for the Forth dictionary space instead of |
| using the default specified in the image (typically 256K). The |
using the default specified in the image (typically 256K). The |
| @var{size} specification consists of an integer and a unit (e.g., |
@i{size} specification for this and subsequent options consists of |
| |
an integer and a unit (e.g., |
| @code{4M}). The unit can be one of @code{b} (bytes), @code{e} (element |
@code{4M}). The unit can be one of @code{b} (bytes), @code{e} (element |
| size, in this case Cells), @code{k} (kilobytes), @code{M} (Megabytes), |
size, in this case Cells), @code{k} (kilobytes), @code{M} (Megabytes), |
| @code{G} (Gigabytes), and @code{T} (Terabytes). If no unit is specified, |
@code{G} (Gigabytes), and @code{T} (Terabytes). If no unit is specified, |
| |
|
| @cindex --data-stack-size, command-line option |
@cindex --data-stack-size, command-line option |
| @cindex -d, command-line option |
@cindex -d, command-line option |
| @item --data-stack-size @var{size} |
@item --data-stack-size @i{size} |
| @itemx -d @var{size} |
@itemx -d @i{size} |
| Allocate @var{size} space for the data stack instead of using the |
Allocate @i{size} space for the data stack instead of using the |
| default specified in the image (typically 16K). |
default specified in the image (typically 16K). |
| |
|
| @cindex --return-stack-size, command-line option |
@cindex --return-stack-size, command-line option |
| @cindex -r, command-line option |
@cindex -r, command-line option |
| @item --return-stack-size @var{size} |
@item --return-stack-size @i{size} |
| @itemx -r @var{size} |
@itemx -r @i{size} |
| Allocate @var{size} space for the return stack instead of using the |
Allocate @i{size} space for the return stack instead of using the |
| default specified in the image (typically 15K). |
default specified in the image (typically 15K). |
| |
|
| @cindex --fp-stack-size, command-line option |
@cindex --fp-stack-size, command-line option |
| @cindex -f, command-line option |
@cindex -f, command-line option |
| @item --fp-stack-size @var{size} |
@item --fp-stack-size @i{size} |
| @itemx -f @var{size} |
@itemx -f @i{size} |
| Allocate @var{size} space for the floating point stack instead of |
Allocate @i{size} space for the floating point stack instead of |
| using the default specified in the image (typically 15.5K). In this case |
using the default specified in the image (typically 15.5K). In this case |
| the unit specifier @code{e} refers to floating point numbers. |
the unit specifier @code{e} refers to floating point numbers. |
| |
|
| @cindex --locals-stack-size, command-line option |
@cindex --locals-stack-size, command-line option |
| @cindex -l, command-line option |
@cindex -l, command-line option |
| @item --locals-stack-size @var{size} |
@item --locals-stack-size @i{size} |
| @itemx -l @var{size} |
@itemx -l @i{size} |
| Allocate @var{size} space for the locals stack instead of using the |
Allocate @i{size} space for the locals stack instead of using the |
| default specified in the image (typically 14.5K). |
default specified in the image (typically 14.5K). |
| |
|
| @cindex -h, command-line option |
@cindex -h, command-line option |
| default image @file{gforth.fi} consist of a sequence of filenames and |
default image @file{gforth.fi} consist of a sequence of filenames and |
| @code{-e @var{forth-code}} options that are interpreted in the sequence |
@code{-e @var{forth-code}} options that are interpreted in the sequence |
| in which they are given. The @code{-e @var{forth-code}} or |
in which they are given. The @code{-e @var{forth-code}} or |
| @code{--evaluate @var{forth-code}} option evaluates the forth |
@code{--evaluate @var{forth-code}} option evaluates the Forth |
| code. This option takes only one argument; if you want to evaluate more |
code. This option takes only one argument; if you want to evaluate more |
| Forth words, you have to quote them or use several @code{-e}s. To exit |
Forth words, you have to quote them or use @code{-e} several times. To exit |
| after processing the command line (instead of entering interactive mode) |
after processing the command line (instead of entering interactive mode) |
| append @code{-e bye} to the command line. |
append @code{-e bye} to the command line. |
| |
|
| @cindex versions, invoking other versions of Gforth |
@cindex versions, invoking other versions of Gforth |
| If you have several versions of Gforth installed, @code{gforth} will |
If you have several versions of Gforth installed, @code{gforth} will |
| invoke the version that was installed last. @code{gforth-@var{version}} |
invoke the version that was installed last. @code{gforth-@i{version}} |
| invokes a specific version. You may want to use the option |
invokes a specific version. If your environment contains the variable |
| @code{--path}, if your environment contains the variable |
@code{GFORTHPATH}, you may want to override it by using the |
| @code{GFORTHPATH}. |
@code{--path} option. |
| |
|
| Not yet implemented: |
Not yet implemented: |
| On startup the system first executes the system initialization file |
On startup the system first executes the system initialization file |
| (unless the option @code{--no-init-file} is given; note that the system |
(unless the option @code{--no-init-file} is given; note that the system |
| resulting from using this option may not be ANS Forth conformant). Then |
resulting from using this option may not be ANS Forth conformant). Then |
| the user initialization file @file{.gforth.fs} is executed, unless the |
the user initialization file @file{.gforth.fs} is executed, unless the |
| option @code{--no-rc} is given; this file is first searched in @file{.}, |
option @code{--no-rc} is given; this file is searched for in @file{.}, |
| then in @file{~}, then in the normal path (see above). |
then in @file{~}, then in the normal path (see above). |
| |
|
| @node Words, Tools, Invoking Gforth, Top |
|
| @chapter Forth Words |
|
| @cindex Words |
|
| |
|
| @menu |
|
| * Notation:: |
|
| * Arithmetic:: |
|
| * Stack Manipulation:: |
|
| * Memory:: |
|
| * Control Structures:: |
|
| * Locals:: |
|
| * Defining Words:: |
|
| * Structures:: |
|
| * Object-oriented Forth:: |
|
| * Tokens for Words:: |
|
| * Wordlists:: |
|
| * Files:: |
|
| * Including Files:: |
|
| * Blocks:: |
|
| * Other I/O:: |
|
| * Programming Tools:: |
|
| * Assembler and Code Words:: |
|
| * Threading Words:: |
|
| @end menu |
|
| |
|
| @node Notation, Arithmetic, Words, Words |
|
| @section Notation |
|
| @cindex notation of glossary entries |
|
| @cindex format of glossary entries |
|
| @cindex glossary notation format |
|
| @cindex word glossary entry format |
|
| |
|
| The Forth words are described in this section in the glossary notation |
@comment ---------------------------------------------- |
| that has become a de-facto standard for Forth texts, i.e., |
@node Leaving Gforth, Command-line editing, Invoking Gforth, Gforth Environment |
| |
@section Leaving Gforth |
| |
@cindex Gforth - leaving |
| |
@cindex leaving Gforth |
| |
|
| |
You can leave Gforth by typing @code{bye} or @kbd{Ctrl-d} (at the start |
| |
of a line) or (if you invoked Gforth with the @code{--die-on-signal} |
| |
option) @kbd{Ctrl-c}. When you leave Gforth, all of your definitions and |
| |
data are discarded. For ways of saving the state of the system before |
| |
leaving Gforth see @ref{Image Files}. |
| |
|
| |
doc-bye |
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Command-line editing, Environment variables, Leaving Gforth, Gforth Environment |
| |
@section Command-line editing |
| |
@cindex command-line editing |
| |
|
| |
Gforth maintains a history file that records every line that you type to |
| |
the text interpreter. This file is preserved between sessions, and is |
| |
used to provide a command-line recall facility; if you type @kbd{Ctrl-P} |
| |
repeatedly you can recall successively older commands from this (or |
| |
previous) session(s). The full list of command-line editing facilities is: |
| |
|
| @format |
@itemize @bullet |
| @var{word} @var{Stack effect} @var{wordset} @var{pronunciation} |
@item |
| @end format |
@kbd{Ctrl-p} (``previous'') (or up-arrow) to recall successively older |
| @var{Description} |
commands from the history buffer. |
| |
@item |
| |
@kbd{Ctrl-n} (``next'') (or down-arrow) to recall successively newer commands |
| |
from the history buffer. |
| |
@item |
| |
@kbd{Ctrl-f} (or right-arrow) to move the cursor right, non-destructively. |
| |
@item |
| |
@kbd{Ctrl-b} (or left-arrow) to move the cursor left, non-destructively. |
| |
@item |
| |
@kbd{Ctrl-h} (backspace) to delete the character to the left of the cursor, |
| |
closing up the line. |
| |
@item |
| |
@kbd{Ctrl-k} to delete (``kill'') from the cursor to the end of the line. |
| |
@item |
| |
@kbd{Ctrl-a} to move the cursor to the start of the line. |
| |
@item |
| |
@kbd{Ctrl-e} to move the cursor to the end of the line. |
| |
@item |
| |
@key{RET} (@kbd{Ctrl-m}) or @key{LFD} (@kbd{Ctrl-j}) to submit the current |
| |
line. |
| |
@item |
| |
@key{TAB} to step through all possible full-word completions of the word |
| |
currently being typed. |
| |
@item |
| |
@kbd{Ctrl-d} on an empty line line to terminate Gforth (gracefully, |
| |
using @code{bye}). |
| |
@item |
| |
@kbd{Ctrl-x} (or @code{Ctrl-d} on a non-empty line) to delete the |
| |
character under the cursor. |
| |
@end itemize |
| |
|
| @table @var |
When editing, displayable characters are inserted to the left of the |
| @item word |
cursor position; the line is always in ``insert'' (as opposed to |
| @cindex case insensitivity |
``overstrike'') mode. |
| The name of the word. BTW, Gforth is case insensitive, so you can |
|
| type the words in in lower case (However, @pxref{core-idef}). |
|
| |
|
| @item Stack effect |
@cindex history file |
| @cindex stack effect |
@cindex @file{.gforth-history} |
| The stack effect is written in the notation @code{@var{before} -- |
On Unix systems, the history file is @file{~/.gforth-history} by |
| @var{after}}, where @var{before} and @var{after} describe the top of |
default@footnote{i.e. it is stored in the user's home directory.}. You |
| stack entries before and after the execution of the word. The rest of |
can find out the name and location of your history file using: |
| the stack is not touched by the word. The top of stack is rightmost, |
|
| i.e., a stack sequence is written as it is typed in. Note that Gforth |
|
| uses a separate floating point stack, but a unified stack |
|
| notation. Also, return stack effects are not shown in @var{stack |
|
| effect}, but in @var{Description}. The name of a stack item describes |
|
| the type and/or the function of the item. See below for a discussion of |
|
| the types. |
|
| |
|
| All words have two stack effects: A compile-time stack effect and a |
@example |
| run-time stack effect. The compile-time stack-effect of most words is |
history-file type \ Unix-class systems |
| @var{ -- }. If the compile-time stack-effect of a word deviates from |
|
| this standard behaviour, or the word does other unusual things at |
|
| compile time, both stack effects are shown; otherwise only the run-time |
|
| stack effect is shown. |
|
| |
|
| @cindex pronounciation of words |
history-file type \ Other systems |
| @item pronunciation |
history-dir type |
| How the word is pronounced. |
@end example |
| |
|
| @cindex wordset |
If you enter long definitions by hand, you can use a text editor to |
| @item wordset |
paste them out of the history file into a Forth source file for reuse at |
| The ANS Forth standard is divided into several wordsets. A standard |
a later time. |
| system need not support all of them. So, the fewer wordsets your program |
|
| uses the more portable it will be in theory. However, we suspect that |
|
| most ANS Forth systems on personal machines will feature all |
|
| wordsets. Words that are not defined in the ANS standard have |
|
| @code{gforth} or @code{gforth-internal} as wordset. @code{gforth} |
|
| describes words that will work in future releases of Gforth; |
|
| @code{gforth-internal} words are more volatile. Environmental query |
|
| strings are also displayed like words; you can recognize them by the |
|
| @code{environment} in the wordset field. |
|
| |
|
| @item Description |
Gforth never trims the size of the history file, so you should do this |
| A description of the behaviour of the word. |
periodically, if necessary. |
| @end table |
|
| |
|
| @cindex types of stack items |
@comment this is all defined in history.fs |
| @cindex stack item types |
@comment NAC TODO the ctrl-D behaviour can either do a bye or a beep.. how is that option |
| The type of a stack item is specified by the character(s) the name |
@comment chosen? |
| starts with: |
|
| |
|
| @table @code |
|
| @item f |
|
| @cindex @code{f}, stack item type |
|
| Boolean flags, i.e. @code{false} or @code{true}. |
|
| @item c |
|
| @cindex @code{c}, stack item type |
|
| Char |
|
| @item w |
|
| @cindex @code{w}, stack item type |
|
| Cell, can contain an integer or an address |
|
| @item n |
|
| @cindex @code{n}, stack item type |
|
| signed integer |
|
| @item u |
|
| @cindex @code{u}, stack item type |
|
| unsigned integer |
|
| @item d |
|
| @cindex @code{d}, stack item type |
|
| double sized signed integer |
|
| @item ud |
|
| @cindex @code{ud}, stack item type |
|
| double sized unsigned integer |
|
| @item r |
|
| @cindex @code{r}, stack item type |
|
| Float (on the FP stack) |
|
| @item a_ |
|
| @cindex @code{a_}, stack item type |
|
| Cell-aligned address |
|
| @item c_ |
|
| @cindex @code{c_}, stack item type |
|
| Char-aligned address (note that a Char may have two bytes in Windows NT) |
|
| @item f_ |
|
| @cindex @code{f_}, stack item type |
|
| Float-aligned address |
|
| @item df_ |
|
| @cindex @code{df_}, stack item type |
|
| Address aligned for IEEE double precision float |
|
| @item sf_ |
|
| @cindex @code{sf_}, stack item type |
|
| Address aligned for IEEE single precision float |
|
| @item xt |
|
| @cindex @code{xt}, stack item type |
|
| Execution token, same size as Cell |
|
| @item wid |
|
| @cindex @code{wid}, stack item type |
|
| Wordlist ID, same size as Cell |
|
| @item f83name |
|
| @cindex @code{f83name}, stack item type |
|
| Pointer to a name structure |
|
| @item " |
|
| @cindex @code{"}, stack item type |
|
| string in the input stream (not on the stack). The terminating character |
|
| is a blank by default. If it is not a blank, it is shown in @code{<>} |
|
| quotes. |
|
| @end table |
|
| |
|
| @node Arithmetic, Stack Manipulation, Notation, Words |
@comment ---------------------------------------------- |
| @section Arithmetic |
@node Environment variables, Gforth Files, Command-line editing, Gforth Environment |
| @cindex arithmetic words |
@section Environment variables |
| |
@cindex environment variables |
| |
|
| @cindex division with potentially negative operands |
Gforth uses these environment variables: |
| Forth arithmetic is not checked, i.e., you will not hear about integer |
|
| overflow on addition or multiplication, you may hear about division by |
|
| zero if you are lucky. The operator is written after the operands, but |
|
| the operands are still in the original order. I.e., the infix @code{2-1} |
|
| corresponds to @code{2 1 -}. Forth offers a variety of division |
|
| operators. If you perform division with potentially negative operands, |
|
| you do not want to use @code{/} or @code{/mod} with its undefined |
|
| behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the |
|
| former, @pxref{Mixed precision}). |
|
| |
|
| @menu |
@itemize @bullet |
| * Single precision:: |
@item |
| * Bitwise operations:: |
@cindex @code{GFORTHHIST} -- environment variable |
| * Mixed precision:: operations with single and double-cell integers |
@code{GFORTHHIST} -- (Unix systems only) specifies the directory in which to |
| * Double precision:: Double-cell integer arithmetic |
open/create the history file, @file{.gforth-history}. Default: |
| * Floating Point:: |
@code{$HOME}. |
| @end menu |
|
| |
|
| @node Single precision, Bitwise operations, Arithmetic, Arithmetic |
@item |
| @subsection Single precision |
@cindex @code{GFORTHPATH} -- environment variable |
| @cindex single precision arithmetic words |
@code{GFORTHPATH} -- specifies the path used when searching for the gforth image file and |
| |
for Forth source-code files. |
| |
|
| doc-+ |
@item |
| doc-- |
@cindex @code{GFORTH} -- environment variable |
| doc-* |
@code{GFORTH} -- used by @file{gforthmi}, @xref{gforthmi}. |
| doc-/ |
|
| doc-mod |
|
| doc-/mod |
|
| doc-negate |
|
| doc-abs |
|
| doc-min |
|
| doc-max |
|
| |
|
| @node Bitwise operations, Mixed precision, Single precision, Arithmetic |
@item |
| @subsection Bitwise operations |
@cindex @code{GFORTHD} -- environment variable |
| @cindex bitwise operation words |
@code{GFORTHD} -- used by @file{gforthmi}, @xref{gforthmi}. |
| |
|
| doc-and |
@item |
| doc-or |
@cindex @code{TMP}, @code{TEMP} - environment variable |
| doc-xor |
@code{TMP}, @code{TEMP} - (non-Unix systems only) used as a potential |
| doc-invert |
location for the history file. |
| doc-2* |
@end itemize |
| doc-2/ |
|
| |
|
| @node Mixed precision, Double precision, Bitwise operations, Arithmetic |
|
| @subsection Mixed precision |
|
| @cindex mixed precision arithmetic words |
|
| |
|
| doc-m+ |
|
| doc-*/ |
|
| doc-*/mod |
|
| doc-m* |
|
| doc-um* |
|
| doc-m*/ |
|
| doc-um/mod |
|
| doc-fm/mod |
|
| doc-sm/rem |
|
| |
|
| @node Double precision, Floating Point, Mixed precision, Arithmetic |
@comment also POSIXELY_CORRECT LINES COLUMNS HOME but no interest in |
| @subsection Double precision |
@comment mentioning these. |
| @cindex double precision arithmetic words |
|
| |
|
| @cindex double-cell numbers, input format |
All the Gforth environment variables default to sensible values if they |
| @cindex input format for double-cell numbers |
are not set. |
| The outer (aka text) interpreter converts numbers containing a dot into |
|
| a double precision number. Note that only numbers with the dot as last |
|
| character are standard-conforming. |
|
| |
|
| doc-d+ |
|
| doc-d- |
|
| doc-dnegate |
|
| doc-dabs |
|
| doc-dmin |
|
| doc-dmax |
|
| |
|
| @node Floating Point, , Double precision, Arithmetic |
@comment ---------------------------------------------- |
| @subsection Floating Point |
@node Gforth Files, Startup speed, Environment variables, Gforth Environment |
| @cindex floating point arithmetic words |
@section Gforth files |
| |
@cindex Gforth files |
| |
|
| @cindex floating-point numbers, input format |
When you install Gforth on a Unix system, it installs files in these |
| @cindex input format for floating-point numbers |
locations by default: |
| The format of floating point numbers recognized by the outer (aka text) |
|
| interpreter is: a signed decimal number, possibly containing a decimal |
|
| point (@code{.}), followed by @code{E} or @code{e}, optionally followed |
|
| by a signed integer (the exponent). E.g., @code{1e} is the same as |
|
| @code{+1.0e+0}. Note that a number without @code{e} is not interpreted |
|
| as floating-point number, but as double (if the number contains a |
|
| @code{.}) or single precision integer. Also, conversions between string |
|
| and floating point numbers always use base 10, irrespective of the value |
|
| of @code{BASE} (in Gforth; for the standard this is an ambiguous |
|
| condition). If @code{BASE} contains a value greater then 14, the |
|
| @code{E} may be interpreted as digit and the number will be interpreted |
|
| as integer, unless it has a signed exponent (both @code{+} and @code{-} |
|
| are allowed as signs). |
|
| |
|
| @cindex angles in trigonometric operations |
@itemize @bullet |
| @cindex trigonometric operations |
@item |
| Angles in floating point operations are given in radians (a full circle |
@file{/usr/local/bin/gforth} |
| has 2 pi radians). Note, that Gforth has a separate floating point |
@item |
| stack, but we use the unified notation. |
@file{/usr/local/bin/gforthmi} |
| |
@item |
| |
@file{/usr/local/man/man1/gforth.1} - man page. |
| |
@item |
| |
@file{/usr/local/info} - the Info version of this manual. |
| |
@item |
| |
@file{/usr/local/lib/gforth/<version>/...} - Gforth @file{.fi} files. |
| |
@item |
| |
@file{/usr/local/share/gforth/<version>/TAGS} - Emacs TAGS file. |
| |
@item |
| |
@file{/usr/local/share/gforth/<version>/...} - Gforth source files. |
| |
@item |
| |
@file{.../emacs/site-lisp/gforth.el} - Emacs gforth mode. |
| |
@end itemize |
| |
|
| @cindex floating-point arithmetic, pitfalls |
You can select different places for installation by using |
| Floating point numbers have a number of unpleasant surprises for the |
@code{configure} options (listed with @code{configure --help}). |
| unwary (e.g., floating point addition is not associative) and even a few |
|
| for the wary. You should not use them unless you know what you are doing |
|
| or you don't care that the results you get are totally bogus. If you |
|
| want to learn about the problems of floating point numbers (and how to |
|
| avoid them), you might start with @cite{David Goldberg, What Every |
|
| Computer Scientist Should Know About Floating-Point Arithmetic, ACM |
|
| Computing Surveys 23(1):5@minus{}48, March 1991}. |
|
| |
|
| doc-f+ |
@comment ---------------------------------------------- |
| doc-f- |
@node Startup speed, , Gforth Files, Gforth Environment |
| doc-f* |
@section Startup speed |
| doc-f/ |
@cindex Startup speed |
| doc-fnegate |
@cindex speed, startup |
| doc-fabs |
|
| doc-fmax |
If Gforth is used for CGI scripts or in shell scripts, its startup |
| doc-fmin |
speed may become a problem. On a 300MHz 21064a under Linux-2.2.13 with |
| doc-floor |
glibc-2.0.7, @code{gforth -e bye} takes about 24.6ms user and 11.3ms |
| doc-fround |
system time. |
| doc-f** |
|
| doc-fsqrt |
If startup speed is a problem, you may consider the following ways to |
| doc-fexp |
improve it; or you may consider ways to reduce the number of startups |
| doc-fexpm1 |
(for example, by using Fast-CGI). |
| doc-fln |
|
| doc-flnp1 |
The first step to improve startup speed is to statically link Gforth, by |
| doc-flog |
building it with @code{XLDFLAGS=-static}. This requires more memory for |
| doc-falog |
the code and will therefore slow down the first invocation, but |
| doc-fsin |
subsequent invocations avoid the dynamic linking overhead. Another |
| doc-fcos |
disadvantage is that Gforth won't profit from library upgrades. As a |
| doc-fsincos |
result, @code{gforth-static -e bye} takes about 17.1ms user and |
| doc-ftan |
8.2ms system time. |
| doc-fasin |
|
| doc-facos |
The next step to improve startup speed is to use a non-relocatable image |
| doc-fatan |
(@pxref{Non-Relocatable Image Files}). You can create this image with |
| doc-fatan2 |
@code{gforth -e "savesystem gforthnr.fi bye"} and later use it with |
| doc-fsinh |
@code{gforth -i gforthnr.fi ...}. This avoids the relocation overhead |
| doc-fcosh |
and a part of the copy-on-write overhead. The disadvantage is that the |
| doc-ftanh |
non-relocatable image does not work if the OS gives Gforth a different |
| doc-fasinh |
address for the dictionary, for whatever reason; so you better provide a |
| doc-facosh |
fallback on a relocatable image. @code{gforth-static -i gforthnr.fi -e |
| doc-fatanh |
bye} takes about 15.3ms user and 7.5ms system time. |
| |
|
| |
The final step is to disable dictionary hashing in Gforth. Gforth |
| |
builds the hash table on startup, which takes much of the startup |
| |
overhead. You can do this by commenting out the @code{include hash.fs} |
| |
in @file{startup.fs} and everything that requires @file{hash.fs} (at the |
| |
moment @file{table.fs} and @file{ekey.fs}) and then doing @code{make}. |
| |
The disadvantages are that functionality like @code{table} and |
| |
@code{ekey} is missing and that text interpretation (e.g., compiling) |
| |
now takes much longer. So, you should only use this method if there is |
| |
no significant text interpretation to perform (the script should be |
| |
compiled into the image, amongst other things). @code{gforth-static -i |
| |
gforthnrnh.fi -e bye} takes about 2.1ms user and 6.1ms system time. |
| |
|
| @node Stack Manipulation, Memory, Arithmetic, Words |
@c ****************************************************************** |
| @section Stack Manipulation |
@node Tutorial, Introduction, Gforth Environment, Top |
| @cindex stack manipulation words |
@chapter Forth Tutorial |
| |
@cindex Tutorial |
| |
@cindex Forth Tutorial |
| |
|
| |
@c Topics from nac's Introduction that could be mentioned: |
| |
@c press <ret> after each line |
| |
@c Prompt |
| |
@c numbers vs. words in dictionary on text interpretation |
| |
@c what happens on redefinition |
| |
@c parsing words (in particular, defining words) |
| |
|
| |
This tutorial can be used with any ANS-compliant Forth; any |
| |
Gforth-specific features are marked as such and you can skip them if you |
| |
work with another Forth. This tutorial does not explain all features of |
| |
Forth, just enough to get you started and give you some ideas about the |
| |
facilities available in Forth. Read the rest of the manual and the |
| |
standard when you are through this. |
| |
|
| |
The intended way to use this tutorial is that you work through it while |
| |
sitting in front of the console, take a look at the examples and predict |
| |
what they will do, then try them out; if the outcome is not as expected, |
| |
find out why (e.g., by trying out variations of the example), so you |
| |
understand what's going on. There are also some assignments that you |
| |
should solve. |
| |
|
| @cindex floating-point stack in the standard |
This tutorial assumes that you have programmed before and know what, |
| Gforth has a data stack (aka parameter stack) for characters, cells, |
e.g., a loop is. |
| addresses, and double cells, a floating point stack for floating point |
|
| numbers, a return stack for storing the return addresses of colon |
|
| definitions and other data, and a locals stack for storing local |
|
| variables. Note that while every sane Forth has a separate floating |
|
| point stack, this is not strictly required; an ANS Forth system could |
|
| theoretically keep floating point numbers on the data stack. As an |
|
| additional difficulty, you don't know how many cells a floating point |
|
| number takes. It is reportedly possible to write words in a way that |
|
| they work also for a unified stack model, but we do not recommend trying |
|
| it. Instead, just say that your program has an environmental dependency |
|
| on a separate FP stack. |
|
| |
|
| @cindex return stack and locals |
@c !! explain compat library |
| @cindex locals and return stack |
|
| Also, a Forth system is allowed to keep the local variables on the |
|
| return stack. This is reasonable, as local variables usually eliminate |
|
| the need to use the return stack explicitly. So, if you want to produce |
|
| a standard complying program and if you are using local variables in a |
|
| word, forget about return stack manipulations in that word (see the |
|
| standard document for the exact rules). |
|
| |
|
| @menu |
@menu |
| * Data stack:: |
* Starting Gforth Tutorial:: |
| * Floating point stack:: |
* Syntax Tutorial:: |
| * Return stack:: |
* Crash Course Tutorial:: |
| * Locals stack:: |
* Stack Tutorial:: |
| * Stack pointer manipulation:: |
* Arithmetics Tutorial:: |
| |
* Stack Manipulation Tutorial:: |
| |
* Using files for Forth code Tutorial:: |
| |
* Comments Tutorial:: |
| |
* Colon Definitions Tutorial:: |
| |
* Decompilation Tutorial:: |
| |
* Stack-Effect Comments Tutorial:: |
| |
* Types Tutorial:: |
| |
* Factoring Tutorial:: |
| |
* Designing the stack effect Tutorial:: |
| |
* Local Variables Tutorial:: |
| |
* Conditional execution Tutorial:: |
| |
* Flags and Comparisons Tutorial:: |
| |
* General Loops Tutorial:: |
| |
* Counted loops Tutorial:: |
| |
* Recursion Tutorial:: |
| |
* Leaving definitions or loops Tutorial:: |
| |
* Return Stack Tutorial:: |
| |
* Memory Tutorial:: |
| |
* Characters and Strings Tutorial:: |
| |
* Alignment Tutorial:: |
| |
* Interpretation and Compilation Semantics and Immediacy Tutorial:: |
| |
* Execution Tokens Tutorial:: |
| |
* Exceptions Tutorial:: |
| |
* Defining Words Tutorial:: |
| |
* Arrays and Records Tutorial:: |
| |
* POSTPONE Tutorial:: |
| |
* Literal Tutorial:: |
| |
* Advanced macros Tutorial:: |
| |
* Compilation Tokens Tutorial:: |
| |
* Wordlists and Search Order Tutorial:: |
| @end menu |
@end menu |
| |
|
| @node Data stack, Floating point stack, Stack Manipulation, Stack Manipulation |
@node Starting Gforth Tutorial, Syntax Tutorial, Tutorial, Tutorial |
| @subsection Data stack |
@section Starting Gforth |
| @cindex data stack manipulation words |
@cindex starting Gforth tutorial |
| @cindex stack manipulations words, data stack |
You can start Gforth by typing its name: |
| |
|
| doc-drop |
@example |
| doc-nip |
gforth |
| doc-dup |
@end example |
| doc-over |
|
| doc-tuck |
|
| doc-swap |
|
| doc-rot |
|
| doc--rot |
|
| doc-?dup |
|
| doc-pick |
|
| doc-roll |
|
| doc-2drop |
|
| doc-2nip |
|
| doc-2dup |
|
| doc-2over |
|
| doc-2tuck |
|
| doc-2swap |
|
| doc-2rot |
|
| |
|
| @node Floating point stack, Return stack, Data stack, Stack Manipulation |
That puts you into interactive mode; you can leave Gforth by typing |
| @subsection Floating point stack |
@code{bye}. While in Gforth, you can edit the command line and access |
| @cindex floating-point stack manipulation words |
the command line history with cursor keys, similar to bash. |
| @cindex stack manipulation words, floating-point stack |
|
| |
|
| doc-fdrop |
|
| doc-fnip |
|
| doc-fdup |
|
| doc-fover |
|
| doc-ftuck |
|
| doc-fswap |
|
| doc-frot |
|
| |
|
| @node Return stack, Locals stack, Floating point stack, Stack Manipulation |
@node Syntax Tutorial, Crash Course Tutorial, Starting Gforth Tutorial, Tutorial |
| @subsection Return stack |
@section Syntax |
| @cindex return stack manipulation words |
@cindex syntax tutorial |
| @cindex stack manipulation words, return stack |
|
| |
|
| doc->r |
A @dfn{word} is a sequence of arbitrary characters (expcept white |
| doc-r> |
space). Words are separated by white space. E.g., each of the |
| doc-r@ |
following lines contains exactly one word: |
| doc-rdrop |
|
| doc-2>r |
|
| doc-2r> |
|
| doc-2r@ |
|
| doc-2rdrop |
|
| |
|
| @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation |
@example |
| @subsection Locals stack |
word |
| |
!@@#$%^&*() |
| |
1234567890 |
| |
5!a |
| |
@end example |
| |
|
| @node Stack pointer manipulation, , Locals stack, Stack Manipulation |
A frequent beginner's error is to leave away necessary white space, |
| @subsection Stack pointer manipulation |
resulting in an error like @samp{Undefined word}; so if you see such an |
| @cindex stack pointer manipulation words |
error, check if you have put spaces wherever necessary. |
| |
|
| doc-sp@ |
@example |
| doc-sp! |
." hello, world" \ correct |
| doc-fp@ |
."hello, world" \ gives an "Undefined word" error |
| doc-fp! |
@end example |
| doc-rp@ |
|
| doc-rp! |
|
| doc-lp@ |
|
| doc-lp! |
|
| |
|
| @node Memory, Control Structures, Stack Manipulation, Words |
Gforth and most other Forth systems ignore differences in case (they are |
| @section Memory |
case-insensitive), i.e., @samp{word} is the same as @samp{Word}. If |
| @cindex Memory words |
your system is case-sensitive, you may have to type all the examples |
| |
given here in upper case. |
| |
|
| @menu |
|
| * Memory Access:: |
|
| * Address arithmetic:: |
|
| * Memory Blocks:: |
|
| @end menu |
|
| |
|
| @node Memory Access, Address arithmetic, Memory, Memory |
@node Crash Course Tutorial, Stack Tutorial, Syntax Tutorial, Tutorial |
| @subsection Memory Access |
@section Crash Course |
| @cindex memory access words |
|
| |
|
| doc-@ |
Type |
| doc-! |
|
| doc-+! |
|
| doc-c@ |
|
| doc-c! |
|
| doc-2@ |
|
| doc-2! |
|
| doc-f@ |
|
| doc-f! |
|
| doc-sf@ |
|
| doc-sf! |
|
| doc-df@ |
|
| doc-df! |
|
| |
|
| @node Address arithmetic, Memory Blocks, Memory Access, Memory |
@example |
| @subsection Address arithmetic |
0 0 ! |
| @cindex address arithmetic words |
here execute |
| |
' catch >body 20 erase abort |
| |
' (quit) >body 20 erase |
| |
@end example |
| |
|
| ANS Forth does not specify the sizes of the data types. Instead, it |
The last two examples are guaranteed to destroy parts of Gforth (and |
| offers a number of words for computing sizes and doing address |
most other systems), so you better leave Gforth afterwards (if it has |
| arithmetic. Basically, address arithmetic is performed in terms of |
not finished by itself). On some systems you may have to kill gforth |
| address units (aus); on most systems the address unit is one byte. Note |
from outside (e.g., in Unix with @code{kill}). |
| that a character may have more than one au, so @code{chars} is no noop |
|
| (on systems where it is a noop, it compiles to nothing). |
|
| |
|
| @cindex alignment of addresses for types |
Now that you know how to produce crashes (and that there's not much to |
| ANS Forth also defines words for aligning addresses for specific |
them), let's learn how to produce meaningful programs. |
| types. Many computers require that accesses to specific data types |
|
| must only occur at specific addresses; e.g., that cells may only be |
|
| accessed at addresses divisible by 4. Even if a machine allows unaligned |
|
| accesses, it can usually perform aligned accesses faster. |
|
| |
|
| For the performance-conscious: alignment operations are usually only |
|
| necessary during the definition of a data structure, not during the |
|
| (more frequent) accesses to it. |
|
| |
|
| ANS Forth defines no words for character-aligning addresses. This is not |
@node Stack Tutorial, Arithmetics Tutorial, Crash Course Tutorial, Tutorial |
| an oversight, but reflects the fact that addresses that are not |
@section Stack |
| char-aligned have no use in the standard and therefore will not be |
@cindex stack tutorial |
| created. |
|
| |
|
| @cindex @code{CREATE} and alignment |
The most obvious feature of Forth is the stack. When you type in a |
| The standard guarantees that addresses returned by @code{CREATE}d words |
number, it is pushed on the stack. You can display the content of the |
| are cell-aligned; in addition, Gforth guarantees that these addresses |
stack with @code{.s}. |
| are aligned for all purposes. |
|
| |
|
| Note that the standard defines a word @code{char}, which has nothing to |
@example |
| do with address arithmetic. |
1 2 .s |
| |
3 .s |
| |
@end example |
| |
|
| doc-chars |
@code{.s} displays the top-of-stack to the right, i.e., the numbers |
| doc-char+ |
appear in @code{.s} output as they appeared in the input. |
| doc-cells |
|
| doc-cell+ |
|
| doc-cell |
|
| doc-align |
|
| doc-aligned |
|
| doc-floats |
|
| doc-float+ |
|
| doc-float |
|
| doc-falign |
|
| doc-faligned |
|
| doc-sfloats |
|
| doc-sfloat+ |
|
| doc-sfalign |
|
| doc-sfaligned |
|
| doc-dfloats |
|
| doc-dfloat+ |
|
| doc-dfalign |
|
| doc-dfaligned |
|
| doc-maxalign |
|
| doc-maxaligned |
|
| doc-cfalign |
|
| doc-cfaligned |
|
| doc-address-unit-bits |
|
| |
|
| @node Memory Blocks, , Address arithmetic, Memory |
You can print the top of stack element with @code{.}. |
| @subsection Memory Blocks |
|
| @cindex memory block words |
|
| |
|
| doc-move |
@example |
| doc-erase |
1 2 3 . . . |
| |
@end example |
| |
|
| While the previous words work on address units, the rest works on |
In general, words consume their stack arguments (@code{.s} is an |
| characters. |
exception). |
| |
|
| doc-cmove |
@assignment |
| doc-cmove> |
What does the stack contain after @code{5 6 7 .}? |
| doc-fill |
@endassignment |
| doc-blank |
|
| |
|
| @node Control Structures, Locals, Memory, Words |
|
| @section Control Structures |
|
| @cindex control structures |
|
| |
|
| Control structures in Forth cannot be used in interpret state, only in |
@node Arithmetics Tutorial, Stack Manipulation Tutorial, Stack Tutorial, Tutorial |
| compile state@footnote{More precisely, they have no interpretation |
@section Arithmetics |
| semantics (@pxref{Interpretation and Compilation Semantics})}, i.e., in |
@cindex arithmetics tutorial |
| a colon definition. We do not like this limitation, but have not seen a |
|
| satisfying way around it yet, although many schemes have been proposed. |
|
| |
|
| @menu |
The words @code{+}, @code{-}, @code{*}, @code{/}, and @code{mod} always |
| * Selection:: |
operate on the top two stack items: |
| * Simple Loops:: |
|
| * Counted Loops:: |
|
| * Arbitrary control structures:: |
|
| * Calls and returns:: |
|
| * Exception Handling:: |
|
| @end menu |
|
| |
|
| @node Selection, Simple Loops, Control Structures, Control Structures |
|
| @subsection Selection |
|
| @cindex selection control structures |
|
| @cindex control structures for selection |
|
| |
|
| @cindex @code{IF} control structure |
|
| @example |
|
| @var{flag} |
|
| IF |
|
| @var{code} |
|
| ENDIF |
|
| @end example |
|
| or |
|
| @example |
@example |
| @var{flag} |
2 2 .s |
| IF |
+ .s |
| @var{code1} |
. |
| ELSE |
2 1 - . |
| @var{code2} |
7 3 mod . |
| ENDIF |
|
| @end example |
@end example |
| |
|
| You can use @code{THEN} instead of @code{ENDIF}. Indeed, @code{THEN} is |
The operands of @code{-}, @code{/}, and @code{mod} are in the same order |
| standard, and @code{ENDIF} is not, although it is quite popular. We |
as in the corresponding infix expression (this is generally the case in |
| recommend using @code{ENDIF}, because it is less confusing for people |
Forth). |
| who also know other languages (and is not prone to reinforcing negative |
|
| prejudices against Forth in these people). Adding @code{ENDIF} to a |
Parentheses are superfluous (and not available), because the order of |
| system that only supplies @code{THEN} is simple: |
the words unambiguously determines the order of evaluation and the |
| |
operands: |
| |
|
| @example |
@example |
| : endif POSTPONE then ; immediate |
3 4 + 5 * . |
| |
3 4 5 * + . |
| @end example |
@end example |
| |
|
| [According to @cite{Webster's New Encyclopedic Dictionary}, @dfn{then |
@assignment |
| (adv.)} has the following meanings: |
What are the infix expressions corresponding to the Forth code above? |
| @quotation |
Write @code{6-7*8+9} in Forth notation@footnote{This notation is also |
| ... 2b: following next after in order ... 3d: as a necessary consequence |
known as Postfix or RPN (Reverse Polish Notation).}. |
| (if you were there, then you saw them). |
@endassignment |
| @end quotation |
|
| Forth's @code{THEN} has the meaning 2b, whereas @code{THEN} in Pascal |
|
| and many other programming languages has the meaning 3d.] |
|
| |
|
| Gforth also provides the words @code{?dup-if} and @code{?dup-0=-if}, so |
To change the sign, use @code{negate}: |
| you can avoid using @code{?dup}. Using these alternatives is also more |
|
| efficient than using @code{?dup}. Definitions in plain standard Forth |
|
| for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in |
|
| @file{compat/control.fs}. |
|
| |
|
| @cindex @code{CASE} control structure |
|
| @example |
@example |
| @var{n} |
2 negate . |
| CASE |
|
| @var{n1} OF @var{code1} ENDOF |
|
| @var{n2} OF @var{code2} ENDOF |
|
| @dots{} |
|
| ENDCASE |
|
| @end example |
@end example |
| |
|
| Executes the first @var{codei}, where the @var{ni} is equal to |
@assignment |
| @var{n}. A default case can be added by simply writing the code after |
Convert -(-3)*4-5 to Forth. |
| the last @code{ENDOF}. It may use @var{n}, which is on top of the stack, |
@endassignment |
| but must not consume it. |
|
| |
|
| @node Simple Loops, Counted Loops, Selection, Control Structures |
@code{/mod} performs both @code{/} and @code{mod}. |
| @subsection Simple Loops |
|
| @cindex simple loops |
|
| @cindex loops without count |
|
| |
|
| @cindex @code{WHILE} loop |
|
| @example |
@example |
| BEGIN |
7 3 /mod . . |
| @var{code1} |
|
| @var{flag} |
|
| WHILE |
|
| @var{code2} |
|
| REPEAT |
|
| @end example |
@end example |
| |
|
| @var{code1} is executed and @var{flag} is computed. If it is true, |
Reference: @ref{Arithmetic}. |
| @var{code2} is executed and the loop is restarted; If @var{flag} is |
|
| false, execution continues after the @code{REPEAT}. |
|
| |
@node Stack Manipulation Tutorial, Using files for Forth code Tutorial, Arithmetics Tutorial, Tutorial |
| |
@section Stack Manipulation |
| |
@cindex stack manipulation tutorial |
| |
|
| |
Stack manipulation words rearrange the data on the stack. |
| |
|
| @cindex @code{UNTIL} loop |
|
| @example |
@example |
| BEGIN |
1 .s drop .s |
| @var{code} |
1 .s dup .s drop drop .s |
| @var{flag} |
1 2 .s over .s drop drop drop |
| UNTIL |
1 2 .s swap .s drop drop |
| |
1 2 3 .s rot .s drop drop drop |
| @end example |
@end example |
| |
|
| @var{code} is executed. The loop is restarted if @code{flag} is false. |
These are the most important stack manipulation words. There are also |
| |
variants that manipulate twice as many stack items: |
| |
|
| @cindex endless loop |
|
| @cindex loops, endless |
|
| @example |
@example |
| BEGIN |
1 2 3 4 .s 2swap .s 2drop 2drop |
| @var{code} |
|
| AGAIN |
|
| @end example |
@end example |
| |
|
| This is an endless loop. |
Two more stack manipulation words are: |
| |
|
| @node Counted Loops, Arbitrary control structures, Simple Loops, Control Structures |
|
| @subsection Counted Loops |
|
| @cindex counted loops |
|
| @cindex loops, counted |
|
| @cindex @code{DO} loops |
|
| |
|
| The basic counted loop is: |
|
| @example |
@example |
| @var{limit} @var{start} |
1 2 .s nip .s drop |
| ?DO |
1 2 .s tuck .s 2drop drop |
| @var{body} |
|
| LOOP |
|
| @end example |
@end example |
| |
|
| This performs one iteration for every integer, starting from @var{start} |
@assignment |
| and up to, but excluding @var{limit}. The counter, aka index, can be |
Replace @code{nip} and @code{tuck} with combinations of other stack |
| accessed with @code{i}. E.g., the loop |
manipulation words. |
| |
|
| @example |
@example |
| 10 0 ?DO |
Given: How do you get: |
| i . |
1 2 3 3 2 1 |
| LOOP |
1 2 3 1 2 3 2 |
| |
1 2 3 1 2 3 3 |
| |
1 2 3 1 3 3 |
| |
1 2 3 2 1 3 |
| |
1 2 3 4 4 3 2 1 |
| |
1 2 3 1 2 3 1 2 3 |
| |
1 2 3 4 1 2 3 4 1 2 |
| |
1 2 3 |
| |
1 2 3 1 2 3 4 |
| |
1 2 3 1 3 |
| @end example |
@end example |
| prints |
@endassignment |
| |
|
| @example |
@example |
| 0 1 2 3 4 5 6 7 8 9 |
5 dup * . |
| @end example |
@end example |
| The index of the innermost loop can be accessed with @code{i}, the index |
|
| of the next loop with @code{j}, and the index of the third loop with |
|
| @code{k}. |
|
| |
|
| doc-i |
|
| doc-j |
|
| doc-k |
|
| |
|
| The loop control data are kept on the return stack, so there are some |
@assignment |
| restrictions on mixing return stack accesses and counted loop |
Write 17^3 and 17^4 in Forth, without writing @code{17} more than once. |
| words. E.g., if you put values on the return stack outside the loop, you |
Write a piece of Forth code that expects two numbers on the stack |
| cannot read them inside the loop. If you put values on the return stack |
(@var{a} and @var{b}, with @var{b} on top) and computes |
| within a loop, you have to remove them before the end of the loop and |
@code{(a-b)(a+1)}. |
| before accessing the index of the loop. |
@endassignment |
| |
|
| There are several variations on the counted loop: |
Reference: @ref{Stack Manipulation}. |
| |
|
| @code{LEAVE} leaves the innermost counted loop immediately. |
|
| |
|
| If @var{start} is greater than @var{limit}, a @code{?DO} loop is entered |
@node Using files for Forth code Tutorial, Comments Tutorial, Stack Manipulation Tutorial, Tutorial |
| (and @code{LOOP} iterates until they become equal by wrap-around |
@section Using files for Forth code |
| arithmetic). This behaviour is usually not what you want. Therefore, |
@cindex loading Forth code, tutorial |
| Gforth offers @code{+DO} and @code{U+DO} (as replacements for |
@cindex files containing Forth code, tutorial |
| @code{?DO}), which do not enter the loop if @var{start} is greater than |
|
| @var{limit}; @code{+DO} is for signed loop parameters, @code{U+DO} for |
|
| unsigned loop parameters. |
|
| |
|
| @code{LOOP} can be replaced with @code{@var{n} +LOOP}; this updates the |
While working at the Forth command line is convenient for one-line |
| index by @var{n} instead of by 1. The loop is terminated when the border |
examples and short one-off code, you probably want to store your source |
| between @var{limit-1} and @var{limit} is crossed. E.g.: |
code in files for convenient editing and persistence. You can use your |
| |
favourite editor (Gforth includes Emacs support, @pxref{Emacs and |
| |
Gforth}) to create @var{file} and use |
| |
|
| @code{4 0 +DO i . 2 +LOOP} prints @code{0 2} |
@example |
| |
s" @var{file}" included |
| |
@end example |
| |
|
| @code{4 1 +DO i . 2 +LOOP} prints @code{1 3} |
to load it into your Forth system. The file name extension I use for |
| |
Forth files is @samp{.fs}. |
| |
|
| @cindex negative increment for counted loops |
You can easily start Gforth with some files loaded like this: |
| @cindex counted loops with negative increment |
|
| The behaviour of @code{@var{n} +LOOP} is peculiar when @var{n} is negative: |
|
| |
|
| @code{-1 0 ?DO i . -1 +LOOP} prints @code{0 -1} |
@example |
| |
gforth @var{file1} @var{file2} |
| |
@end example |
| |
|
| @code{ 0 0 ?DO i . -1 +LOOP} prints nothing |
If an error occurs during loading these files, Gforth terminates, |
| |
whereas an error during @code{INCLUDED} within Gforth usually gives you |
| |
a Gforth command line. Starting the Forth system every time gives you a |
| |
clean start every time, without interference from the results of earlier |
| |
tries. |
| |
|
| Therefore we recommend avoiding @code{@var{n} +LOOP} with negative |
I often put all the tests in a file, then load the code and run the |
| @var{n}. One alternative is @code{@var{u} -LOOP}, which reduces the |
tests with |
| index by @var{u} each iteration. The loop is terminated when the border |
|
| between @var{limit+1} and @var{limit} is crossed. Gforth also provides |
|
| @code{-DO} and @code{U-DO} for down-counting loops. E.g.: |
|
| |
|
| @code{-2 0 -DO i . 1 -LOOP} prints @code{0 -1} |
@example |
| |
gforth @var{code} @var{tests} -e bye |
| |
@end example |
| |
|
| @code{-1 0 -DO i . 1 -LOOP} prints @code{0} |
(often by performing this command with @kbd{C-x C-e} in Emacs). The |
| |
@code{-e bye} ensures that Gforth terminates afterwards so that I can |
| |
restart this command without ado. |
| |
|
| @code{ 0 0 -DO i . 1 -LOOP} prints nothing |
The advantage of this approach is that the tests can be repeated easily |
| |
every time the program ist changed, making it easy to catch bugs |
| |
introduced by the change. |
| |
|
| Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and |
Reference: @ref{Forth source files}. |
| @code{-LOOP} are not in the ANS Forth standard. However, an |
|
| implementation for these words that uses only standard words is provided |
|
| in @file{compat/loops.fs}. |
|
| |
|
| @code{?DO} can also be replaced by @code{DO}. @code{DO} always enters |
|
| the loop, independent of the loop parameters. Do not use @code{DO}, even |
|
| if you know that the loop is entered in any case. Such knowledge tends |
|
| to become invalid during maintenance of a program, and then the |
|
| @code{DO} will make trouble. |
|
| |
|
| @code{UNLOOP} is used to prepare for an abnormal loop exit, e.g., via |
@node Comments Tutorial, Colon Definitions Tutorial, Using files for Forth code Tutorial, Tutorial |
| @code{EXIT}. @code{UNLOOP} removes the loop control parameters from the |
@section Comments |
| return stack so @code{EXIT} can get to its return address. |
@cindex comments tutorial |
| |
|
| @cindex @code{FOR} loops |
|
| Another counted loop is |
|
| @example |
@example |
| @var{n} |
\ That's a comment; it ends at the end of the line |
| FOR |
( Another comment; it ends here: ) .s |
| @var{body} |
|
| NEXT |
|
| @end example |
@end example |
| This is the preferred loop of native code compiler writers who are too |
|
| lazy to optimize @code{?DO} loops properly. In Gforth, this loop |
|
| iterates @var{n+1} times; @code{i} produces values starting with @var{n} |
|
| and ending with 0. Other Forth systems may behave differently, even if |
|
| they support @code{FOR} loops. To avoid problems, don't use @code{FOR} |
|
| loops. |
|
| |
|
| @node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures |
@code{\} and @code{(} are ordinary Forth words and therefore have to be |
| @subsection Arbitrary control structures |
separated with white space from the following text. |
| @cindex control structures, user-defined |
|
| |
|
| @cindex control-flow stack |
@example |
| ANS Forth permits and supports using control structures in a non-nested |
\This gives an "Undefined word" error |
| way. Information about incomplete control structures is stored on the |
@end example |
| control-flow stack. This stack may be implemented on the Forth data |
|
| stack, and this is what we have done in Gforth. |
|
| |
|
| @cindex @code{orig}, control-flow stack item |
The first @code{)} ends a comment started with @code{(}, so you cannot |
| @cindex @code{dest}, control-flow stack item |
nest @code{(}-comments; and you cannot comment out text containing a |
| An @i{orig} entry represents an unresolved forward branch, a @i{dest} |
@code{)} with @code{( ... )}@footnote{therefore it's a good idea to |
| entry represents a backward branch target. A few words are the basis for |
avoid @code{)} in word names.}. |
| building any control structure possible (except control structures that |
|
| need storage, like calls, coroutines, and backtracking). |
|
| |
|
| doc-if |
I use @code{\}-comments for descriptive text and for commenting out code |
| doc-ahead |
of one or more line; I use @code{(}-comments for describing the stack |
| doc-then |
effect, the stack contents, or for commenting out sub-line pieces of |
| doc-begin |
code. |
| doc-until |
|
| doc-again |
|
| doc-cs-pick |
|
| doc-cs-roll |
|
| |
|
| On many systems control-flow stack items take one word, in Gforth they |
The Emacs mode @file{gforth.el} (@pxref{Emacs and Gforth}) supports |
| currently take three (this may change in the future). Therefore it is a |
these uses by commenting out a region with @kbd{C-x \}, uncommenting a |
| really good idea to manipulate the control flow stack with |
region with @kbd{C-u C-x \}, and filling a @code{\}-commented region |
| @code{cs-pick} and @code{cs-roll}, not with data stack manipulation |
with @kbd{M-q}. |
| words. |
|
| |
|
| Some standard control structure words are built from these words: |
Reference: @ref{Comments}. |
| |
|
| doc-else |
|
| doc-while |
|
| doc-repeat |
|
| |
|
| Gforth adds some more control-structure words: |
@node Colon Definitions Tutorial, Decompilation Tutorial, Comments Tutorial, Tutorial |
| |
@section Colon Definitions |
| |
@cindex colon definitions, tutorial |
| |
@cindex definitions, tutorial |
| |
@cindex procedures, tutorial |
| |
@cindex functions, tutorial |
| |
|
| doc-endif |
are similar to procedures and functions in other programming languages. |
| doc-?dup-if |
|
| doc-?dup-0=-if |
|
| |
|
| Counted loop words constitute a separate group of words: |
@example |
| |
: squared ( n -- n^2 ) |
| |
dup * ; |
| |
5 squared . |
| |
7 squared . |
| |
@end example |
| |
|
| doc-?do |
@code{:} starts the colon definition; its name is @code{squared}. The |
| doc-+do |
following comment describes its stack effect. The words @code{dup *} |
| doc-u+do |
are not executed, but compiled into the definition. @code{;} ends the |
| doc--do |
colon definition. |
| doc-u-do |
|
| doc-do |
|
| doc-for |
|
| doc-loop |
|
| doc-+loop |
|
| doc--loop |
|
| doc-next |
|
| doc-leave |
|
| doc-?leave |
|
| doc-unloop |
|
| doc-done |
|
| |
|
| The standard does not allow using @code{cs-pick} and @code{cs-roll} on |
The newly-defined word can be used like any other word, including using |
| @i{do-sys}. Our system allows it, but it's your job to ensure that for |
it in other definitions: |
| every @code{?DO} etc. there is exactly one @code{UNLOOP} on any path |
|
| through the definition (@code{LOOP} etc. compile an @code{UNLOOP} on the |
|
| fall-through path). Also, you have to ensure that all @code{LEAVE}s are |
|
| resolved (by using one of the loop-ending words or @code{DONE}). |
|
| |
|
| Another group of control structure words are |
@example |
| |
: cubed ( n -- n^3 ) |
| |
dup squared * ; |
| |
-5 cubed . |
| |
: fourth-power ( n -- n^4 ) |
| |
squared squared ; |
| |
3 fourth-power . |
| |
@end example |
| |
|
| doc-case |
@assignment |
| doc-endcase |
Write colon definitions for @code{nip}, @code{tuck}, @code{negate}, and |
| doc-of |
@code{/mod} in terms of other Forth words, and check if they work (hint: |
| doc-endof |
test your tests on the originals first). Don't let the |
| |
@samp{redefined}-Messages spook you, they are just warnings. |
| |
@endassignment |
| |
|
| @i{case-sys} and @i{of-sys} cannot be processed using @code{cs-pick} and |
Reference: @ref{Colon Definitions}. |
| @code{cs-roll}. |
|
| |
|
| @subsubsection Programming Style |
|
| |
|
| In order to ensure readability we recommend that you do not create |
@node Decompilation Tutorial, Stack-Effect Comments Tutorial, Colon Definitions Tutorial, Tutorial |
| arbitrary control structures directly, but define new control structure |
@section Decompilation |
| words for the control structure you want and use these words in your |
@cindex decompilation tutorial |
| program. |
@cindex see tutorial |
| |
|
| E.g., instead of writing |
You can decompile colon definitions with @code{see}: |
| |
|
| @example |
@example |
| begin |
see squared |
| ... |
see cubed |
| if [ 1 cs-roll ] |
|
| ... |
|
| again then |
|
| @end example |
@end example |
| |
|
| we recommend defining control structure words, e.g., |
In Gforth @code{see} shows you a reconstruction of the source code from |
| |
the executable code. Informations that were present in the source, but |
| |
not in the executable code, are lost (e.g., comments). |
| |
|
| @example |
You can also decompile the predefined words: |
| : while ( dest -- orig dest ) |
|
| POSTPONE if |
|
| 1 cs-roll ; immediate |
|
| |
|
| : repeat ( orig dest -- ) |
@example |
| POSTPONE again |
see . |
| POSTPONE then ; immediate |
see + |
| @end example |
@end example |
| |
|
| and then using these to create the control structure: |
|
| |
@node Stack-Effect Comments Tutorial, Types Tutorial, Decompilation Tutorial, Tutorial |
| |
@section Stack-Effect Comments |
| |
@cindex stack-effect comments, tutorial |
| |
@cindex --, tutorial |
| |
By convention the comment after the name of a definition describes the |
| |
stack effect: The part in from of the @samp{--} describes the state of |
| |
the stack before the execution of the definition, i.e., the parameters |
| |
that are passed into the colon definition; the part behind the @samp{--} |
| |
is the state of the stack after the execution of the definition, i.e., |
| |
the results of the definition. The stack comment only shows the top |
| |
stack items that the definition accesses and/or changes. |
| |
|
| |
You should put a correct stack effect on every definition, even if it is |
| |
just @code{( -- )}. You should also add some descriptive comment to |
| |
more complicated words (I usually do this in the lines following |
| |
@code{:}). If you don't do this, your code becomes unreadable (because |
| |
you have to work through every definition before you can undertsand |
| |
any). |
| |
|
| |
@assignment |
| |
The stack effect of @code{swap} can be written like this: @code{x1 x2 -- |
| |
x2 x1}. Describe the stack effect of @code{-}, @code{drop}, @code{dup}, |
| |
@code{over}, @code{rot}, @code{nip}, and @code{tuck}. Hint: When you |
| |
are done, you can compare your stack effects to those in this manual |
| |
(@pxref{Word Index}). |
| |
@endassignment |
| |
|
| |
Sometimes programmers put comments at various places in colon |
| |
definitions that describe the contents of the stack at that place (stack |
| |
comments); i.e., they are like the first part of a stack-effect |
| |
comment. E.g., |
| |
|
| @example |
@example |
| begin |
: cubed ( n -- n^3 ) |
| ... |
dup squared ( n n^2 ) * ; |
| while |
|
| ... |
|
| repeat |
|
| @end example |
@end example |
| |
|
| That's much easier to read, isn't it? Of course, @code{REPEAT} and |
In this case the stack comment is pretty superfluous, because the word |
| @code{WHILE} are predefined, so in this example it would not be |
is simple enough. If you think it would be a good idea to add such a |
| necessary to define them. |
comment to increase readability, you should also consider factoring the |
| |
word into several simpler words (@pxref{Factoring Tutorial,, |
| |
Factoring}), which typically eliminates the need for the stack comment; |
| |
however, if you decide not to refactor it, then having such a comment is |
| |
better than not having it. |
| |
|
| @node Calls and returns, Exception Handling, Arbitrary control structures, Control Structures |
The names of the stack items in stack-effect and stack comments in the |
| @subsection Calls and returns |
standard, in this manual, and in many programs specify the type through |
| @cindex calling a definition |
a type prefix, similar to Fortran and Hungarian notation. The most |
| @cindex returning from a definition |
frequent prefixes are: |
| |
|
| @cindex recursive definitions |
@table @code |
| A definition can be called simply be writing the name of the definition |
@item n |
| to be called. Note that normally a definition is invisible during its |
signed integer |
| definition. If you want to write a directly recursive definition, you |
@item u |
| can use @code{recursive} to make the current definition visible. |
unsigned integer |
| |
@item c |
| |
character |
| |
@item f |
| |
Boolean flags, i.e. @code{false} or @code{true}. |
| |
@item a-addr,a- |
| |
Cell-aligned address |
| |
@item c-addr,c- |
| |
Char-aligned address (note that a Char may have two bytes in Windows NT) |
| |
@item xt |
| |
Execution token, same size as Cell |
| |
@item w,x |
| |
Cell, can contain an integer or an address. It usually takes 32, 64 or |
| |
16 bits (depending on your platform and Forth system). A cell is more |
| |
commonly known as machine word, but the term @emph{word} already means |
| |
something different in Forth. |
| |
@item d |
| |
signed double-cell integer |
| |
@item ud |
| |
unsigned double-cell integer |
| |
@item r |
| |
Float (on the FP stack) |
| |
@end table |
| |
|
| doc-recursive |
You can find a more complete list in @ref{Notation}. |
| |
|
| Another way to perform a recursive call is |
@assignment |
| |
Write stack-effect comments for all definitions you have written up to |
| |
now. |
| |
@endassignment |
| |
|
| |
|
| |
@node Types Tutorial, Factoring Tutorial, Stack-Effect Comments Tutorial, Tutorial |
| |
@section Types |
| |
@cindex types tutorial |
| |
|
| |
In Forth the names of the operations are not overloaded; so similar |
| |
operations on different types need different names; e.g., @code{+} adds |
| |
integers, and you have to use @code{f+} to add floating-point numbers. |
| |
The following prefixes are often used for related operations on |
| |
different types: |
| |
|
| doc-recurse |
@table @code |
| |
@item (none) |
| |
signed integer |
| |
@item u |
| |
unsigned integer |
| |
@item c |
| |
character |
| |
@item d |
| |
signed double-cell integer |
| |
@item ud, du |
| |
unsigned double-cell integer |
| |
@item 2 |
| |
two cells (not-necessarily double-cell numbers) |
| |
@item m, um |
| |
mixed single-cell and double-cell operations |
| |
@item f |
| |
floating-point (note that in stack comments @samp{f} represents flags, |
| |
and @samp{r} represents FP numbers). |
| |
@end table |
| |
|
| @quotation |
If there are no differences between the signed and the unsigned variant |
| @progstyle |
(e.g., for @code{+}), there is only the prefix-less variant. |
| I prefer using @code{recursive} to @code{recurse}, because calling the |
|
| definition by name is more descriptive (if the name is well-chosen) than |
|
| the somewhat cryptic @code{recurse}. E.g., in a quicksort |
|
| implementation, it is much better to read (and think) ``now sort the |
|
| partitions'' than to read ``now do a recursive call''. |
|
| @end quotation |
|
| |
|
| For mutual recursion, use @code{defer}red words, like this: |
Forth does not perform type checking, neither at compile time, nor at |
| |
run time. If you use the wrong oeration, the data are interpreted |
| |
incorrectly: |
| |
|
| @example |
@example |
| defer foo |
-1 u. |
| |
|
| : bar ( ... -- ... ) |
|
| ... foo ... ; |
|
| |
|
| :noname ( ... -- ... ) |
|
| ... bar ... ; |
|
| IS foo |
|
| @end example |
@end example |
| |
|
| When the end of the definition is reached, it returns. An earlier return |
If you have only experience with type-checked languages until now, and |
| can be forced using |
have heard how important type-checking is, don't panic! In my |
| |
experience (and that of other Forthers), type errors in Forth code are |
| |
usually easy to find (once you get used to it), the increased vigilance |
| |
of the programmer tends to catch some harder errors in addition to most |
| |
type errors, and you never have to work around the type system, so in |
| |
most situations the lack of type-checking seems to be a win (projects to |
| |
add type checking to Forth have not caught on). |
| |
|
| doc-exit |
|
| |
|
| Don't forget to clean up the return stack and @code{UNLOOP} any |
@node Factoring Tutorial, Designing the stack effect Tutorial, Types Tutorial, Tutorial |
| outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing. The |
@section Factoring |
| primitive compiled by @code{EXIT} is |
@cindex factoring tutorial |
| |
|
| doc-;s |
If you try to write longer definitions, you will soon find it hard to |
| |
keep track of the stack contents. Therefore, good Forth programmers |
| |
tend to write only short definitions (e.g., three lines). The art of |
| |
finding meaningful short definitions is known as factoring (as in |
| |
factoring polynomials). |
| |
|
| @node Exception Handling, , Calls and returns, Control Structures |
Well-factored programs offer additional advantages: smaller, more |
| @subsection Exception Handling |
general words, are easier to test and debug and can be reused more and |
| @cindex Exceptions |
better than larger, specialized words. |
| |
|
| doc-catch |
So, if you run into difficulties with stack management, when writing |
| doc-throw |
code, try to define meaningful factors for the word, and define the word |
| |
in terms of those. Even if a factor contains only two words, it is |
| |
often helpful. |
| |
|
| @node Locals, Defining Words, Control Structures, Words |
Good factoring is not easy, and it takes some practice to get the knack |
| @section Locals |
for it; but even experienced Forth programmers often don't find the |
| @cindex locals |
right solution right away, but only when rewriting the program. So, if |
| |
you don't come up with a good solution immediately, keep trying, don't |
| |
despair. |
| |
|
| Local variables can make Forth programming more enjoyable and Forth |
@c example !! |
| programs easier to read. Unfortunately, the locals of ANS Forth are |
|
| laden with restrictions. Therefore, we provide not only the ANS Forth |
|
| locals wordset, but also our own, more powerful locals wordset (we |
|
| implemented the ANS Forth locals wordset through our locals wordset). |
|
| |
|
| The ideas in this section have also been published in the paper |
|
| @cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented |
|
| at EuroForth '94; it is available at |
|
| @*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}. |
|
| |
|
| @menu |
@node Designing the stack effect Tutorial, Local Variables Tutorial, Factoring Tutorial, Tutorial |
| * Gforth locals:: |
@section Designing the stack effect |
| * ANS Forth locals:: |
@cindex Stack effect design, tutorial |
| @end menu |
@cindex design of stack effects, tutorial |
| |
|
| @node Gforth locals, ANS Forth locals, Locals, Locals |
In other languages you can use an arbitrary order of parameters for a |
| @subsection Gforth locals |
function; and since there is only one result, you don't have to deal with |
| @cindex Gforth locals |
the order of results, either. |
| @cindex locals, Gforth style |
|
| |
|
| Locals can be defined with |
In Forth (and other stack-based languages, e.g., Postscript) the |
| |
parameter and result order of a definition is important and should be |
| |
designed well. The general guideline is to design the stack effect such |
| |
that the word is simple to use in most cases, even if that complicates |
| |
the implementation of the word. Some concrete rules are: |
| |
|
| |
@itemize @bullet |
| |
|
| |
@item |
| |
Words consume all of their parameters (e.g., @code{.}). |
| |
|
| |
@item |
| |
If there is a convention on the order of parameters (e.g., from |
| |
mathematics or another programming language), stick with it (e.g., |
| |
@code{-}). |
| |
|
| |
@item |
| |
If one parameter usually requires only a short computation (e.g., it is |
| |
a constant), pass it on the top of the stack. Conversely, parameters |
| |
that usually require a long sequence of code to compute should be passed |
| |
as the bottom (i.e., first) parameter. This makes the code easier to |
| |
read, because reader does not need to keep track of the bottom item |
| |
through a long sequence of code (or, alternatively, through stack |
| |
manipulations). E.g., @code{!} (store, @pxref{Memory}) expects the |
| |
address on top of the stack because it is usually simpler to compute |
| |
than the stored value (often the address is just a variable). |
| |
|
| |
@item |
| |
Similarly, results that are usually consumed quickly should be returned |
| |
on the top of stack, whereas a result that is often used in long |
| |
computations should be passed as bottom result. E.g., the file words |
| |
like @code{open-file} return the error code on the top of stack, because |
| |
it is usually consumed quickly by @code{throw}; moreover, the error code |
| |
has to be checked before doing anything with the other results. |
| |
|
| |
@end itemize |
| |
|
| |
These rules are just general guidelines, don't lose sight of the overall |
| |
goal to make the words easy to use. E.g., if the convention rule |
| |
conflicts with the computation-length rule, you might decide in favour |
| |
of the convention if the word will be used rarely, and in favour of the |
| |
computation-length rule if the word will be used frequently (because |
| |
with frequent use the cost of breaking the computation-length rule would |
| |
be quite high, and frequent use makes it easier to remember an |
| |
unconventional order). |
| |
|
| |
@c example !! structure package |
| |
|
| |
|
| |
@node Local Variables Tutorial, Conditional execution Tutorial, Designing the stack effect Tutorial, Tutorial |
| |
@section Local Variables |
| |
@cindex local variables, tutorial |
| |
|
| |
You can define local variables (@emph{locals}) in a colon definition: |
| |
|
| @example |
@example |
| @{ local1 local2 ... -- comment @} |
: swap @{ a b -- b a @} |
| |
b a ; |
| |
1 2 swap .s 2drop |
| @end example |
@end example |
| or |
|
| |
(If your Forth system does not support this syntax, include |
| |
@file{compat/anslocals.fs} first). |
| |
|
| |
In this example @code{@{ a b -- b a @}} is the locals definition; it |
| |
takes two cells from the stack, puts the top of stack in @code{b} and |
| |
the next stack element in @code{a}. @code{--} starts a comment ending |
| |
with @code{@}}. After the locals definition, using the name of the |
| |
local will push its value on the stack. You can leave the comment |
| |
part (@code{-- b a}) away: |
| |
|
| @example |
@example |
| @{ local1 local2 ... @} |
: swap ( x1 x2 -- x2 x1 ) |
| |
@{ a b @} b a ; |
| @end example |
@end example |
| |
|
| E.g., |
In Gforth you can have several locals definitions, anywhere in a colon |
| |
definition; in contrast, in a standard program you can have only one |
| |
locals definition per colon definition, and that locals definition must |
| |
be outside any controll structure. |
| |
|
| |
With locals you can write slightly longer definitions without running |
| |
into stack trouble. However, I recommend trying to write colon |
| |
definitions without locals for exercise purposes to help you gain the |
| |
essential factoring skills. |
| |
|
| |
@assignment |
| |
Rewrite your definitions until now with locals |
| |
@endassignment |
| |
|
| |
Reference: @ref{Locals}. |
| |
|
| |
|
| |
@node Conditional execution Tutorial, Flags and Comparisons Tutorial, Local Variables Tutorial, Tutorial |
| |
@section Conditional execution |
| |
@cindex conditionals, tutorial |
| |
@cindex if, tutorial |
| |
|
| |
In Forth you can use control structures only inside colon definitions. |
| |
An @code{if}-structure looks like this: |
| |
|
| @example |
@example |
| : max @{ n1 n2 -- n3 @} |
: abs ( n1 -- +n2 ) |
| n1 n2 > if |
dup 0 < if |
| n1 |
negate |
| else |
|
| n2 |
|
| endif ; |
endif ; |
| |
5 abs . |
| |
-5 abs . |
| @end example |
@end example |
| |
|
| The similarity of locals definitions with stack comments is intended. A |
@code{if} takes a flag from the stack. If the flag is non-zero (true), |
| locals definition often replaces the stack comment of a word. The order |
the following code is performed, otherwise execution continues after the |
| of the locals corresponds to the order in a stack comment and everything |
@code{endif} (or @code{else}). @code{<} compares the top two stack |
| after the @code{--} is really a comment. |
elements and prioduces a flag: |
| |
|
| This similarity has one disadvantage: It is too easy to confuse locals |
@example |
| declarations with stack comments, causing bugs and making them hard to |
1 2 < . |
| find. However, this problem can be avoided by appropriate coding |
2 1 < . |
| conventions: Do not use both notations in the same program. If you do, |
1 1 < . |
| they should be distinguished using additional means, e.g. by position. |
@end example |
| |
|
| @cindex types of locals |
Actually the standard name for @code{endif} is @code{then}. This |
| @cindex locals types |
tutorial presents the examples using @code{endif}, because this is often |
| The name of the local may be preceded by a type specifier, e.g., |
less confusing for people familiar with other programming languages |
| @code{F:} for a floating point value: |
where @code{then} has a different meaning. If your system does not have |
| |
@code{endif}, define it with |
| |
|
| @example |
@example |
| : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @} |
: endif postpone then ; immediate |
| \ complex multiplication |
|
| Ar Br f* Ai Bi f* f- |
|
| Ar Bi f* Ai Br f* f+ ; |
|
| @end example |
@end example |
| |
|
| @cindex flavours of locals |
You can optionally use an @code{else}-part: |
| @cindex locals flavours |
|
| @cindex value-flavoured locals |
|
| @cindex variable-flavoured locals |
|
| Gforth currently supports cells (@code{W:}, @code{W^}), doubles |
|
| (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters |
|
| (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined |
|
| with @code{W:}, @code{D:} etc.) produces its value and can be changed |
|
| with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.) |
|
| produces its address (which becomes invalid when the variable's scope is |
|
| left). E.g., the standard word @code{emit} can be defined in terms of |
|
| @code{type} like this: |
|
| |
|
| @example |
@example |
| : emit @{ C^ char* -- @} |
: min ( n1 n2 -- n ) |
| char* 1 type ; |
2dup < if |
| |
drop |
| |
else |
| |
nip |
| |
endif ; |
| |
2 3 min . |
| |
3 2 min . |
| @end example |
@end example |
| |
|
| @cindex default type of locals |
@assignment |
| @cindex locals, default type |
Write @code{min} without @code{else}-part (hint: what's the definition |
| A local without type specifier is a @code{W:} local. Both flavours of |
of @code{nip}?). |
| locals are initialized with values from the data or FP stack. |
@endassignment |
| |
|
| Currently there is no way to define locals with user-defined data |
Reference: @ref{Selection}. |
| structures, but we are working on it. |
|
| |
|
| Gforth allows defining locals everywhere in a colon definition. This |
|
| poses the following questions: |
|
| |
|
| @menu |
@node Flags and Comparisons Tutorial, General Loops Tutorial, Conditional execution Tutorial, Tutorial |
| * Where are locals visible by name?:: |
@section Flags and Comparisons |
| * How long do locals live?:: |
@cindex flags tutorial |
| * Programming Style:: |
@cindex comparison tutorial |
| * Implementation:: |
|
| @end menu |
|
| |
|
| @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals |
In a false-flag all bits are clear (0 when interpreted as integer). In |
| @subsubsection Where are locals visible by name? |
a canonical true-flag all bits are set (-1 as a twos-complement signed |
| @cindex locals visibility |
integer); in many contexts (e.g., @code{if}) any non-zero value is |
| @cindex visibility of locals |
treated as true flag. |
| @cindex scope of locals |
|
| |
|
| Basically, the answer is that locals are visible where you would expect |
@example |
| it in block-structured languages, and sometimes a little longer. If you |
false . |
| want to restrict the scope of a local, enclose its definition in |
true . |
| @code{SCOPE}...@code{ENDSCOPE}. |
true hex u. decimal |
| |
@end example |
| |
|
| doc-scope |
Comparison words produce canonical flags: |
| doc-endscope |
|
| |
|
| These words behave like control structure words, so you can use them |
@example |
| with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in |
1 1 = . |
| arbitrary ways. |
1 0= . |
| |
0 1 < . |
| |
0 0 < . |
| |
-1 1 u< . \ type error, u< interprets -1 as large unsigned number |
| |
-1 1 < . |
| |
@end example |
| |
|
| If you want a more exact answer to the visibility question, here's the |
Gforth supports all combinations of the prefixes @code{0 u d d0 du f f0} |
| basic principle: A local is visible in all places that can only be |
(or none) and the comparisons @code{= <> < > <= >=}. Only a part of |
| reached through the definition of the local@footnote{In compiler |
these combinations are standard (for details see the standard, |
| construction terminology, all places dominated by the definition of the |
@ref{Numeric comparison}, @ref{Floating Point} or @ref{Word Index}). |
| local.}. In other words, it is not visible in places that can be reached |
|
| without going through the definition of the local. E.g., locals defined |
|
| in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals |
|
| defined in @code{BEGIN}...@code{UNTIL} are visible after the |
|
| @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}). |
|
| |
|
| The reasoning behind this solution is: We want to have the locals |
You can use @code{and or xor invert} can be used as operations on |
| visible as long as it is meaningful. The user can always make the |
canonical flags. Actually they are bitwise operations: |
| visibility shorter by using explicit scoping. In a place that can |
|
| only be reached through the definition of a local, the meaning of a |
|
| local name is clear. In other places it is not: How is the local |
|
| initialized at the control flow path that does not contain the |
|
| definition? Which local is meant, if the same name is defined twice in |
|
| two independent control flow paths? |
|
| |
|
| This should be enough detail for nearly all users, so you can skip the |
@example |
| rest of this section. If you really must know all the gory details and |
1 2 and . |
| options, read on. |
1 2 or . |
| |
1 3 xor . |
| |
1 invert . |
| |
@end example |
| |
|
| In order to implement this rule, the compiler has to know which places |
You can convert a zero/non-zero flag into a canonical flag with |
| are unreachable. It knows this automatically after @code{AHEAD}, |
@code{0<>} (and complement it on the way with @code{0=}). |
| @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after |
|
| most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the |
|
| compiler that the control flow never reaches that place. If |
|
| @code{UNREACHABLE} is not used where it could, the only consequence is |
|
| that the visibility of some locals is more limited than the rule above |
|
| says. If @code{UNREACHABLE} is used where it should not (i.e., if you |
|
| lie to the compiler), buggy code will be produced. |
|
| |
|
| doc-unreachable |
@example |
| |
1 0= . |
| |
1 0<> . |
| |
@end example |
| |
|
| |
You can use the all-bits-set feature of canonical flags and the bitwise |
| |
operation of the Boolean operations to avoid @code{if}s: |
| |
|
| Another problem with this rule is that at @code{BEGIN}, the compiler |
|
| does not know which locals will be visible on the incoming |
|
| back-edge. All problems discussed in the following are due to this |
|
| ignorance of the compiler (we discuss the problems using @code{BEGIN} |
|
| loops as examples; the discussion also applies to @code{?DO} and other |
|
| loops). Perhaps the most insidious example is: |
|
| @example |
@example |
| AHEAD |
: foo ( n1 -- n2 ) |
| BEGIN |
0= if |
| x |
14 |
| [ 1 CS-ROLL ] THEN |
else |
| @{ x @} |
0 |
| ... |
endif ; |
| UNTIL |
0 foo . |
| |
1 foo . |
| |
|
| |
: foo ( n1 -- n2 ) |
| |
0= 14 and ; |
| |
0 foo . |
| |
1 foo . |
| @end example |
@end example |
| |
|
| This should be legal according to the visibility rule. The use of |
@assignment |
| @code{x} can only be reached through the definition; but that appears |
Write @code{min} without @code{if}. |
| textually below the use. |
@endassignment |
| |
|
| From this example it is clear that the visibility rules cannot be fully |
For reference, see @ref{Boolean Flags}, @ref{Numeric comparison}, and |
| implemented without major headaches. Our implementation treats common |
@ref{Bitwise operations}. |
| cases as advertised and the exceptions are treated in a safe way: The |
|
| compiler makes a reasonable guess about the locals visible after a |
|
| @code{BEGIN}; if it is too pessimistic, the |
@node General Loops Tutorial, Counted loops Tutorial, Flags and Comparisons Tutorial, Tutorial |
| user will get a spurious error about the local not being defined; if the |
@section General Loops |
| compiler is too optimistic, it will notice this later and issue a |
@cindex loops, indefinite, tutorial |
| warning. In the case above the compiler would complain about @code{x} |
|
| being undefined at its use. You can see from the obscure examples in |
The endless loop is the most simple one: |
| this section that it takes quite unusual control structures to get the |
|
| compiler into trouble, and even then it will often do fine. |
|
| |
|
| If the @code{BEGIN} is reachable from above, the most optimistic guess |
|
| is that all locals visible before the @code{BEGIN} will also be |
|
| visible after the @code{BEGIN}. This guess is valid for all loops that |
|
| are entered only through the @code{BEGIN}, in particular, for normal |
|
| @code{BEGIN}...@code{WHILE}...@code{REPEAT} and |
|
| @code{BEGIN}...@code{UNTIL} loops and it is implemented in our |
|
| compiler. When the branch to the @code{BEGIN} is finally generated by |
|
| @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and |
|
| warns the user if it was too optimistic: |
|
| @example |
@example |
| IF |
: endless ( -- ) |
| @{ x @} |
0 begin |
| BEGIN |
dup . 1+ |
| \ x ? |
again ; |
| [ 1 cs-roll ] THEN |
endless |
| ... |
|
| UNTIL |
|
| @end example |
@end example |
| |
|
| Here, @code{x} lives only until the @code{BEGIN}, but the compiler |
Terminate this loop by pressing @kbd{Ctrl-C} (in Gforth). @code{begin} |
| optimistically assumes that it lives until the @code{THEN}. It notices |
does nothing at run-time, @code{again} jumps back to @code{begin}. |
| this difference when it compiles the @code{UNTIL} and issues a |
|
| warning. The user can avoid the warning, and make sure that @code{x} |
A loop with one exit at any place looks like this: |
| is not used in the wrong area by using explicit scoping: |
|
| @example |
@example |
| IF |
: log2 ( +n1 -- n2 ) |
| SCOPE |
\ logarithmus dualis of n1>0, rounded down to the next integer |
| @{ x @} |
assert( dup 0> ) |
| ENDSCOPE |
2/ 0 begin |
| BEGIN |
over 0> while |
| [ 1 cs-roll ] THEN |
1+ swap 2/ swap |
| ... |
repeat |
| UNTIL |
nip ; |
| |
7 log2 . |
| |
8 log2 . |
| @end example |
@end example |
| |
|
| Since the guess is optimistic, there will be no spurious error messages |
At run-time @code{while} consumes a flag; if it is 0, execution |
| about undefined locals. |
continues behind the @code{repeat}; if the flag is non-zero, execution |
| |
continues behind the @code{while}. @code{Repeat} jumps back to |
| |
@code{begin}, just like @code{again}. |
| |
|
| If the @code{BEGIN} is not reachable from above (e.g., after |
In Forth there are many combinations/abbreviations, like @code{1+}. |
| @code{AHEAD} or @code{EXIT}), the compiler cannot even make an |
However, @code{2/} is not one of them; it shifts it's argument right by |
| optimistic guess, as the locals visible after the @code{BEGIN} may be |
one bit (arithmetic shift right): |
| defined later. Therefore, the compiler assumes that no locals are |
|
| visible after the @code{BEGIN}. However, the user can use |
|
| @code{ASSUME-LIVE} to make the compiler assume that the same locals are |
|
| visible at the BEGIN as at the point where the top control-flow stack |
|
| item was created. |
|
| |
|
| doc-assume-live |
@example |
| |
-5 2 / . |
| |
-5 2/ . |
| |
@end example |
| |
|
| |
@code{assert(} is no standard word, but you can get it on systems other |
| |
then Gforth by including @file{compat/assert.fs}. You can see what it |
| |
does by trying |
| |
|
| E.g., |
|
| @example |
@example |
| @{ x @} |
0 log2 . |
| AHEAD |
|
| ASSUME-LIVE |
|
| BEGIN |
|
| x |
|
| [ 1 CS-ROLL ] THEN |
|
| ... |
|
| UNTIL |
|
| @end example |
@end example |
| |
|
| Other cases where the locals are defined before the @code{BEGIN} can be |
Here's a loop with an exit at the end: |
| handled by inserting an appropriate @code{CS-ROLL} before the |
|
| @code{ASSUME-LIVE} (and changing the control-flow stack manipulation |
|
| behind the @code{ASSUME-LIVE}). |
|
| |
|
| Cases where locals are defined after the @code{BEGIN} (but should be |
|
| visible immediately after the @code{BEGIN}) can only be handled by |
|
| rearranging the loop. E.g., the ``most insidious'' example above can be |
|
| arranged into: |
|
| @example |
@example |
| BEGIN |
: log2 ( +n1 -- n2 ) |
| @{ x @} |
\ logarithmus dualis of n1>0, rounded down to the next integer |
| ... 0= |
assert( dup 0 > ) |
| WHILE |
-1 begin |
| x |
1+ swap 2/ swap |
| REPEAT |
over 0 <= |
| |
until |
| |
nip ; |
| @end example |
@end example |
| |
|
| @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals |
@code{Until} consumes a flag; if it is non-zero, execution continues at |
| @subsubsection How long do locals live? |
the @code{begin}, otherwise after the @code{until}. |
| @cindex locals lifetime |
|
| @cindex lifetime of locals |
|
| |
|
| The right answer for the lifetime question would be: A local lives at |
@assignment |
| least as long as it can be accessed. For a value-flavoured local this |
Write a definition for computing the greatest common divisor. |
| means: until the end of its visibility. However, a variable-flavoured |
@endassignment |
| local could be accessed through its address far beyond its visibility |
|
| scope. Ultimately, this would mean that such locals would have to be |
|
| garbage collected. Since this entails un-Forth-like implementation |
|
| complexities, I adopted the same cowardly solution as some other |
|
| languages (e.g., C): The local lives only as long as it is visible; |
|
| afterwards its address is invalid (and programs that access it |
|
| afterwards are erroneous). |
|
| |
|
| @node Programming Style, Implementation, How long do locals live?, Gforth locals |
Reference: @ref{Simple Loops}. |
| @subsubsection Programming Style |
|
| @cindex locals programming style |
|
| @cindex programming style, locals |
|
| |
|
| The freedom to define locals anywhere has the potential to change |
|
| programming styles dramatically. In particular, the need to use the |
|
| return stack for intermediate storage vanishes. Moreover, all stack |
|
| manipulations (except @code{PICK}s and @code{ROLL}s with run-time |
|
| determined arguments) can be eliminated: If the stack items are in the |
|
| wrong order, just write a locals definition for all of them; then |
|
| write the items in the order you want. |
|
| |
|
| This seems a little far-fetched and eliminating stack manipulations is |
|
| unlikely to become a conscious programming objective. Still, the number |
|
| of stack manipulations will be reduced dramatically if local variables |
|
| are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with |
|
| a traditional implementation of @code{max}). |
|
| |
|
| This shows one potential benefit of locals: making Forth programs more |
|
| readable. Of course, this benefit will only be realized if the |
|
| programmers continue to honour the principle of factoring instead of |
|
| using the added latitude to make the words longer. |
|
| |
|
| @cindex single-assignment style for locals |
|
| Using @code{TO} can and should be avoided. Without @code{TO}, |
|
| every value-flavoured local has only a single assignment and many |
|
| advantages of functional languages apply to Forth. I.e., programs are |
|
| easier to analyse, to optimize and to read: It is clear from the |
|
| definition what the local stands for, it does not turn into something |
|
| different later. |
|
| |
|
| E.g., a definition using @code{TO} might look like this: |
@node Counted loops Tutorial, Recursion Tutorial, General Loops Tutorial, Tutorial |
| @example |
@section Counted loops |
| : strcmp @{ addr1 u1 addr2 u2 -- n @} |
@cindex loops, counted, tutorial |
| u1 u2 min 0 |
|
| ?do |
|
| addr1 c@@ addr2 c@@ - |
|
| ?dup-if |
|
| unloop exit |
|
| then |
|
| addr1 char+ TO addr1 |
|
| addr2 char+ TO addr2 |
|
| loop |
|
| u1 u2 - ; |
|
| @end example |
|
| Here, @code{TO} is used to update @code{addr1} and @code{addr2} at |
|
| every loop iteration. @code{strcmp} is a typical example of the |
|
| readability problems of using @code{TO}. When you start reading |
|
| @code{strcmp}, you think that @code{addr1} refers to the start of the |
|
| string. Only near the end of the loop you realize that it is something |
|
| else. |
|
| |
|
| This can be avoided by defining two locals at the start of the loop that |
|
| are initialized with the right value for the current iteration. |
|
| @example |
@example |
| : strcmp @{ addr1 u1 addr2 u2 -- n @} |
: ^ ( n1 u -- n ) |
| addr1 addr2 |
\ n = the uth power of u1 |
| u1 u2 min 0 |
1 swap 0 u+do |
| ?do @{ s1 s2 @} |
over * |
| s1 c@@ s2 c@@ - |
|
| ?dup-if |
|
| unloop exit |
|
| then |
|
| s1 char+ s2 char+ |
|
| loop |
loop |
| 2drop |
nip ; |
| u1 u2 - ; |
3 2 ^ . |
| |
4 3 ^ . |
| @end example |
@end example |
| Here it is clear from the start that @code{s1} has a different value |
|
| in every loop iteration. |
|
| |
|
| @node Implementation, , Programming Style, Gforth locals |
@code{U+do} (from @file{compat/loops.fs}, if your Forth system doesn't |
| @subsubsection Implementation |
have it) takes two numbers of the stack @code{( u3 u4 -- )}, and then |
| @cindex locals implementation |
performs the code between @code{u+do} and @code{loop} for @code{u3-u4} |
| @cindex implementation of locals |
times (or not at all, if @code{u3-u4<0}). |
| |
|
| @cindex locals stack |
You can see the stack effect design rules at work in the stack effect of |
| Gforth uses an extra locals stack. The most compelling reason for |
the loop start words: Since the start value of the loop is more |
| this is that the return stack is not float-aligned; using an extra stack |
frequently constant than the end value, the start value is passed on |
| also eliminates the problems and restrictions of using the return stack |
the top-of-stack. |
| as locals stack. Like the other stacks, the locals stack grows toward |
|
| lower addresses. A few primitives allow an efficient implementation: |
|
| |
|
| doc-@local# |
You can access the counter of a counted loop with @code{i}: |
| doc-f@local# |
|
| doc-laddr# |
|
| doc-lp+!# |
|
| doc-lp! |
|
| doc->l |
|
| doc-f>l |
|
| |
|
| In addition to these primitives, some specializations of these |
@example |
| primitives for commonly occurring inline arguments are provided for |
: fac ( u -- u! ) |
| efficiency reasons, e.g., @code{@@local0} as specialization of |
1 swap 1+ 1 u+do |
| @code{@@local#} for the inline argument 0. The following compiling words |
i * |
| compile the right specialized version, or the general version, as |
loop ; |
| appropriate: |
5 fac . |
| |
7 fac . |
| |
@end example |
| |
|
| doc-compile-@local |
There is also @code{+do}, which expects signed numbers (important for |
| doc-compile-f@local |
deciding whether to enter the loop). |
| doc-compile-lp+! |
|
| |
|
| Combinations of conditional branches and @code{lp+!#} like |
@assignment |
| @code{?branch-lp+!#} (the locals pointer is only changed if the branch |
Write a definition for computing the nth Fibonacci number. |
| is taken) are provided for efficiency and correctness in loops. |
@endassignment |
| |
|
| A special area in the dictionary space is reserved for keeping the |
You can also use increments other than 1: |
| local variable names. @code{@{} switches the dictionary pointer to this |
|
| area and @code{@}} switches it back and generates the locals |
|
| initializing code. @code{W:} etc.@ are normal defining words. This |
|
| special area is cleared at the start of every colon definition. |
|
| |
|
| @cindex wordlist for defining locals |
@example |
| A special feature of Gforth's dictionary is used to implement the |
: up2 ( n1 n2 -- ) |
| definition of locals without type specifiers: every wordlist (aka |
+do |
| vocabulary) has its own methods for searching |
i . |
| etc. (@pxref{Wordlists}). For the present purpose we defined a wordlist |
2 +loop ; |
| with a special search method: When it is searched for a word, it |
10 0 up2 |
| actually creates that word using @code{W:}. @code{@{} changes the search |
|
| order to first search the wordlist containing @code{@}}, @code{W:} etc., |
|
| and then the wordlist for defining locals without type specifiers. |
|
| |
|
| The lifetime rules support a stack discipline within a colon |
: down2 ( n1 n2 -- ) |
| definition: The lifetime of a local is either nested with other locals |
-do |
| lifetimes or it does not overlap them. |
i . |
| |
2 -loop ; |
| |
0 10 down2 |
| |
@end example |
| |
|
| At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack |
Reference: @ref{Counted Loops}. |
| pointer manipulation is generated. Between control structure words |
|
| locals definitions can push locals onto the locals stack. @code{AGAIN} |
|
| is the simplest of the other three control flow words. It has to |
|
| restore the locals stack depth of the corresponding @code{BEGIN} |
|
| before branching. The code looks like this: |
|
| @format |
|
| @code{lp+!#} current-locals-size @minus{} dest-locals-size |
|
| @code{branch} <begin> |
|
| @end format |
|
| |
|
| @code{UNTIL} is a little more complicated: If it branches back, it |
|
| must adjust the stack just like @code{AGAIN}. But if it falls through, |
|
| the locals stack must not be changed. The compiler generates the |
|
| following code: |
|
| @format |
|
| @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size |
|
| @end format |
|
| The locals stack pointer is only adjusted if the branch is taken. |
|
| |
|
| @code{THEN} can produce somewhat inefficient code: |
@node Recursion Tutorial, Leaving definitions or loops Tutorial, Counted loops Tutorial, Tutorial |
| @format |
@section Recursion |
| @code{lp+!#} current-locals-size @minus{} orig-locals-size |
@cindex recursion tutorial |
| <orig target>: |
|
| @code{lp+!#} orig-locals-size @minus{} new-locals-size |
|
| @end format |
|
| The second @code{lp+!#} adjusts the locals stack pointer from the |
|
| level at the @var{orig} point to the level after the @code{THEN}. The |
|
| first @code{lp+!#} adjusts the locals stack pointer from the current |
|
| level to the level at the orig point, so the complete effect is an |
|
| adjustment from the current level to the right level after the |
|
| @code{THEN}. |
|
| |
|
| @cindex locals information on the control-flow stack |
Usually the name of a definition is not visible in the definition; but |
| @cindex control-flow stack items, locals information |
earlier definitions are usually visible: |
| In a conventional Forth implementation a dest control-flow stack entry |
|
| is just the target address and an orig entry is just the address to be |
|
| patched. Our locals implementation adds a wordlist to every orig or dest |
|
| item. It is the list of locals visible (or assumed visible) at the point |
|
| described by the entry. Our implementation also adds a tag to identify |
|
| the kind of entry, in particular to differentiate between live and dead |
|
| (reachable and unreachable) orig entries. |
|
| |
|
| A few unusual operations have to be performed on locals wordlists: |
@example |
| |
1 0 / . \ "Floating-point unidentified fault" in Gforth on most platforms |
| |
: / ( n1 n2 -- n ) |
| |
dup 0= if |
| |
-10 throw \ report division by zero |
| |
endif |
| |
/ \ old version |
| |
; |
| |
1 0 / |
| |
@end example |
| |
|
| doc-common-list |
For recursive definitions you can use @code{recursive} (non-standard) or |
| doc-sub-list? |
@code{recurse}: |
| doc-list-size |
|
| |
|
| Several features of our locals wordlist implementation make these |
@example |
| operations easy to implement: The locals wordlists are organised as |
: fac1 ( n -- n! ) recursive |
| linked lists; the tails of these lists are shared, if the lists |
dup 0> if |
| contain some of the same locals; and the address of a name is greater |
dup 1- fac1 * |
| than the address of the names behind it in the list. |
else |
| |
drop 1 |
| |
endif ; |
| |
7 fac1 . |
| |
|
| Another important implementation detail is the variable |
: fac2 ( n -- n! ) |
| @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to |
dup 0> if |
| determine if they can be reached directly or only through the branch |
dup 1- recurse * |
| that they resolve. @code{dead-code} is set by @code{UNREACHABLE}, |
else |
| @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon |
drop 1 |
| definition, by @code{BEGIN} and usually by @code{THEN}. |
endif ; |
| |
8 fac2 . |
| |
@end example |
| |
|
| Counted loops are similar to other loops in most respects, but |
@assignment |
| @code{LEAVE} requires special attention: It performs basically the same |
Write a recursive definition for computing the nth Fibonacci number. |
| service as @code{AHEAD}, but it does not create a control-flow stack |
@endassignment |
| entry. Therefore the information has to be stored elsewhere; |
|
| traditionally, the information was stored in the target fields of the |
|
| branches created by the @code{LEAVE}s, by organizing these fields into a |
|
| linked list. Unfortunately, this clever trick does not provide enough |
|
| space for storing our extended control flow information. Therefore, we |
|
| introduce another stack, the leave stack. It contains the control-flow |
|
| stack entries for all unresolved @code{LEAVE}s. |
|
| |
|
| Local names are kept until the end of the colon definition, even if |
Reference (including indirect recursion): @xref{Calls and returns}. |
| they are no longer visible in any control-flow path. In a few cases |
|
| this may lead to increased space needs for the locals name area, but |
|
| usually less than reclaiming this space would cost in code size. |
|
| |
|
| |
|
| @node ANS Forth locals, , Gforth locals, Locals |
@node Leaving definitions or loops Tutorial, Return Stack Tutorial, Recursion Tutorial, Tutorial |
| @subsection ANS Forth locals |
@section Leaving definitions or loops |
| @cindex locals, ANS Forth style |
@cindex leaving definitions, tutorial |
| |
@cindex leaving loops, tutorial |
| |
|
| The ANS Forth locals wordset does not define a syntax for locals, but |
@code{EXIT} exits the current definition right away. For every counted |
| words that make it possible to define various syntaxes. One of the |
loop that is left in this way, an @code{UNLOOP} has to be performed |
| possible syntaxes is a subset of the syntax we used in the Gforth locals |
before the @code{EXIT}: |
| wordset, i.e.: |
|
| |
|
| |
@c !! real examples |
| @example |
@example |
| @{ local1 local2 ... -- comment @} |
: ... |
| @end example |
... u+do |
| or |
... if |
| @example |
... unloop exit |
| @{ local1 local2 ... @} |
endif |
| |
... |
| |
loop |
| |
... ; |
| @end example |
@end example |
| |
|
| The order of the locals corresponds to the order in a stack comment. The |
@code{LEAVE} leaves the innermost counted loop right away: |
| restrictions are: |
|
| |
|
| @itemize @bullet |
@example |
| @item |
: ... |
| Locals can only be cell-sized values (no type specifiers are allowed). |
... u+do |
| @item |
... if |
| Locals can be defined only outside control structures. |
... leave |
| @item |
endif |
| Locals can interfere with explicit usage of the return stack. For the |
... |
| exact (and long) rules, see the standard. If you don't use return stack |
loop |
| accessing words in a definition using locals, you will be all right. The |
... ; |
| purpose of this rule is to make locals implementation on the return |
@end example |
| stack easier. |
|
| @item |
|
| The whole definition must be in one line. |
|
| @end itemize |
|
| |
|
| Locals defined in this way behave like @code{VALUE}s (@xref{Simple |
@c !! example |
| Defining Words}). I.e., they are initialized from the stack. Using their |
|
| name produces their value. Their value can be changed using @code{TO}. |
|
| |
|
| Since this syntax is supported by Gforth directly, you need not do |
Reference: @ref{Calls and returns}, @ref{Counted Loops}. |
| anything to use it. If you want to port a program using this syntax to |
|
| another ANS Forth system, use @file{compat/anslocal.fs} to implement the |
|
| syntax on the other system. |
|
| |
|
| Note that a syntax shown in the standard, section A.13 looks |
|
| similar, but is quite different in having the order of locals |
|
| reversed. Beware! |
|
| |
|
| The ANS Forth locals wordset itself consists of the following word |
@node Return Stack Tutorial, Memory Tutorial, Leaving definitions or loops Tutorial, Tutorial |
| |
@section Return Stack |
| |
@cindex return stack tutorial |
| |
|
| doc-(local) |
In addition to the data stack Forth also has a second stack, the return |
| |
stack; most Forth systems store the return addresses of procedure calls |
| |
there (thus its name). Programmers can also use this stack: |
| |
|
| The ANS Forth locals extension wordset defines a syntax, but it is so |
@example |
| awful that we strongly recommend not to use it. We have implemented this |
: foo ( n1 n2 -- ) |
| syntax to make porting to Gforth easy, but do not document it here. The |
.s |
| problem with this syntax is that the locals are defined in an order |
>r .s |
| reversed with respect to the standard stack comment notation, making |
r@@ . |
| programs harder to read, and easier to misread and miswrite. The only |
>r .s |
| merit of this syntax is that it is easy to implement using the ANS Forth |
r@@ . |
| locals wordset. |
r> . |
| |
r@@ . |
| |
r> . ; |
| |
1 2 foo |
| |
@end example |
| |
|
| @node Defining Words, Structures, Locals, Words |
@code{>r} takes an element from the data stack and pushes it onto the |
| @section Defining Words |
return stack; conversely, @code{r>} moves an elementm from the return to |
| @cindex defining words |
the data stack; @code{r@@} pushes a copy of the top of the return stack |
| |
on the return stack. |
| |
|
| @menu |
Forth programmers usually use the return stack for storing data |
| * Simple Defining Words:: |
temporarily, if using the data stack alone would be too complex, and |
| * Colon Definitions:: |
factoring and locals are not an option: |
| * User-defined Defining Words:: |
|
| * Supplying names:: |
|
| * Interpretation and Compilation Semantics:: |
|
| @end menu |
|
| |
|
| @node Simple Defining Words, Colon Definitions, Defining Words, Defining Words |
@example |
| @subsection Simple Defining Words |
: 2swap ( x1 x2 x3 x4 -- x3 x4 x1 x2 ) |
| @cindex simple defining words |
rot >r rot r> ; |
| @cindex defining words, simple |
@end example |
| |
|
| doc-constant |
The return address of the definition and the loop control parameters of |
| doc-2constant |
counted loops usually reside on the return stack, so you have to take |
| doc-fconstant |
all items, that you have pushed on the return stack in a colon |
| doc-variable |
definition or counted loop, from the return stack before the definition |
| doc-2variable |
or loop ends. You cannot access items that you pushed on the return |
| doc-fvariable |
stack outside some definition or loop within the definition of loop. |
| doc-create |
|
| doc-user |
|
| doc-value |
|
| doc-to |
|
| doc-defer |
|
| doc-is |
|
| |
|
| @node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words |
If you miscount the return stack items, this usually ends in a crash: |
| @subsection Colon Definitions |
|
| @cindex colon definitions |
|
| |
|
| @example |
@example |
| : name ( ... -- ... ) |
: crash ( n -- ) |
| word1 word2 word3 ; |
>r ; |
| |
5 crash |
| @end example |
@end example |
| |
|
| creates a word called @code{name}, that, upon execution, executes |
You cannot mix using locals and using the return stack (according to the |
| @code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}. |
standard; Gforth has no problem). However, they solve the same |
| |
problems, so this shouldn't be an issue. |
| |
|
| The explanation above is somewhat superficial. @xref{Interpretation and |
@assignment |
| Compilation Semantics} for an in-depth discussion of some of the issues |
Can you rewrite any of the definitions you wrote until now in a better |
| involved. |
way using the return stack? |
| |
@endassignment |
| |
|
| doc-: |
Reference: @ref{Return stack}. |
| doc-; |
|
| |
|
| @node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words |
|
| @subsection User-defined Defining Words |
|
| @cindex user-defined defining words |
|
| @cindex defining words, user-defined |
|
| |
|
| You can create new defining words simply by wrapping defining-time code |
@node Memory Tutorial, Characters and Strings Tutorial, Return Stack Tutorial, Tutorial |
| around existing defining words and putting the sequence in a colon |
@section Memory |
| definition. |
@cindex memory access/allocation tutorial |
| |
|
| @cindex @code{CREATE} ... @code{DOES>} |
You can create a global variable @code{v} with |
| If you want the words defined with your defining words to behave |
|
| differently from words defined with standard defining words, you can |
|
| write your defining word like this: |
|
| |
|
| @example |
@example |
| : def-word ( "name" -- ) |
variable v ( -- addr ) |
| Create @var{code1} |
@end example |
| DOES> ( ... -- ... ) |
|
| @var{code2} ; |
|
| |
|
| def-word name |
@code{v} pushes the address of a cell in memory on the stack. This cell |
| |
was reserved by @code{variable}. You can use @code{!} (store) to store |
| |
values into this cell and @code{@@} (fetch) to load the value from the |
| |
stack into memory: |
| |
|
| |
@example |
| |
v . |
| |
5 v ! .s |
| |
v @@ . |
| @end example |
@end example |
| |
|
| Technically, this fragment defines a defining word @code{def-word}, and |
You can see a raw dump of memory with @code{dump}: |
| a word @code{name}; when you execute @code{name}, the address of the |
|
| body of @code{name} is put on the data stack and @var{code2} is executed |
|
| (the address of the body of @code{name} is the address @code{HERE} |
|
| returns immediately after the @code{CREATE}). |
|
| |
|
| In other words, if you make the following definitions: |
@example |
| |
v 1 cells .s dump |
| |
@end example |
| |
|
| |
@code{Cells ( n1 -- n2 )} gives you the number of bytes (or, more |
| |
generally, address units (aus)) that @code{n1 cells} occupy. You can |
| |
also reserve more memory: |
| |
|
| @example |
@example |
| : def-word1 ( "name" -- ) |
create v2 20 cells allot |
| Create @var{code1} ; |
v2 20 cells dump |
| |
@end example |
| |
|
| : action1 ( ... -- ... ) |
creates a word @code{v2} and reserves 20 uninitialized cells; the |
| @var{code2} ; |
address pushed by @code{v2} points to the start of these 20 cells. You |
| |
can use address arithmetic to access these cells: |
| |
|
| |
@example |
| |
3 v2 5 cells + ! |
| |
v2 20 cells dump |
| |
@end example |
| |
|
| |
You can reserve and initialize memory with @code{,}: |
| |
|
| def-word name1 |
@example |
| |
create v3 |
| |
5 , 4 , 3 , 2 , 1 , |
| |
v3 @@ . |
| |
v3 cell+ @@ . |
| |
v3 2 cells + @@ . |
| |
v3 5 cells dump |
| @end example |
@end example |
| |
|
| Using @code{name1 action1} is equivalent to using @code{name}. |
@assignment |
| |
Write a definition @code{vsum ( addr u -- n )} that computes the sum of |
| |
@code{u} cells, with the first of these cells at @code{addr}, the next |
| |
one at @code{addr cell+} etc. |
| |
@endassignment |
| |
|
| E.g., you can implement @code{Constant} in this way: |
You can also reserve memory without creating a new word: |
| |
|
| @example |
@example |
| : constant ( w "name" -- ) |
here 10 cells allot . |
| create , |
here . |
| DOES> ( -- w ) |
|
| @@ ; |
|
| @end example |
@end example |
| |
|
| When you create a constant with @code{5 constant five}, first a new word |
@code{Here} pushes the start address of the memory area. You should |
| @code{five} is created, then the value 5 is laid down in the body of |
store it somewhere, or you will have a hard time finding the memory area |
| @code{five} with @code{,}. When @code{five} is invoked, the address of |
again. |
| the body is put on the stack, and @code{@@} retrieves the value 5. |
|
| |
|
| @cindex stack effect of @code{DOES>}-parts |
@code{Allot} manages dictionary memory. The dictionary memory contains |
| @cindex @code{DOES>}-parts, stack effect |
the system's data structures for words etc. on Gforth and most other |
| In the example above the stack comment after the @code{DOES>} specifies |
Forth systems. It is managed like a stack: You can free the memory that |
| the stack effect of the defined words, not the stack effect of the |
you have just @code{allot}ed with |
| following code (the following code expects the address of the body on |
|
| the top of stack, which is not reflected in the stack comment). This is |
|
| the convention that I use and recommend (it clashes a bit with using |
|
| locals declarations for stack effect specification, though). |
|
| |
|
| @subsubsection Applications of @code{CREATE..DOES>} |
@example |
| @cindex @code{CREATE} ... @code{DOES>}, applications |
-10 cells allot |
| |
here . |
| |
@end example |
| |
|
| You may wonder how to use this feature. Here are some usage patterns: |
Note that you cannot do this if you have created a new word in the |
| |
meantime (because then your @code{allot}ed memory is no longer on the |
| |
top of the dictionary ``stack''). |
| |
|
| |
Alternatively, you can use @code{allocate} and @code{free} which allow |
| |
freeing memory in any order: |
| |
|
| @cindex factoring similar colon definitions |
|
| When you see a sequence of code occurring several times, and you can |
|
| identify a meaning, you will factor it out as a colon definition. When |
|
| you see similar colon definitions, you can factor them using |
|
| @code{CREATE..DOES>}. E.g., an assembler usually defines several words |
|
| that look very similar: |
|
| @example |
@example |
| : ori, ( reg-target reg-source n -- ) |
10 cells allocate throw .s |
| 0 asm-reg-reg-imm ; |
20 cells allocate throw .s |
| : andi, ( reg-target reg-source n -- ) |
swap |
| 1 asm-reg-reg-imm ; |
free throw |
| |
free throw |
| @end example |
@end example |
| |
|
| This could be factored with: |
The @code{throw}s deal with errors (e.g., out of memory). |
| |
|
| |
And there is also a |
| |
@uref{http://www.complang.tuwien.ac.at/forth/garbage-collection.zip, |
| |
garbage collector}, which eliminates the need to @code{free} memory |
| |
explicitly. |
| |
|
| |
Reference: @ref{Memory}. |
| |
|
| |
|
| |
@node Characters and Strings Tutorial, Alignment Tutorial, Memory Tutorial, Tutorial |
| |
@section Characters and Strings |
| |
@cindex strings tutorial |
| |
@cindex characters tutorial |
| |
|
| |
On the stack characters take up a cell, like numbers. In memory they |
| |
have their own size (one 8-bit byte on most systems), and therefore |
| |
require their own words for memory access: |
| |
|
| @example |
@example |
| : reg-reg-imm ( op-code -- ) |
create v4 |
| create , |
104 c, 97 c, 108 c, 108 c, 111 c, |
| DOES> ( reg-target reg-source n -- ) |
v4 4 chars + c@@ . |
| @@ asm-reg-reg-imm ; |
v4 5 chars dump |
| |
@end example |
| |
|
| 0 reg-reg-imm ori, |
The preferred representation of strings on the stack is @code{addr |
| 1 reg-reg-imm andi, |
u-count}, where @code{addr} is the address of the first character and |
| |
@code{u-count} is the number of characters in the string. |
| |
|
| |
@example |
| |
v4 5 type |
| @end example |
@end example |
| |
|
| @cindex currying |
You get a string constant with |
| Another view of @code{CREATE..DOES>} is to consider it as a crude way to |
|
| supply a part of the parameters for a word (known as @dfn{currying} in |
|
| the functional language community). E.g., @code{+} needs two |
|
| parameters. Creating versions of @code{+} with one parameter fixed can |
|
| be done like this: |
|
| @example |
@example |
| : curry+ ( n1 -- ) |
s" hello, world" .s |
| create , |
type |
| DOES> ( n2 -- n1+n2 ) |
@end example |
| @@ + ; |
|
| |
|
| 3 curry+ 3+ |
Make sure you have a space between @code{s"} and the string; @code{s"} |
| -2 curry+ 2- |
is a normal Forth word and must be delimited with white space (try what |
| |
happens when you remove the space). |
| |
|
| |
However, this interpretive use of @code{s"} is quite restricted: the |
| |
string exists only until the next call of @code{s"} (some Forth systems |
| |
keep more than one of these strings, but usually they still have a |
| |
limited lifetime). |
| |
|
| |
@example |
| |
s" hello," s" world" .s |
| |
type |
| |
type |
| @end example |
@end example |
| |
|
| @subsubsection The gory details of @code{CREATE..DOES>} |
You can also use @code{s"} in a definition, and the resulting |
| @cindex @code{CREATE} ... @code{DOES>}, details |
strings then live forever (well, for as long as the definition): |
| |
|
| doc-does> |
@example |
| |
: foo s" hello," s" world" ; |
| |
foo .s |
| |
type |
| |
type |
| |
@end example |
| |
|
| |
@assignment |
| |
@code{Emit ( c -- )} types @code{c} as character (not a number). |
| |
Implement @code{type ( addr u -- )}. |
| |
@endassignment |
| |
|
| |
Reference: @ref{Memory Blocks}. |
| |
|
| |
|
| |
@node Alignment Tutorial, Interpretation and Compilation Semantics and Immediacy Tutorial, Characters and Strings Tutorial, Tutorial |
| |
@section Alignment |
| |
@cindex alignment tutorial |
| |
@cindex memory alignment tutorial |
| |
|
| |
On many processors cells have to be aligned in memory, if you want to |
| |
access them with @code{@@} and @code{!} (and even if the processor does |
| |
not require alignment, access to aligned cells is faster). |
| |
|
| |
@code{Create} aligns @code{here} (i.e., the place where the next |
| |
allocation will occur, and that the @code{create}d word points to). |
| |
Likewise, the memory produced by @code{allocate} starts at an aligned |
| |
address. Adding a number of @code{cells} to an aligned address produces |
| |
another aligned address. |
| |
|
| |
However, address arithmetic involving @code{char+} and @code{chars} can |
| |
create an address that is not cell-aligned. @code{Aligned ( addr -- |
| |
a-addr )} produces the next aligned address: |
| |
|
| @cindex @code{DOES>} in a separate definition |
|
| This means that you need not use @code{CREATE} and @code{DOES>} in the |
|
| same definition; E.g., you can put the @code{DOES>}-part in a separate |
|
| definition. This allows us to, e.g., select among different DOES>-parts: |
|
| @example |
@example |
| : does1 |
v3 char+ aligned .s @@ . |
| DOES> ( ... -- ... ) |
v3 char+ .s @@ . |
| ... ; |
@end example |
| |
|
| : does2 |
Similarly, @code{align} advances @code{here} to the next aligned |
| DOES> ( ... -- ... ) |
address: |
| ... ; |
|
| |
|
| : def-word ( ... -- ... ) |
@example |
| create ... |
create v5 97 c, |
| IF |
here . |
| does1 |
align here . |
| ELSE |
1000 , |
| does2 |
|
| ENDIF ; |
|
| @end example |
@end example |
| |
|
| @cindex @code{DOES>} in interpretation state |
Note that you should use aligned addresses even if your processor does |
| In a standard program you can apply a @code{DOES>}-part only if the last |
not require them, if you want your program to be portable. |
| word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part |
|
| will override the behaviour of the last word defined in any case. In a |
Reference: @ref{Address arithmetic}. |
| standard program, you can use @code{DOES>} only in a colon |
|
| definition. In Gforth, you can also use it in interpretation state, in a |
|
| kind of one-shot mode: |
@node Interpretation and Compilation Semantics and Immediacy Tutorial, Execution Tokens Tutorial, Alignment Tutorial, Tutorial |
| |
@section Interpretation and Compilation Semantics and Immediacy |
| |
@cindex semantics tutorial |
| |
@cindex interpretation semantics tutorial |
| |
@cindex compilation semantics tutorial |
| |
@cindex immediate, tutorial |
| |
|
| |
When a word is compiled, it behaves differently from being interpreted. |
| |
E.g., consider @code{+}: |
| |
|
| @example |
@example |
| CREATE name ( ... -- ... ) |
1 2 + . |
| @var{initialization} |
: foo + ; |
| DOES> |
|
| @var{code} ; |
|
| @end example |
@end example |
| This is equivalent to the standard |
|
| |
These two behaviours are known as compilation and interpretation |
| |
semantics. For normal words (e.g., @code{+}), the compilation semantics |
| |
is to append the interpretation semantics to the currently defined word |
| |
(@code{foo} in the example above). I.e., when @code{foo} is executed |
| |
later, the interpretation semantics of @code{+} (i.e., adding two |
| |
numbers) will be performed. |
| |
|
| |
However, there are words with non-default compilation semantics, e.g., |
| |
the control-flow words like @code{if}. You can use @code{immediate} to |
| |
change the compilation semantics of the last defined word to be equal to |
| |
the interpretation semantics: |
| |
|
| @example |
@example |
| :noname |
: [FOO] ( -- ) |
| DOES> |
5 . ; immediate |
| @var{code} ; |
|
| CREATE name EXECUTE ( ... -- ... ) |
[FOO] |
| @var{initialization} |
: bar ( -- ) |
| |
[FOO] ; |
| |
bar |
| |
see bar |
| @end example |
@end example |
| |
|
| You can get the address of the body of a word with |
Two conventions to mark words with non-default compilation semnatics are |
| |
names with brackets (more frequently used) and to write them all in |
| |
upper case (less frequently used). |
| |
|
| doc->body |
In Gforth (and many other systems) you can also remove the |
| |
interpretation semantics with @code{compile-only} (the compilation |
| |
semantics is derived from the original interpretation semantics): |
| |
|
| @node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words |
@example |
| @subsection Supplying names for the defined words |
: flip ( -- ) |
| @cindex names for defined words |
6 . ; compile-only \ but not immediate |
| @cindex defining words, name parameter |
flip |
| |
|
| @cindex defining words, name given in a string |
: flop ( -- ) |
| By default, defining words take the names for the defined words from the |
flip ; |
| input stream. Sometimes you want to supply the name from a string. You |
flop |
| can do this with |
@end example |
| |
|
| doc-nextname |
In this example the interpretation semantics of @code{flop} is equal to |
| |
the original interpretation semantics of @code{flip}. |
| |
|
| E.g., |
The text interpreter has two states: in interpret state, it performs the |
| |
interpretation semantics of words it encounters; in compile state, it |
| |
performs the compilation semantics of these words. |
| |
|
| |
Among other things, @code{:} switches into compile state, and @code{;} |
| |
switches back to interpret state. They contain the factors @code{]} |
| |
(switch to compile state) and @code{[} (switch to interpret state), that |
| |
do nothing but switch the state. |
| |
|
| @example |
@example |
| s" foo" nextname create |
: xxx ( -- ) |
| |
[ 5 . ] |
| |
; |
| |
|
| |
xxx |
| |
see xxx |
| @end example |
@end example |
| is equivalent to |
|
| |
These brackets are also the source of the naming convention mentioned |
| |
above. |
| |
|
| |
Reference: @ref{Interpretation and Compilation Semantics}. |
| |
|
| |
|
| |
@node Execution Tokens Tutorial, Exceptions Tutorial, Interpretation and Compilation Semantics and Immediacy Tutorial, Tutorial |
| |
@section Execution Tokens |
| |
@cindex execution tokens tutorial |
| |
@cindex XT tutorial |
| |
|
| |
@code{' word} gives you the execution token (XT) of a word. The XT is a |
| |
cell representing the interpretation semantics of a word. You can |
| |
execute this semantics with @code{execute}: |
| |
|
| @example |
@example |
| create foo |
' + .s |
| |
1 2 rot execute . |
| @end example |
@end example |
| |
|
| @cindex defining words without name |
The XT is similar to a function pointer in C. However, parameter |
| Sometimes you want to define a word without a name. You can do this with |
passing through the stack makes it a little more flexible: |
| |
|
| doc-noname |
@example |
| |
: map-array ( ... addr u xt -- ... ) |
| |
\ executes xt ( ... x -- ... ) for every element of the array starting |
| |
\ at addr and containing u elements |
| |
@{ xt @} |
| |
cells over + swap ?do |
| |
i @@ xt execute |
| |
1 cells +loop ; |
| |
|
| @cindex execution token of last defined word |
create a 3 , 4 , 2 , -1 , 4 , |
| To make any use of the newly defined word, you need its execution |
a 5 ' . map-array .s |
| token. You can get it with |
0 a 5 ' + map-array . |
| |
s" max-n" environment? drop .s |
| |
a 5 ' min map-array . |
| |
@end example |
| |
|
| doc-lastxt |
You can use map-array with the XTs of words that consume one element |
| |
more than they produce. In theory you can also use it with other XTs, |
| |
but the stack effect then depends on the size of the array, which is |
| |
hard to understand. |
| |
|
| |
Since XTs are cell-sized, you can store them in memory and manipulate |
| |
them on the stack like other cells. You can also compile the XT into a |
| |
word with @code{compile,}: |
| |
|
| E.g., you can initialize a deferred word with an anonymous colon |
|
| definition: |
|
| @example |
@example |
| Defer deferred |
: foo1 ( n1 n2 -- n ) |
| noname : ( ... -- ... ) |
[ ' + compile, ] ; |
| ... ; |
see foo |
| lastxt IS deferred |
|
| @end example |
@end example |
| |
|
| @code{lastxt} also works when the last word was not defined as |
This is non-standard, because @code{compile,} has no compilation |
| @code{noname}. |
semantics in the standard, but it works in good Forth systems. For the |
| |
broken ones, use |
| |
|
| The standard has also recognized the need for anonymous words and |
@example |
| provides |
: [compile,] compile, ; immediate |
| |
|
| doc-:noname |
: foo1 ( n1 n2 -- n ) |
| |
[ ' + ] [compile,] ; |
| |
see foo |
| |
@end example |
| |
|
| |
@code{'} is a word with default compilation semantics; it parses the |
| |
next word when its interpretation semantics are executed, not during |
| |
compilation: |
| |
|
| This leaves the execution token for the word on the stack after the |
|
| closing @code{;}. You can rewrite the last example with @code{:noname}: |
|
| @example |
@example |
| Defer deferred |
: foo ( -- xt ) |
| :noname ( ... -- ... ) |
' ; |
| ... ; |
see foo |
| IS deferred |
: bar ( ... "word" -- ... ) |
| |
' execute ; |
| |
see bar |
| |
1 2 bar + . |
| @end example |
@end example |
| |
|
| @node Interpretation and Compilation Semantics, , Supplying names, Defining Words |
You often want to parse a word during compilation and compile its XT so |
| @subsection Interpretation and Compilation Semantics |
it will be pushed on the stack at run-time. @code{[']} does this: |
| @cindex semantics, interpretation and compilation |
|
| |
@example |
| |
: xt-+ ( -- xt ) |
| |
['] + ; |
| |
see xt-+ |
| |
1 2 xt-+ execute . |
| |
@end example |
| |
|
| |
Many programmers tend to see @code{'} and the word it parses as one |
| |
unit, and expect it to behave like @code{[']} when compiled, and are |
| |
confused by the actual behaviour. If you are, just remember that the |
| |
Forth system just takes @code{'} as one unit and has no idea that it is |
| |
a parsing word (attempts to convenience programmers in this issue have |
| |
usually resulted in even worse pitfalls, see |
| |
@uref{http://www.complang.tuwien.ac.at/papers/ertl98.ps.gz, |
| |
@code{State}-smartness---Why it is evil and How to Exorcise it}). |
| |
|
| |
Note that the state of the interpreter does not come into play when |
| |
creating and executing XTs. I.e., even when you execute @code{'} in |
| |
compile state, it still gives you the interpretation semantics. And |
| |
whatever that state is, @code{execute} performs the semantics |
| |
represented by the XT (i.e., for XTs produced with @code{'} the |
| |
interpretation semantics). |
| |
|
| |
Reference: @ref{Tokens for Words}. |
| |
|
| |
|
| |
@node Exceptions Tutorial, Defining Words Tutorial, Execution Tokens Tutorial, Tutorial |
| |
@section Exceptions |
| |
@cindex exceptions tutorial |
| |
|
| |
@code{throw ( n -- )} causes an exception unless n is zero. |
| |
|
| |
@example |
| |
100 throw .s |
| |
0 throw .s |
| |
@end example |
| |
|
| |
@code{catch ( ... xt -- ... n )} behaves similar to @code{execute}, but |
| |
it catches exceptions and pushes the number of the exception on the |
| |
stack (or 0, if the xt executed without exception). If there was an |
| |
exception, the stacks have the same depth as when entering @code{catch}: |
| |
|
| |
@example |
| |
.s |
| |
3 0 ' / catch .s |
| |
3 2 ' / catch .s |
| |
@end example |
| |
|
| |
@assignment |
| |
Try the same with @code{execute} instead of @code{catch}. |
| |
@endassignment |
| |
|
| |
@code{Throw} always jumps to the dynamically next enclosing |
| |
@code{catch}, even if it has to leave several call levels to achieve |
| |
this: |
| |
|
| |
@example |
| |
: foo 100 throw ; |
| |
: foo1 foo ." after foo" ; |
| |
: bar ['] foo1 catch ; |
| |
bar . |
| |
@end example |
| |
|
| |
It is often important to restore a value upon leaving a definition, even |
| |
if the definition is left through an exception. You can ensure this |
| |
like this: |
| |
|
| |
@example |
| |
: ... |
| |
save-x |
| |
['] word-changing-x catch ( ... n ) |
| |
restore-x |
| |
( ... n ) throw ; |
| |
@end example |
| |
|
| |
Gforth provides an alternative syntax in addition to @code{catch}: |
| |
@code{try ... recover ... endtry}. If the code between @code{try} and |
| |
@code{recover} has an exception, the stack depths are restored, the |
| |
exception number is pushed on the stack, and the code between |
| |
@code{recover} and @code{endtry} is performed. E.g., the definition for |
| |
@code{catch} is |
| |
|
| |
@example |
| |
: catch ( x1 .. xn xt -- y1 .. ym 0 / z1 .. zn error ) \ exception |
| |
try |
| |
execute 0 |
| |
recover |
| |
nip |
| |
endtry ; |
| |
@end example |
| |
|
| |
The equivalent to the restoration code above is |
| |
|
| |
@example |
| |
: ... |
| |
save-x |
| |
try |
| |
word-changing-x |
| |
end-try |
| |
restore-x |
| |
throw ; |
| |
@end example |
| |
|
| |
As you can see, the @code{recover} part is optional. |
| |
|
| |
Reference: @ref{Exception Handling}. |
| |
|
| |
|
| |
@node Defining Words Tutorial, Arrays and Records Tutorial, Exceptions Tutorial, Tutorial |
| |
@section Defining Words |
| |
@cindex defining words tutorial |
| |
@cindex does> tutorial |
| |
@cindex create...does> tutorial |
| |
|
| |
@c before semantics? |
| |
|
| |
@code{:}, @code{create}, and @code{variable} are definition words: They |
| |
define other words. @code{Constant} is another definition word: |
| |
|
| |
@example |
| |
5 constant foo |
| |
foo . |
| |
@end example |
| |
|
| |
You can also use the prefixes @code{2} (double-cell) and @code{f} |
| |
(floating point) with @code{variable} and @code{constant}. |
| |
|
| |
You can also define your own defining words. E.g.: |
| |
|
| |
@example |
| |
: variable ( "name" -- ) |
| |
create 0 , ; |
| |
@end example |
| |
|
| |
You can also define defining words that create words that do something |
| |
other than just producing their address: |
| |
|
| |
@example |
| |
: constant ( n "name" -- ) |
| |
create , |
| |
does> ( -- n ) |
| |
( addr ) @@ ; |
| |
|
| |
5 constant foo |
| |
foo . |
| |
@end example |
| |
|
| |
The definition of @code{constant} above ends at the @code{does>}; i.e., |
| |
@code{does>} replaces @code{;}, but it also does something else: It |
| |
changes the last defined word such that it pushes the address of the |
| |
body of the word and then performs the code after the @code{does>} |
| |
whenever it is called. |
| |
|
| |
In the example above, @code{constant} uses @code{,} to store 5 into the |
| |
body of @code{foo}. When @code{foo} executes, it pushes the address of |
| |
the body onto the stack, then (in the code after the @code{does>}) |
| |
fetches the 5 from there. |
| |
|
| |
The stack comment near the @code{does>} reflects the stack effect of the |
| |
defined word, not the stack effect of the code after the @code{does>} |
| |
(the difference is that the code expects the address of the body that |
| |
the stack comment does not show). |
| |
|
| |
You can use these definition words to do factoring in cases that involve |
| |
(other) definition words. E.g., a field offset is always added to an |
| |
address. Instead of defining |
| |
|
| |
@example |
| |
2 cells constant offset-field1 |
| |
@end example |
| |
|
| |
and using this like |
| |
|
| |
@example |
| |
( addr ) offset-field1 + |
| |
@end example |
| |
|
| |
you can define a definition word |
| |
|
| |
@example |
| |
: simple-field ( n "name" -- ) |
| |
create , |
| |
does> ( n1 -- n1+n ) |
| |
( addr ) @@ + ; |
| |
@end example |
| |
|
| |
Definition and use of field offsets now look like this: |
| |
|
| |
@example |
| |
2 cells simple-field field1 |
| |
create mystruct 4 cells allot |
| |
mystruct .s field1 .s drop |
| |
@end example |
| |
|
| |
If you want to do something with the word without performing the code |
| |
after the @code{does>}, you can access the body of a @code{create}d word |
| |
with @code{>body ( xt -- addr )}: |
| |
|
| |
@example |
| |
: value ( n "name" -- ) |
| |
create , |
| |
does> ( -- n1 ) |
| |
@@ ; |
| |
: to ( n "name" -- ) |
| |
' >body ! ; |
| |
|
| |
5 value foo |
| |
foo . |
| |
7 to foo |
| |
foo . |
| |
@end example |
| |
|
| |
@assignment |
| |
Define @code{defer ( "name" -- )}, which creates a word that stores an |
| |
XT (at the start the XT of @code{abort}), and upon execution |
| |
@code{execute}s the XT. Define @code{is ( xt "name" -- )} that stores |
| |
@code{xt} into @code{name}, a word defined with @code{defer}. Indirect |
| |
recursion is one application of @code{defer}. |
| |
@endassignment |
| |
|
| |
Reference: @ref{User-defined Defining Words}. |
| |
|
| |
|
| |
@node Arrays and Records Tutorial, POSTPONE Tutorial, Defining Words Tutorial, Tutorial |
| |
@section Arrays and Records |
| |
@cindex arrays tutorial |
| |
@cindex records tutorial |
| |
@cindex structs tutorial |
| |
|
| |
Forth has no standard words for defining data structures such as arrays |
| |
and records (structs in C terminology), but you can build them yourself |
| |
based on address arithmetic. You can also define words for defining |
| |
arrays and records (@pxref{Defining Words Tutorial,, Defining Words}). |
| |
|
| |
One of the first projects a Forth newcomer sets out upon when learning |
| |
about defining words is an array defining word (possibly for |
| |
n-dimensional arrays). Go ahead and do it, I did it, too; you will |
| |
learn something from it. However, don't be disappointed when you later |
| |
learn that you have little use for these words (inappropriate use would |
| |
be even worse). I have not yet found a set of useful array words yet; |
| |
the needs are just too diverse, and named, global arrays (the result of |
| |
naive use of defining words) are often not flexible enough (e.g., |
| |
consider how to pass them as parameters). Another such project is a set |
| |
of words to help dealing with strings. |
| |
|
| |
On the other hand, there is a useful set of record words, and it has |
| |
been defined in @file{compat/struct.fs}; these words are predefined in |
| |
Gforth. They are explained in depth elsewhere in this manual (see |
| |
@pxref{Structures}). The @code{simple-field} example above is |
| |
simplified variant of fields in this package. |
| |
|
| |
|
| |
@node POSTPONE Tutorial, Literal Tutorial, Arrays and Records Tutorial, Tutorial |
| |
@section @code{POSTPONE} |
| |
@cindex postpone tutorial |
| |
|
| |
You can compile the compilation semantics (instead of compiling the |
| |
interpretation semantics) of a word with @code{POSTPONE}: |
| |
|
| |
@example |
| |
: MY-+ ( Compilation: -- ; Run-time of compiled code: n1 n2 -- n ) |
| |
POSTPONE + ; immediate |
| |
: foo ( n1 n2 -- n ) |
| |
MY-+ ; |
| |
1 2 foo . |
| |
see foo |
| |
@end example |
| |
|
| |
During the definition of @code{foo} the text interpreter performs the |
| |
compilation semantics of @code{MY-+}, which performs the compilation |
| |
semantics of @code{+}, i.e., it compiles @code{+} into @code{foo}. |
| |
|
| |
This example also displays separate stack comments for the compilation |
| |
semantics and for the stack effect of the compiled code. For words with |
| |
default compilation semantics these stack effects are usually not |
| |
displayed; the stack effect of the compilation semantics is always |
| |
@code{( -- )} for these words, the stack effect for the compiled code is |
| |
the stack effect of the interpretation semantics. |
| |
|
| |
Note that the state of the interpreter does not come into play when |
| |
performing the compilation semantics in this way. You can also perform |
| |
it interpretively, e.g.: |
| |
|
| |
@example |
| |
: foo2 ( n1 n2 -- n ) |
| |
[ MY-+ ] ; |
| |
1 2 foo . |
| |
see foo |
| |
@end example |
| |
|
| |
However, there are some broken Forth systems where this does not always |
| |
work, and therefore this practice was been declared non-standard in |
| |
1999. |
| |
@c !! repair.fs |
| |
|
| |
Here is another example for using @code{POSTPONE}: |
| |
|
| |
@example |
| |
: MY-- ( Compilation: -- ; Run-time of compiled code: n1 n2 -- n ) |
| |
POSTPONE negate POSTPONE + ; immediate compile-only |
| |
: bar ( n1 n2 -- n ) |
| |
MY-- ; |
| |
2 1 bar . |
| |
see bar |
| |
@end example |
| |
|
| |
You can define @code{ENDIF} in this way: |
| |
|
| |
@example |
| |
: ENDIF ( Compilation: orig -- ) |
| |
POSTPONE then ; immediate |
| |
@end example |
| |
|
| |
@assignment |
| |
Write @code{MY-2DUP} that has compilation semantics equivalent to |
| |
@code{2dup}, but compiles @code{over over}. |
| |
@endassignment |
| |
|
| |
@c !! @xref{Macros} for reference |
| |
|
| |
|
| |
@node Literal Tutorial, Advanced macros Tutorial, POSTPONE Tutorial, Tutorial |
| |
@section @code{Literal} |
| |
@cindex literal tutorial |
| |
|
| |
You cannot @code{POSTPONE} numbers: |
| |
|
| |
@example |
| |
: [FOO] POSTPONE 500 ; immediate |
| |
@end example |
| |
|
| |
Instead, you can use @code{LITERAL (compilation: n --; run-time: -- n )}: |
| |
|
| |
@example |
| |
: [FOO] ( compilation: --; run-time: -- n ) |
| |
500 POSTPONE literal ; immediate |
| |
|
| |
: flip [FOO] ; |
| |
flip . |
| |
see flip |
| |
@end example |
| |
|
| |
@code{LITERAL} consumes a number at compile-time (when it's compilation |
| |
semantics are executed) and pushes it at run-time (when the code it |
| |
compiled is executed). A frequent use of @code{LITERAL} is to compile a |
| |
number computed at compile time into the current word: |
| |
|
| |
@example |
| |
: bar ( -- n ) |
| |
[ 2 2 + ] literal ; |
| |
see bar |
| |
@end example |
| |
|
| |
@assignment |
| |
Write @code{]L} which allows writing the example above as @code{: bar ( |
| |
-- n ) [ 2 2 + ]L ;} |
| |
@endassignment |
| |
|
| |
@c !! @xref{Macros} for reference |
| |
|
| |
|
| |
@node Advanced macros Tutorial, Compilation Tokens Tutorial, Literal Tutorial, Tutorial |
| |
@section Advanced macros |
| |
@cindex macros, advanced tutorial |
| |
@cindex run-time code generation, tutorial |
| |
|
| |
Reconsider @code{map-array} from @ref{Execution Tokens Tutorial,, |
| |
Execution Tokens}. It frequently performs @code{execute}, a relatively |
| |
expensive operation in some Forth implementations. You can use |
| |
@code{compile,} and @code{POSTPONE} to eliminate these @code{execute}s |
| |
and produce a word that contains the word to be performed directly: |
| |
|
| |
@c use ]] ... [[ |
| |
@example |
| |
: compile-map-array ( compilation: xt -- ; run-time: ... addr u -- ... ) |
| |
\ at run-time, execute xt ( ... x -- ... ) for each element of the |
| |
\ array beginning at addr and containing u elements |
| |
@{ xt @} |
| |
POSTPONE cells POSTPONE over POSTPONE + POSTPONE swap POSTPONE ?do |
| |
POSTPONE i POSTPONE @@ xt compile, |
| |
1 cells POSTPONE literal POSTPONE +loop ; |
| |
|
| |
: sum-array ( addr u -- n ) |
| |
0 rot rot [ ' + compile-map-array ] ; |
| |
see sum-array |
| |
a 5 sum-array . |
| |
@end example |
| |
|
| |
You can use the full power of Forth for generating the code; here's an |
| |
example where the code is generated in a loop: |
| |
|
| |
@example |
| |
: compile-vmul-step ( compilation: n --; run-time: n1 addr1 -- n2 addr2 ) |
| |
\ n2=n1+(addr1)*n, addr2=addr1+cell |
| |
POSTPONE tuck POSTPONE @@ |
| |
POSTPONE literal POSTPONE * POSTPONE + |
| |
POSTPONE swap POSTPONE cell+ ; |
| |
|
| |
: compile-vmul ( compilation: addr1 u -- ; run-time: addr2 -- n ) |
| |
\ n=v1*v2 (inner product), where the v_i are represented as addr_i u |
| |
0 postpone literal postpone swap |
| |
[ ' compile-vmul-step compile-map-array ] |
| |
postpone drop ; |
| |
see compile-vmul |
| |
|
| |
: a-vmul ( addr -- n ) |
| |
\ n=a*v, where v is a vector that's as long as a and starts at addr |
| |
[ a 5 compile-vmul ] ; |
| |
see a-vmul |
| |
a a-vmul . |
| |
@end example |
| |
|
| |
This example uses @code{compile-map-array} to show off, but you could |
| |
also use @code{map-array} instead (try it now!). |
| |
|
| |
You can use this technique for efficient multiplication of large |
| |
matrices. In matrix multiplication, you multiply every line of one |
| |
matrix with every column of the other matrix. You can generate the code |
| |
for one line once, and use it for every column. The only downside of |
| |
this technique is that it is cumbersome to recover the memory consumed |
| |
by the generated code when you are done (and in more complicated cases |
| |
it is not possible portably). |
| |
|
| |
@c !! @xref{Macros} for reference |
| |
|
| |
|
| |
@node Compilation Tokens Tutorial, Wordlists and Search Order Tutorial, Advanced macros Tutorial, Tutorial |
| |
@section Compilation Tokens |
| |
@cindex compilation tokens, tutorial |
| |
@cindex CT, tutorial |
| |
|
| |
This section is Gforth-specific. You can skip it. |
| |
|
| |
@code{' word compile,} compiles the interpretation semantics. For words |
| |
with default compilation semantics this is the same as performing the |
| |
compilation semantics. To represent the compilation semantics of other |
| |
words (e.g., words like @code{if} that have no interpretation |
| |
semantics), Gforth has the concept of a compilation token (CT, |
| |
consisting of two cells), and words @code{comp'} and @code{[comp']}. |
| |
You can perform the compilation semantics represented by a CT with |
| |
@code{execute}: |
| |
|
| |
@example |
| |
: foo2 ( n1 n2 -- n ) |
| |
[ comp' + execute ] ; |
| |
see foo |
| |
@end example |
| |
|
| |
You can compile the compilation semantics represented by a CT with |
| |
@code{postpone,}: |
| |
|
| |
@example |
| |
: foo3 ( -- ) |
| |
[ comp' + postpone, ] ; |
| |
see foo3 |
| |
@end example |
| |
|
| |
@code{[ comp' word postpone, ]} is equivalent to @code{POSTPONE word}. |
| |
@code{comp'} is particularly useful for words that have no |
| |
interpretation semantics: |
| |
|
| |
@example |
| |
' if |
| |
comp' if .s 2drop |
| |
@end example |
| |
|
| |
Reference: @ref{Tokens for Words}. |
| |
|
| |
|
| |
@node Wordlists and Search Order Tutorial, , Compilation Tokens Tutorial, Tutorial |
| |
@section Wordlists and Search Order |
| |
@cindex wordlists tutorial |
| |
@cindex search order, tutorial |
| |
|
| |
The dictionary is not just a memory area that allows you to allocate |
| |
memory with @code{allot}, it also contains the Forth words, arranged in |
| |
several wordlists. When searching for a word in a wordlist, |
| |
conceptually you start searching at the youngest and proceed towards |
| |
older words (in reality most systems nowadays use hash-tables); i.e., if |
| |
you define a word with the same name as an older word, the new word |
| |
shadows the older word. |
| |
|
| |
Which wordlists are searched in which order is determined by the search |
| |
order. You can display the search order with @code{order}. It displays |
| |
first the search order, starting with the wordlist searched first, then |
| |
it displays the wordlist that will contain newly defined words. |
| |
|
| |
You can create a new, empty wordlist with @code{wordlist ( -- wid )}: |
| |
|
| |
@example |
| |
wordlist constant mywords |
| |
@end example |
| |
|
| |
@code{Set-current ( wid -- )} sets the wordlist that will contain newly |
| |
defined words (the @emph{current} wordlist): |
| |
|
| |
@example |
| |
mywords set-current |
| |
order |
| |
@end example |
| |
|
| |
Gforth does not display a name for the wordlist in @code{mywords} |
| |
because this wordlist was created anonymously with @code{wordlist}. |
| |
|
| |
You can get the current wordlist with @code{get-current ( -- wid)}. If |
| |
you want to put something into a specific wordlist without overall |
| |
effect on the current wordlist, this typically looks like this: |
| |
|
| |
@example |
| |
get-current mywords set-current ( wid ) |
| |
create someword |
| |
( wid ) set-current |
| |
@end example |
| |
|
| |
You can write the search order with @code{set-order ( wid1 .. widn n -- |
| |
)} and read it with @code{get-order ( -- wid1 .. widn n )}. The first |
| |
searched wordlist is topmost. |
| |
|
| |
@example |
| |
get-order mywords swap 1+ set-order |
| |
order |
| |
@end example |
| |
|
| |
Yes, the order of wordlists in the output of @code{order} is reversed |
| |
from stack comments and the output of @code{.s} and thus unintuitive. |
| |
|
| |
@assignment |
| |
Define @code{>order ( wid -- )} with adds @code{wid} as first searched |
| |
wordlist to the search order. Define @code{previous ( -- )}, which |
| |
removes the first searched wordlist from the search order. Experiment |
| |
with boundary conditions (you will see some crashes or situations that |
| |
are hard or impossible to leave). |
| |
@endassignment |
| |
|
| |
The search order is a powerful foundation for providing features similar |
| |
to Modula-2 modules and C++ namespaces. However, trying to modularize |
| |
programs in this way has disadvantages for debugging and reuse/factoring |
| |
that overcome the advantages in my experience (I don't do huge projects, |
| |
though). These disadvantages are not so clear in other |
| |
languages/programming environments, because these langauges are not so |
| |
strong in debugging and reuse. |
| |
|
| |
@c !! example |
| |
|
| |
Reference: @ref{Word Lists}. |
| |
|
| |
@c ****************************************************************** |
| |
@node Introduction, Words, Tutorial, Top |
| |
@comment node-name, next, previous, up |
| |
@chapter An Introduction to ANS Forth |
| |
@cindex Forth - an introduction |
| |
|
| |
The primary purpose of this manual is to document Gforth. However, since |
| |
Forth is not a widely-known language and there is a lack of up-to-date |
| |
teaching material, it seems worthwhile to provide some introductory |
| |
material. For other sources of Forth-related |
| |
information, see @ref{Forth-related information}. |
| |
|
| |
The examples in this section should work on any ANS Forth; the |
| |
output shown was produced using Gforth. Each example attempts to |
| |
reproduce the exact output that Gforth produces. If you try out the |
| |
examples (and you should), what you should type is shown @kbd{like this} |
| |
and Gforth's response is shown @code{like this}. The single exception is |
| |
that, where the example shows @key{RET} it means that you should |
| |
press the ``carriage return'' key. Unfortunately, some output formats for |
| |
this manual cannot show the difference between @kbd{this} and |
| |
@code{this} which will make trying out the examples harder (but not |
| |
impossible). |
| |
|
| |
Forth is an unusual language. It provides an interactive development |
| |
environment which includes both an interpreter and compiler. Forth |
| |
programming style encourages you to break a problem down into many |
| |
@cindex factoring |
| |
small fragments (@dfn{factoring}), and then to develop and test each |
| |
fragment interactively. Forth advocates assert that breaking the |
| |
edit-compile-test cycle used by conventional programming languages can |
| |
lead to great productivity improvements. |
| |
|
| |
@menu |
| |
* Introducing the Text Interpreter:: |
| |
* Stacks and Postfix notation:: |
| |
* Your first definition:: |
| |
* How does that work?:: |
| |
* Forth is written in Forth:: |
| |
* Review - elements of a Forth system:: |
| |
* Where to go next:: |
| |
* Exercises:: |
| |
@end menu |
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Introducing the Text Interpreter, Stacks and Postfix notation, Introduction, Introduction |
| |
@section Introducing the Text Interpreter |
| |
@cindex text interpreter |
| |
@cindex outer interpreter |
| |
|
| |
@c IMO this is too detailed and the pace is too slow for |
| |
@c an introduction. If you know German, take a look at |
| |
@c http://www.complang.tuwien.ac.at/anton/lvas/skriptum-stack.html |
| |
@c to see how I do it - anton |
| |
|
| |
@c nac-> Where I have accepted your comments 100% and modified the text |
| |
@c accordingly, I have deleted your comments. Elsewhere I have added a |
| |
@c response like this to attempt to rationalise what I have done. Of |
| |
@c course, this is a very clumsy mechanism for something that would be |
| |
@c done far more efficiently over a beer. Please delete any dialogue |
| |
@c you consider closed. |
| |
|
| |
When you invoke the Forth image, you will see a startup banner printed |
| |
and nothing else (if you have Gforth installed on your system, try |
| |
invoking it now, by typing @kbd{gforth@key{RET}}). Forth is now running |
| |
its command line interpreter, which is called the @dfn{Text Interpreter} |
| |
(also known as the @dfn{Outer Interpreter}). (You will learn a lot |
| |
about the text interpreter as you read through this chapter, for more |
| |
detail @pxref{The Text Interpreter}). |
| |
|
| |
Although it's not obvious, Forth is actually waiting for your |
| |
input. Type a number and press the @key{RET} key: |
| |
|
| |
@example |
| |
@kbd{45@key{RET}} ok |
| |
@end example |
| |
|
| |
Rather than give you a prompt to invite you to input something, the text |
| |
interpreter prints a status message @i{after} it has processed a line |
| |
of input. The status message in this case (``@code{ ok}'' followed by |
| |
carriage-return) indicates that the text interpreter was able to process |
| |
all of your input successfully. Now type something illegal: |
| |
|
| |
@example |
| |
@kbd{qwer341@key{RET}} |
| |
:1: Undefined word |
| |
qwer341 |
| |
^^^^^^^ |
| |
$400D2BA8 Bounce |
| |
$400DBDA8 no.extensions |
| |
@end example |
| |
|
| |
The exact text, other than the ``Undefined word'' may differ slightly on |
| |
your system, but the effect is the same; when the text interpreter |
| |
detects an error, it discards any remaining text on a line, resets |
| |
certain internal state and prints an error message. For a detailed description of error messages see @ref{Error |
| |
messages}. |
| |
|
| |
The text interpreter waits for you to press carriage-return, and then |
| |
processes your input line. Starting at the beginning of the line, it |
| |
breaks the line into groups of characters separated by spaces. For each |
| |
group of characters in turn, it makes two attempts to do something: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
@cindex name dictionary |
| |
It tries to treat it as a command. It does this by searching a @dfn{name |
| |
dictionary}. If the group of characters matches an entry in the name |
| |
dictionary, the name dictionary provides the text interpreter with |
| |
information that allows the text interpreter perform some actions. In |
| |
Forth jargon, we say that the group |
| |
@cindex word |
| |
@cindex definition |
| |
@cindex execution token |
| |
@cindex xt |
| |
of characters names a @dfn{word}, that the dictionary search returns an |
| |
@dfn{execution token (xt)} corresponding to the @dfn{definition} of the |
| |
word, and that the text interpreter executes the xt. Often, the terms |
| |
@dfn{word} and @dfn{definition} are used interchangeably. |
| |
@item |
| |
If the text interpreter fails to find a match in the name dictionary, it |
| |
tries to treat the group of characters as a number in the current number |
| |
base (when you start up Forth, the current number base is base 10). If |
| |
the group of characters legitimately represents a number, the text |
| |
interpreter pushes the number onto a stack (we'll learn more about that |
| |
in the next section). |
| |
@end itemize |
| |
|
| |
If the text interpreter is unable to do either of these things with any |
| |
group of characters, it discards the group of characters and the rest of |
| |
the line, then prints an error message. If the text interpreter reaches |
| |
the end of the line without error, it prints the status message ``@code{ ok}'' |
| |
followed by carriage-return. |
| |
|
| |
This is the simplest command we can give to the text interpreter: |
| |
|
| |
@example |
| |
@key{RET} ok |
| |
@end example |
| |
|
| |
The text interpreter did everything we asked it to do (nothing) without |
| |
an error, so it said that everything is ``@code{ ok}''. Try a slightly longer |
| |
command: |
| |
|
| |
@example |
| |
@kbd{12 dup fred dup@key{RET}} |
| |
:1: Undefined word |
| |
12 dup fred dup |
| |
^^^^ |
| |
$400D2BA8 Bounce |
| |
$400DBDA8 no.extensions |
| |
@end example |
| |
|
| |
When you press the carriage-return key, the text interpreter starts to |
| |
work its way along the line: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
When it gets to the space after the @code{2}, it takes the group of |
| |
characters @code{12} and looks them up in the name |
| |
dictionary@footnote{We can't tell if it found them or not, but assume |
| |
for now that it did not}. There is no match for this group of characters |
| |
in the name dictionary, so it tries to treat them as a number. It is |
| |
able to do this successfully, so it puts the number, 12, ``on the stack'' |
| |
(whatever that means). |
| |
@item |
| |
The text interpreter resumes scanning the line and gets the next group |
| |
of characters, @code{dup}. It looks it up in the name dictionary and |
| |
(you'll have to take my word for this) finds it, and executes the word |
| |
@code{dup} (whatever that means). |
| |
@item |
| |
Once again, the text interpreter resumes scanning the line and gets the |
| |
group of characters @code{fred}. It looks them up in the name |
| |
dictionary, but can't find them. It tries to treat them as a number, but |
| |
they don't represent any legal number. |
| |
@end itemize |
| |
|
| |
At this point, the text interpreter gives up and prints an error |
| |
message. The error message shows exactly how far the text interpreter |
| |
got in processing the line. In particular, it shows that the text |
| |
interpreter made no attempt to do anything with the final character |
| |
group, @code{dup}, even though we have good reason to believe that the |
| |
text interpreter would have no problem looking that word up and |
| |
executing it a second time. |
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Stacks and Postfix notation, Your first definition, Introducing the Text Interpreter, Introduction |
| |
@section Stacks, postfix notation and parameter passing |
| |
@cindex text interpreter |
| |
@cindex outer interpreter |
| |
|
| |
In procedural programming languages (like C and Pascal), the |
| |
building-block of programs is the @dfn{function} or @dfn{procedure}. These |
| |
functions or procedures are called with @dfn{explicit parameters}. For |
| |
example, in C we might write: |
| |
|
| |
@example |
| |
total = total + new_volume(length,height,depth); |
| |
@end example |
| |
|
| |
@noindent |
| |
where new_volume is a function-call to another piece of code, and total, |
| |
length, height and depth are all variables. length, height and depth are |
| |
parameters to the function-call. |
| |
|
| |
In Forth, the equivalent of the function or procedure is the |
| |
@dfn{definition} and parameters are implicitly passed between |
| |
definitions using a shared stack that is visible to the |
| |
programmer. Although Forth does support variables, the existence of the |
| |
stack means that they are used far less often than in most other |
| |
programming languages. When the text interpreter encounters a number, it |
| |
will place (@dfn{push}) it on the stack. There are several stacks (the |
| |
actual number is implementation-dependent ...) and the particular stack |
| |
used for any operation is implied unambiguously by the operation being |
| |
performed. The stack used for all integer operations is called the @dfn{data |
| |
stack} and, since this is the stack used most commonly, references to |
| |
``the data stack'' are often abbreviated to ``the stack''. |
| |
|
| |
The stacks have a last-in, first-out (LIFO) organisation. If you type: |
| |
|
| |
@example |
| |
@kbd{1 2 3@key{RET}} ok |
| |
@end example |
| |
|
| |
Then this instructs the text interpreter to placed three numbers on the |
| |
(data) stack. An analogy for the behaviour of the stack is to take a |
| |
pack of playing cards and deal out the ace (1), 2 and 3 into a pile on |
| |
the table. The 3 was the last card onto the pile (``last-in'') and if |
| |
you take a card off the pile then, unless you're prepared to fiddle a |
| |
bit, the card that you take off will be the 3 (``first-out''). The |
| |
number that will be first-out of the stack is called the @dfn{top of |
| |
stack}, which |
| |
@cindex TOS definition |
| |
is often abbreviated to @dfn{TOS}. |
| |
|
| |
To understand how parameters are passed in Forth, consider the |
| |
behaviour of the definition @code{+} (pronounced ``plus''). You will not |
| |
be surprised to learn that this definition performs addition. More |
| |
precisely, it adds two number together and produces a result. Where does |
| |
it get the two numbers from? It takes the top two numbers off the |
| |
stack. Where does it place the result? On the stack. You can act-out the |
| |
behaviour of @code{+} with your playing cards like this: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
Pick up two cards from the stack on the table |
| |
@item |
| |
Stare at them intently and ask yourself ``what @i{is} the sum of these two |
| |
numbers'' |
| |
@item |
| |
Decide that the answer is 5 |
| |
@item |
| |
Shuffle the two cards back into the pack and find a 5 |
| |
@item |
| |
Put a 5 on the remaining ace that's on the table. |
| |
@end itemize |
| |
|
| |
If you don't have a pack of cards handy but you do have Forth running, |
| |
you can use the definition @code{.s} to show the current state of the stack, |
| |
without affecting the stack. Type: |
| |
|
| |
@example |
| |
@kbd{clearstack 1 2 3@key{RET}} ok |
| |
@kbd{.s@key{RET}} <3> 1 2 3 ok |
| |
@end example |
| |
|
| |
The text interpreter looks up the word @code{clearstack} and executes |
| |
it; it tidies up the stack and removes any entries that may have been |
| |
left on it by earlier examples. The text interpreter pushes each of the |
| |
three numbers in turn onto the stack. Finally, the text interpreter |
| |
looks up the word @code{.s} and executes it. The effect of executing |
| |
@code{.s} is to print the ``<3>'' (the total number of items on the stack) |
| |
followed by a list of all the items on the stack; the item on the far |
| |
right-hand side is the TOS. |
| |
|
| |
You can now type: |
| |
|
| |
@example |
| |
@kbd{+ .s@key{RET}} <2> 1 5 ok |
| |
@end example |
| |
|
| |
@noindent |
| |
which is correct; there are now 2 items on the stack and the result of |
| |
the addition is 5. |
| |
|
| |
If you're playing with cards, try doing a second addition: pick up the |
| |
two cards, work out that their sum is 6, shuffle them into the pack, |
| |
look for a 6 and place that on the table. You now have just one item on |
| |
the stack. What happens if you try to do a third addition? Pick up the |
| |
first card, pick up the second card -- ah! There is no second card. This |
| |
is called a @dfn{stack underflow} and consitutes an error. If you try to |
| |
do the same thing with Forth it will report an error (probably a Stack |
| |
Underflow or an Invalid Memory Address error). |
| |
|
| |
The opposite situation to a stack underflow is a @dfn{stack overflow}, |
| |
which simply accepts that there is a finite amount of storage space |
| |
reserved for the stack. To stretch the playing card analogy, if you had |
| |
enough packs of cards and you piled the cards up on the table, you would |
| |
eventually be unable to add another card; you'd hit the ceiling. Gforth |
| |
allows you to set the maximum size of the stacks. In general, the only |
| |
time that you will get a stack overflow is because a definition has a |
| |
bug in it and is generating data on the stack uncontrollably. |
| |
|
| |
There's one final use for the playing card analogy. If you model your |
| |
stack using a pack of playing cards, the maximum number of items on |
| |
your stack will be 52 (I assume you didn't use the Joker). The maximum |
| |
@i{value} of any item on the stack is 13 (the King). In fact, the only |
| |
possible numbers are positive integer numbers 1 through 13; you can't |
| |
have (for example) 0 or 27 or 3.52 or -2. If you change the way you |
| |
think about some of the cards, you can accommodate different |
| |
numbers. For example, you could think of the Jack as representing 0, |
| |
the Queen as representing -1 and the King as representing -2. Your |
| |
@i{range} remains unchanged (you can still only represent a total of 13 |
| |
numbers) but the numbers that you can represent are -2 through 10. |
| |
|
| |
In that analogy, the limit was the amount of information that a single |
| |
stack entry could hold, and Forth has a similar limit. In Forth, the |
| |
size of a stack entry is called a @dfn{cell}. The actual size of a cell is |
| |
implementation dependent and affects the maximum value that a stack |
| |
entry can hold. A Standard Forth provides a cell size of at least |
| |
16-bits, and most desktop systems use a cell size of 32-bits. |
| |
|
| |
Forth does not do any type checking for you, so you are free to |
| |
manipulate and combine stack items in any way you wish. A convenient way |
| |
of treating stack items is as 2's complement signed integers, and that |
| |
is what Standard words like @code{+} do. Therefore you can type: |
| |
|
| |
@example |
| |
@kbd{-5 12 + .s@key{RET}} <1> 7 ok |
| |
@end example |
| |
|
| |
If you use numbers and definitions like @code{+} in order to turn Forth |
| |
into a great big pocket calculator, you will realise that it's rather |
| |
different from a normal calculator. Rather than typing 2 + 3 = you had |
| |
to type 2 3 + (ignore the fact that you had to use @code{.s} to see the |
| |
result). The terminology used to describe this difference is to say that |
| |
your calculator uses @dfn{Infix Notation} (parameters and operators are |
| |
mixed) whilst Forth uses @dfn{Postfix Notation} (parameters and |
| |
operators are separate), also called @dfn{Reverse Polish Notation}. |
| |
|
| |
Whilst postfix notation might look confusing to begin with, it has |
| |
several important advantages: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
it is unambiguous |
| |
@item |
| |
it is more concise |
| |
@item |
| |
it fits naturally with a stack-based system |
| |
@end itemize |
| |
|
| |
To examine these claims in more detail, consider these sums: |
| |
|
| |
@example |
| |
6 + 5 * 4 = |
| |
4 * 5 + 6 = |
| |
@end example |
| |
|
| |
If you're just learning maths or your maths is very rusty, you will |
| |
probably come up with the answer 44 for the first and 26 for the |
| |
second. If you are a bit of a whizz at maths you will remember the |
| |
@i{convention} that multiplication takes precendence over addition, and |
| |
you'd come up with the answer 26 both times. To explain the answer 26 |
| |
to someone who got the answer 44, you'd probably rewrite the first sum |
| |
like this: |
| |
|
| |
@example |
| |
6 + (5 * 4) = |
| |
@end example |
| |
|
| |
If what you really wanted was to perform the addition before the |
| |
multiplication, you would have to use parentheses to force it. |
| |
|
| |
If you did the first two sums on a pocket calculator you would probably |
| |
get the right answers, unless you were very cautious and entered them using |
| |
these keystroke sequences: |
| |
|
| |
6 + 5 = * 4 = |
| |
4 * 5 = + 6 = |
| |
|
| |
Postfix notation is unambiguous because the order that the operators |
| |
are applied is always explicit; that also means that parentheses are |
| |
never required. The operators are @i{active} (the act of quoting the |
| |
operator makes the operation occur) which removes the need for ``=''. |
| |
|
| |
The sum 6 + 5 * 4 can be written (in postfix notation) in two |
| |
equivalent ways: |
| |
|
| |
@example |
| |
6 5 4 * + or: |
| |
5 4 * 6 + |
| |
@end example |
| |
|
| |
An important thing that you should notice about this notation is that |
| |
the @i{order} of the numbers does not change; if you want to subtract |
| |
2 from 10 you type @code{10 2 -}. |
| |
|
| |
The reason that Forth uses postfix notation is very simple to explain: it |
| |
makes the implementation extremely simple, and it follows naturally from |
| |
using the stack as a mechanism for passing parameters. Another way of |
| |
thinking about this is to realise that all Forth definitions are |
| |
@i{active}; they execute as they are encountered by the text |
| |
interpreter. The result of this is that the syntax of Forth is trivially |
| |
simple. |
| |
|
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Your first definition, How does that work?, Stacks and Postfix notation, Introduction |
| |
@section Your first Forth definition |
| |
@cindex first definition |
| |
|
| |
Until now, the examples we've seen have been trivial; we've just been |
| |
using Forth as a bigger-than-pocket calculator. Also, each calculation |
| |
we've shown has been a ``one-off'' -- to repeat it we'd need to type it in |
| |
again@footnote{That's not quite true. If you press the up-arrow key on |
| |
your keyboard you should be able to scroll back to any earlier command, |
| |
edit it and re-enter it.} In this section we'll see how to add new |
| |
words to Forth's vocabulary. |
| |
|
| |
The easiest way to create a new word is to use a @dfn{colon |
| |
definition}. We'll define a few and try them out before worrying too |
| |
much about how they work. Try typing in these examples; be careful to |
| |
copy the spaces accurately: |
| |
|
| |
@example |
| |
: add-two 2 + . ; |
| |
: greet ." Hello and welcome" ; |
| |
: demo 5 add-two ; |
| |
@end example |
| |
|
| |
@noindent |
| |
Now try them out: |
| |
|
| |
@example |
| |
@kbd{greet@key{RET}} Hello and welcome ok |
| |
@kbd{greet greet@key{RET}} Hello and welcomeHello and welcome ok |
| |
@kbd{4 add-two@key{RET}} 6 ok |
| |
@kbd{demo@key{RET}} 7 ok |
| |
@kbd{9 greet demo add-two@key{RET}} Hello and welcome7 11 ok |
| |
@end example |
| |
|
| |
The first new thing that we've introduced here is the pair of words |
| |
@code{:} and @code{;}. These are used to start and terminate a new |
| |
definition, respectively. The first word after the @code{:} is the name |
| |
for the new definition. |
| |
|
| |
As you can see from the examples, a definition is built up of words that |
| |
have already been defined; Forth makes no distinction between |
| |
definitions that existed when you started the system up, and those that |
| |
you define yourself. |
| |
|
| |
The examples also introduce the words @code{.} (dot), @code{."} |
| |
(dot-quote) and @code{dup} (dewp). Dot takes the value from the top of |
| |
the stack and displays it. It's like @code{.s} except that it only |
| |
displays the top item of the stack and it is destructive; after it has |
| |
executed, the number is no longer on the stack. There is always one |
| |
space printed after the number, and no spaces before it. Dot-quote |
| |
defines a string (a sequence of characters) that will be printed when |
| |
the word is executed. The string can contain any printable characters |
| |
except @code{"}. A @code{"} has a special function; it is not a Forth |
| |
word but it acts as a delimiter (the way that delimiters work is |
| |
described in the next section). Finally, @code{dup} duplicates the value |
| |
at the top of the stack. Try typing @code{5 dup .s} to see what it does. |
| |
|
| |
We already know that the text interpreter searches through the |
| |
dictionary to locate names. If you've followed the examples earlier, you |
| |
will already have a definition called @code{add-two}. Lets try modifying |
| |
it by typing in a new definition: |
| |
|
| |
@example |
| |
@kbd{: add-two dup . ." + 2 =" 2 + . ;@key{RET}} redefined add-two ok |
| |
@end example |
| |
|
| |
Forth recognised that we were defining a word that already exists, and |
| |
printed a message to warn us of that fact. Let's try out the new |
| |
definition: |
| |
|
| |
@example |
| |
@kbd{9 add-two@key{RET}} 9 + 2 =11 ok |
| |
@end example |
| |
|
| |
@noindent |
| |
All that we've actually done here, though, is to create a new |
| |
definition, with a particular name. The fact that there was already a |
| |
definition with the same name did not make any difference to the way |
| |
that the new definition was created (except that Forth printed a warning |
| |
message). The old definition of add-two still exists (try @code{demo} |
| |
again to see that this is true). Any new definition will use the new |
| |
definition of @code{add-two}, but old definitions continue to use the |
| |
version that already existed at the time that they were @code{compiled}. |
| |
|
| |
Before you go on to the next section, try defining and redefining some |
| |
words of your own. |
| |
|
| |
@comment ---------------------------------------------- |
| |
@node How does that work?, Forth is written in Forth, Your first definition, Introduction |
| |
@section How does that work? |
| |
@cindex parsing words |
| |
|
| |
@c That's pretty deep (IMO way too deep) for an introduction. - anton |
| |
|
| |
@c Is it a good idea to talk about the interpretation semantics of a |
| |
@c number? We don't have an xt to go along with it. - anton |
| |
|
| |
@c Now that I have eliminated execution semantics, I wonder if it would not |
| |
@c be better to keep them (or add run-time semantics), to make it easier to |
| |
@c explain what compilation semantics usually does. - anton |
| |
|
| |
@c nac-> I removed the term ``default compilation sematics'' from the |
| |
@c introductory chapter. Removing ``execution semantics'' was making |
| |
@c everything simpler to explain, then I think the use of this term made |
| |
@c everything more complex again. I replaced it with ``default |
| |
@c semantics'' (which is used elsewhere in the manual) by which I mean |
| |
@c ``a definition that has neither the immediate nor the compile-only |
| |
@c flag set''. I reworded big chunks of the ``how does that work'' |
| |
@c section (and, unusually for me, I think I even made it shorter!). See |
| |
@c what you think -- I know I have not addressed your primary concern |
| |
@c that it is too heavy-going for an introduction. From what I understood |
| |
@c of your course notes it looks as though they might be a good framework. |
| |
@c Things that I've tried to capture here are some things that came as a |
| |
@c great revelation here when I first understood them. Also, I like the |
| |
@c fact that a very simple code example shows up almost all of the issues |
| |
@c that you need to understand to see how Forth works. That's unique and |
| |
@c worthwhile to emphasise. |
| |
|
| |
Now we're going to take another look at the definition of @code{add-two} |
| |
from the previous section. From our knowledge of the way that the text |
| |
interpreter works, we would have expected this result when we tried to |
| |
define @code{add-two}: |
| |
|
| |
@example |
| |
@kbd{: add-two 2 + . ;@key{RET}} |
| |
^^^^^^^ |
| |
Error: Undefined word |
| |
@end example |
| |
|
| |
The reason that this didn't happen is bound up in the way that @code{:} |
| |
works. The word @code{:} does two special things. The first special |
| |
thing that it does prevents the text interpreter from ever seeing the |
| |
characters @code{add-two}. The text interpreter uses a variable called |
| |
@cindex modifying >IN |
| |
@code{>IN} (pronounced ``to-in'') to keep track of where it is in the |
| |
input line. When it encounters the word @code{:} it behaves in exactly |
| |
the same way as it does for any other word; it looks it up in the name |
| |
dictionary, finds its xt and executes it. When @code{:} executes, it |
| |
looks at the input buffer, finds the word @code{add-two} and advances the |
| |
value of @code{>IN} to point past it. It then does some other stuff |
| |
associated with creating the new definition (including creating an entry |
| |
for @code{add-two} in the name dictionary). When the execution of @code{:} |
| |
completes, control returns to the text interpreter, which is oblivious |
| |
to the fact that it has been tricked into ignoring part of the input |
| |
line. |
| |
|
| |
@cindex parsing words |
| |
Words like @code{:} -- words that advance the value of @code{>IN} and so |
| |
prevent the text interpreter from acting on the whole of the input line |
| |
-- are called @dfn{parsing words}. |
| |
|
| |
@cindex @code{state} - effect on the text interpreter |
| |
@cindex text interpreter - effect of state |
| |
The second special thing that @code{:} does is change the value of a |
| |
variable called @code{state}, which affects the way that the text |
| |
interpreter behaves. When Gforth starts up, @code{state} has the value |
| |
0, and the text interpreter is said to be @dfn{interpreting}. During a |
| |
colon definition (started with @code{:}), @code{state} is set to -1 and |
| |
the text interpreter is said to be @dfn{compiling}. |
| |
|
| |
In this example, the text interpreter is compiling when it processes the |
| |
string ``@code{2 + . ;}''. It still breaks the string down into |
| |
character sequences in the same way. However, instead of pushing the |
| |
number @code{2} onto the stack, it lays down (@dfn{compiles}) some magic |
| |
into the definition of @code{add-two} that will make the number @code{2} get |
| |
pushed onto the stack when @code{add-two} is @dfn{executed}. Similarly, |
| |
the behaviours of @code{+} and @code{.} are also compiled into the |
| |
definition. |
| |
|
| |
One category of words don't get compiled. These so-called @dfn{immediate |
| |
words} get executed (performed @i{now}) regardless of whether the text |
| |
interpreter is interpreting or compiling. The word @code{;} is an |
| |
immediate word. Rather than being compiled into the definition, it |
| |
executes. Its effect is to terminate the current definition, which |
| |
includes changing the value of @code{state} back to 0. |
| |
|
| |
When you execute @code{add-two}, it has a @dfn{run-time effect} that is |
| |
exactly the same as if you had typed @code{2 + . @key{RET}} outside of a |
| |
definition. |
| |
|
| |
In Forth, every word or number can be described in terms of two |
| |
properties: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
@cindex interpretation semantics |
| |
Its @dfn{interpretation semantics} describe how it will behave when the |
| |
text interpreter encounters it in @dfn{interpret} state. The |
| |
interpretation semantics of a word are represented by an @dfn{execution |
| |
token}. |
| |
@item |
| |
@cindex compilation semantics |
| |
Its @dfn{compilation semantics} describe how it will behave when the |
| |
text interpreter encounters it in @dfn{compile} state. The compilation |
| |
semantics of a word are represented in an implementation-dependent way; |
| |
Gforth uses a @dfn{compilation token}. |
| |
@end itemize |
| |
|
| |
@noindent |
| |
Numbers are always treated in a fixed way: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
When the number is @dfn{interpreted}, its behaviour is to push the |
| |
number onto the stack. |
| |
@item |
| |
When the number is @dfn{compiled}, a piece of code is appended to the |
| |
current definition that pushes the number when it runs. (In other words, |
| |
the compilation semantics of a number are to postpone its interpretation |
| |
semantics until the run-time of the definition that it is being compiled |
| |
into.) |
| |
@end itemize |
| |
|
| |
Words don't behave in such a regular way, but most have @i{default |
| |
semantics} which means that they behave like this: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
The @dfn{interpretation semantics} of the word are to do something useful. |
| |
@item |
| |
The @dfn{compilation semantics} of the word are to append its |
| |
@dfn{interpretation semantics} to the current definition (so that its |
| |
run-time behaviour is to do something useful). |
| |
@end itemize |
| |
|
| |
@cindex immediate words |
| |
The actual behaviour of any particular word can be controlled by using |
| |
the words @code{immediate} and @code{compile-only} when the word is |
| |
defined. These words set flags in the name dictionary entry of the most |
| |
recently defined word, and these flags are retrieved by the text |
| |
interpreter when it finds the word in the name dictionary. |
| |
|
| |
A word that is marked as @dfn{immediate} has compilation semantics that |
| |
are identical to its interpretation semantics. In other words, it |
| |
behaves like this: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
The @dfn{interpretation semantics} of the word are to do something useful. |
| |
@item |
| |
The @dfn{compilation semantics} of the word are to do something useful |
| |
(and actually the same thing); i.e., it is executed during compilation. |
| |
@end itemize |
| |
|
| |
Marking a word as @dfn{compile-only} prohibits the text interpreter from |
| |
performing the interpretation semantics of the word directly; an attempt |
| |
to do so will generate an error. It is never necessary to use |
| |
@code{compile-only} (and it is not even part of ANS Forth, though it is |
| |
provided by many implementations) but it is good etiquette to apply it |
| |
to a word that will not behave correctly (and might have unexpected |
| |
side-effects) in interpret state. For example, it is only legal to use |
| |
the conditional word @code{IF} within a definition. If you forget this |
| |
and try to use it elsewhere, the fact that (in Gforth) it is marked as |
| |
@code{compile-only} allows the text interpreter to generate a helpful |
| |
error message rather than subjecting you to the consequences of your |
| |
folly. |
| |
|
| |
This example shows the difference between an immediate and a |
| |
non-immediate word: |
| |
|
| |
@example |
| |
: show-state state @@ . ; |
| |
: show-state-now show-state ; immediate |
| |
: word1 show-state ; |
| |
: word2 show-state-now ; |
| |
@end example |
| |
|
| |
The word @code{immediate} after the definition of @code{show-state-now} |
| |
makes that word an immediate word. These definitions introduce a new |
| |
word: @code{@@} (pronounced ``fetch''). This word fetches the value of a |
| |
variable, and leaves it on the stack. Therefore, the behaviour of |
| |
@code{show-state} is to print a number that represents the current value |
| |
of @code{state}. |
| |
|
| |
When you execute @code{word1}, it prints the number 0, indicating that |
| |
the system is interpreting. When the text interpreter compiled the |
| |
definition of @code{word1}, it encountered @code{show-state} whose |
| |
compilation semantics are to append its interpretation semantics to the |
| |
current definition. When you execute @code{word1}, it performs the |
| |
interpretation semantics of @code{show-state}. At the time that @code{word1} |
| |
(and therefore @code{show-state}) are executed, the system is |
| |
interpreting. |
| |
|
| |
When you pressed @key{RET} after entering the definition of @code{word2}, |
| |
you should have seen the number -1 printed, followed by ``@code{ |
| |
ok}''. When the text interpreter compiled the definition of |
| |
@code{word2}, it encountered @code{show-state-now}, an immediate word, |
| |
whose compilation semantics are therefore to perform its interpretation |
| |
semantics. It is executed straight away (even before the text |
| |
interpreter has moved on to process another group of characters; the |
| |
@code{;} in this example). The effect of executing it are to display the |
| |
value of @code{state} @i{at the time that the definition of} |
| |
@code{word2} @i{is being defined}. Printing -1 demonstrates that the |
| |
system is compiling at this time. If you execute @code{word2} it does |
| |
nothing at all. |
| |
|
| |
@cindex @code{."}, how it works |
| |
Before leaving the subject of immediate words, consider the behaviour of |
| |
@code{."} in the definition of @code{greet}, in the previous |
| |
section. This word is both a parsing word and an immediate word. Notice |
| |
that there is a space between @code{."} and the start of the text |
| |
@code{Hello and welcome}, but that there is no space between the last |
| |
letter of @code{welcome} and the @code{"} character. The reason for this |
| |
is that @code{."} is a Forth word; it must have a space after it so that |
| |
the text interpreter can identify it. The @code{"} is not a Forth word; |
| |
it is a @dfn{delimiter}. The examples earlier show that, when the string |
| |
is displayed, there is neither a space before the @code{H} nor after the |
| |
@code{e}. Since @code{."} is an immediate word, it executes at the time |
| |
that @code{greet} is defined. When it executes, its behaviour is to |
| |
search forward in the input line looking for the delimiter. When it |
| |
finds the delimiter, it updates @code{>IN} to point past the |
| |
delimiter. It also compiles some magic code into the definition of |
| |
@code{greet}; the xt of a run-time routine that prints a text string. It |
| |
compiles the string @code{Hello and welcome} into memory so that it is |
| |
available to be printed later. When the text interpreter gains control, |
| |
the next word it finds in the input stream is @code{;} and so it |
| |
terminates the definition of @code{greet}. |
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Forth is written in Forth, Review - elements of a Forth system, How does that work?, Introduction |
| |
@section Forth is written in Forth |
| |
@cindex structure of Forth programs |
| |
|
| |
When you start up a Forth compiler, a large number of definitions |
| |
already exist. In Forth, you develop a new application using bottom-up |
| |
programming techniques to create new definitions that are defined in |
| |
terms of existing definitions. As you create each definition you can |
| |
test and debug it interactively. |
| |
|
| |
If you have tried out the examples in this section, you will probably |
| |
have typed them in by hand; when you leave Gforth, your definitions will |
| |
be lost. You can avoid this by using a text editor to enter Forth source |
| |
code into a file, and then loading code from the file using |
| |
@code{include} (@pxref{Forth source files}). A Forth source file is |
| |
processed by the text interpreter, just as though you had typed it in by |
| |
hand@footnote{Actually, there are some subtle differences -- see |
| |
@ref{The Text Interpreter}.}. |
| |
|
| |
Gforth also supports the traditional Forth alternative to using text |
| |
files for program entry (@pxref{Blocks}). |
| |
|
| |
In common with many, if not most, Forth compilers, most of Gforth is |
| |
actually written in Forth. All of the @file{.fs} files in the |
| |
installation directory@footnote{For example, |
| |
@file{/usr/local/share/gforth...}} are Forth source files, which you can |
| |
study to see examples of Forth programming. |
| |
|
| |
Gforth maintains a history file that records every line that you type to |
| |
the text interpreter. This file is preserved between sessions, and is |
| |
used to provide a command-line recall facility. If you enter long |
| |
definitions by hand, you can use a text editor to paste them out of the |
| |
history file into a Forth source file for reuse at a later time |
| |
(for more information @pxref{Command-line editing}). |
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Review - elements of a Forth system, Where to go next, Forth is written in Forth, Introduction |
| |
@section Review - elements of a Forth system |
| |
@cindex elements of a Forth system |
| |
|
| |
To summarise this chapter: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
Forth programs use @dfn{factoring} to break a problem down into small |
| |
fragments called @dfn{words} or @dfn{definitions}. |
| |
@item |
| |
Forth program development is an interactive process. |
| |
@item |
| |
The main command loop that accepts input, and controls both |
| |
interpretation and compilation, is called the @dfn{text interpreter} |
| |
(also known as the @dfn{outer interpreter}). |
| |
@item |
| |
Forth has a very simple syntax, consisting of words and numbers |
| |
separated by spaces or carriage-return characters. Any additional syntax |
| |
is imposed by @dfn{parsing words}. |
| |
@item |
| |
Forth uses a stack to pass parameters between words. As a result, it |
| |
uses postfix notation. |
| |
@item |
| |
To use a word that has previously been defined, the text interpreter |
| |
searches for the word in the @dfn{name dictionary}. |
| |
@item |
| |
Words have @dfn{interpretation semantics} and @dfn{compilation semantics}. |
| |
@item |
| |
The text interpreter uses the value of @code{state} to select between |
| |
the use of the @dfn{interpretation semantics} and the @dfn{compilation |
| |
semantics} of a word that it encounters. |
| |
@item |
| |
The relationship between the @dfn{interpretation semantics} and |
| |
@dfn{compilation semantics} for a word |
| |
depend upon the way in which the word was defined (for example, whether |
| |
it is an @dfn{immediate} word). |
| |
@item |
| |
Forth definitions can be implemented in Forth (called @dfn{high-level |
| |
definitions}) or in some other way (usually a lower-level language and |
| |
as a result often called @dfn{low-level definitions}, @dfn{code |
| |
definitions} or @dfn{primitives}). |
| |
@item |
| |
Many Forth systems are implemented mainly in Forth. |
| |
@end itemize |
| |
|
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Where to go next, Exercises, Review - elements of a Forth system, Introduction |
| |
@section Where To Go Next |
| |
@cindex where to go next |
| |
|
| |
Amazing as it may seem, if you have read (and understood) this far, you |
| |
know almost all the fundamentals about the inner workings of a Forth |
| |
system. You certainly know enough to be able to read and understand the |
| |
rest of this manual and the ANS Forth document, to learn more about the |
| |
facilities that Forth in general and Gforth in particular provide. Even |
| |
scarier, you know almost enough to implement your own Forth system. |
| |
However, that's not a good idea just yet... better to try writing some |
| |
programs in Gforth. |
| |
|
| |
Forth has such a rich vocabulary that it can be hard to know where to |
| |
start in learning it. This section suggests a few sets of words that are |
| |
enough to write small but useful programs. Use the word index in this |
| |
document to learn more about each word, then try it out and try to write |
| |
small definitions using it. Start by experimenting with these words: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
Arithmetic: @code{+ - * / /MOD */ ABS INVERT} |
| |
@item |
| |
Comparison: @code{MIN MAX =} |
| |
@item |
| |
Logic: @code{AND OR XOR NOT} |
| |
@item |
| |
Stack manipulation: @code{DUP DROP SWAP OVER} |
| |
@item |
| |
Loops and decisions: @code{IF ELSE ENDIF ?DO I LOOP} |
| |
@item |
| |
Input/Output: @code{. ." EMIT CR KEY} |
| |
@item |
| |
Defining words: @code{: ; CREATE} |
| |
@item |
| |
Memory allocation words: @code{ALLOT ,} |
| |
@item |
| |
Tools: @code{SEE WORDS .S MARKER} |
| |
@end itemize |
| |
|
| |
When you have mastered those, go on to: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
More defining words: @code{VARIABLE CONSTANT VALUE TO CREATE DOES>} |
| |
@item |
| |
Memory access: @code{@@ !} |
| |
@end itemize |
| |
|
| |
When you have mastered these, there's nothing for it but to read through |
| |
the whole of this manual and find out what you've missed. |
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Exercises, , Where to go next, Introduction |
| |
@section Exercises |
| |
@cindex exercises |
| |
|
| |
TODO: provide a set of programming excercises linked into the stuff done |
| |
already and into other sections of the manual. Provide solutions to all |
| |
the exercises in a .fs file in the distribution. |
| |
|
| |
@c Get some inspiration from Starting Forth and Kelly&Spies. |
| |
|
| |
@c excercises: |
| |
@c 1. take inches and convert to feet and inches. |
| |
@c 2. take temperature and convert from fahrenheight to celcius; |
| |
@c may need to care about symmetric vs floored?? |
| |
@c 3. take input line and do character substitution |
| |
@c to encipher or decipher |
| |
@c 4. as above but work on a file for in and out |
| |
@c 5. take input line and convert to pig-latin |
| |
@c |
| |
@c thing of sets of things to exercise then come up with |
| |
@c problems that need those things. |
| |
|
| |
|
| |
@c ****************************************************************** |
| |
@node Words, Error messages, Introduction, Top |
| |
@chapter Forth Words |
| |
@cindex words |
| |
|
| |
@menu |
| |
* Notation:: |
| |
* Case insensitivity:: |
| |
* Comments:: |
| |
* Boolean Flags:: |
| |
* Arithmetic:: |
| |
* Stack Manipulation:: |
| |
* Memory:: |
| |
* Control Structures:: |
| |
* Defining Words:: |
| |
* Interpretation and Compilation Semantics:: |
| |
* Tokens for Words:: |
| |
* The Text Interpreter:: |
| |
* Word Lists:: |
| |
* Environmental Queries:: |
| |
* Files:: |
| |
* Blocks:: |
| |
* Other I/O:: |
| |
* Locals:: |
| |
* Structures:: |
| |
* Object-oriented Forth:: |
| |
* Programming Tools:: |
| |
* Assembler and Code Words:: |
| |
* Threading Words:: |
| |
* Passing Commands to the OS:: |
| |
* Keeping track of Time:: |
| |
* Miscellaneous Words:: |
| |
@end menu |
| |
|
| |
@node Notation, Case insensitivity, Words, Words |
| |
@section Notation |
| |
@cindex notation of glossary entries |
| |
@cindex format of glossary entries |
| |
@cindex glossary notation format |
| |
@cindex word glossary entry format |
| |
|
| |
The Forth words are described in this section in the glossary notation |
| |
that has become a de-facto standard for Forth texts: |
| |
|
| |
@format |
| |
@i{word} @i{Stack effect} @i{wordset} @i{pronunciation} |
| |
@end format |
| |
@i{Description} |
| |
|
| |
@table @var |
| |
@item word |
| |
The name of the word. |
| |
|
| |
@item Stack effect |
| |
@cindex stack effect |
| |
The stack effect is written in the notation @code{@i{before} -- |
| |
@i{after}}, where @i{before} and @i{after} describe the top of |
| |
stack entries before and after the execution of the word. The rest of |
| |
the stack is not touched by the word. The top of stack is rightmost, |
| |
i.e., a stack sequence is written as it is typed in. Note that Gforth |
| |
uses a separate floating point stack, but a unified stack |
| |
notation. Also, return stack effects are not shown in @i{stack |
| |
effect}, but in @i{Description}. The name of a stack item describes |
| |
the type and/or the function of the item. See below for a discussion of |
| |
the types. |
| |
|
| |
All words have two stack effects: A compile-time stack effect and a |
| |
run-time stack effect. The compile-time stack-effect of most words is |
| |
@i{ -- }. If the compile-time stack-effect of a word deviates from |
| |
this standard behaviour, or the word does other unusual things at |
| |
compile time, both stack effects are shown; otherwise only the run-time |
| |
stack effect is shown. |
| |
|
| |
@cindex pronounciation of words |
| |
@item pronunciation |
| |
How the word is pronounced. |
| |
|
| |
@cindex wordset |
| |
@cindex environment wordset |
| |
@item wordset |
| |
The ANS Forth standard is divided into several word sets. A standard |
| |
system need not support all of them. Therefore, in theory, the fewer |
| |
word sets your program uses the more portable it will be. However, we |
| |
suspect that most ANS Forth systems on personal machines will feature |
| |
all word sets. Words that are not defined in ANS Forth have |
| |
@code{gforth} or @code{gforth-internal} as word set. @code{gforth} |
| |
describes words that will work in future releases of Gforth; |
| |
@code{gforth-internal} words are more volatile. Environmental query |
| |
strings are also displayed like words; you can recognize them by the |
| |
@code{environment} in the word set field. |
| |
|
| |
@item Description |
| |
A description of the behaviour of the word. |
| |
@end table |
| |
|
| |
@cindex types of stack items |
| |
@cindex stack item types |
| |
The type of a stack item is specified by the character(s) the name |
| |
starts with: |
| |
|
| |
@table @code |
| |
@item f |
| |
@cindex @code{f}, stack item type |
| |
Boolean flags, i.e. @code{false} or @code{true}. |
| |
@item c |
| |
@cindex @code{c}, stack item type |
| |
Char |
| |
@item w |
| |
@cindex @code{w}, stack item type |
| |
Cell, can contain an integer or an address |
| |
@item n |
| |
@cindex @code{n}, stack item type |
| |
signed integer |
| |
@item u |
| |
@cindex @code{u}, stack item type |
| |
unsigned integer |
| |
@item d |
| |
@cindex @code{d}, stack item type |
| |
double sized signed integer |
| |
@item ud |
| |
@cindex @code{ud}, stack item type |
| |
double sized unsigned integer |
| |
@item r |
| |
@cindex @code{r}, stack item type |
| |
Float (on the FP stack) |
| |
@item a- |
| |
@cindex @code{a_}, stack item type |
| |
Cell-aligned address |
| |
@item c- |
| |
@cindex @code{c_}, stack item type |
| |
Char-aligned address (note that a Char may have two bytes in Windows NT) |
| |
@item f- |
| |
@cindex @code{f_}, stack item type |
| |
Float-aligned address |
| |
@item df- |
| |
@cindex @code{df_}, stack item type |
| |
Address aligned for IEEE double precision float |
| |
@item sf- |
| |
@cindex @code{sf_}, stack item type |
| |
Address aligned for IEEE single precision float |
| |
@item xt |
| |
@cindex @code{xt}, stack item type |
| |
Execution token, same size as Cell |
| |
@item wid |
| |
@cindex @code{wid}, stack item type |
| |
Word list ID, same size as Cell |
| |
@item ior, wior |
| |
@cindex ior type description |
| |
@cindex wior type description |
| |
I/O result code, cell-sized. In Gforth, you can @code{throw} iors. |
| |
@item f83name |
| |
@cindex @code{f83name}, stack item type |
| |
Pointer to a name structure |
| |
@item " |
| |
@cindex @code{"}, stack item type |
| |
string in the input stream (not on the stack). The terminating character |
| |
is a blank by default. If it is not a blank, it is shown in @code{<>} |
| |
quotes. |
| |
@end table |
| |
|
| |
@comment ---------------------------------------------- |
| |
@node Case insensitivity, Comments, Notation, Words |
| |
@section Case insensitivity |
| |
@cindex case sensitivity |
| |
@cindex upper and lower case |
| |
|
| |
Gforth is case-insensitive; you can enter definitions and invoke |
| |
Standard words using upper, lower or mixed case (however, |
| |
@pxref{core-idef, Implementation-defined options, Implementation-defined |
| |
options}). |
| |
|
| |
ANS Forth only @i{requires} implementations to recognise Standard words |
| |
when they are typed entirely in upper case. Therefore, a Standard |
| |
program must use upper case for all Standard words. You can use whatever |
| |
case you like for words that you define, but in a Standard program you |
| |
have to use the words in the same case that you defined them. |
| |
|
| |
Gforth supports case sensitivity through @code{table}s (case-sensitive |
| |
wordlists, @pxref{Word Lists}). |
| |
|
| |
Two people have asked how to convert Gforth to be case-sensitive; while |
| |
we think this is a bad idea, you can change all wordlists into tables |
| |
like this: |
| |
|
| |
@example |
| |
' table-find forth-wordlist wordlist-map @ ! |
| |
@end example |
| |
|
| |
Note that you now have to type the predefined words in the same case |
| |
that we defined them, which are varying. You may want to convert them |
| |
to your favourite case before doing this operation (I won't explain how, |
| |
because if you are even contemplating doing this, you'd better have |
| |
enough knowledge of Forth systems to know this already). |
| |
|
| |
@node Comments, Boolean Flags, Case insensitivity, Words |
| |
@section Comments |
| |
@cindex comments |
| |
|
| |
Forth supports two styles of comment; the traditional @i{in-line} comment, |
| |
@code{(} and its modern cousin, the @i{comment to end of line}; @code{\}. |
| |
|
| |
|
| |
doc-( |
| |
doc-\ |
| |
doc-\G |
| |
|
| |
|
| |
@node Boolean Flags, Arithmetic, Comments, Words |
| |
@section Boolean Flags |
| |
@cindex Boolean flags |
| |
|
| |
A Boolean flag is cell-sized. A cell with all bits clear represents the |
| |
flag @code{false} and a flag with all bits set represents the flag |
| |
@code{true}. Words that check a flag (for example, @code{IF}) will treat |
| |
a cell that has @i{any} bit set as @code{true}. |
| |
@c on and off to Memory? |
| |
@c true and false to "Bitwise operations" or "Numeric comparison"? |
| |
|
| |
doc-true |
| |
doc-false |
| |
doc-on |
| |
doc-off |
| |
|
| |
|
| |
@node Arithmetic, Stack Manipulation, Boolean Flags, Words |
| |
@section Arithmetic |
| |
@cindex arithmetic words |
| |
|
| |
@cindex division with potentially negative operands |
| |
Forth arithmetic is not checked, i.e., you will not hear about integer |
| |
overflow on addition or multiplication, you may hear about division by |
| |
zero if you are lucky. The operator is written after the operands, but |
| |
the operands are still in the original order. I.e., the infix @code{2-1} |
| |
corresponds to @code{2 1 -}. Forth offers a variety of division |
| |
operators. If you perform division with potentially negative operands, |
| |
you do not want to use @code{/} or @code{/mod} with its undefined |
| |
behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the |
| |
former, @pxref{Mixed precision}). |
| |
@comment TODO discuss the different division forms and the std approach |
| |
|
| |
@menu |
| |
* Single precision:: |
| |
* Double precision:: Double-cell integer arithmetic |
| |
* Bitwise operations:: |
| |
* Numeric comparison:: |
| |
* Mixed precision:: Operations with single and double-cell integers |
| |
* Floating Point:: |
| |
@end menu |
| |
|
| |
@node Single precision, Double precision, Arithmetic, Arithmetic |
| |
@subsection Single precision |
| |
@cindex single precision arithmetic words |
| |
|
| |
@c !! cell undefined |
| |
|
| |
By default, numbers in Forth are single-precision integers that are one |
| |
cell in size. They can be signed or unsigned, depending upon how you |
| |
treat them. For the rules used by the text interpreter for recognising |
| |
single-precision integers see @ref{Number Conversion}. |
| |
|
| |
These words are all defined for signed operands, but some of them also |
| |
work for unsigned numbers: @code{+}, @code{1+}, @code{-}, @code{1-}, |
| |
@code{*}. |
| |
|
| |
doc-+ |
| |
doc-1+ |
| |
doc-- |
| |
doc-1- |
| |
doc-* |
| |
doc-/ |
| |
doc-mod |
| |
doc-/mod |
| |
doc-negate |
| |
doc-abs |
| |
doc-min |
| |
doc-max |
| |
doc-floored |
| |
|
| |
|
| |
@node Double precision, Bitwise operations, Single precision, Arithmetic |
| |
@subsection Double precision |
| |
@cindex double precision arithmetic words |
| |
|
| |
For the rules used by the text interpreter for |
| |
recognising double-precision integers, see @ref{Number Conversion}. |
| |
|
| |
A double precision number is represented by a cell pair, with the most |
| |
significant cell at the TOS. It is trivial to convert an unsigned single |
| |
to a double: simply push a @code{0} onto the TOS. Since numbers are |
| |
represented by Gforth using 2's complement arithmetic, converting a |
| |
signed single to a (signed) double requires sign-extension across the |
| |
most significant cell. This can be achieved using @code{s>d}. The moral |
| |
of the story is that you cannot convert a number without knowing whether |
| |
it represents an unsigned or a signed number. |
| |
|
| |
These words are all defined for signed operands, but some of them also |
| |
work for unsigned numbers: @code{d+}, @code{d-}. |
| |
|
| |
doc-s>d |
| |
doc-d>s |
| |
doc-d+ |
| |
doc-d- |
| |
doc-dnegate |
| |
doc-dabs |
| |
doc-dmin |
| |
doc-dmax |
| |
|
| |
|
| |
@node Bitwise operations, Numeric comparison, Double precision, Arithmetic |
| |
@subsection Bitwise operations |
| |
@cindex bitwise operation words |
| |
|
| |
|
| |
doc-and |
| |
doc-or |
| |
doc-xor |
| |
doc-invert |
| |
doc-lshift |
| |
doc-rshift |
| |
doc-2* |
| |
doc-d2* |
| |
doc-2/ |
| |
doc-d2/ |
| |
|
| |
|
| |
@node Numeric comparison, Mixed precision, Bitwise operations, Arithmetic |
| |
@subsection Numeric comparison |
| |
@cindex numeric comparison words |
| |
|
| |
Note that the words that compare for equality (@code{= <> 0= 0<> d= d<> |
| |
d0= d0<>}) work for for both signed and unsigned numbers. |
| |
|
| |
doc-< |
| |
doc-<= |
| |
doc-<> |
| |
doc-= |
| |
doc-> |
| |
doc->= |
| |
|
| |
doc-0< |
| |
doc-0<= |
| |
doc-0<> |
| |
doc-0= |
| |
doc-0> |
| |
doc-0>= |
| |
|
| |
doc-u< |
| |
doc-u<= |
| |
@c u<> and u= exist but are the same as <> and = |
| |
@c doc-u<> |
| |
@c doc-u= |
| |
doc-u> |
| |
doc-u>= |
| |
|
| |
doc-within |
| |
|
| |
doc-d< |
| |
doc-d<= |
| |
doc-d<> |
| |
doc-d= |
| |
doc-d> |
| |
doc-d>= |
| |
|
| |
doc-d0< |
| |
doc-d0<= |
| |
doc-d0<> |
| |
doc-d0= |
| |
doc-d0> |
| |
doc-d0>= |
| |
|
| |
doc-du< |
| |
doc-du<= |
| |
@c du<> and du= exist but are the same as d<> and d= |
| |
@c doc-du<> |
| |
@c doc-du= |
| |
doc-du> |
| |
doc-du>= |
| |
|
| |
|
| |
@node Mixed precision, Floating Point, Numeric comparison, Arithmetic |
| |
@subsection Mixed precision |
| |
@cindex mixed precision arithmetic words |
| |
|
| |
|
| |
doc-m+ |
| |
doc-*/ |
| |
doc-*/mod |
| |
doc-m* |
| |
doc-um* |
| |
doc-m*/ |
| |
doc-um/mod |
| |
doc-fm/mod |
| |
doc-sm/rem |
| |
|
| |
|
| |
@node Floating Point, , Mixed precision, Arithmetic |
| |
@subsection Floating Point |
| |
@cindex floating point arithmetic words |
| |
|
| |
For the rules used by the text interpreter for |
| |
recognising floating-point numbers see @ref{Number Conversion}. |
| |
|
| |
Gforth has a separate floating point stack, but the documentation uses |
| |
the unified notation.@footnote{It's easy to generate the separate |
| |
notation from that by just separating the floating-point numbers out: |
| |
e.g. @code{( n r1 u r2 -- r3 )} becomes @code{( n u -- ) ( F: r1 r2 -- |
| |
r3 )}.} |
| |
|
| |
@cindex floating-point arithmetic, pitfalls |
| |
Floating point numbers have a number of unpleasant surprises for the |
| |
unwary (e.g., floating point addition is not associative) and even a few |
| |
for the wary. You should not use them unless you know what you are doing |
| |
or you don't care that the results you get are totally bogus. If you |
| |
want to learn about the problems of floating point numbers (and how to |
| |
avoid them), you might start with @cite{David Goldberg, |
| |
@uref{http://www.validgh.com/goldberg/paper.ps,What Every Computer |
| |
Scientist Should Know About Floating-Point Arithmetic}, ACM Computing |
| |
Surveys 23(1):5@minus{}48, March 1991}. |
| |
|
| |
|
| |
doc-d>f |
| |
doc-f>d |
| |
doc-f+ |
| |
doc-f- |
| |
doc-f* |
| |
doc-f/ |
| |
doc-fnegate |
| |
doc-fabs |
| |
doc-fmax |
| |
doc-fmin |
| |
doc-floor |
| |
doc-fround |
| |
doc-f** |
| |
doc-fsqrt |
| |
doc-fexp |
| |
doc-fexpm1 |
| |
doc-fln |
| |
doc-flnp1 |
| |
doc-flog |
| |
doc-falog |
| |
doc-f2* |
| |
doc-f2/ |
| |
doc-1/f |
| |
doc-precision |
| |
doc-set-precision |
| |
|
| |
@cindex angles in trigonometric operations |
| |
@cindex trigonometric operations |
| |
Angles in floating point operations are given in radians (a full circle |
| |
has 2 pi radians). |
| |
|
| |
doc-fsin |
| |
doc-fcos |
| |
doc-fsincos |
| |
doc-ftan |
| |
doc-fasin |
| |
doc-facos |
| |
doc-fatan |
| |
doc-fatan2 |
| |
doc-fsinh |
| |
doc-fcosh |
| |
doc-ftanh |
| |
doc-fasinh |
| |
doc-facosh |
| |
doc-fatanh |
| |
doc-pi |
| |
|
| |
@cindex equality of floats |
| |
@cindex floating-point comparisons |
| |
One particular problem with floating-point arithmetic is that comparison |
| |
for equality often fails when you would expect it to succeed. For this |
| |
reason approximate equality is often preferred (but you still have to |
| |
know what you are doing). Also note that IEEE NaNs may compare |
| |
differently from what you might expect. The comparison words are: |
| |
|
| |
doc-f~rel |
| |
doc-f~abs |
| |
doc-f~ |
| |
doc-f= |
| |
doc-f<> |
| |
|
| |
doc-f< |
| |
doc-f<= |
| |
doc-f> |
| |
doc-f>= |
| |
|
| |
doc-f0< |
| |
doc-f0<= |
| |
doc-f0<> |
| |
doc-f0= |
| |
doc-f0> |
| |
doc-f0>= |
| |
|
| |
|
| |
@node Stack Manipulation, Memory, Arithmetic, Words |
| |
@section Stack Manipulation |
| |
@cindex stack manipulation words |
| |
|
| |
@cindex floating-point stack in the standard |
| |
Gforth maintains a number of separate stacks: |
| |
|
| |
@cindex data stack |
| |
@cindex parameter stack |
| |
@itemize @bullet |
| |
@item |
| |
A data stack (also known as the @dfn{parameter stack}) -- for |
| |
characters, cells, addresses, and double cells. |
| |
|
| |
@cindex floating-point stack |
| |
@item |
| |
A floating point stack -- for holding floating point (FP) numbers. |
| |
|
| |
@cindex return stack |
| |
@item |
| |
A return stack -- for holding the return addresses of colon |
| |
definitions and other (non-FP) data. |
| |
|
| |
@cindex locals stack |
| |
@item |
| |
A locals stack -- for holding local variables. |
| |
@end itemize |
| |
|
| |
@menu |
| |
* Data stack:: |
| |
* Floating point stack:: |
| |
* Return stack:: |
| |
* Locals stack:: |
| |
* Stack pointer manipulation:: |
| |
@end menu |
| |
|
| |
@node Data stack, Floating point stack, Stack Manipulation, Stack Manipulation |
| |
@subsection Data stack |
| |
@cindex data stack manipulation words |
| |
@cindex stack manipulations words, data stack |
| |
|
| |
|
| |
doc-drop |
| |
doc-nip |
| |
doc-dup |
| |
doc-over |
| |
doc-tuck |
| |
doc-swap |
| |
doc-pick |
| |
doc-rot |
| |
doc--rot |
| |
doc-?dup |
| |
doc-roll |
| |
doc-2drop |
| |
doc-2nip |
| |
doc-2dup |
| |
doc-2over |
| |
doc-2tuck |
| |
doc-2swap |
| |
doc-2rot |
| |
|
| |
|
| |
@node Floating point stack, Return stack, Data stack, Stack Manipulation |
| |
@subsection Floating point stack |
| |
@cindex floating-point stack manipulation words |
| |
@cindex stack manipulation words, floating-point stack |
| |
|
| |
Whilst every sane Forth has a separate floating-point stack, it is not |
| |
strictly required; an ANS Forth system could theoretically keep |
| |
floating-point numbers on the data stack. As an additional difficulty, |
| |
you don't know how many cells a floating-point number takes. It is |
| |
reportedly possible to write words in a way that they work also for a |
| |
unified stack model, but we do not recommend trying it. Instead, just |
| |
say that your program has an environmental dependency on a separate |
| |
floating-point stack. |
| |
|
| |
doc-floating-stack |
| |
|
| |
doc-fdrop |
| |
doc-fnip |
| |
doc-fdup |
| |
doc-fover |
| |
doc-ftuck |
| |
doc-fswap |
| |
doc-fpick |
| |
doc-frot |
| |
|
| |
|
| |
@node Return stack, Locals stack, Floating point stack, Stack Manipulation |
| |
@subsection Return stack |
| |
@cindex return stack manipulation words |
| |
@cindex stack manipulation words, return stack |
| |
|
| |
@cindex return stack and locals |
| |
@cindex locals and return stack |
| |
A Forth system is allowed to keep local variables on the |
| |
return stack. This is reasonable, as local variables usually eliminate |
| |
the need to use the return stack explicitly. So, if you want to produce |
| |
a standard compliant program and you are using local variables in a |
| |
word, forget about return stack manipulations in that word (refer to the |
| |
standard document for the exact rules). |
| |
|
| |
doc->r |
| |
doc-r> |
| |
doc-r@ |
| |
doc-rdrop |
| |
doc-2>r |
| |
doc-2r> |
| |
doc-2r@ |
| |
doc-2rdrop |
| |
|
| |
|
| |
@node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation |
| |
@subsection Locals stack |
| |
|
| |
Gforth uses an extra locals stack. It is described, along with the |
| |
reasons for its existence, in @ref{Locals implementation}. |
| |
|
| |
@node Stack pointer manipulation, , Locals stack, Stack Manipulation |
| |
@subsection Stack pointer manipulation |
| |
@cindex stack pointer manipulation words |
| |
|
| |
@c removed s0 r0 l0 -- they are obsolete aliases for sp0 rp0 lp0 |
| |
doc-sp0 |
| |
doc-sp@ |
| |
doc-sp! |
| |
doc-fp0 |
| |
doc-fp@ |
| |
doc-fp! |
| |
doc-rp0 |
| |
doc-rp@ |
| |
doc-rp! |
| |
doc-lp0 |
| |
doc-lp@ |
| |
doc-lp! |
| |
|
| |
|
| |
@node Memory, Control Structures, Stack Manipulation, Words |
| |
@section Memory |
| |
@cindex memory words |
| |
|
| |
@menu |
| |
* Memory model:: |
| |
* Dictionary allocation:: |
| |
* Heap Allocation:: |
| |
* Memory Access:: |
| |
* Address arithmetic:: |
| |
* Memory Blocks:: |
| |
@end menu |
| |
|
| |
In addition to the standard Forth memory allocation words, there is also |
| |
a @uref{http://www.complang.tuwien.ac.at/forth/garbage-collection.zip, |
| |
garbage collector}. |
| |
|
| |
@node Memory model, Dictionary allocation, Memory, Memory |
| |
@subsection ANS Forth and Gforth memory models |
| |
|
| |
@c The ANS Forth description is a mess (e.g., is the heap part of |
| |
@c the dictionary?), so let's not stick to closely with it. |
| |
|
| |
ANS Forth considers a Forth system as consisting of several address |
| |
spaces, of which only @dfn{data space} is managed and accessible with |
| |
the memory words. Memory not necessarily in data space includes the |
| |
stacks, the code (called code space) and the headers (called name |
| |
space). In Gforth everything is in data space, but the code for the |
| |
primitives is usually read-only. |
| |
|
| |
Data space is divided into a number of areas: The (data space portion of |
| |
the) dictionary@footnote{Sometimes, the term @dfn{dictionary} is used to |
| |
refer to the search data structure embodied in word lists and headers, |
| |
because it is used for looking up names, just as you would in a |
| |
conventional dictionary.}, the heap, and a number of system-allocated |
| |
buffers. |
| |
|
| |
@cindex address arithmetic restrictions, ANS vs. Gforth |
| |
@cindex contiguous regions, ANS vs. Gforth |
| |
In ANS Forth data space is also divided into contiguous regions. You |
| |
can only use address arithmetic within a contiguous region, not between |
| |
them. Usually each allocation gives you one contiguous region, but the |
| |
dictionary allocation words have additional rules (@pxref{Dictionary |
| |
allocation}). |
| |
|
| |
Gforth provides one big address space, and address arithmetic can be |
| |
performed between any addresses. However, in the dictionary headers or |
| |
code are interleaved with data, so almost the only contiguous data space |
| |
regions there are those described by ANS Forth as contiguous; but you |
| |
can be sure that the dictionary is allocated towards increasing |
| |
addresses even between contiguous regions. The memory order of |
| |
allocations in the heap is platform-dependent (and possibly different |
| |
from one run to the next). |
| |
|
| |
|
| |
@node Dictionary allocation, Heap Allocation, Memory model, Memory |
| |
@subsection Dictionary allocation |
| |
@cindex reserving data space |
| |
@cindex data space - reserving some |
| |
|
| |
Dictionary allocation is a stack-oriented allocation scheme, i.e., if |
| |
you want to deallocate X, you also deallocate everything |
| |
allocated after X. |
| |
|
| |
@cindex contiguous regions in dictionary allocation |
| |
The allocations using the words below are contiguous and grow the region |
| |
towards increasing addresses. Other words that allocate dictionary |
| |
memory of any kind (i.e., defining words including @code{:noname}) end |
| |
the contiguous region and start a new one. |
| |
|
| |
In ANS Forth only @code{create}d words are guaranteed to produce an |
| |
address that is the start of the following contiguous region. In |
| |
particular, the cell allocated by @code{variable} is not guaranteed to |
| |
be contiguous with following @code{allot}ed memory. |
| |
|
| |
You can deallocate memory by using @code{allot} with a negative argument |
| |
(with some restrictions, see @code{allot}). For larger deallocations use |
| |
@code{marker}. |
| |
|
| |
|
| |
doc-here |
| |
doc-unused |
| |
doc-allot |
| |
doc-c, |
| |
doc-f, |
| |
doc-, |
| |
doc-2, |
| |
|
| |
Memory accesses have to be aligned (@pxref{Address arithmetic}). So of |
| |
course you should allocate memory in an aligned way, too. I.e., before |
| |
allocating allocating a cell, @code{here} must be cell-aligned, etc. |
| |
The words below align @code{here} if it is not already. Basically it is |
| |
only already aligned for a type, if the last allocation was a multiple |
| |
of the size of this type and if @code{here} was aligned for this type |
| |
before. |
| |
|
| |
After freshly @code{create}ing a word, @code{here} is @code{align}ed in |
| |
ANS Forth (@code{maxalign}ed in Gforth). |
| |
|
| |
doc-align |
| |
doc-falign |
| |
doc-sfalign |
| |
doc-dfalign |
| |
doc-maxalign |
| |
doc-cfalign |
| |
|
| |
|
| |
@node Heap Allocation, Memory Access, Dictionary allocation, Memory |
| |
@subsection Heap allocation |
| |
@cindex heap allocation |
| |
@cindex dynamic allocation of memory |
| |
@cindex memory-allocation word set |
| |
|
| |
@cindex contiguous regions and heap allocation |
| |
Heap allocation supports deallocation of allocated memory in any |
| |
order. Dictionary allocation is not affected by it (i.e., it does not |
| |
end a contiguous region). In Gforth, these words are implemented using |
| |
the standard C library calls malloc(), free() and resize(). |
| |
|
| |
The memory region produced by one invocation of @code{allocate} or |
| |
@code{resize} is internally contiguous. There is no contiguity between |
| |
such a region and any other region (including others allocated from the |
| |
heap). |
| |
|
| |
doc-allocate |
| |
doc-free |
| |
doc-resize |
| |
|
| |
|
| |
@node Memory Access, Address arithmetic, Heap Allocation, Memory |
| |
@subsection Memory Access |
| |
@cindex memory access words |
| |
|
| |
doc-@ |
| |
doc-! |
| |
doc-+! |
| |
doc-c@ |
| |
doc-c! |
| |
doc-2@ |
| |
doc-2! |
| |
doc-f@ |
| |
doc-f! |
| |
doc-sf@ |
| |
doc-sf! |
| |
doc-df@ |
| |
doc-df! |
| |
|
| |
|
| |
@node Address arithmetic, Memory Blocks, Memory Access, Memory |
| |
@subsection Address arithmetic |
| |
@cindex address arithmetic words |
| |
|
| |
Address arithmetic is the foundation on which you can build data |
| |
structures like arrays, records (@pxref{Structures}) and objects |
| |
(@pxref{Object-oriented Forth}). |
| |
|
| |
@cindex address unit |
| |
@cindex au (address unit) |
| |
ANS Forth does not specify the sizes of the data types. Instead, it |
| |
offers a number of words for computing sizes and doing address |
| |
arithmetic. Address arithmetic is performed in terms of address units |
| |
(aus); on most systems the address unit is one byte. Note that a |
| |
character may have more than one au, so @code{chars} is no noop (on |
| |
platforms where it is a noop, it compiles to nothing). |
| |
|
| |
The basic address arithmetic words are @code{+} and @code{-}. E.g., if |
| |
you have the address of a cell, perform @code{1 cells +}, and you will |
| |
have the address of the next cell. |
| |
|
| |
@cindex contiguous regions and address arithmetic |
| |
In ANS Forth you can perform address arithmetic only within a contiguous |
| |
region, i.e., if you have an address into one region, you can only add |
| |
and subtract such that the result is still within the region; you can |
| |
only subtract or compare addresses from within the same contiguous |
| |
region. Reasons: several contiguous regions can be arranged in memory |
| |
in any way; on segmented systems addresses may have unusual |
| |
representations, such that address arithmetic only works within a |
| |
region. Gforth provides a few more guarantees (linear address space, |
| |
dictionary grows upwards), but in general I have found it easy to stay |
| |
within contiguous regions (exception: computing and comparing to the |
| |
address just beyond the end of an array). |
| |
|
| |
@cindex alignment of addresses for types |
| |
ANS Forth also defines words for aligning addresses for specific |
| |
types. Many computers require that accesses to specific data types |
| |
must only occur at specific addresses; e.g., that cells may only be |
| |
accessed at addresses divisible by 4. Even if a machine allows unaligned |
| |
accesses, it can usually perform aligned accesses faster. |
| |
|
| |
For the performance-conscious: alignment operations are usually only |
| |
necessary during the definition of a data structure, not during the |
| |
(more frequent) accesses to it. |
| |
|
| |
ANS Forth defines no words for character-aligning addresses. This is not |
| |
an oversight, but reflects the fact that addresses that are not |
| |
char-aligned have no use in the standard and therefore will not be |
| |
created. |
| |
|
| |
@cindex @code{CREATE} and alignment |
| |
ANS Forth guarantees that addresses returned by @code{CREATE}d words |
| |
are cell-aligned; in addition, Gforth guarantees that these addresses |
| |
are aligned for all purposes. |
| |
|
| |
Note that the ANS Forth word @code{char} has nothing to do with address |
| |
arithmetic. |
| |
|
| |
|
| |
doc-chars |
| |
doc-char+ |
| |
doc-cells |
| |
doc-cell+ |
| |
doc-cell |
| |
doc-aligned |
| |
doc-floats |
| |
doc-float+ |
| |
doc-float |
| |
doc-faligned |
| |
doc-sfloats |
| |
doc-sfloat+ |
| |
doc-sfaligned |
| |
doc-dfloats |
| |
doc-dfloat+ |
| |
doc-dfaligned |
| |
doc-maxaligned |
| |
doc-cfaligned |
| |
doc-address-unit-bits |
| |
|
| |
|
| |
@node Memory Blocks, , Address arithmetic, Memory |
| |
@subsection Memory Blocks |
| |
@cindex memory block words |
| |
@cindex character strings - moving and copying |
| |
|
| |
Memory blocks often represent character strings; For ways of storing |
| |
character strings in memory see @ref{String Formats}. For other |
| |
string-processing words see @ref{Displaying characters and strings}. |
| |
|
| |
A few of these words work on address unit blocks. In that case, you |
| |
usually have to insert @code{CHARS} before the word when working on |
| |
character strings. Most words work on character blocks, and expect a |
| |
char-aligned address. |
| |
|
| |
When copying characters between overlapping memory regions, use |
| |
@code{chars move} or choose carefully between @code{cmove} and |
| |
@code{cmove>}. |
| |
|
| |
doc-move |
| |
doc-erase |
| |
doc-cmove |
| |
doc-cmove> |
| |
doc-fill |
| |
doc-blank |
| |
doc-compare |
| |
doc-search |
| |
doc--trailing |
| |
doc-/string |
| |
|
| |
|
| |
@comment TODO examples |
| |
|
| |
|
| |
@node Control Structures, Defining Words, Memory, Words |
| |
@section Control Structures |
| |
@cindex control structures |
| |
|
| |
Control structures in Forth cannot be used interpretively, only in a |
| |
colon definition@footnote{To be precise, they have no interpretation |
| |
semantics (@pxref{Interpretation and Compilation Semantics}).}. We do |
| |
not like this limitation, but have not seen a satisfying way around it |
| |
yet, although many schemes have been proposed. |
| |
|
| |
@menu |
| |
* Selection:: IF ... ELSE ... ENDIF |
| |
* Simple Loops:: BEGIN ... |
| |
* Counted Loops:: DO |
| |
* Arbitrary control structures:: |
| |
* Calls and returns:: |
| |
* Exception Handling:: |
| |
@end menu |
| |
|
| |
@node Selection, Simple Loops, Control Structures, Control Structures |
| |
@subsection Selection |
| |
@cindex selection control structures |
| |
@cindex control structures for selection |
| |
|
| |
@cindex @code{IF} control structure |
| |
@example |
| |
@i{flag} |
| |
IF |
| |
@i{code} |
| |
ENDIF |
| |
@end example |
| |
@noindent |
| |
|
| |
If @i{flag} is non-zero (as far as @code{IF} etc. are concerned, a cell |
| |
with any bit set represents truth) @i{code} is executed. |
| |
|
| |
@example |
| |
@i{flag} |
| |
IF |
| |
@i{code1} |
| |
ELSE |
| |
@i{code2} |
| |
ENDIF |
| |
@end example |
| |
|
| |
If @var{flag} is true, @i{code1} is executed, otherwise @i{code2} is |
| |
executed. |
| |
|
| |
You can use @code{THEN} instead of @code{ENDIF}. Indeed, @code{THEN} is |
| |
standard, and @code{ENDIF} is not, although it is quite popular. We |
| |
recommend using @code{ENDIF}, because it is less confusing for people |
| |
who also know other languages (and is not prone to reinforcing negative |
| |
prejudices against Forth in these people). Adding @code{ENDIF} to a |
| |
system that only supplies @code{THEN} is simple: |
| |
@example |
| |
: ENDIF POSTPONE THEN ; immediate |
| |
@end example |
| |
|
| |
[According to @cite{Webster's New Encyclopedic Dictionary}, @dfn{then |
| |
(adv.)} has the following meanings: |
| |
@quotation |
| |
... 2b: following next after in order ... 3d: as a necessary consequence |
| |
(if you were there, then you saw them). |
| |
@end quotation |
| |
Forth's @code{THEN} has the meaning 2b, whereas @code{THEN} in Pascal |
| |
and many other programming languages has the meaning 3d.] |
| |
|
| |
Gforth also provides the words @code{?DUP-IF} and @code{?DUP-0=-IF}, so |
| |
you can avoid using @code{?dup}. Using these alternatives is also more |
| |
efficient than using @code{?dup}. Definitions in ANS Forth |
| |
for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in |
| |
@file{compat/control.fs}. |
| |
|
| |
@cindex @code{CASE} control structure |
| |
@example |
| |
@i{n} |
| |
CASE |
| |
@i{n1} OF @i{code1} ENDOF |
| |
@i{n2} OF @i{code2} ENDOF |
| |
@dots{} |
| |
( n ) @i{default-code} ( n ) |
| |
ENDCASE |
| |
@end example |
| |
|
| |
Executes the first @i{codei}, where the @i{ni} is equal to @i{n}. If no |
| |
@i{ni} matches, the optional @i{default-code} is executed. The optional |
| |
default case can be added by simply writing the code after the last |
| |
@code{ENDOF}. It may use @i{n}, which is on top of the stack, but must |
| |
not consume it. |
| |
|
| |
@progstyle |
| |
To keep the code understandable, you should ensure that on all paths |
| |
through a selection construct the stack is changed in the same way |
| |
(wrt. number and types of stack items consumed and pushed). |
| |
|
| |
@node Simple Loops, Counted Loops, Selection, Control Structures |
| |
@subsection Simple Loops |
| |
@cindex simple loops |
| |
@cindex loops without count |
| |
|
| |
@cindex @code{WHILE} loop |
| |
@example |
| |
BEGIN |
| |
@i{code1} |
| |
@i{flag} |
| |
WHILE |
| |
@i{code2} |
| |
REPEAT |
| |
@end example |
| |
|
| |
@i{code1} is executed and @i{flag} is computed. If it is true, |
| |
@i{code2} is executed and the loop is restarted; If @i{flag} is |
| |
false, execution continues after the @code{REPEAT}. |
| |
|
| |
@cindex @code{UNTIL} loop |
| |
@example |
| |
BEGIN |
| |
@i{code} |
| |
@i{flag} |
| |
UNTIL |
| |
@end example |
| |
|
| |
@i{code} is executed. The loop is restarted if @code{flag} is false. |
| |
|
| |
@progstyle |
| |
To keep the code understandable, a complete iteration of the loop should |
| |
not change the number and types of the items on the stacks. |
| |
|
| |
@cindex endless loop |
| |
@cindex loops, endless |
| |
@example |
| |
BEGIN |
| |
@i{code} |
| |
AGAIN |
| |
@end example |
| |
|
| |
This is an endless loop. |
| |
|
| |
@node Counted Loops, Arbitrary control structures, Simple Loops, Control Structures |
| |
@subsection Counted Loops |
| |
@cindex counted loops |
| |
@cindex loops, counted |
| |
@cindex @code{DO} loops |
| |
|
| |
The basic counted loop is: |
| |
@example |
| |
@i{limit} @i{start} |
| |
?DO |
| |
@i{body} |
| |
LOOP |
| |
@end example |
| |
|
| |
This performs one iteration for every integer, starting from @i{start} |
| |
and up to, but excluding @i{limit}. The counter, or @i{index}, can be |
| |
accessed with @code{i}. For example, the loop: |
| |
@example |
| |
10 0 ?DO |
| |
i . |
| |
LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{0 1 2 3 4 5 6 7 8 9} |
| |
|
| |
The index of the innermost loop can be accessed with @code{i}, the index |
| |
of the next loop with @code{j}, and the index of the third loop with |
| |
@code{k}. |
| |
|
| |
|
| |
doc-i |
| |
doc-j |
| |
doc-k |
| |
|
| |
|
| |
The loop control data are kept on the return stack, so there are some |
| |
restrictions on mixing return stack accesses and counted loop words. In |
| |
particuler, if you put values on the return stack outside the loop, you |
| |
cannot read them inside the loop@footnote{well, not in a way that is |
| |
portable.}. If you put values on the return stack within a loop, you |
| |
have to remove them before the end of the loop and before accessing the |
| |
index of the loop. |
| |
|
| |
There are several variations on the counted loop: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
@code{LEAVE} leaves the innermost counted loop immediately; execution |
| |
continues after the associated @code{LOOP} or @code{NEXT}. For example: |
| |
|
| |
@example |
| |
10 0 ?DO i DUP . 3 = IF LEAVE THEN LOOP |
| |
@end example |
| |
prints @code{0 1 2 3} |
| |
|
| |
|
| |
@item |
| |
@code{UNLOOP} prepares for an abnormal loop exit, e.g., via |
| |
@code{EXIT}. @code{UNLOOP} removes the loop control parameters from the |
| |
return stack so @code{EXIT} can get to its return address. For example: |
| |
|
| |
@example |
| |
: demo 10 0 ?DO i DUP . 3 = IF UNLOOP EXIT THEN LOOP ." Done" ; |
| |
@end example |
| |
prints @code{0 1 2 3} |
| |
|
| |
|
| |
@item |
| |
If @i{start} is greater than @i{limit}, a @code{?DO} loop is entered |
| |
(and @code{LOOP} iterates until they become equal by wrap-around |
| |
arithmetic). This behaviour is usually not what you want. Therefore, |
| |
Gforth offers @code{+DO} and @code{U+DO} (as replacements for |
| |
@code{?DO}), which do not enter the loop if @i{start} is greater than |
| |
@i{limit}; @code{+DO} is for signed loop parameters, @code{U+DO} for |
| |
unsigned loop parameters. |
| |
|
| |
@item |
| |
@code{?DO} can be replaced by @code{DO}. @code{DO} always enters |
| |
the loop, independent of the loop parameters. Do not use @code{DO}, even |
| |
if you know that the loop is entered in any case. Such knowledge tends |
| |
to become invalid during maintenance of a program, and then the |
| |
@code{DO} will make trouble. |
| |
|
| |
@item |
| |
@code{LOOP} can be replaced with @code{@i{n} +LOOP}; this updates the |
| |
index by @i{n} instead of by 1. The loop is terminated when the border |
| |
between @i{limit-1} and @i{limit} is crossed. E.g.: |
| |
|
| |
@example |
| |
4 0 +DO i . 2 +LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{0 2} |
| |
|
| |
@example |
| |
4 1 +DO i . 2 +LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{1 3} |
| |
|
| |
@item |
| |
@cindex negative increment for counted loops |
| |
@cindex counted loops with negative increment |
| |
The behaviour of @code{@i{n} +LOOP} is peculiar when @i{n} is negative: |
| |
|
| |
@example |
| |
-1 0 ?DO i . -1 +LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{0 -1} |
| |
|
| |
@example |
| |
0 0 ?DO i . -1 +LOOP |
| |
@end example |
| |
prints nothing. |
| |
|
| |
Therefore we recommend avoiding @code{@i{n} +LOOP} with negative |
| |
@i{n}. One alternative is @code{@i{u} -LOOP}, which reduces the |
| |
index by @i{u} each iteration. The loop is terminated when the border |
| |
between @i{limit+1} and @i{limit} is crossed. Gforth also provides |
| |
@code{-DO} and @code{U-DO} for down-counting loops. E.g.: |
| |
|
| |
@example |
| |
-2 0 -DO i . 1 -LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{0 -1} |
| |
|
| |
@example |
| |
-1 0 -DO i . 1 -LOOP |
| |
@end example |
| |
@noindent |
| |
prints @code{0} |
| |
|
| |
@example |
| |
0 0 -DO i . 1 -LOOP |
| |
@end example |
| |
@noindent |
| |
prints nothing. |
| |
|
| |
@end itemize |
| |
|
| |
Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and |
| |
@code{-LOOP} are not defined in ANS Forth. However, an implementation |
| |
for these words that uses only standard words is provided in |
| |
@file{compat/loops.fs}. |
| |
|
| |
|
| |
@cindex @code{FOR} loops |
| |
Another counted loop is: |
| |
@example |
| |
@i{n} |
| |
FOR |
| |
@i{body} |
| |
NEXT |
| |
@end example |
| |
This is the preferred loop of native code compiler writers who are too |
| |
lazy to optimize @code{?DO} loops properly. This loop structure is not |
| |
defined in ANS Forth. In Gforth, this loop iterates @i{n+1} times; |
| |
@code{i} produces values starting with @i{n} and ending with 0. Other |
| |
Forth systems may behave differently, even if they support @code{FOR} |
| |
loops. To avoid problems, don't use @code{FOR} loops. |
| |
|
| |
@node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures |
| |
@subsection Arbitrary control structures |
| |
@cindex control structures, user-defined |
| |
|
| |
@cindex control-flow stack |
| |
ANS Forth permits and supports using control structures in a non-nested |
| |
way. Information about incomplete control structures is stored on the |
| |
control-flow stack. This stack may be implemented on the Forth data |
| |
stack, and this is what we have done in Gforth. |
| |
|
| |
@cindex @code{orig}, control-flow stack item |
| |
@cindex @code{dest}, control-flow stack item |
| |
An @i{orig} entry represents an unresolved forward branch, a @i{dest} |
| |
entry represents a backward branch target. A few words are the basis for |
| |
building any control structure possible (except control structures that |
| |
need storage, like calls, coroutines, and backtracking). |
| |
|
| |
|
| |
doc-if |
| |
doc-ahead |
| |
doc-then |
| |
doc-begin |
| |
doc-until |
| |
doc-again |
| |
doc-cs-pick |
| |
doc-cs-roll |
| |
|
| |
|
| |
The Standard words @code{CS-PICK} and @code{CS-ROLL} allow you to |
| |
manipulate the control-flow stack in a portable way. Without them, you |
| |
would need to know how many stack items are occupied by a control-flow |
| |
entry (many systems use one cell. In Gforth they currently take three, |
| |
but this may change in the future). |
| |
|
| |
Some standard control structure words are built from these words: |
| |
|
| |
|
| |
doc-else |
| |
doc-while |
| |
doc-repeat |
| |
|
| |
|
| |
@noindent |
| |
Gforth adds some more control-structure words: |
| |
|
| |
|
| |
doc-endif |
| |
doc-?dup-if |
| |
doc-?dup-0=-if |
| |
|
| |
|
| |
@noindent |
| |
Counted loop words constitute a separate group of words: |
| |
|
| |
|
| |
doc-?do |
| |
doc-+do |
| |
doc-u+do |
| |
doc--do |
| |
doc-u-do |
| |
doc-do |
| |
doc-for |
| |
doc-loop |
| |
doc-+loop |
| |
doc--loop |
| |
doc-next |
| |
doc-leave |
| |
doc-?leave |
| |
doc-unloop |
| |
doc-done |
| |
|
| |
|
| |
The standard does not allow using @code{CS-PICK} and @code{CS-ROLL} on |
| |
@i{do-sys}. Gforth allows it, but it's your job to ensure that for |
| |
every @code{?DO} etc. there is exactly one @code{UNLOOP} on any path |
| |
through the definition (@code{LOOP} etc. compile an @code{UNLOOP} on the |
| |
fall-through path). Also, you have to ensure that all @code{LEAVE}s are |
| |
resolved (by using one of the loop-ending words or @code{DONE}). |
| |
|
| |
@noindent |
| |
Another group of control structure words are: |
| |
|
| |
|
| |
doc-case |
| |
doc-endcase |
| |
doc-of |
| |
doc-endof |
| |
|
| |
|
| |
@i{case-sys} and @i{of-sys} cannot be processed using @code{CS-PICK} and |
| |
@code{CS-ROLL}. |
| |
|
| |
@subsubsection Programming Style |
| |
@cindex control structures programming style |
| |
@cindex programming style, arbitrary control structures |
| |
|
| |
In order to ensure readability we recommend that you do not create |
| |
arbitrary control structures directly, but define new control structure |
| |
words for the control structure you want and use these words in your |
| |
program. For example, instead of writing: |
| |
|
| |
@example |
| |
BEGIN |
| |
... |
| |
IF [ 1 CS-ROLL ] |
| |
... |
| |
AGAIN THEN |
| |
@end example |
| |
|
| |
@noindent |
| |
we recommend defining control structure words, e.g., |
| |
|
| |
@example |
| |
: WHILE ( DEST -- ORIG DEST ) |
| |
POSTPONE IF |
| |
1 CS-ROLL ; immediate |
| |
|
| |
: REPEAT ( orig dest -- ) |
| |
POSTPONE AGAIN |
| |
POSTPONE THEN ; immediate |
| |
@end example |
| |
|
| |
@noindent |
| |
and then using these to create the control structure: |
| |
|
| |
@example |
| |
BEGIN |
| |
... |
| |
WHILE |
| |
... |
| |
REPEAT |
| |
@end example |
| |
|
| |
That's much easier to read, isn't it? Of course, @code{REPEAT} and |
| |
@code{WHILE} are predefined, so in this example it would not be |
| |
necessary to define them. |
| |
|
| |
@node Calls and returns, Exception Handling, Arbitrary control structures, Control Structures |
| |
@subsection Calls and returns |
| |
@cindex calling a definition |
| |
@cindex returning from a definition |
| |
|
| |
@cindex recursive definitions |
| |
A definition can be called simply be writing the name of the definition |
| |
to be called. Normally a definition is invisible during its own |
| |
definition. If you want to write a directly recursive definition, you |
| |
can use @code{recursive} to make the current definition visible, or |
| |
@code{recurse} to call the current definition directly. |
| |
|
| |
|
| |
doc-recursive |
| |
doc-recurse |
| |
|
| |
|
| |
@comment TODO add example of the two recursion methods |
| |
@quotation |
| |
@progstyle |
| |
I prefer using @code{recursive} to @code{recurse}, because calling the |
| |
definition by name is more descriptive (if the name is well-chosen) than |
| |
the somewhat cryptic @code{recurse}. E.g., in a quicksort |
| |
implementation, it is much better to read (and think) ``now sort the |
| |
partitions'' than to read ``now do a recursive call''. |
| |
@end quotation |
| |
|
| |
For mutual recursion, use @code{Defer}red words, like this: |
| |
|
| |
@example |
| |
Defer foo |
| |
|
| |
: bar ( ... -- ... ) |
| |
... foo ... ; |
| |
|
| |
:noname ( ... -- ... ) |
| |
... bar ... ; |
| |
IS foo |
| |
@end example |
| |
|
| |
Deferred words are discussed in more detail in @ref{Deferred words}. |
| |
|
| |
The current definition returns control to the calling definition when |
| |
the end of the definition is reached or @code{EXIT} is encountered. |
| |
|
| |
doc-exit |
| |
doc-;s |
| |
|
| |
|
| |
@node Exception Handling, , Calls and returns, Control Structures |
| |
@subsection Exception Handling |
| |
@cindex exceptions |
| |
|
| |
@c quit is a very bad idea for error handling, |
| |
@c because it does not translate into a THROW |
| |
@c it also does not belong into this chapter |
| |
|
| |
If a word detects an error condition that it cannot handle, it can |
| |
@code{throw} an exception. In the simplest case, this will terminate |
| |
your program, and report an appropriate error. |
| |
|
| |
doc-throw |
| |
|
| |
@code{Throw} consumes a cell-sized error number on the stack. There are |
| |
some predefined error numbers in ANS Forth (see @file{errors.fs}). In |
| |
Gforth (and most other systems) you can use the iors produced by various |
| |
words as error numbers (e.g., a typical use of @code{allocate} is |
| |
@code{allocate throw}). Gforth also provides the word @code{exception} |
| |
to define your own error numbers (with decent error reporting); an ANS |
| |
Forth version of this word (but without the error messages) is available |
| |
in @code{compat/except.fs}. And finally, you can use your own error |
| |
numbers (anything outside the range -4095..0), but won't get nice error |
| |
messages, only numbers. For example, try: |
| |
|
| |
@example |
| |
-10 throw \ ANS defined |
| |
-267 throw \ system defined |
| |
s" my error" exception throw \ user defined |
| |
7 throw \ arbitrary number |
| |
@end example |
| |
|
| |
doc---exception-exception |
| |
|
| |
A common idiom to @code{THROW} a specific error if a flag is true is |
| |
this: |
| |
|
| |
@example |
| |
@code{( flag ) 0<> @i{errno} and throw} |
| |
@end example |
| |
|
| |
Your program can provide exception handlers to catch exceptions. An |
| |
exception handler can be used to correct the problem, or to clean up |
| |
some data structures and just throw the exception to the next exception |
| |
handler. Note that @code{throw} jumps to the dynamically innermost |
| |
exception handler. The system's exception handler is outermost, and just |
| |
prints an error and restarts command-line interpretation (or, in batch |
| |
mode (i.e., while processing the shell command line), leaves Gforth). |
| |
|
| |
The ANS Forth way to catch exceptions is @code{catch}: |
| |
|
| |
doc-catch |
| |
|
| |
The most common use of exception handlers is to clean up the state when |
| |
an error happens. E.g., |
| |
|
| |
@example |
| |
base @ >r hex \ actually the hex should be inside foo, or we h |
| |
['] foo catch ( nerror|0 ) |
| |
r> base ! |
| |
( nerror|0 ) throw \ pass it on |
| |
@end example |
| |
|
| |
A use of @code{catch} for handling the error @code{myerror} might look |
| |
like this: |
| |
|
| |
@example |
| |
['] foo catch |
| |
CASE |
| |
myerror OF ... ( do something about it ) ENDOF |
| |
dup throw \ default: pass other errors on, do nothing on non-errors |
| |
ENDCASE |
| |
@end example |
| |
|
| |
Having to wrap the code into a separate word is often cumbersome, |
| |
therefore Gforth provides an alternative syntax: |
| |
|
| |
@example |
| |
TRY |
| |
@i{code1} |
| |
RECOVER \ optional |
| |
@i{code2} \ optional |
| |
ENDTRY |
| |
@end example |
| |
|
| |
This performs @i{Code1}. If @i{code1} completes normally, execution |
| |
continues after the @code{endtry}. If @i{Code1} throws, the stacks are |
| |
reset to the state during @code{try}, the throw value is pushed on the |
| |
data stack, and execution constinues at @i{code2}, and finally falls |
| |
through the @code{endtry} into the following code. If there is no |
| |
@code{recover} clause, this works like an empty recover clause. |
| |
|
| |
doc-try |
| |
doc-recover |
| |
doc-endtry |
| |
|
| |
The cleanup example from above in this syntax: |
| |
|
| |
@example |
| |
base @ >r TRY |
| |
hex foo \ now the hex is placed correctly |
| |
0 \ value for throw |
| |
ENDTRY |
| |
r> base ! throw |
| |
@end example |
| |
|
| |
And here's the error handling example: |
| |
|
| |
@example |
| |
TRY |
| |
foo |
| |
RECOVER |
| |
CASE |
| |
myerror OF ... ( do something about it ) ENDOF |
| |
throw \ pass other errors on |
| |
ENDCASE |
| |
ENDTRY |
| |
@end example |
| |
|
| |
@progstyle |
| |
As usual, you should ensure that the stack depth is statically known at |
| |
the end: either after the @code{throw} for passing on errors, or after |
| |
the @code{ENDTRY} (or, if you use @code{catch}, after the end of the |
| |
selection construct for handling the error). |
| |
|
| |
There are two alternatives to @code{throw}: @code{Abort"} is conditional |
| |
and you can provide an error message. @code{Abort} just produces an |
| |
``Aborted'' error. |
| |
|
| |
The problem with these words is that exception handlers cannot |
| |
differentiate between different @code{abort"}s; they just look like |
| |
@code{-2 throw} to them (the error message cannot be accessed by |
| |
standard programs). Similar @code{abort} looks like @code{-1 throw} to |
| |
exception handlers. |
| |
|
| |
doc-abort" |
| |
doc-abort |
| |
|
| |
|
| |
|
| |
@c ------------------------------------------------------------- |
| |
@node Defining Words, Interpretation and Compilation Semantics, Control Structures, Words |
| |
@section Defining Words |
| |
@cindex defining words |
| |
|
| |
Defining words are used to extend Forth by creating new entries in the dictionary. |
| |
|
| |
@menu |
| |
* CREATE:: |
| |
* Variables:: Variables and user variables |
| |
* Constants:: |
| |
* Values:: Initialised variables |
| |
* Colon Definitions:: |
| |
* Anonymous Definitions:: Definitions without names |
| |
* Supplying names:: Passing definition names as strings |
| |
* User-defined Defining Words:: |
| |
* Deferred words:: Allow forward references |
| |
* Aliases:: |
| |
@end menu |
| |
|
| |
@node CREATE, Variables, Defining Words, Defining Words |
| |
@subsection @code{CREATE} |
| |
@cindex simple defining words |
| |
@cindex defining words, simple |
| |
|
| |
Defining words are used to create new entries in the dictionary. The |
| |
simplest defining word is @code{CREATE}. @code{CREATE} is used like |
| |
this: |
| |
|
| |
@example |
| |
CREATE new-word1 |
| |
@end example |
| |
|
| |
@code{CREATE} is a parsing word, i.e., it takes an argument from the |
| |
input stream (@code{new-word1} in our example). It generates a |
| |
dictionary entry for @code{new-word1}. When @code{new-word1} is |
| |
executed, all that it does is leave an address on the stack. The address |
| |
represents the value of the data space pointer (@code{HERE}) at the time |
| |
that @code{new-word1} was defined. Therefore, @code{CREATE} is a way of |
| |
associating a name with the address of a region of memory. |
| |
|
| |
doc-create |
| |
|
| |
Note that in ANS Forth guarantees only for @code{create} that its body |
| |
is in dictionary data space (i.e., where @code{here}, @code{allot} |
| |
etc. work, @pxref{Dictionary allocation}). Also, in ANS Forth only |
| |
@code{create}d words can be modified with @code{does>} |
| |
(@pxref{User-defined Defining Words}). And in ANS Forth @code{>body} |
| |
can only be applied to @code{create}d words. |
| |
|
| |
By extending this example to reserve some memory in data space, we end |
| |
up with something like a @i{variable}. Here are two different ways to do |
| |
it: |
| |
|
| |
@example |
| |
CREATE new-word2 1 cells allot \ reserve 1 cell - initial value undefined |
| |
CREATE new-word3 4 , \ reserve 1 cell and initialise it (to 4) |
| |
@end example |
| |
|
| |
The variable can be examined and modified using @code{@@} (``fetch'') and |
| |
@code{!} (``store'') like this: |
| |
|
| |
@example |
| |
new-word2 @@ . \ get address, fetch from it and display |
| |
1234 new-word2 ! \ new value, get address, store to it |
| |
@end example |
| |
|
| |
@cindex arrays |
| |
A similar mechanism can be used to create arrays. For example, an |
| |
80-character text input buffer: |
| |
|
| |
@example |
| |
CREATE text-buf 80 chars allot |
| |
|
| |
text-buf 0 chars c@@ \ the 1st character (offset 0) |
| |
text-buf 3 chars c@@ \ the 4th character (offset 3) |
| |
@end example |
| |
|
| |
You can build arbitrarily complex data structures by allocating |
| |
appropriate areas of memory. For further discussions of this, and to |
| |
learn about some Gforth tools that make it easier, |
| |
@xref{Structures}. |
| |
|
| |
|
| |
@node Variables, Constants, CREATE, Defining Words |
| |
@subsection Variables |
| |
@cindex variables |
| |
|
| |
The previous section showed how a sequence of commands could be used to |
| |
generate a variable. As a final refinement, the whole code sequence can |
| |
be wrapped up in a defining word (pre-empting the subject of the next |
| |
section), making it easier to create new variables: |
| |
|
| |
@example |
| |
: myvariableX ( "name" -- a-addr ) CREATE 1 cells allot ; |
| |
: myvariable0 ( "name" -- a-addr ) CREATE 0 , ; |
| |
|
| |
myvariableX foo \ variable foo starts off with an unknown value |
| |
myvariable0 joe \ whilst joe is initialised to 0 |
| |
|
| |
45 3 * foo ! \ set foo to 135 |
| |
1234 joe ! \ set joe to 1234 |
| |
3 joe +! \ increment joe by 3.. to 1237 |
| |
@end example |
| |
|
| |
Not surprisingly, there is no need to define @code{myvariable}, since |
| |
Forth already has a definition @code{Variable}. ANS Forth does not |
| |
guarantee that a @code{Variable} is initialised when it is created |
| |
(i.e., it may behave like @code{myvariableX}). In contrast, Gforth's |
| |
@code{Variable} initialises the variable to 0 (i.e., it behaves exactly |
| |
like @code{myvariable0}). Forth also provides @code{2Variable} and |
| |
@code{fvariable} for double and floating-point variables, respectively |
| |
-- they are initialised to 0. and 0e in Gforth. If you use a @code{Variable} to |
| |
store a boolean, you can use @code{on} and @code{off} to toggle its |
| |
state. |
| |
|
| |
doc-variable |
| |
doc-2variable |
| |
doc-fvariable |
| |
|
| |
@cindex user variables |
| |
@cindex user space |
| |
The defining word @code{User} behaves in the same way as @code{Variable}. |
| |
The difference is that it reserves space in @i{user (data) space} rather |
| |
than normal data space. In a Forth system that has a multi-tasker, each |
| |
task has its own set of user variables. |
| |
|
| |
doc-user |
| |
@c doc-udp |
| |
@c doc-uallot |
| |
|
| |
@comment TODO is that stuff about user variables strictly correct? Is it |
| |
@comment just terminal tasks that have user variables? |
| |
@comment should document tasker.fs (with some examples) elsewhere |
| |
@comment in this manual, then expand on user space and user variables. |
| |
|
| |
@node Constants, Values, Variables, Defining Words |
| |
@subsection Constants |
| |
@cindex constants |
| |
|
| |
@code{Constant} allows you to declare a fixed value and refer to it by |
| |
name. For example: |
| |
|
| |
@example |
| |
12 Constant INCHES-PER-FOOT |
| |
3E+08 fconstant SPEED-O-LIGHT |
| |
@end example |
| |
|
| |
A @code{Variable} can be both read and written, so its run-time |
| |
behaviour is to supply an address through which its current value can be |
| |
manipulated. In contrast, the value of a @code{Constant} cannot be |
| |
changed once it has been declared@footnote{Well, often it can be -- but |
| |
not in a Standard, portable way. It's safer to use a @code{Value} (read |
| |
on).} so it's not necessary to supply the address -- it is more |
| |
efficient to return the value of the constant directly. That's exactly |
| |
what happens; the run-time effect of a constant is to put its value on |
| |
the top of the stack (You can find one |
| |
way of implementing @code{Constant} in @ref{User-defined Defining Words}). |
| |
|
| |
Forth also provides @code{2Constant} and @code{fconstant} for defining |
| |
double and floating-point constants, respectively. |
| |
|
| |
doc-constant |
| |
doc-2constant |
| |
doc-fconstant |
| |
|
| |
@c that's too deep, and it's not necessarily true for all ANS Forths. - anton |
| |
@c nac-> How could that not be true in an ANS Forth? You can't define a |
| |
@c constant, use it and then delete the definition of the constant.. |
| |
|
| |
@c anton->An ANS Forth system can compile a constant to a literal; On |
| |
@c decompilation you would see only the number, just as if it had been used |
| |
@c in the first place. The word will stay, of course, but it will only be |
| |
@c used by the text interpreter (no run-time duties, except when it is |
| |
@c POSTPONEd or somesuch). |
| |
|
| |
@c nac: |
| |
@c I agree that it's rather deep, but IMO it is an important difference |
| |
@c relative to other programming languages.. often it's annoying: it |
| |
@c certainly changes my programming style relative to C. |
| |
|
| |
@c anton: In what way? |
| |
|
| |
Constants in Forth behave differently from their equivalents in other |
| |
programming languages. In other languages, a constant (such as an EQU in |
| |
assembler or a #define in C) only exists at compile-time; in the |
| |
executable program the constant has been translated into an absolute |
| |
number and, unless you are using a symbolic debugger, it's impossible to |
| |
know what abstract thing that number represents. In Forth a constant has |
| |
an entry in the header space and remains there after the code that uses |
| |
it has been defined. In fact, it must remain in the dictionary since it |
| |
has run-time duties to perform. For example: |
| |
|
| |
@example |
| |
12 Constant INCHES-PER-FOOT |
| |
: FEET-TO-INCHES ( n1 -- n2 ) INCHES-PER-FOOT * ; |
| |
@end example |
| |
|
| |
@cindex in-lining of constants |
| |
When @code{FEET-TO-INCHES} is executed, it will in turn execute the xt |
| |
associated with the constant @code{INCHES-PER-FOOT}. If you use |
| |
@code{see} to decompile the definition of @code{FEET-TO-INCHES}, you can |
| |
see that it makes a call to @code{INCHES-PER-FOOT}. Some Forth compilers |
| |
attempt to optimise constants by in-lining them where they are used. You |
| |
can force Gforth to in-line a constant like this: |
| |
|
| |
@example |
| |
: FEET-TO-INCHES ( n1 -- n2 ) [ INCHES-PER-FOOT ] LITERAL * ; |
| |
@end example |
| |
|
| |
If you use @code{see} to decompile @i{this} version of |
| |
@code{FEET-TO-INCHES}, you can see that @code{INCHES-PER-FOOT} is no |
| |
longer present. To understand how this works, read |
| |
@ref{Interpret/Compile states}, and @ref{Literals}. |
| |
|
| |
In-lining constants in this way might improve execution time |
| |
fractionally, and can ensure that a constant is now only referenced at |
| |
compile-time. However, the definition of the constant still remains in |
| |
the dictionary. Some Forth compilers provide a mechanism for controlling |
| |
a second dictionary for holding transient words such that this second |
| |
dictionary can be deleted later in order to recover memory |
| |
space. However, there is no standard way of doing this. |
| |
|
| |
|
| |
@node Values, Colon Definitions, Constants, Defining Words |
| |
@subsection Values |
| |
@cindex values |
| |
|
| |
A @code{Value} behaves like a @code{Constant}, but it can be changed. |
| |
@code{TO} is a parsing word that changes a @code{Values}. In Gforth |
| |
(not in ANS Forth) you can access (and change) a @code{value} also with |
| |
@code{>body}. |
| |
|
| |
Here are some |
| |
examples: |
| |
|
| |
@example |
| |
12 Value APPLES \ Define APPLES with an initial value of 12 |
| |
34 TO APPLES \ Change the value of APPLES. TO is a parsing word |
| |
1 ' APPLES >body +! \ Increment APPLES. Non-standard usage. |
| |
APPLES \ puts 35 on the top of the stack. |
| |
@end example |
| |
|
| |
doc-value |
| |
doc-to |
| |
|
| |
|
| |
|
| |
@node Colon Definitions, Anonymous Definitions, Values, Defining Words |
| |
@subsection Colon Definitions |
| |
@cindex colon definitions |
| |
|
| |
@example |
| |
: name ( ... -- ... ) |
| |
word1 word2 word3 ; |
| |
@end example |
| |
|
| |
@noindent |
| |
Creates a word called @code{name} that, upon execution, executes |
| |
@code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}. |
| |
|
| |
The explanation above is somewhat superficial. For simple examples of |
| |
colon definitions see @ref{Your first definition}. For an in-depth |
| |
discussion of some of the issues involved, @xref{Interpretation and |
| |
Compilation Semantics}. |
| |
|
| |
doc-: |
| |
doc-; |
| |
|
| |
|
| |
@node Anonymous Definitions, Supplying names, Colon Definitions, Defining Words |
| |
@subsection Anonymous Definitions |
| |
@cindex colon definitions |
| |
@cindex defining words without name |
| |
|
| |
Sometimes you want to define an @dfn{anonymous word}; a word without a |
| |
name. You can do this with: |
| |
|
| |
doc-:noname |
| |
|
| |
This leaves the execution token for the word on the stack after the |
| |
closing @code{;}. Here's an example in which a deferred word is |
| |
initialised with an @code{xt} from an anonymous colon definition: |
| |
|
| |
@example |
| |
Defer deferred |
| |
:noname ( ... -- ... ) |
| |
... ; |
| |
IS deferred |
| |
@end example |
| |
|
| |
@noindent |
| |
Gforth provides an alternative way of doing this, using two separate |
| |
words: |
| |
|
| |
doc-noname |
| |
@cindex execution token of last defined word |
| |
doc-lastxt |
| |
|
| |
@noindent |
| |
The previous example can be rewritten using @code{noname} and |
| |
@code{lastxt}: |
| |
|
| |
@example |
| |
Defer deferred |
| |
noname : ( ... -- ... ) |
| |
... ; |
| |
lastxt IS deferred |
| |
@end example |
| |
|
| |
@noindent |
| |
@code{noname} works with any defining word, not just @code{:}. |
| |
|
| |
@code{lastxt} also works when the last word was not defined as |
| |
@code{noname}. It does not work for combined words, though. It also has |
| |
the useful property that is is valid as soon as the header for a |
| |
definition has been built. Thus: |
| |
|
| |
@example |
| |
lastxt . : foo [ lastxt . ] ; ' foo . |
| |
@end example |
| |
|
| |
@noindent |
| |
prints 3 numbers; the last two are the same. |
| |
|
| |
@node Supplying names, User-defined Defining Words, Anonymous Definitions, Defining Words |
| |
@subsection Supplying the name of a defined word |
| |
@cindex names for defined words |
| |
@cindex defining words, name given in a string |
| |
|
| |
By default, a defining word takes the name for the defined word from the |
| |
input stream. Sometimes you want to supply the name from a string. You |
| |
can do this with: |
| |
|
| |
doc-nextname |
| |
|
| |
For example: |
| |
|
| |
@example |
| |
s" foo" nextname create |
| |
@end example |
| |
|
| |
@noindent |
| |
is equivalent to: |
| |
|
| |
@example |
| |
create foo |
| |
@end example |
| |
|
| |
@noindent |
| |
@code{nextname} works with any defining word. |
| |
|
| |
|
| |
@node User-defined Defining Words, Deferred words, Supplying names, Defining Words |
| |
@subsection User-defined Defining Words |
| |
@cindex user-defined defining words |
| |
@cindex defining words, user-defined |
| |
|
| |
You can create a new defining word by wrapping defining-time code around |
| |
an existing defining word and putting the sequence in a colon |
| |
definition. |
| |
|
| |
@c anton: This example is very complex and leads in a quite different |
| |
@c direction from the CREATE-DOES> stuff that follows. It should probably |
| |
@c be done elsewhere, or as a subsubsection of this subsection (or as a |
| |
@c subsection of Defining Words) |
| |
|
| |
For example, suppose that you have a word @code{stats} that |
| |
gathers statistics about colon definitions given the @i{xt} of the |
| |
definition, and you want every colon definition in your application to |
| |
make a call to @code{stats}. You can define and use a new version of |
| |
@code{:} like this: |
| |
|
| |
@example |
| |
: stats ( xt -- ) DUP ." (Gathering statistics for " . ." )" |
| |
... ; \ other code |
| |
|
| |
: my: : lastxt postpone literal ['] stats compile, ; |
| |
|
| |
my: foo + - ; |
| |
@end example |
| |
|
| |
When @code{foo} is defined using @code{my:} these steps occur: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
@code{my:} is executed. |
| |
@item |
| |
The @code{:} within the definition (the one between @code{my:} and |
| |
@code{lastxt}) is executed, and does just what it always does; it parses |
| |
the input stream for a name, builds a dictionary header for the name |
| |
@code{foo} and switches @code{state} from interpret to compile. |
| |
@item |
| |
The word @code{lastxt} is executed. It puts the @i{xt} for the word that is |
| |
being defined -- @code{foo} -- onto the stack. |
| |
@item |
| |
The code that was produced by @code{postpone literal} is executed; this |
| |
causes the value on the stack to be compiled as a literal in the code |
| |
area of @code{foo}. |
| |
@item |
| |
The code @code{['] stats} compiles a literal into the definition of |
| |
@code{my:}. When @code{compile,} is executed, that literal -- the |
| |
execution token for @code{stats} -- is layed down in the code area of |
| |
@code{foo} , following the literal@footnote{Strictly speaking, the |
| |
mechanism that @code{compile,} uses to convert an @i{xt} into something |
| |
in the code area is implementation-dependent. A threaded implementation |
| |
might spit out the execution token directly whilst another |
| |
implementation might spit out a native code sequence.}. |
| |
@item |
| |
At this point, the execution of @code{my:} is complete, and control |
| |
returns to the text interpreter. The text interpreter is in compile |
| |
state, so subsequent text @code{+ -} is compiled into the definition of |
| |
@code{foo} and the @code{;} terminates the definition as always. |
| |
@end itemize |
| |
|
| |
You can use @code{see} to decompile a word that was defined using |
| |
@code{my:} and see how it is different from a normal @code{:} |
| |
definition. For example: |
| |
|
| |
@example |
| |
: bar + - ; \ like foo but using : rather than my: |
| |
see bar |
| |
: bar |
| |
+ - ; |
| |
see foo |
| |
: foo |
| |
107645672 stats + - ; |
| |
|
| |
\ use ' stats . to show that 107645672 is the xt for stats |
| |
@end example |
| |
|
| |
You can use techniques like this to make new defining words in terms of |
| |
@i{any} existing defining word. |
| |
|
| |
|
| |
@cindex defining defining words |
| |
@cindex @code{CREATE} ... @code{DOES>} |
| |
If you want the words defined with your defining words to behave |
| |
differently from words defined with standard defining words, you can |
| |
write your defining word like this: |
| |
|
| |
@example |
| |
: def-word ( "name" -- ) |
| |
CREATE @i{code1} |
| |
DOES> ( ... -- ... ) |
| |
@i{code2} ; |
| |
|
| |
def-word name |
| |
@end example |
| |
|
| |
@cindex child words |
| |
This fragment defines a @dfn{defining word} @code{def-word} and then |
| |
executes it. When @code{def-word} executes, it @code{CREATE}s a new |
| |
word, @code{name}, and executes the code @i{code1}. The code @i{code2} |
| |
is not executed at this time. The word @code{name} is sometimes called a |
| |
@dfn{child} of @code{def-word}. |
| |
|
| |
When you execute @code{name}, the address of the body of @code{name} is |
| |
put on the data stack and @i{code2} is executed (the address of the body |
| |
of @code{name} is the address @code{HERE} returns immediately after the |
| |
@code{CREATE}, i.e., the address a @code{create}d word returns by |
| |
default). |
| |
|
| |
@c anton: |
| |
@c www.dictionary.com says: |
| |
@c at·a·vism: 1.The reappearance of a characteristic in an organism after |
| |
@c several generations of absence, usually caused by the chance |
| |
@c recombination of genes. 2.An individual or a part that exhibits |
| |
@c atavism. Also called throwback. 3.The return of a trait or recurrence |
| |
@c of previous behavior after a period of absence. |
| |
@c |
| |
@c Doesn't seem to fit. |
| |
|
| |
@c @cindex atavism in child words |
| |
You can use @code{def-word} to define a set of child words that behave |
| |
similarly; they all have a common run-time behaviour determined by |
| |
@i{code2}. Typically, the @i{code1} sequence builds a data area in the |
| |
body of the child word. The structure of the data is common to all |
| |
children of @code{def-word}, but the data values are specific -- and |
| |
private -- to each child word. When a child word is executed, the |
| |
address of its private data area is passed as a parameter on TOS to be |
| |
used and manipulated@footnote{It is legitimate both to read and write to |
| |
this data area.} by @i{code2}. |
| |
|
| |
The two fragments of code that make up the defining words act (are |
| |
executed) at two completely separate times: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
At @i{define time}, the defining word executes @i{code1} to generate a |
| |
child word |
| |
@item |
| |
At @i{child execution time}, when a child word is invoked, @i{code2} |
| |
is executed, using parameters (data) that are private and specific to |
| |
the child word. |
| |
@end itemize |
| |
|
| |
Another way of understanding the behaviour of @code{def-word} and |
| |
@code{name} is to say that, if you make the following definitions: |
| |
@example |
| |
: def-word1 ( "name" -- ) |
| |
CREATE @i{code1} ; |
| |
|
| |
: action1 ( ... -- ... ) |
| |
@i{code2} ; |
| |
|
| |
def-word1 name1 |
| |
@end example |
| |
|
| |
@noindent |
| |
Then using @code{name1 action1} is equivalent to using @code{name}. |
| |
|
| |
The classic example is that you can define @code{CONSTANT} in this way: |
| |
|
| |
@example |
| |
: CONSTANT ( w "name" -- ) |
| |
CREATE , |
| |
DOES> ( -- w ) |
| |
@@ ; |
| |
@end example |
| |
|
| |
@comment There is a beautiful description of how this works and what |
| |
@comment it does in the Forthwrite 100th edition.. as well as an elegant |
| |
@comment commentary on the Counting Fruits problem. |
| |
|
| |
When you create a constant with @code{5 CONSTANT five}, a set of |
| |
define-time actions take place; first a new word @code{five} is created, |
| |
then the value 5 is laid down in the body of @code{five} with |
| |
@code{,}. When @code{five} is executed, the address of the body is put on |
| |
the stack, and @code{@@} retrieves the value 5. The word @code{five} has |
| |
no code of its own; it simply contains a data field and a pointer to the |
| |
code that follows @code{DOES>} in its defining word. That makes words |
| |
created in this way very compact. |
| |
|
| |
The final example in this section is intended to remind you that space |
| |
reserved in @code{CREATE}d words is @i{data} space and therefore can be |
| |
both read and written by a Standard program@footnote{Exercise: use this |
| |
example as a starting point for your own implementation of @code{Value} |
| |
and @code{TO} -- if you get stuck, investigate the behaviour of @code{'} and |
| |
@code{[']}.}: |
| |
|
| |
@example |
| |
: foo ( "name" -- ) |
| |
CREATE -1 , |
| |
DOES> ( -- ) |
| |
@@ . ; |
| |
|
| |
foo first-word |
| |
foo second-word |
| |
|
| |
123 ' first-word >BODY ! |
| |
@end example |
| |
|
| |
If @code{first-word} had been a @code{CREATE}d word, we could simply |
| |
have executed it to get the address of its data field. However, since it |
| |
was defined to have @code{DOES>} actions, its execution semantics are to |
| |
perform those @code{DOES>} actions. To get the address of its data field |
| |
it's necessary to use @code{'} to get its xt, then @code{>BODY} to |
| |
translate the xt into the address of the data field. When you execute |
| |
@code{first-word}, it will display @code{123}. When you execute |
| |
@code{second-word} it will display @code{-1}. |
| |
|
| |
@cindex stack effect of @code{DOES>}-parts |
| |
@cindex @code{DOES>}-parts, stack effect |
| |
In the examples above the stack comment after the @code{DOES>} specifies |
| |
the stack effect of the defined words, not the stack effect of the |
| |
following code (the following code expects the address of the body on |
| |
the top of stack, which is not reflected in the stack comment). This is |
| |
the convention that I use and recommend (it clashes a bit with using |
| |
locals declarations for stack effect specification, though). |
| |
|
| |
@menu |
| |
* CREATE..DOES> applications:: |
| |
* CREATE..DOES> details:: |
| |
* Advanced does> usage example:: |
| |
@end menu |
| |
|
| |
@node CREATE..DOES> applications, CREATE..DOES> details, User-defined Defining Words, User-defined Defining Words |
| |
@subsubsection Applications of @code{CREATE..DOES>} |
| |
@cindex @code{CREATE} ... @code{DOES>}, applications |
| |
|
| |
You may wonder how to use this feature. Here are some usage patterns: |
| |
|
| |
@cindex factoring similar colon definitions |
| |
When you see a sequence of code occurring several times, and you can |
| |
identify a meaning, you will factor it out as a colon definition. When |
| |
you see similar colon definitions, you can factor them using |
| |
@code{CREATE..DOES>}. E.g., an assembler usually defines several words |
| |
that look very similar: |
| |
@example |
| |
: ori, ( reg-target reg-source n -- ) |
| |
0 asm-reg-reg-imm ; |
| |
: andi, ( reg-target reg-source n -- ) |
| |
1 asm-reg-reg-imm ; |
| |
@end example |
| |
|
| |
@noindent |
| |
This could be factored with: |
| |
@example |
| |
: reg-reg-imm ( op-code -- ) |
| |
CREATE , |
| |
DOES> ( reg-target reg-source n -- ) |
| |
@@ asm-reg-reg-imm ; |
| |
|
| |
0 reg-reg-imm ori, |
| |
1 reg-reg-imm andi, |
| |
@end example |
| |
|
| |
@cindex currying |
| |
Another view of @code{CREATE..DOES>} is to consider it as a crude way to |
| |
supply a part of the parameters for a word (known as @dfn{currying} in |
| |
the functional language community). E.g., @code{+} needs two |
| |
parameters. Creating versions of @code{+} with one parameter fixed can |
| |
be done like this: |
| |
@example |
| |
: curry+ ( n1 -- ) |
| |
CREATE , |
| |
DOES> ( n2 -- n1+n2 ) |
| |
@@ + ; |
| |
|
| |
3 curry+ 3+ |
| |
-2 curry+ 2- |
| |
@end example |
| |
|
| |
@node CREATE..DOES> details, Advanced does> usage example, CREATE..DOES> applications, User-defined Defining Words |
| |
@subsubsection The gory details of @code{CREATE..DOES>} |
| |
@cindex @code{CREATE} ... @code{DOES>}, details |
| |
|
| |
doc-does> |
| |
|
| |
@cindex @code{DOES>} in a separate definition |
| |
This means that you need not use @code{CREATE} and @code{DOES>} in the |
| |
same definition; you can put the @code{DOES>}-part in a separate |
| |
definition. This allows us to, e.g., select among different @code{DOES>}-parts: |
| |
@example |
| |
: does1 |
| |
DOES> ( ... -- ... ) |
| |
... ; |
| |
|
| |
: does2 |
| |
DOES> ( ... -- ... ) |
| |
... ; |
| |
|
| |
: def-word ( ... -- ... ) |
| |
create ... |
| |
IF |
| |
does1 |
| |
ELSE |
| |
does2 |
| |
ENDIF ; |
| |
@end example |
| |
|
| |
In this example, the selection of whether to use @code{does1} or |
| |
@code{does2} is made at definition-time; at the time that the child word is |
| |
@code{CREATE}d. |
| |
|
| |
@cindex @code{DOES>} in interpretation state |
| |
In a standard program you can apply a @code{DOES>}-part only if the last |
| |
word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part |
| |
will override the behaviour of the last word defined in any case. In a |
| |
standard program, you can use @code{DOES>} only in a colon |
| |
definition. In Gforth, you can also use it in interpretation state, in a |
| |
kind of one-shot mode; for example: |
| |
@example |
| |
CREATE name ( ... -- ... ) |
| |
@i{initialization} |
| |
DOES> |
| |
@i{code} ; |
| |
@end example |
| |
|
| |
@noindent |
| |
is equivalent to the standard: |
| |
@example |
| |
:noname |
| |
DOES> |
| |
@i{code} ; |
| |
CREATE name EXECUTE ( ... -- ... ) |
| |
@i{initialization} |
| |
@end example |
| |
|
| |
doc->body |
| |
|
| |
@node Advanced does> usage example, , CREATE..DOES> details, User-defined Defining Words |
| |
@subsubsection Advanced does> usage example |
| |
|
| |
The MIPS disassembler (@file{arch/mips/disasm.fs}) contains many words |
| |
for disassembling instructions, that follow a very repetetive scheme: |
| |
|
| |
@example |
| |
:noname @var{disasm-operands} s" @var{inst-name}" type ; |
| |
@var{entry-num} cells @var{table} + ! |
| |
@end example |
| |
|
| |
Of course, this inspires the idea to factor out the commonalities to |
| |
allow a definition like |
| |
|
| |
@example |
| |
@var{disasm-operands} @var{entry-num} @var{table} define-inst @var{inst-name} |
| |
@end example |
| |
|
| |
The parameters @var{disasm-operands} and @var{table} are usually |
| |
correlated. Moreover, before I wrote the disassembler, there already |
| |
existed code that defines instructions like this: |
| |
|
| |
@example |
| |
@var{entry-num} @var{inst-format} @var{inst-name} |
| |
@end example |
| |
|
| |
This code comes from the assembler and resides in |
| |
@file{arch/mips/insts.fs}. |
| |
|
| |
So I had to define the @var{inst-format} words that performed the scheme |
| |
above when executed. At first I chose to use run-time code-generation: |
| |
|
| |
@example |
| |
: @var{inst-format} ( entry-num "name" -- ; compiled code: addr w -- ) |
| |
:noname Postpone @var{disasm-operands} |
| |
name Postpone sliteral Postpone type Postpone ; |
| |
swap cells @var{table} + ! ; |
| |
@end example |
| |
|
| |
Note that this supplies the other two parameters of the scheme above. |
| |
|
| |
An alternative would have been to write this using |
| |
@code{create}/@code{does>}: |
| |
|
| |
@example |
| |
: @var{inst-format} ( entry-num "name" -- ) |
| |
here name string, ( entry-num c-addr ) \ parse and save "name" |
| |
noname create , ( entry-num ) |
| |
lastxt swap cells @var{table} + ! |
| |
does> ( addr w -- ) |
| |
\ disassemble instruction w at addr |
| |
@@ >r |
| |
@var{disasm-operands} |
| |
r> count type ; |
| |
@end example |
| |
|
| |
Somehow the first solution is simpler, mainly because it's simpler to |
| |
shift a string from definition-time to use-time with @code{sliteral} |
| |
than with @code{string,} and friends. |
| |
|
| |
I wrote a lot of words following this scheme and soon thought about |
| |
factoring out the commonalities among them. Note that this uses a |
| |
two-level defining word, i.e., a word that defines ordinary defining |
| |
words. |
| |
|
| |
This time a solution involving @code{postpone} and friends seemed more |
| |
difficult (try it as an exercise), so I decided to use a |
| |
@code{create}/@code{does>} word; since I was already at it, I also used |
| |
@code{create}/@code{does>} for the lower level (try using |
| |
@code{postpone} etc. as an exercise), resulting in the following |
| |
definition: |
| |
|
| |
@example |
| |
: define-format ( disasm-xt table-xt -- ) |
| |
\ define an instruction format that uses disasm-xt for |
| |
\ disassembling and enters the defined instructions into table |
| |
\ table-xt |
| |
create 2, |
| |
does> ( u "inst" -- ) |
| |
\ defines an anonymous word for disassembling instruction inst, |
| |
\ and enters it as u-th entry into table-xt |
| |
2@@ swap here name string, ( u table-xt disasm-xt c-addr ) \ remember string |
| |
noname create 2, \ define anonymous word |
| |
execute lastxt swap ! \ enter xt of defined word into table-xt |
| |
does> ( addr w -- ) |
| |
\ disassemble instruction w at addr |
| |
2@@ >r ( addr w disasm-xt R: c-addr ) |
| |
execute ( R: c-addr ) \ disassemble operands |
| |
r> count type ; \ print name |
| |
@end example |
| |
|
| |
Note that the tables here (in contrast to above) do the @code{cells +} |
| |
by themselves (that's why you have to pass an xt). This word is used in |
| |
the following way: |
| |
|
| |
@example |
| |
' @var{disasm-operands} ' @var{table} define-format @var{inst-format} |
| |
@end example |
| |
|
| |
As shown above, the defined instruction format is then used like this: |
| |
|
| |
@example |
| |
@var{entry-num} @var{inst-format} @var{inst-name} |
| |
@end example |
| |
|
| |
In terms of currying, this kind of two-level defining word provides the |
| |
parameters in three stages: first @var{disasm-operands} and @var{table}, |
| |
then @var{entry-num} and @var{inst-name}, finally @code{addr w}, i.e., |
| |
the instruction to be disassembled. |
| |
|
| |
Of course this did not quite fit all the instruction format names used |
| |
in @file{insts.fs}, so I had to define a few wrappers that conditioned |
| |
the parameters into the right form. |
| |
|
| |
If you have trouble following this section, don't worry. First, this is |
| |
involved and takes time (and probably some playing around) to |
| |
understand; second, this is the first two-level |
| |
@code{create}/@code{does>} word I have written in seventeen years of |
| |
Forth; and if I did not have @file{insts.fs} to start with, I may well |
| |
have elected to use just a one-level defining word (with some repeating |
| |
of parameters when using the defining word). So it is not necessary to |
| |
understand this, but it may improve your understanding of Forth. |
| |
|
| |
|
| |
@node Deferred words, Aliases, User-defined Defining Words, Defining Words |
| |
@subsection Deferred words |
| |
@cindex deferred words |
| |
|
| |
The defining word @code{Defer} allows you to define a word by name |
| |
without defining its behaviour; the definition of its behaviour is |
| |
deferred. Here are two situation where this can be useful: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
Where you want to allow the behaviour of a word to be altered later, and |
| |
for all precompiled references to the word to change when its behaviour |
| |
is changed. |
| |
@item |
| |
For mutual recursion; @xref{Calls and returns}. |
| |
@end itemize |
| |
|
| |
In the following example, @code{foo} always invokes the version of |
| |
@code{greet} that prints ``@code{Good morning}'' whilst @code{bar} |
| |
always invokes the version that prints ``@code{Hello}''. There is no way |
| |
of getting @code{foo} to use the later version without re-ordering the |
| |
source code and recompiling it. |
| |
|
| |
@example |
| |
: greet ." Good morning" ; |
| |
: foo ... greet ... ; |
| |
: greet ." Hello" ; |
| |
: bar ... greet ... ; |
| |
@end example |
| |
|
| |
This problem can be solved by defining @code{greet} as a @code{Defer}red |
| |
word. The behaviour of a @code{Defer}red word can be defined and |
| |
redefined at any time by using @code{IS} to associate the xt of a |
| |
previously-defined word with it. The previous example becomes: |
| |
|
| |
@example |
| |
Defer greet ( -- ) |
| |
: foo ... greet ... ; |
| |
: bar ... greet ... ; |
| |
: greet1 ( -- ) ." Good morning" ; |
| |
: greet2 ( -- ) ." Hello" ; |
| |
' greet2 <IS> greet \ make greet behave like greet2 |
| |
@end example |
| |
|
| |
@progstyle |
| |
You should write a stack comment for every deferred word, and put only |
| |
XTs into deferred words that conform to this stack effect. Otherwise |
| |
it's too difficult to use the deferred word. |
| |
|
| |
A deferred word can be used to improve the statistics-gathering example |
| |
from @ref{User-defined Defining Words}; rather than edit the |
| |
application's source code to change every @code{:} to a @code{my:}, do |
| |
this: |
| |
|
| |
@example |
| |
: real: : ; \ retain access to the original |
| |
defer : \ redefine as a deferred word |
| |
' my: <IS> : \ use special version of : |
| |
\ |
| |
\ load application here |
| |
\ |
| |
' real: <IS> : \ go back to the original |
| |
@end example |
| |
|
| |
|
| |
One thing to note is that @code{<IS>} consumes its name when it is |
| |
executed. If you want to specify the name at compile time, use |
| |
@code{[IS]}: |
| |
|
| |
@example |
| |
: set-greet ( xt -- ) |
| |
[IS] greet ; |
| |
|
| |
' greet1 set-greet |
| |
@end example |
| |
|
| |
A deferred word can only inherit execution semantics from the xt |
| |
(because that is all that an xt can represent -- for more discussion of |
| |
this @pxref{Tokens for Words}); by default it will have default |
| |
interpretation and compilation semantics deriving from this execution |
| |
semantics. However, you can change the interpretation and compilation |
| |
semantics of the deferred word in the usual ways: |
| |
|
| |
@example |
| |
: bar .... ; compile-only |
| |
Defer fred immediate |
| |
Defer jim |
| |
|
| |
' bar <IS> jim \ jim has default semantics |
| |
' bar <IS> fred \ fred is immediate |
| |
@end example |
| |
|
| |
doc-defer |
| |
doc-<is> |
| |
doc-[is] |
| |
doc-is |
| |
@comment TODO document these: what's defers [is] |
| |
doc-what's |
| |
doc-defers |
| |
|
| |
@c Use @code{words-deferred} to see a list of deferred words. |
| |
|
| |
Definitions in ANS Forth for @code{defer}, @code{<is>} and @code{[is]} |
| |
are provided in @file{compat/defer.fs}. |
| |
|
| |
|
| |
@node Aliases, , Deferred words, Defining Words |
| |
@subsection Aliases |
| |
@cindex aliases |
| |
|
| |
The defining word @code{Alias} allows you to define a word by name that |
| |
has the same behaviour as some other word. Here are two situation where |
| |
this can be useful: |
| |
|
| |
@itemize @bullet |
| |
@item |
| |
When you want access to a word's definition from a different word list |
| |
(for an example of this, see the definition of the @code{Root} word list |
| |
in the Gforth source). |
| |
@item |
| |
When you want to create a synonym; a definition that can be known by |
| |
either of two names (for example, @code{THEN} and @code{ENDIF} are |
| |
aliases). |
| |
@end itemize |
| |
|
| |
Like deferred words, an alias has default compilation and interpretation |
| |
semantics at the beginning (not the modifications of the other word), |
| |
but you can change them in the usual ways (@code{immediate}, |
| |
@code{compile-only}). For example: |
| |
|
| |
@example |
| |
: foo ... ; immediate |
| |
|
| |
' foo Alias bar \ bar is not an immediate word |
| |
' foo Alias fooby immediate \ fooby is an immediate word |
| |
@end example |
| |
|
| |
Words that are aliases have the same xt, different headers in the |
| |
dictionary, and consequently different name tokens (@pxref{Tokens for |
| |
Words}) and possibly different immediate flags. An alias can only have |
| |
default or immediate compilation semantics; you can define aliases for |
| |
combined words with @code{interpret/compile:} -- see @ref{Combined words}. |
| |
|
| |
doc-alias |
| |
|
| |
|
| |
@node Interpretation and Compilation Semantics, Tokens for Words, Defining Words, Words |
| |
@section Interpretation and Compilation Semantics |
| |
@cindex semantics, interpretation and compilation |
| |
|
| |
@c !! state and ' are used without explanation |
| |
@c example for immediate/compile-only? or is the tutorial enough |
| |
|
| |
@cindex interpretation semantics |
| |
The @dfn{interpretation semantics} of a (named) word are what the text |
| |
interpreter does when it encounters the word in interpret state. It also |
| |
appears in some other contexts, e.g., the execution token returned by |
| |
@code{' @i{word}} identifies the interpretation semantics of @i{word} |
| |
(in other words, @code{' @i{word} execute} is equivalent to |
| |
interpret-state text interpretation of @code{@i{word}}). |
| |
|
| |
@cindex compilation semantics |
| |
The @dfn{compilation semantics} of a (named) word are what the text |
| |
interpreter does when it encounters the word in compile state. It also |
| |
appears in other contexts, e.g, @code{POSTPONE @i{word}} |
| |
compiles@footnote{In standard terminology, ``appends to the current |
| |
definition''.} the compilation semantics of @i{word}. |
| |
|
| |
@cindex execution semantics |
| |
The standard also talks about @dfn{execution semantics}. They are used |
| |
only for defining the interpretation and compilation semantics of many |
| |
words. By default, the interpretation semantics of a word are to |
| |
@code{execute} its execution semantics, and the compilation semantics of |
| |
a word are to @code{compile,} its execution semantics.@footnote{In |
| |
standard terminology: The default interpretation semantics are its |
| |
execution semantics; the default compilation semantics are to append its |
| |
execution semantics to the execution semantics of the current |
| |
definition.} |
| |
|
| |
Unnamed words (@pxref{Anonymous Definitions}) cannot be encountered by |
| |
the text interpreter, ticked, or @code{postpone}d, so they have no |
| |
interpretation or compilation semantics. Their behaviour is represented |
| |
by their XT (@pxref{Tokens for Words}), and we call it execution |
| |
semantics, too. |
| |
|
| |
@comment TODO expand, make it co-operate with new sections on text interpreter. |
| |
|
| |
@cindex immediate words |
| |
@cindex compile-only words |
| |
You can change the semantics of the most-recently defined word: |
| |
|
| |
|
| |
doc-immediate |
| |
doc-compile-only |
| |
doc-restrict |
| |
|
| |
|
| |
Note that ticking (@code{'}) a compile-only word gives an error |
| |
(``Interpreting a compile-only word''). |
| |
|
| |
@menu |
| |
* Combined words:: |
| |
@end menu |
| |
|
| |
|
| |
@node Combined words, , Interpretation and Compilation Semantics, Interpretation and Compilation Semantics |
| |
@subsection Combined Words |
| |
@cindex combined words |
| |
|
| |
Gforth allows you to define @dfn{combined words} -- words that have an |
| |
arbitrary combination of interpretation and compilation semantics. |
| |
|
| |
doc-interpret/compile: |
| |
|
| |
This feature was introduced for implementing @code{TO} and @code{S"}. I |
| |
recommend that you do not define such words, as cute as they may be: |
| |
they make it hard to get at both parts of the word in some contexts. |
| |
E.g., assume you want to get an execution token for the compilation |
| |
part. Instead, define two words, one that embodies the interpretation |
| |
part, and one that embodies the compilation part. Once you have done |
| |
that, you can define a combined word with @code{interpret/compile:} for |
| |
the convenience of your users. |
| |
|
| |
You might try to use this feature to provide an optimizing |
| |
implementation of the default compilation semantics of a word. For |
| |
example, by defining: |
| |
@example |
| |
:noname |
| |
foo bar ; |
| |
:noname |
| |
POSTPONE foo POSTPONE bar ; |
| |
interpret/compile: opti-foobar |
| |
@end example |
| |
|
| |
@noindent |
| |
as an optimizing version of: |
| |
|
| |
@example |
| |
: foobar |
| |
foo bar ; |
| |
@end example |
| |
|
| |
Unfortunately, this does not work correctly with @code{[compile]}, |
| |
because @code{[compile]} assumes that the compilation semantics of all |
| |
@code{interpret/compile:} words are non-default. I.e., @code{[compile] |
| |
opti-foobar} would compile compilation semantics, whereas |
| |
@code{[compile] foobar} would compile interpretation semantics. |
| |
|
| |
@cindex state-smart words (are a bad idea) |
| |
Some people try to use @dfn{state-smart} words to emulate the feature provided |
| |
by @code{interpret/compile:} (words are state-smart if they check |
| |
@code{STATE} during execution). E.g., they would try to code |
| |
@code{foobar} like this: |
| |
|
| |
@example |
| |
: foobar |
| |
STATE @@ |
| |
IF ( compilation state ) |
| |
POSTPONE foo POSTPONE bar |
| |
ELSE |
| |
foo bar |
| |
ENDIF ; immediate |
| |
@end example |
| |
|
| |
Although this works if @code{foobar} is only processed by the text |
| |
interpreter, it does not work in other contexts (like @code{'} or |
| |
@code{POSTPONE}). E.g., @code{' foobar} will produce an execution token |
| |
for a state-smart word, not for the interpretation semantics of the |
| |
original @code{foobar}; when you execute this execution token (directly |
| |
with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile |
| |
state, the result will not be what you expected (i.e., it will not |
| |
perform @code{foo bar}). State-smart words are a bad idea. Simply don't |
| |
write them@footnote{For a more detailed discussion of this topic, see |
| |
M. Anton Ertl, |
| |
@cite{@uref{http://www.complang.tuwien.ac.at/papers/ertl98.ps.gz,@code{State}-smartness---Why |
| |
it is Evil and How to Exorcise it}}, EuroForth '98.}! |
| |
|
| |
@cindex defining words with arbitrary semantics combinations |
| |
It is also possible to write defining words that define words with |
| |
arbitrary combinations of interpretation and compilation semantics. In |
| |
general, they look like this: |
| |
|
| |
@example |
| |
: def-word |
| |
create-interpret/compile |
| |
@i{code1} |
| |
interpretation> |
| |
@i{code2} |
| |
<interpretation |
| |
compilation> |
| |
@i{code3} |
| |
<compilation ; |
| |
@end example |
| |
|
| |
For a @i{word} defined with @code{def-word}, the interpretation |
| |
semantics are to push the address of the body of @i{word} and perform |
| |
@i{code2}, and the compilation semantics are to push the address of |
| |
the body of @i{word} and perform @i{code3}. E.g., @code{constant} |
| |
can also be defined like this (except that the defined constants don't |
| |
behave correctly when @code{[compile]}d): |
| |
|
| |
@example |
| |
: constant ( n "name" -- ) |
| |
create-interpret/compile |
| |
, |
| |
interpretation> ( -- n ) |
| |
@@ |
| |
<interpretation |
| |
compilation> ( compilation. -- ; run-time. -- n ) |
| |
@@ postpone literal |
| |
<compilation ; |
| |
@end example |
| |
|
| |
|
| |
doc-create-interpret/compile |
| |
doc-interpretation> |
| |
doc-<interpretation |
| |
doc-compilation> |
| |
doc-<compilation |
| |
|
| |
|
| |
Words defined with @code{interpret/compile:} and |
| |
@code{create-interpret/compile} have an extended header structure that |
| |
differs from other words; however, unless you try to access them with |
| |
plain address arithmetic, you should not notice this. Words for |
| |
accessing the header structure usually know how to deal with this; e.g., |
| |
@code{'} @i{word} @code{>body} also gives you the body of a word created |
| |
with @code{create-interpret/compile}. |
| |
|
| |
|
| |
doc-postpone |
| |
|
| |
@comment TODO -- expand glossary text for POSTPONE |
| |
|
| |
|
| |
@c ------------------------------------------------------------- |
| |
@node Tokens for Words, The Text Interpreter, Interpretation and Compilation Semantics, Words |
| |
@section Tokens for Words |
| |
@cindex tokens for words |
| |
|
| |
This section describes the creation and use of tokens that represent |
| |
words. |
| |
|
| |
@menu |
| |
* Execution token:: represents execution/interpretation semantics |
| |
* Compilation token:: represents compilation semantics |
| |
* Name token:: represents named words |
| |
@end menu |
| |
|
| |
@node Execution token, Compilation token, Tokens for Words, Tokens for Words |
| |
@subsection Execution token |
| |
|
| |
@cindex xt |
| |
@cindex execution token |
| |
An @dfn{execution token} (@i{XT}) represents some behaviour of a word. |
| |
You can use @code{execute} to invoke this behaviour. |
| |
|
| |
@cindex tick (') |
| |
You can use @code{'} to get an execution token that represents the |
| |
interpretation semantics of a named word: |
| |
|
| |
@example |
| |
5 ' . |
| |
execute |
|