gforth/doc/gforth.ds - diff

Return to gforth.ds CVS log

Up to [gforth] / gforth / doc

Diff for /gforth/doc/gforth.ds between versions 1.25 and 1.26

-version 1.25, 1999/03/11 22:52:11
+version 1.26, 1999/03/23 20:24:21
  Line 32  Programming style note:
  @ifinfo
  This file documents Gforth @value{VERSION}
- Copyright @copyright{} 1995-1998 Free Software Foundation, Inc.
+ Copyright @copyright{} 1995-1999 Free Software Foundation, Inc.
       Permission is granted to make and distribute verbatim copies of
       this manual provided the copyright notice and this permission notice
  Line 71  Copyright @copyright{} 1995-1998 Free So
  @center Jens Wilke
  @center Neal Crook
  @sp 3
- @center This manual is permanently under construction and was last updated on 16-Feb-1999
+ @center This manual is permanently under construction and was last updated on 23-Mar-1999
  @comment  The following two commands start the copyright page.
  @page
  Line 107  personal machines. This manual correspon
  @menu
  * License::                     The GPL
- * Introduction::                An introduction to ANS Forth
  * Goals::                       About the Gforth Project
+ * Introduction::                An introduction to ANS Forth
  * Invoking Gforth::             Starting (and exiting) Gforth
  * Words::                       Forth words available in Gforth
  * Error messages::              How to interpret them
  Line 129  personal machines. This manual correspon
  @detailmenu --- The Detailed Node Listing ---
+ Goals of Gforth
+ * Gforth Extensions Sinful?::
  An Introduction to ANS Forth
  * Introducing the Text Interpreter::
- Line 139  An Introduction to ANS Forth
+ Line 143  An Introduction to ANS Forth
  * Review - elements of a Forth system::
  * Exercises::
- Goals of Gforth
- * Gforth Extensions Sinful?::
  Forth Words
  * Notation::
  Line 152  Forth Words
  * Stack Manipulation::
  * Memory::
  * Control Structures::
- * Locals::
  * Defining Words::
  * The Text Interpreter::
- * Structures::
- * Object-oriented Forth::
  * Tokens for Words::
  * Word Lists::
  * Environmental Queries::
  * Files::
- * Including Files::
  * Blocks::
  * Other I/O::
  * Programming Tools::
  * Assembler and Code Words::
  * Threading Words::
+ * Locals::
+ * Structures::
+ * Object-oriented Forth::
  * Passing Commands to the OS::
  * Miscellaneous Words::
- Line 202  Control Structures
+ Line 201  Control Structures
  * Calls and returns::
  * Exception Handling::
- Locals
- * Gforth locals::
- * ANS Forth locals::
- Gforth locals
- * Where are locals visible by name?::
- * How long do locals live?::
- * Programming Style::
- * Implementation::
  Defining Words
  * Simple Defining Words::
- Line 229  The Text Interpreter
+ Line 216  The Text Interpreter
  * Literals::
  * Interpreter Directives::
+ Word Lists
+ * Why use word lists?::
+ * Word list examples::
+ Files
+ * Forth source files::
+ * General files::
+ * Search Paths::
+ * Forth Search Paths::
+ * General Search Paths::
+ Other I/O
+ * Simple numeric output::
+ * Formatted numeric output::
+ * String Formats::
+ * Displaying characters and strings::
+ * Input::
+ Programming Tools
+ * Debugging::                   Simple and quick.
+ * Assertions::                  Making your programs self-checking.
+ * Singlestep Debugger::         Executing your program word by word.
+ Locals
+ * Gforth locals::
+ * ANS Forth locals::
+ Gforth locals
+ * Where are locals visible by name?::
+ * How long do locals live?::
+ * Programming Style::
+ * Implementation::
  Structures
  * Why explicit structure support?::
- Line 274  The @file{mini-oof.fs} model
+ Line 300  The @file{mini-oof.fs} model
  * Mini-OOF Example::
  * Mini-OOF Implementation::
- Word Lists
- * Why use word lists?::
- * Word list examples::
- Including Files
- * Words for Including::
- * Search Path::
- * Forth Search Paths::
- * General Search Paths::
- Other I/O
- * Simple numeric output::       Predefined formats
- * Formatted numeric output::    Formatted (pictured) output
- * String Formats::              How Forth stores strings in memory
- * Displaying characters and strings:: Other stuff
- * Input::                       Input
- Programming Tools
- * Debugging::                   Simple and quick.
- * Assertions::                  Making your programs self-checking.
- * Singlestep Debugger::         Executing your program word by word.
  Tools
  * ANS Report::                  Report the words used, sorted by wordset.
  Line 422  Other Forth-related information
  @end detailmenu
  @end menu
- @node License, Introduction, Top, Top
+ @node License, Goals, Top, Top
  @unnumbered GNU GENERAL PUBLIC LICENSE
  @center Version 2, June 1991
  Line 823  from other Forth compilers. However, thi
  reference manual.
  @end iftex
- @c ----------------------------------------------------------
- @node    Introduction, Goals, License, Top
+ @c ******************************************************************
+ @node Goals, Introduction, License, Top
+ @comment node-name,     next,           previous, up
+ @chapter Goals of Gforth
+ @cindex goals of the Gforth project
+ The goal of the Gforth Project is to develop a standard model for
+ ANS Forth. This can be split into several subgoals:
+ @itemize @bullet
+ @item
+ Gforth should conform to the ANS Forth Standard.
+ @item
+ It should be a model, i.e. it should define all the
+ implementation-dependent things.
+ @item
+ It should become standard, i.e. widely accepted and used. This goal
+ is the most difficult one.
+ @end itemize
+ To achieve these goals Gforth should be
+ @itemize @bullet
+ @item
+ Similar to previous models (fig-Forth, F83)
+ @item
+ Powerful. It should provide for all the things that are considered
+ necessary today and even some that are not yet considered necessary.
+ @item
+ Efficient. It should not get the reputation of being exceptionally
+ slow.
+ @item
+ Free.
+ @item
+ Available on many machines/easy to port.
+ @end itemize
+ Have we achieved these goals? Gforth conforms to the ANS Forth
+ standard. It may be considered a model, but we have not yet documented
+ which parts of the model are stable and which parts we are likely to
+ change. It certainly has not yet become a de facto standard, but it
+ appears to be quite popular. It has some similarities to and some
+ differences from previous models. It has some powerful features, but not
+ yet everything that we envisioned. We certainly have achieved our
+ execution speed goals (@pxref{Performance}).  It is free and available
+ on many machines.
+ @menu
+ * Gforth Extensions Sinful?::
+ @end menu
+ @node Gforth Extensions Sinful?, , Goals, Goals
+ @comment node-name,     next,           previous, up
+ @section Is it a Sin to use Gforth Extensions?
+ @cindex Gforth extensions
+ If you've been paying attention, you will have realised that there is an
+ ANS (American National Standard) for Forth. As you read through the rest
+ of this manual, you will see documentation for @var{Standard} words, and
+ documentation for some appealing Gforth @var{extensions}. You might ask
+ yourself the question: @var{``Given that there is a standard, would I be
+ committing a sin to use (non-Standard) Gforth extensions?''}
+ The answer to that question is somewhat pragmatic and somewhat
+ philosophical. Consider these points:
+ @itemize @bullet
+ @item
+ A number of the Gforth extensions can be implemented in ANS Forth using
+ files provided in the @file{compat/} directory. These are mentioned in
+ the text in passing.
+ @item
+ Forth has a rich historical precedent for programmers taking advantage
+ of implementation-dependent features of their tools (for example,
+ relying on a knowledge of the dictionary structure). Sometimes these
+ techniques are necessary to extract every last bit of performance from
+ the hardware, sometimes they are just a programming shorthand.
+ @item
+ The best way to break the rules is to know what the rules are. To learn
+ the rules, there is no substitute for studying the text of the Standard
+ itself. In particular, Appendix A of the Standard (@var{Rationale})
+ provides a valuable insight into the thought processes of the technical
+ committee.
+ @item
+ The best reason to break a rule is because you have to; because it's
+ more productive to do that, because it makes your code run fast enough
+ or because you can see no Standard way to achieve what you want to
+ achieve.
+ @end itemize
+ The tool @file{ans-report.fs} (@pxref{ANS Report}) makes it easy to
+ analyse your program and determine what non-Standard definitions it
+ relies upon.
+ @c ******************************************************************
+ @node    Introduction, Invoking Gforth, Goals, Top
  @comment node-name,     next,           previous, up
  @chapter An Introduction to ANS Forth
  @cindex Forth - an introduction
- Line 835  teaching material, it seems worthwhile t
+ Line 928  teaching material, it seems worthwhile t
  material. @xref{Forth-related information} for other sources of Forth-related
  information.
- The examples in this section should work on any ANS Standard Forth, the
+ The examples in this section should work on any ANS Forth; the
- output shown was produced using Gforth. In each example, I have tried to
+ output shown was produced using Gforth. Each example attempts to
  reproduce the exact output that Gforth produces. If you try out the
  examples (and you should), what you should type is shown @kbd{like this}
  and Gforth's response is shown @code{like this}. The single exception is
  that, where the example shows @kbd{<return>} it means that you should
- press the "carriage return" key. Unfortunatley, some output formats for
+ press the ``carriage return'' key. Unfortunately, some output formats for
  this manual cannot show the difference between @kbd{this} and
  @code{this} which will make trying out the examples harder (but not
  impossible).
- Line 864  lead to great productivity improvements.
+ Line 957  lead to great productivity improvements.
  * Review - elements of a Forth system::
  * Exercises::
  @end menu
- @comment TODO add these sections to the top xref lists
  @comment ----------------------------------------------
  @node Introducing the Text Interpreter, Stacks and Postfix notation, Introduction, Introduction
- Line 876  When you invoke the Forth image, you wil
+ Line 968  When you invoke the Forth image, you wil
  and nothing else (if you have Gforth installed on your system, try
  invoking it now, by typing @kbd{gforth<return>}). Forth is now running
  its command line interpreter, which is called the @var{Text Interpreter}
- (also known as the @var{Outer Interpreter}).  (@pxref{The Text
+ (also known as the @var{Outer Interpreter}).  (You will learn a lot
- Interpreter} describes it in more detail, but we will learn more about
+ about the text interpreter as you read through this chapter,
- its behaviour as we go through this chapter).
+ but @pxref{The Text Interpreter} for more detail).
- Although it may not be obvious, Forth is actually waiting for your
+ Although it's not obvious, Forth is actually waiting for your
  input. Type a number and press the <return> key:
  @example
- Line 889  input. Type a number and press the <retu
+ Line 981  input. Type a number and press the <retu
  Rather than give you a prompt to invite you to input something, the text
  interpreter prints a status message @var{after} it has processed a line
- of input. The status message in this case (" ok" followed by
+ of input. The status message in this case (``@code{ ok}'' followed by
  carriage-return) indicates that the text interpreter was able to process
  all of your input successfully. Now type something illegal:
  @example
  @kbd{qwer341<return>}
+ :1: Undefined word
+ qwer341
  ^^^^^^^
- Error: Undefined word
+ $400D2BA8 Bounce
+ $400DBDA8 no.extensions
  @end example
- When the text interpreter detects an error, it discards any remaining
+ The exact text, other than the ``Undefined word'' may differ slightly on
- text on a line, resets certain internal state and prints an error
+ your system, but the effect is the same; when the text interpreter
- message.
+ detects an error, it discards any remaining text on a line, resets
+ certain internal state and prints an error message.
- The text interpreter works on input one line at a time. Starting at
- the beginning of the line, it breaks the line into groups of characters
+ The text interpreter waits for you to press carrage-return, and then
- separated by spaces. For each group of characters in turn, it makes two
+ processes your input line. Starting at the beginning of the line, it
- attempts to do something:
+ breaks the line into groups of characters separated by spaces. For each
+ group of characters in turn, it makes two attempts to do something:
  @itemize @bullet
  @item
- Line 933  in the next section).
+ Line 1029  in the next section).
  @end itemize
  If the text interpreter is unable to do either of these things with any
- group of characters, it discards the rest of the line and print an error
+ group of characters, it discards the group of characters and the rest of
- message. If the text interpreter reaches the end of the line without
+ the line, then prints an error message. If the text interpreter reaches
- error, it prints the status message " ok" followed by carriage-return.
+ the end of the line without error, it prints the status message ``@code{ ok}''
+ followed by carriage-return.
  This is the simplest command we can give to the text interpreter:
- Line 944  This is the simplest command we can give
+ Line 1041  This is the simplest command we can give
  @end example
  The text interpreter did everything we asked it to do (nothing) without
- an error, so it said that everything is "ok". Try a slightly longer
+ an error, so it said that everything is ``@code{ ok}''. Try a slightly longer
  command:
  @example
  @kbd{12 dup fred dup<return>}
+ :1: Undefined word
+dup fred dup
         ^^^^
- Error: Undefined word
+ $400D2BA8 Bounce
+ $400DBDA8 no.extensions
  @end example
- When you pres the <return> key, the text interpreter starts to work its
+ When you press the carriage-return key, the text interpreter starts to
- way along the line.
+ work its way along the line:
  @itemize @bullet
  @item
- Line 963  characters @code{12} and looks them up i
+ Line 1063  characters @code{12} and looks them up i
  dictionary@footnote{We can't tell if it found them or not, but assume
  for now that it did not}. There is no match for this group of characters
  in the name dictionary, so it tries to treat them as a number. It is
- able to do this successfully, so it puts the number, 12, "on the stack"
+ able to do this successfully, so it puts the number, 12, ``on the stack''
  (whatever that means).
  @item
  The text interpreter resumes scanning the line and gets the next group
- of characters, @code{dup}. It looks them up in the name dictionary and
+ of characters, @code{dup}. It looks it up in the name dictionary and
- (you'll have to take my word for this) finds them, and executes the word
+ (you'll have to take my word for this) finds it, and executes the word
  @code{dup} (whatever that means).
  @item
  Once again, the text interpreter resumes scanning the line and gets the
- Line 993  and executing it a second time.
+ Line 1093  and executing it a second time.
  @cindex outer interpreter
  In procedural programming languages (like C and Pascal), the
- building-block of programs is the function or procedure. These
+ building-block of programs is the @var{function} or @var{procedure}. These
- functions or procedures are called with explicit parameters. For
+ functions or procedures are called with @var{explicit parameters}. For
  example, in C we might write:
  @example
  total = total + new_volume(length,height,depth);
  @end example
- where total, length, height, depth are all variables and new_volume is
+ @noindent
- a function-call to another piece of code.
+ where new_volume is a function-call to another piece of code, and total,
+ length, height and depth are all variables. length, height and depth are
+ parameters to the function-call.
- In Forth, the equivalent to the function or procedure is the
+ In Forth, the equivalent of the function or procedure is the
  @var{definition} and parameters are implicitly passed between
  definitions using a shared stack that is visible to the
  programmer. Although Forth does support variables, the existence of the
- Line 1015  actual number is implementation-dependen
+ Line 1117  actual number is implementation-dependen
  used for any operation is implied unambiguously by the operation being
  performed. The stack used for all integer operations is called the @var{data
  stack} and, since this is the stack used most commonly, references to
- "the data stack" are often abbreviated to "the stack".
+ ``the data stack'' are often abbreviated to ``the stack''.
  The stacks have a last-in, first-out (LIFO) organisation. If you type:
- Line 1023  The stacks have a last-in, first-out (LI
+ Line 1125  The stacks have a last-in, first-out (LI
  @kbd{1 2 3<return>}  ok
  @end example
- Then you (well, the text interpreter, really) have placed three numbers
+ Then this instructs the text interpreter to placed three numbers on the
- on the (data) stack. An analogy for the behaviour of the stack is to
+ (data) stack. An analogy for the behaviour of the stack is to take a
- take a pack of playing cards and deal out the ace (1), 2 and 3 into a
+ pack of playing cards and deal out the ace (1), 2 and 3 into a pile on
- pile on the table. The 3 was the last card onto the pile ("last-in") and
+ the table. The 3 was the last card onto the pile (``last-in'') and if
- if you take a card off the pile then, unless you're prepared to fiddle a
+ you take a card off the pile then, unless you're prepared to fiddle a
- bit, the card that you take off will be the 3 ("first-out"). The number
+ bit, the card that you take off will be the 3 (``first-out''). The
- that will be first-out of the stack is called the "top of stack", which
+ number that will be first-out of the stack is called the @var{top of
+ stack}, which
+ @cindex TOS definition
  is often abbreviated to @var{TOS}.
- To see how parameters are passed in Forth, we will consider the
+ To understand how parameters are passed in Forth, consider the
- behaviour of the definition @code{+} (pronounced "plus"). You will not be
+ behaviour of the definition @code{+} (pronounced ``plus''). You will not
- surprised to learn that this definition performs addition. More
+ be surprised to learn that this definition performs addition. More
  precisely, it adds two number together and produces a result. Where does
- it get the two numbers from? It takes the first two numbers off the
+ it get the two numbers from? It takes the top two numbers off the
  stack. Where does it place the result? On the stack. You can act-out the
  behaviour of @code{+} with your playing cards like this:
  @itemize @bullet
  @item
- Pick up two cards from the stack
+ Pick up two cards from the stack on the table
  @item
- Stare at them intently and ask yourself "what *is* the sum of these two
+ Stare at them intently and ask yourself ``what @var{is} the sum of these two
- numbers"
+ numbers''
  @item
  Decide that the answer is 5
  @item
- Line 1055  Put a 5 on the remaining ace that's on t
+ Line 1159  Put a 5 on the remaining ace that's on t
  @end itemize
  If you don't have a pack of cards handy but you do have Forth running,
- you can use the definition .s to show the current state of the stack,
+ you can use the definition @code{.s} to show the current state of the stack,
  without affecting the stack. Type:
  @example
  @kbd{clearstack 1 2 3<return>} ok
- @kbd{.s<return> <3> 1 2 3 } ok
+ @kbd{.s<return>} <3> 1 2 3  ok
  @end example
  The text interpreter looks up the word @code{clearstack} and executes
- Line 1068  it; it tidies up the stack and removes a
+ Line 1172  it; it tidies up the stack and removes a
  left on it by earlier examples. The text interpreter pushes each of the
  three numbers in turn onto the stack. Finally, the text interpreter
  looks up the word @code{.s} and executes it. The effect of executing
- @code{.s} is to print the "<3>" (the total number of items on the stack)
+ @code{.s} is to print the ``<3>'' (the total number of items on the stack)
- followed by a list of all the items and the item on the far right-hand
+ followed by a list of all the items on the stack; the item on the far
- side is the TOS.
+ right-hand side is the TOS.
  You can now type:
- + .s<return> <2> 1 5  ok
+ @example
+ @kbd{+ .s<return>} <2> 1 5  ok
+ @end example
+ @noindent
  which is correct; there are now 2 items on the stack and the result of
  the addition is 5.
- If you're playing with cards, try doing a second addition; pick up the
+ If you're playing with cards, try doing a second addition: pick up the
  two cards, work out that their sum is 6, shuffle them into the pack,
- look for a 6 and place that on the table. You now have just one item
+ look for a 6 and place that on the table. You now have just one item on
- on the stack. What happens if you try to do a third addition? Pick up
+ the stack. What happens if you try to do a third addition? Pick up the
- the first card, pick up the second card - ah. There is no second
+ first card, pick up the second card -- ah! There is no second card. This
- card. This is called a "stack underflow" and consitutes an error. If
+ is called a @var{stack underflow} and consitutes an error. If you try to
- you try to do the same thing with Forth it will report an error
+ do the same thing with Forth it will report an error (probably a Stack
- (probably a Stack Underflow or an Invalid Memory Address error).
+ Underflow or an Invalid Memory Address error).
- The opposite situation to a stack underflow is a stack overflow, which
+ The opposite situation to a stack underflow is a @var{stack overflow},
- simply accepts that there is a finite amount of storage space reserved
+ which simply accepts that there is a finite amount of storage space
- for the stack. To stretch the playing card analogy, if you had enough
+ reserved for the stack. To stretch the playing card analogy, if you had
- packs of cards and you piled the cards up on the table, you would
+ enough packs of cards and you piled the cards up on the table, you would
- eventually be unable to add another card; you'd hit the
+ eventually be unable to add another card; you'd hit the ceiling. Gforth
- ceiling. Gforth allows you to set the maximum size of the stacks. In
+ allows you to set the maximum size of the stacks. In general, the only
- general, the only time that you will get a stack overflow is because a
+ time that you will get a stack overflow is because a definition has a
- definition has a bug in it and is generating data on the stack
+ bug in it and is generating data on the stack uncontrollably.
- uncontrollably.
  There's one final use for the playing card analogy. If you model your
  stack using a pack of playing cards, the maximum number of items on
  your stack will be 52 (I assume you didn't use the Joker). The maximum
- *value* of any item on the stack is 13 (the King). In fact, the only
+ @var{value} of any item on the stack is 13 (the King). In fact, the only
  possible numbers are positive integer numbers 1 through 13; you can't
  have (for example) 0 or 27 or 3.52 or -2. If you change the way you
  think about some of the cards, you can accommodate different
- Line 1112  numbers) but the numbers that you can re
+ Line 1218  numbers) but the numbers that you can re
  In that analogy, the limit was the amount of information that a single
  stack entry could hold, and Forth has a similar limit. In Forth, the
- size of a stack entry is called a "cell". The actual size of a cell is
+ size of a stack entry is called a @var{cell}. The actual size of a cell is
  implementation dependent and affects the maximum value that a stack
  entry can hold. A Standard Forth provides a cell size of at least
 -bits, and most desktop systems use a cell size of 32-bits.
- Line 1120  entry can hold. A Standard Forth provide
+ Line 1226  entry can hold. A Standard Forth provide
  Forth does not do any type checking for you, so you are free to
  manipulate and combine stack items in any way you wish. A convenient
  ways of treating stack items is as 2's complement signed integers, and
- that is what Standard words like "+" do. Therefore you can type:
+ that is what Standard words like ``+'' do. Therefore you can type:
- -5 12 + .s<return> <1> 7  ok
+ @example
+ @kbd{-5 12 + .s<return>} <1> 7  ok
+ @end example
- If you use numbers and definitions like "+" in order to turn Forth
+ If you use numbers and definitions like ``+'' in order to turn Forth
  into a great big pocket calculator, you will realise that it's rather
  different from a normal calculator. Rather than typing 2 + 3 = you had
- to type 2 3 + (ignore the fact that you had to use .s to see the
+ to type 2 3 + (ignore the fact that you had to use @code{.s} to see the
  result). The terminology used to describe this difference is to say
- that your calculator uses "Infix Notation" (parameters and operators
+ that your calculator uses @var{Infix Notation} (parameters and operators
- are mixed) whilst Forth uses "Postfix Notation" (parameters and
+ are mixed) whilst Forth uses @var{Postfix Notation} (parameters and
- operators are separate), also called "Reverse Polish Notation".
+ operators are separate), also called @var{Reverse Polish Notation}.
  Whilst postfix notation might look confusing to begin with, it has
  several important advantages:
- - it is unambiguous
+ @itemize @bullet
- - it is more concise
+ @item
- - it fits naturally with a stack-based system
+ it is unambiguous
+ @item
+ it is more concise
+ @item
+ it fits naturally with a stack-based system
+ @end itemize
  To examine these claims in more detail, consider these sums:
+ @example
 + 5 * 4 =
 * 5 + 6 =
+ @end example
  If you're just learning maths or your maths is very rusty, you will
  probably come up with the answer 44 for the first and 26 for the
  second. If you are a bit of a whizz at maths you will remember the
- *convention* that multiplication takes precendence over addition, and
+ @var{convention} that multiplication takes precendence over addition, and
  you'd come up with the answer 26 both times. To explain the answer 26
  to someone who got the answer 44, you'd probably rewrite the first sum
  like this:
+ @example
 + (5 * 4) =
+ @end example
  If what you really wanted was to perform the addition before the
  multiplication, you would have to use parentheses to force it.
- Line 1167  these keystroke sequences:
+ Line 1284  these keystroke sequences:
  Postfix notation is unambiguous because the order that the operators
  are applied is always explicit; that also means that parentheses are
- never required. The operators are *active* (the act of quoting the
+ never required. The operators are @var{active} (the act of quoting the
- operator makes the operation occur) which removes the need for "=".
+ operator makes the operation occur) which removes the need for ``=''.
  The sum 6 + 5 * 4 can be written (in postfix notation) in two
  equivalent ways:
+ @example
 5 4 * +      or:
 4 * 6 +
+ @end example
  An important thing that you should notice about this notation is that
  the @var{order} of the numbers does not change; if you want to subtract
 from 10 you type @code{10 2 -}.
- The reason why Forth uses postfix notation is very simple to explain: it
+ The reason that Forth uses postfix notation is very simple to explain: it
  makes the implementation extremely simple, and it follows naturally from
  using the stack as a mechanism for passing parameters. Another way of
  thinking about this is to realise that all Forth definitions are
  @var{active}; they execute as they are encountered by the text
- interpreter. The result of this is that the syntax of Forth is almost
+ interpreter. The result of this is that the syntax of Forth is trivially
- trivially simple.
+ simple.
- Line 1197  trivially simple.
+ Line 1316  trivially simple.
  Until now, the examples we've seen have been trivial; we've just been
  using Forth an a bigger-than-pocket calculator. Also, each calculation
- we've shown has been a "one-off" -- to repeat it we'd need to type it in
+ we've shown has been a ``one-off'' -- to repeat it we'd need to type it in
  again@footnote{That's not quite true. If you press the up-arrow key on
  your keyboard you should be able to scroll back to any earlier command,
  edit it and re-enter it.} In this section we'll see how to add new
  word to Forth's vocabulary.
- The easiest way to create a new word is to use a "colon
+ The easiest way to create a new word is to use a @var{colon
- definition". We'll define a few and try them out before we worry too
+ definition}. We'll define a few and try them out before we worry too
  much about how they work. Try typing in these examples; be careful to
  copy the spaces accurately:
- Line 1341  magic to make that xt or number get exec
+ Line 1460  magic to make that xt or number get exec
  at the time that @code{add-two} is @var{executed}. Therefore, when you
  execute @code{add-two} its @var{run-time effect} is exactly the same as
  if you had typed @code{2 + .} outside of a definition, and pressed
- <return>.
+ carriage-return.
  In Forth, every word or number can be described in terms of three
  properties:
- Line 1468  example). The effect of executing it are
+ Line 1587  example). The effect of executing it are
  compilation state at this time. If you execute @code{word2} it does
  nothing at all.
- @cindex ." -- how it works
+ @cindex @code{."}, how it works
  Before leaving the subject of immediate words, consider the behaviour of
  @code{."} in the definition of @code{greet}, in the previous
  section. This word is both a parsing word and an immediate word. Notice
- Line 1480  the text interpreter can identify it. Th
+ Line 1599  the text interpreter can identify it. Th
  it is a @var{delimiter}. The examples earlier show that, when the string
  is displayed, there is neither a space before the @code{H} nor after the
  @code{e}. Since @code{."} is an immediate word, it executes at the time
- that @code{greet is defined}. When it executes, it searches forward in
+ that @code{greet} is defined. When it executes, it searches forward in
  the input line looking for the delimiter. When it finds the delimiter,
  it updates @code{>in} to point past the delimiter. It also compiles some
  magic code into the definition of @code{greet}; the xt of a run-time
- Line 1506  If you have tried out the examples in th
+ Line 1625  If you have tried out the examples in th
  have typed them in by hand; when you leave Gforth, your definitions will
  be deleted. You can avoid this by using a text editor to enter Forth
  source code into a file, and then load all of the code from the file
- using @code{include} (@xref{Including Files}). A Forth source
+ using @code{include} (@xref{Forth source files}). A Forth source
  file is processed by the text interpreter, just as though you had typed
  it in by hand@footnote{Actually, there are some subtle differences, like
  the fact that it doesn't print @code{ ok} at the end of each line}.
- Line 1526  long definitions by hand, you can use a
+ Line 1645  long definitions by hand, you can use a
  the history file into a Forth source file for reuse at a later time.
  @cindex history file
- @cindex .gforth-history
+ @cindex @file{.gforth-history}
- @cindex GFORTHHIST
+ @cindex @code{GFORTHHIST} environment variable
+ @cindex environment variables
  You can find out the name of your history file using @code{history-file
  type }. On non-Unix systems you can find the location of the file using
- @code{history-dir type }@footnote{The environment variable GFORTHHIST
+ @code{history-dir type }@footnote{The environment variable @code{GFORTHHIST}
  determines the location of the file.}
- Line 1552  Forth program development is an interact
+ Line 1672  Forth program development is an interact
  @item
  The main command loop that accepts input, and controls both
  interpretation and compilation, is called the @var{text interpreter}
- (also known as the @var{outer interpreter}.
+ (also known as the @var{outer interpreter}).
  @item
  Forth has a very simple syntax, consisting of words and numbers
  separated by spaces or carriage-return characters. Any additional syntax
- Line 1573  semantics} of a word that it encounters.
+ Line 1693  semantics} of a word that it encounters.
  @item
  The relationship between the @var{interpretation semantics}, @var{compilation semantics}
  and @var{execution semantics} for a word depend upon the way in which
- the word was defined (for example, whether it is an @var{immediate} word.
+ the word was defined (for example, whether it is an @var{immediate} word).
  @item
  Forth definitions can be implemented in Forth (called @var{high-level
  definitions}) or in some other way (usually a lower-level language and
- Line 1583  definitions} or @var{primitives}).
+ Line 1703  definitions} or @var{primitives}).
  Many Forth systems are implemented mainly in Forth.
  @item
  You now know enough to read and understand the rest of this manual and
- the ANS Forth Standard.
+ the ANS Forth document.
  @end itemize
- Line 1609  provides. Even scarier, you know almost
+ Line 1729  provides. Even scarier, you know almost
  system. However, that's not a good idea just yet.. better to try writing
  some programs in Gforth.
- The large number of Forth words available in ANS Standard Forth and
+ The large number of Forth words available in ANS Forth and
  Gforth make learning Forth somewhat daunting. To make the problem
  easier, use the index of this manual to learn more about these words:
- Line 1622  all the exercises in a .fs file in the d
+ Line 1742  all the exercises in a .fs file in the d
  inspiration from Starting Forth and Kelly&Spies.
+ @c ******************************************************************
+ @node Invoking Gforth, Words, Introduction, Top
+ @chapter Invoking Gforth
+ @cindex Gforth - invoking
+ @cindex invoking Gforth
+ @cindex running Gforth
+ @cindex command-line options
+ @cindex options on the command line
+ @cindex flags on the command line
- @c ----------------------------------------------------------
+ You will usually just say @code{gforth}. In many other cases the default
- @node Goals, Invoking Gforth, Introduction, Top
+ Gforth image will be invoked like this:
- @comment node-name,     next,           previous, up
+ @example
- @chapter Goals of Gforth
+ gforth [files] [-e forth-code]
- @cindex Goals
+ @end example
- The goal of the Gforth Project is to develop a standard model for
+ This interprets the contents of the files and the Forth code in the order they
- ANS Forth. This can be split into several subgoals:
+ are given.
- @itemize @bullet
- @item
- Gforth should conform to the ANS Forth Standard.
- @item
- It should be a model, i.e. it should define all the
- implementation-dependent things.
- @item
- It should become standard, i.e. widely accepted and used. This goal
- is the most difficult one.
- @end itemize
- To achieve these goals Gforth should be
- @itemize @bullet
- @item
- Similar to previous models (fig-Forth, F83)
- @item
- Powerful. It should provide for all the things that are considered
- necessary today and even some that are not yet considered necessary.
- @item
- Efficient. It should not get the reputation of being exceptionally
- slow.
- @item
- Free.
- @item
- Available on many machines/easy to port.
- @end itemize
- Have we achieved these goals? Gforth conforms to the ANS Forth
+ In general, the command line looks like this:
- standard. It may be considered a model, but we have not yet documented
- which parts of the model are stable and which parts we are likely to
- change. It certainly has not yet become a de facto standard, but it
- appears to be quite popular. It has some similarities to and some
- differences from previous models. It has some powerful features, but not
- yet everything that we envisioned. We certainly have achieved our
- execution speed goals (@pxref{Performance}).  It is free and available
- on many machines.
- @menu
+ @example
- * Gforth Extensions Sinful?::
+ gforth [initialization options] [image-specific options]
- @end menu
+ @end example
- @node Gforth Extensions Sinful?, , Goals, Goals
+ The initialization options must come before the rest of the command
- @comment node-name,     next,           previous, up
+ line. They are:
- @section Is it a Sin to use Gforth Extensions?
- @cindex Gforth extensions
- If you've been paying attention, you will have realised that there is an
- ANS Standard for Forth. As you read through the rest of this manual, you
- will see documentation for @var{Standard} words, and documentation for
- some appealing Gforth @var{extensions}. You might ask yourself the
- question: @var{"Given that there is a standard, would I be committing a
- sin to use (non-Standard) Gforth extensions?"}
- The answer to that question is somewhat pragmatic and somewhat
- philosophical. Consider these points:
- @itemize @bullet
- @item
- A number of the Gforth extensions can be implemented in ANS Standard
- Forth using files provided in the @file{compat/} directory. These are
- mentioned in the text in passing.
- @item
- Forth has a rich historical precedent for programmers taking advantage
- of implementation-dependent features of their tools (for example,
- relying on a knowledge of the dictionary structure). Sometimes these
- techniques are necessary to extract every last bit of performance from
- the hardware, sometimes they are just a programming shorthand.
- @item
- The best way to break the rules is to know what the rules are. To learn
- the rules, there is no substitute for studying the text of the Standard
- itself. In particular, Appendix A of the Standard (@var{Rationale})
- provides a valuable insight into the thought processes of the technical
- committee.
- @item
- The best reason to break a rule is because you have to; because it's
- more productive to do that, because it makes your code run fast enough
- or because you can see no Standard way to achieve what you want to
- achieve.
- @end itemize
- The tool @file{ans-report.fs} (@pxref{ANS Report}) makes it easy to
- analyse your program and determine what non-Standard definitions it
- relies upon.
- @c ----------------------------------------------------------
- @node Invoking Gforth, Words, Goals, Top
- @chapter Invoking Gforth
- @cindex Gforth - invoking
- @cindex invoking Gforth
- @cindex running Gforth
- @cindex command-line options
- @cindex options on the command line
- @cindex flags on the command line
- You will usually just say @code{gforth}. In many other cases the default
- Gforth image will be invoked like this:
- @example
- gforth [files] [-e forth-code]
- @end example
- This interprets the contents of the files and the Forth code in the order they
- are given.
- In general, the command line looks like this:
- @example
- gforth [initialization options] [image-specific options]
- @end example
- The initialization options must come before the rest of the command
- line. They are:
  @table @code
  @cindex -i, command-line option
- Line 1856  default image @file{gforth.fi} consist o
+ Line 1881  default image @file{gforth.fi} consist o
  in which they are given. The @code{-e @var{forth-code}} or
  @code{--evaluate @var{forth-code}} option evaluates the Forth
  code. This option takes only one argument; if you want to evaluate more
- Forth words, you have to quote them or use several @code{-e}s. To exit
+ Forth words, you have to quote them or use @code{-e} several times. To exit
  after processing the command line (instead of entering interactive mode)
  append @code{-e bye} to the command line.
- Line 1900  doc-bye
+ Line 1925  doc-bye
  @comment some are in .c files.
+ @c ******************************************************************
  @node Words, Error messages, Invoking Gforth, Top
  @chapter Forth Words
- @cindex Words
+ @cindex words
  @menu
  * Notation::
- Line 1912  doc-bye
+ Line 1938  doc-bye
  * Stack Manipulation::
  * Memory::
  * Control Structures::
- * Locals::
  * Defining Words::
  * The Text Interpreter::
- * Structures::
- * Object-oriented Forth::
  * Tokens for Words::
  * Word Lists::
  * Environmental Queries::
  * Files::
- * Including Files::
  * Blocks::
  * Other I/O::
  * Programming Tools::
  * Assembler and Code Words::
  * Threading Words::
+ * Locals::
+ * Structures::
+ * Object-oriented Forth::
  * Passing Commands to the OS::
  * Miscellaneous Words::
  @end menu
- Line 1948  that has become a de-facto standard for
+ Line 1973  that has become a de-facto standard for
  @table @var
  @item word
- @cindex case insensitivity
+ @cindex case-sensitivity
- The name of the word. BTW, Gforth is case insensitive, so you can
+ The name of the word. Gforth is case-insensitive, so you can type the
- type the words in in lower case (However, @pxref{core-idef}).
+ words in in lower case (However, @pxref{core-idef,
+ Implementation-defined options, Implementation-defined options}).
  @item Stack effect
  @cindex stack effect
- Line 1982  The ANS Forth standard is divided into s
+ Line 2008  The ANS Forth standard is divided into s
  system need not support all of them. Therefore, in theory, the fewer
  word sets your program uses the more portable it will be. However, we
  suspect that most ANS Forth systems on personal machines will feature
- all word sets. Words that are not defined in the ANS standard have
+ all word sets. Words that are not defined in ANS Forth have
  @code{gforth} or @code{gforth-internal} as word set. @code{gforth}
  describes words that will work in future releases of Gforth;
  @code{gforth-internal} words are more volatile. Environmental query
- Line 2056  quotes.
+ Line 2082  quotes.
  @node Comments, Boolean Flags, Notation, Words
  @section Comments
- @cindex Comments
+ @cindex comments
- Forth supports two styles of comment; the traditional "in-line" comment,
+ Forth supports two styles of comment; the traditional @var{in-line} comment,
- @code{(} and its modern cousin, the "comment to end of line"; @code{\}.
+ @code{(} and its modern cousin, the @var{comment to end of line}; @code{\}.
  doc-(
  doc-\
- Line 2067  doc-\G
+ Line 2093  doc-\G
  @node Boolean Flags, Arithmetic, Comments, Words
  @section Boolean Flags
- @cindex Boolean Flags
+ @cindex Boolean flags
  A Boolean flag is cell-sized. A cell with all bits clear represents the
  flag @code{false} and a flag with all bits set represents the flag
- @code{true}. Words that check a flag (for example, @var{IF}) will treat
+ @code{true}. Words that check a flag (for example, @code{IF}) will treat
  a cell that has @var{any} bit set as @code{true}.
  doc-true
- Line 2092  operators. If you perform division with
+ Line 2118  operators. If you perform division with
  you do not want to use @code{/} or @code{/mod} with its undefined
  behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the
  former, @pxref{Mixed precision}).
+ @comment TODO discuss the different division forms and the std approach
  @menu
  * Single precision::
- Line 2107  former, @pxref{Mixed precision}).
+ Line 2134  former, @pxref{Mixed precision}).
  @cindex single precision arithmetic words
  By default, numbers in Forth are single-precision integers that are 1
- CELL in size. They can be signed or unsigned, depending upon how you
+ cell in size. They can be signed or unsigned, depending upon how you
  treat them. @xref{Number Conversion} for the rules used by the text
  interpreter for recognising single-precision integers.
- Line 2148  doc-d2/
+ Line 2175  doc-d2/
  recognising double-precision integers.
  A double precision number is represented by a cell pair, with the most
- significant digit at the TOS. It is trivial to convert an unsigned single
+ significant digit at the TOS. It is trivial to convert an unsigned
- to an (unsigned) double; simply push a @code{0} onto the TOS. Since numbers
+ single to an (unsigned) double; simply push a @code{0} onto the
- are represented by Gforth using 2's complement arithmetic, converting
+ TOS. Since numbers are represented by Gforth using 2's complement
- a signed single to a (signed) double requires sign-extension across the
+ arithmetic, converting a signed single to a (signed) double requires
- most significant digit. This can be achieved using @code{s>d}. The moral
+ sign-extension across the most significant digit. This can be achieved
- of the story is that you cannot convert a number without knowing what that
+ using @code{s>d}. The moral of the story is that you cannot convert a
- number represents.
+ number without knowing whether it represents an unsigned or a
+ signed number.
  doc-s>d
  doc-d+
- Line 2228  recognising floating-point numbers.
+ Line 2256  recognising floating-point numbers.
  @cindex angles in trigonometric operations
  @cindex trigonometric operations
  Angles in floating point operations are given in radians (a full circle
- has 2 pi radians). Note, that Gforth has a separate floating point
+ has 2 pi radians). Gforth has a separate floating point
- stack, but we use the unified notation.
+ stack, but the documentation uses the unified notation.
  @cindex floating-point arithmetic, pitfalls
  Floating point numbers have a number of unpleasant surprises for the
- Line 2398  doc-2rdrop
+ Line 2426  doc-2rdrop
  @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation
  @subsection Locals stack
+ @comment TODO
  @node Stack pointer manipulation,  , Locals stack, Stack Manipulation
  @subsection Stack pointer manipulation
- Line 2421  doc-lp!
+ Line 2450  doc-lp!
  @node Memory, Control Structures, Stack Manipulation, Words
  @section Memory
- @cindex Memory words
+ @cindex memory words
  @menu
  * Memory Access::
- Line 2475  char-aligned have no use in the standard
+ Line 2504  char-aligned have no use in the standard
  created.
  @cindex @code{CREATE} and alignment
- The standard guarantees that addresses returned by @code{CREATE}d words
+ AND Forth guarantees that addresses returned by @code{CREATE}d words
  are cell-aligned; in addition, Gforth guarantees that these addresses
  are aligned for all purposes.
- Note that the standard defines a word @code{char}, which has nothing to
+ Note that the ANS Forth word @code{char} has nothing to do with address
- do with address arithmetic.
+ arithmetic.
  doc-chars
  doc-char+
- Line 2542  doc-blank
+ Line 2571  doc-blank
  doc-compare
  doc-search
- @node Control Structures, Locals, Memory, Words
+ @node Control Structures, Defining Words, Memory, Words
  @section Control Structures
  @cindex control structures
- Line 2605  and many other programming languages has
+ Line 2634  and many other programming languages has
  Gforth also provides the words @code{?DUP-IF} and @code{?DUP-0=-IF}, so
  you can avoid using @code{?dup}. Using these alternatives is also more
- efficient than using @code{?dup}. Definitions in ANS Standard Forth
+ efficient than using @code{?dup}. Definitions in ANS Forth
  for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in
  @file{compat/control.fs}.
- Line 2804  prints nothing.
+ Line 2833  prints nothing.
  @end itemize
  Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and
- @code{-LOOP} are not in the ANS Forth standard. However, an
+ @code{-LOOP} are not defined in ANS Forth. However, an implementation
- implementation for these words that uses only standard words is provided
+ for these words that uses only standard words is provided in
- in @file{compat/loops.fs}.
+ @file{compat/loops.fs}.
  @cindex @code{FOR} loops
- Another counted loop is
+ Another counted loop is:
  @example
  @var{n}
  FOR
- Line 2819  FOR
+ Line 2847  FOR
  NEXT
  @end example
  This is the preferred loop of native code compiler writers who are too
- lazy to optimize @code{?DO} loops properly. In Gforth, this loop
+ lazy to optimize @code{?DO} loops properly. This loop structure is not
- iterates @var{n+1} times; @code{i} produces values starting with @var{n}
+ defined in ANS Forth. In Gforth, this loop iterates @var{n+1} times;
- and ending with 0. Other Forth systems may behave differently, even if
+ @code{i} produces values starting with @var{n} and ending with 0. Other
- they support @code{FOR} loops. To avoid problems, don't use @code{FOR}
+ Forth systems may behave differently, even if they support @code{FOR}
- loops.
+ loops. To avoid problems, don't use @code{FOR} loops.
  @node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures
  @subsection Arbitrary control structures
- Line 2857  would need to know how many stack items
+ Line 2885  would need to know how many stack items
  entry (many systems use one cell. In Gforth they currently take three,
  but this may change in the future).
  Some standard control structure words are built from these words:
  doc-else
- Line 2895  through the definition (@code{LOOP} etc.
+ Line 2922  through the definition (@code{LOOP} etc.
  fall-through path). Also, you have to ensure that all @code{LEAVE}s are
  resolved (by using one of the loop-ending words or @code{DONE}).
- Another group of control structure words are
+ Another group of control structure words are:
  doc-case
  doc-endcase
- Line 2910  doc-endof
+ Line 2937  doc-endof
  In order to ensure readability we recommend that you do not create
  arbitrary control structures directly, but define new control structure
  words for the control structure you want and use these words in your
- program.
+ program. For example, instead of writing:
- E.g., instead of writing:
  @example
- begin
+ BEGIN
    ...
- if [ 1 cs-roll ]
+ IF [ 1 CS-ROLL ]
    ...
- again then
+ AGAIN THEN
  @end example
  @noindent
  we recommend defining control structure words, e.g.,
  @example
- : while ( dest -- orig dest )
+ : WHILE ( DEST -- ORIG DEST )
-  POSTPONE if
+  POSTPONE IF
-cs-roll ; immediate
+CS-ROLL ; immediate
- : repeat ( orig dest -- )
+ : REPEAT ( orig dest -- )
-  POSTPONE again
+  POSTPONE AGAIN
-  POSTPONE then ; immediate
+  POSTPONE THEN ; immediate
  @end example
  @noindent
  and then using these to create the control structure:
  @example
- begin
+ BEGIN
    ...
- while
+ WHILE
    ...
- repeat
+ REPEAT
  @end example
  That's much easier to read, isn't it? Of course, @code{REPEAT} and
- Line 2957  necessary to define them.
+ Line 2982  necessary to define them.
  @cindex recursive definitions
  A definition can be called simply be writing the name of the definition
- to be called. Note that normally a definition is invisible during its
+ to be called. Normally a definition is invisible during its own
  definition. If you want to write a directly recursive definition, you
- can use @code{recursive} to make the current definition visible.
+ can use @code{recursive} to make the current definition visible, or
+ @code{recurse} to call the current definition directly.
  doc-recursive
- Another way to perform a recursive call is
  doc-recurse
  @comment TODO add example of the two recursion methods
- Line 2993  defer foo
+ Line 3016  defer foo
  IS foo
  @end example
- When the end of the definition is reached, it returns. An earlier return
+ The current definition returns control to the calling definition when
- can be forced using
+ the end of the definition is reached or @code{EXIT} is encountered.
  doc-exit
- Don't forget to clean up the return stack and @code{UNLOOP} any
- outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing.
  doc-;s
  @node Exception Handling,  , Calls and returns, Control Structures
  @subsection Exception Handling
- @cindex Exceptions
+ @cindex exceptions
- @comment TODO examples and blurb
- doc-catch
- doc-throw
- @comment TODO -- think this will alllcate you a new THROW code?
- @comment for reserving new exception numbers. Note the existence of compat/exception.fs
- doc---exception-exception
- doc-quit
- doc-abort
- doc-abort"
+ If your program detects a fatal error condition, the simplest action
+ that it can take is to @code{quit}. This resets the return stack and
+ restarts the text interpreter, but does not print any error message.
- @node Locals, Defining Words, Control Structures, Words
+ The next stage in severity is to execute @code{abort}, which has the
- @section Locals
+ same effect as @code{quit}, with the addition that it resets the data
- @cindex locals
+ stack.
- Local variables can make Forth programming more enjoyable and Forth
- programs easier to read. Unfortunately, the locals of ANS Forth are
- laden with restrictions. Therefore, we provide not only the ANS Forth
- locals wordset, but also our own, more powerful locals wordset (we
- implemented the ANS Forth locals wordset through our locals wordset).
- The ideas in this section have also been published in the paper
+ A slightly more sophisticated approach is use use @code{abort"}, which
- @cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented
+ compiles a string to be used as an error message and does a conditional
- at EuroForth '94; it is available at
+ @code{abort} at run-time. For example:
- @*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}.
- @menu
+ @example
- * Gforth locals::
+ @kbd{: checker abort" That flag was true" ." A false flag" ;<return>}  ok
- * ANS Forth locals::
+ @kbd{0 checker<return>} A false flag ok
- @end menu
+ @kbd{1 checker<return>}
+ :1: That flag was true
+checker
+   ^^^^^^^
+ $400D1648 throw
+ $400E4660
+ @end example
- @node Gforth locals, ANS Forth locals, Locals, Locals
+ These simple techniques allow a program to react to a fatal error
- @subsection Gforth locals
+ condition, but they are not exactly user-friendly. The ANS Forth
- @cindex Gforth locals
+ Exception word set provides the pair of words @code{throw} and
- @cindex locals, Gforth style
+ @code{catch}, which can be used to provide sophisticated error-handling.
- Locals can be defined with
+ @code{catch} has a similar behaviour to @code{execute}, in that it takes
+ an @var{xt} as a parameter and starts execution of the xt. However,
+ before passing control to the xt, @code{catch} pushes an
+ @var{exception frame} onto the @var{exception stack}. This exception
+ frame is used to restore the system to a known state if a detected error
+ occurs during the execution of the xt. A typical way to use @code{catch}
+ would be:
  @example
- @{ local1 local2 ... -- comment @}
+ ... ['] foo catch IF ...
  @end example
- or
+ Whilst @code{foo} executes, it can call other words to any level of
+ nesting, as usual.  If @code{foo} (and all the words that it calls)
+ execute successfully, control will ultimately passes to the word following
+ the @code{catch}, and there will be a @code{true} flag (0) at
+ TOS. However, if any word detects an error, it can terminate the
+ execution of @code{foo} by pushing an error code onto the stack and then
+ performing a @code{throw}. The execution of @code{throw} will pass
+ control to the word following the @code{catch}, but this time the TOS
+ will hold the error code. Therefore, the @code{IF} in the example
+ can be used to determine whether @code{foo} executed successfully.
+ This simple example shows how you can use @code{throw} and @code{catch}
+ to ``take over'' exception handling from the system:
  @example
- @{ local1 local2 ... @}
+ : my-div ['] / catch if ." DIVIDE ERROR" else ." OK.. " . then ;
  @end example
- E.g.,
+ The next example is more sophisticated and shows a multi-level
+ @code{throw} and @code{catch}. To understand this example, start at the
+ definition of @code{top-level} and work backwards:
  @example
- : max @{ n1 n2 -- n3 @}
+ : lowest-level ( -- c )
-  n1 n2 > if
+     key dup 27 = if
-    n1
+throw \ ESCAPE key pressed
-  else
+     else
-    n2
+         ." lowest-level successfull" CR
-  endif ;
+     then
+ ;
+ : lower-level ( -- c )
+     lowest-level
+     \ at this level consider a CTRL-U to be a fatal error
+     dup 21 = if \ CTRL-U
+throw
+     else
+         ." lower-level successfull" CR
+     then
+ ;
+ : low-level ( -- c )
+     ['] lower-level catch
+     ?dup if
+         \ error occurred - do we recognise it?
+         dup 1 = if
+             \ ESCAPE key pressed.. pretend it was an E
+             [char] E
+         else throw \ propogate the error upwards
+         then
+     then
+     ." low-level successfull" CR
+ ;
+ : top-level ( -- )
+     CR ['] low-level catch \ CATCH is used like EXECUTE
+     ?dup if \ error occurred..
+         ." Error " . ." occurred - contact your supplier"
+     else
+         ." The '" emit ." ' key was pressed" CR
+     then
+ ;
  @end example
- The similarity of locals definitions with stack comments is intended. A
+ The ANS Forth document assigns @code{throw} codes thus:
- locals definition often replaces the stack comment of a word. The order
- of the locals corresponds to the order in a stack comment and everything
- after the @code{--} is really a comment.
- This similarity has one disadvantage: It is too easy to confuse locals
+ @itemize @bullet
- declarations with stack comments, causing bugs and making them hard to
+ @item
- find. However, this problem can be avoided by appropriate coding
+ codes in the range -1 -- -255 are reserved to be assigned by the
- conventions: Do not use both notations in the same program. If you do,
+ Standard. Assignments for codes in the range -1 -- -58 are currently
- they should be distinguished using additional means, e.g. by position.
+ documented in the Standard. In particular, @code{-1 throw} is equivalent
+ to @code{abort} and @code{-2 throw} is equivalent to @code{abort"}.
+ @item
+ codes in the range -256 -- -4095 are reserved to be assigned by the system.
+ @item
+ all other codes may be assigned by programs.
+ @end itemize
- @cindex types of locals
+ Gforth provides the word @code{exception} as a mechanism for assigning
- @cindex locals types
+ system throw codes to applications. This allows multiple applications to
- The name of the local may be preceded by a type specifier, e.g.,
+ co-exist in memory without any clash of @code{throw} codes. A definition
- @code{F:} for a floating point value:
+ of @code{exception} in ANS Forth is provided in
+ @file{compat/exception.fs}.
- @example
- : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
- \ complex multiplication
-  Ar Br f* Ai Bi f* f-
-  Ar Bi f* Ai Br f* f+ ;
- @end example
- @cindex flavours of locals
+ doc-quit
- @cindex locals flavours
+ doc-abort
- @cindex value-flavoured locals
+ doc-abort"
- @cindex variable-flavoured locals
- Gforth currently supports cells (@code{W:}, @code{W^}), doubles
- (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
- (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
- with @code{W:}, @code{D:} etc.) produces its value and can be changed
- with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
- produces its address (which becomes invalid when the variable's scope is
- left). E.g., the standard word @code{emit} can be defined in terms of
- @code{type} like this:
- @example
+ doc-catch
- : emit @{ C^ char* -- @}
+ doc-throw
-     char* 1 type ;
+ doc---exception-exception
- @end example
- @cindex default type of locals
- @cindex locals, default type
- A local without type specifier is a @code{W:} local. Both flavours of
- locals are initialized with values from the data or FP stack.
- Currently there is no way to define locals with user-defined data
+ @c -------------------------------------------------------------
- structures, but we are working on it.
+ @node Defining Words, The Text Interpreter, Control Structures, Words
+ @section Defining Words
+ @cindex defining words
- Gforth allows defining locals everywhere in a colon definition. This
+ @comment TODO much more intro material here. 3 classes: colon defn, variables/constants
- poses the following questions:
+ @comment values, user-defined defining words.
  @menu
- * Where are locals visible by name?::
+ * Simple Defining Words::
- * How long do locals live?::
+ * Colon Definitions::
- * Programming Style::
+ * User-defined Defining Words::
- * Implementation::
+ * Supplying names::
+ * Interpretation and Compilation Semantics::
  @end menu
- @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
+ @node Simple Defining Words, Colon Definitions, Defining Words, Defining Words
- @subsubsection Where are locals visible by name?
+ @subsection Simple Defining Words
- @cindex locals visibility
+ @cindex simple defining words
- @cindex visibility of locals
+ @cindex defining words, simple
- @cindex scope of locals
- Basically, the answer is that locals are visible where you would expect
- it in block-structured languages, and sometimes a little longer. If you
- want to restrict the scope of a local, enclose its definition in
- @code{SCOPE}...@code{ENDSCOPE}.
- doc-scope
+ doc-constant
- doc-endscope
+ doc-2constant
+ doc-fconstant
+ doc-variable
+ doc-2variable
+ doc-fvariable
+ doc-create
+ doc-user
+ doc-value
+ doc-to
+ doc-defer
+ doc-is
- These words behave like control structure words, so you can use them
+ Definitions in ANS Forth for @code{defer}, @code{<is>} and
- with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
+ @code{[is]} are provided in @file{compat/defer.fs}.
- arbitrary ways.
+ @comment TODO - what do the two "is" words do?
- If you want a more exact answer to the visibility question, here's the
+ @node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words
- basic principle: A local is visible in all places that can only be
+ @subsection Colon Definitions
- reached through the definition of the local@footnote{In compiler
+ @cindex colon definitions
- construction terminology, all places dominated by the definition of the
- local.}. In other words, it is not visible in places that can be reached
- without going through the definition of the local. E.g., locals defined
- in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
- defined in @code{BEGIN}...@code{UNTIL} are visible after the
- @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
- The reasoning behind this solution is: We want to have the locals
+ @example
- visible as long as it is meaningful. The user can always make the
+ : name ( ... -- ... )
- visibility shorter by using explicit scoping. In a place that can
+     word1 word2 word3 ;
- only be reached through the definition of a local, the meaning of a
+ @end example
- local name is clear. In other places it is not: How is the local
- initialized at the control flow path that does not contain the
- definition? Which local is meant, if the same name is defined twice in
- two independent control flow paths?
- This should be enough detail for nearly all users, so you can skip the
+ creates a word called @code{name}, that, upon execution, executes
- rest of this section. If you really must know all the gory details and
+ @code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}.
- options, read on.
- In order to implement this rule, the compiler has to know which places
+ The explanation above is somewhat superficial. @xref{Interpretation and
- are unreachable. It knows this automatically after @code{AHEAD},
+ Compilation Semantics} for an in-depth discussion of some of the issues
- @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
+ involved.
- most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
- compiler that the control flow never reaches that place. If
- @code{UNREACHABLE} is not used where it could, the only consequence is
- that the visibility of some locals is more limited than the rule above
- says. If @code{UNREACHABLE} is used where it should not (i.e., if you
- lie to the compiler), buggy code will be produced.
- doc-unreachable
+ doc-:
+ doc-;
- Another problem with this rule is that at @code{BEGIN}, the compiler
+ @node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words
- does not know which locals will be visible on the incoming
+ @subsection User-defined Defining Words
- back-edge. All problems discussed in the following are due to this
+ @cindex user-defined defining words
- ignorance of the compiler (we discuss the problems using @code{BEGIN}
+ @cindex defining words, user-defined
- loops as examples; the discussion also applies to @code{?DO} and other
- loops). Perhaps the most insidious example is:
- @example
- AHEAD
- BEGIN
-   x
- [ 1 CS-ROLL ] THEN
-   @{ x @}
-   ...
- UNTIL
- @end example
- This should be legal according to the visibility rule. The use of
+ You can create new defining words simply by wrapping defining-time code
- @code{x} can only be reached through the definition; but that appears
+ around existing defining words and putting the sequence in a colon
- textually below the use.
+ definition.
- From this example it is clear that the visibility rules cannot be fully
+ @comment TODO example
- implemented without major headaches. Our implementation treats common
- cases as advertised and the exceptions are treated in a safe way: The
- compiler makes a reasonable guess about the locals visible after a
- @code{BEGIN}; if it is too pessimistic, the
- user will get a spurious error about the local not being defined; if the
- compiler is too optimistic, it will notice this later and issue a
- warning. In the case above the compiler would complain about @code{x}
- being undefined at its use. You can see from the obscure examples in
- this section that it takes quite unusual control structures to get the
- compiler into trouble, and even then it will often do fine.
- If the @code{BEGIN} is reachable from above, the most optimistic guess
+ @cindex @code{CREATE} ... @code{DOES>}
- is that all locals visible before the @code{BEGIN} will also be
+ If you want the words defined with your defining words to behave
- visible after the @code{BEGIN}. This guess is valid for all loops that
+ differently from words defined with standard defining words, you can
- are entered only through the @code{BEGIN}, in particular, for normal
+ write your defining word like this:
- @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
- @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
- compiler. When the branch to the @code{BEGIN} is finally generated by
- @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
- warns the user if it was too optimistic:
- @example
- IF
-   @{ x @}
- BEGIN
-   \ x ?
- [ 1 cs-roll ] THEN
-   ...
- UNTIL
- @end example
- Here, @code{x} lives only until the @code{BEGIN}, but the compiler
- optimistically assumes that it lives until the @code{THEN}. It notices
- this difference when it compiles the @code{UNTIL} and issues a
- warning. The user can avoid the warning, and make sure that @code{x}
- is not used in the wrong area by using explicit scoping:
  @example
- IF
+ : def-word ( "name" -- )
-   SCOPE
+     Create @var{code1}
-   @{ x @}
+ DOES> ( ... -- ... )
-   ENDSCOPE
+     @var{code2} ;
- BEGIN
- [ 1 cs-roll ] THEN
-   ...
- UNTIL
- @end example
- Since the guess is optimistic, there will be no spurious error messages
+ def-word name
- about undefined locals.
+ @end example
- If the @code{BEGIN} is not reachable from above (e.g., after
+ Technically, this fragment defines a defining word @code{def-word}, and
- @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
+ a word @code{name}; when you execute @code{name}, the address of the
- optimistic guess, as the locals visible after the @code{BEGIN} may be
+ body of @code{name} is put on the data stack and @var{code2} is executed
- defined later. Therefore, the compiler assumes that no locals are
+ (the address of the body of @code{name} is the address @code{HERE}
- visible after the @code{BEGIN}. However, the user can use
+ returns immediately after the @code{CREATE}). The word @code{name} is
- @code{ASSUME-LIVE} to make the compiler assume that the same locals are
+ sometimes called a @var{child} of @code{def-word}.
- visible at the BEGIN as at the point where the top control-flow stack
- item was created.
- doc-assume-live
+ In other words, if you make the following definitions:
- E.g.,
  @example
- @{ x @}
+ : def-word1 ( "name" -- )
- AHEAD
+     Create @var{code1} ;
- ASSUME-LIVE
- BEGIN
+ : action1 ( ... -- ... )
-   x
+     @var{code2} ;
- [ 1 CS-ROLL ] THEN
-   ...
+ def-word name1
- UNTIL
  @end example
- Other cases where the locals are defined before the @code{BEGIN} can be
+ Using @code{name1 action1} is equivalent to using @code{name}.
- handled by inserting an appropriate @code{CS-ROLL} before the
- @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
+ The classic example is that you can define @code{Constant} in this way:
- behind the @code{ASSUME-LIVE}).
- Cases where locals are defined after the @code{BEGIN} (but should be
- visible immediately after the @code{BEGIN}) can only be handled by
- rearranging the loop. E.g., the ``most insidious'' example above can be
- arranged into:
  @example
- BEGIN
+ : constant ( w "name" -- )
-   @{ x @}
+     create ,
-   ... 0=
+ DOES> ( -- w )
- WHILE
+     @@ ;
-   x
- REPEAT
  @end example
- @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals
+ @comment that is the classic example.. maybe it should be earlier. There
- @subsubsection How long do locals live?
+ @comment is a beautiful description of how this works and what it does in
- @cindex locals lifetime
+ @comment the Forthwrite 100th edition.
- @cindex lifetime of locals
- The right answer for the lifetime question would be: A local lives at
- least as long as it can be accessed. For a value-flavoured local this
- means: until the end of its visibility. However, a variable-flavoured
- local could be accessed through its address far beyond its visibility
- scope. Ultimately, this would mean that such locals would have to be
- garbage collected. Since this entails un-Forth-like implementation
- complexities, I adopted the same cowardly solution as some other
- languages (e.g., C): The local lives only as long as it is visible;
- afterwards its address is invalid (and programs that access it
- afterwards are erroneous).
- @node Programming Style, Implementation, How long do locals live?, Gforth locals
+ When you create a constant with @code{5 constant five}, first a new word
- @subsubsection Programming Style
+ @code{five} is created, then the value 5 is laid down in the body of
- @cindex locals programming style
+ @code{five} with @code{,}. When @code{five} is invoked, the address of
- @cindex programming style, locals
+ the body is put on the stack, and @code{@@} retrieves the value 5.
- The freedom to define locals anywhere has the potential to change
+ @cindex stack effect of @code{DOES>}-parts
- programming styles dramatically. In particular, the need to use the
+ @cindex @code{DOES>}-parts, stack effect
- return stack for intermediate storage vanishes. Moreover, all stack
+ In the example above the stack comment after the @code{DOES>} specifies
- manipulations (except @code{PICK}s and @code{ROLL}s with run-time
+ the stack effect of the defined words, not the stack effect of the
- determined arguments) can be eliminated: If the stack items are in the
+ following code (the following code expects the address of the body on
- wrong order, just write a locals definition for all of them; then
+ the top of stack, which is not reflected in the stack comment). This is
- write the items in the order you want.
+ the convention that I use and recommend (it clashes a bit with using
+ locals declarations for stack effect specification, though).
- This seems a little far-fetched and eliminating stack manipulations is
+ @subsubsection Applications of @code{CREATE..DOES>}
- unlikely to become a conscious programming objective. Still, the number
+ @cindex @code{CREATE} ... @code{DOES>}, applications
- of stack manipulations will be reduced dramatically if local variables
- are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with
- a traditional implementation of @code{max}).
- This shows one potential benefit of locals: making Forth programs more
+ You may wonder how to use this feature. Here are some usage patterns:
- readable. Of course, this benefit will only be realized if the
- programmers continue to honour the principle of factoring instead of
- using the added latitude to make the words longer.
- @cindex single-assignment style for locals
+ @cindex factoring similar colon definitions
- Using @code{TO} can and should be avoided.  Without @code{TO},
+ When you see a sequence of code occurring several times, and you can
- every value-flavoured local has only a single assignment and many
+ identify a meaning, you will factor it out as a colon definition. When
- advantages of functional languages apply to Forth. I.e., programs are
+ you see similar colon definitions, you can factor them using
- easier to analyse, to optimize and to read: It is clear from the
+ @code{CREATE..DOES>}. E.g., an assembler usually defines several words
- definition what the local stands for, it does not turn into something
+ that look very similar:
- different later.
+ @example
+ : ori, ( reg-target reg-source n -- )
+asm-reg-reg-imm ;
+ : andi, ( reg-target reg-source n -- )
+asm-reg-reg-imm ;
+ @end example
- E.g., a definition using @code{TO} might look like this:
+ @noindent
+ This could be factored with:
  @example
- : strcmp @{ addr1 u1 addr2 u2 -- n @}
+ : reg-reg-imm ( op-code -- )
-  u1 u2 min 0
+     CREATE ,
-  ?do
+ DOES> ( reg-target reg-source n -- )
-    addr1 c@@ addr2 c@@ -
+     @@ asm-reg-reg-imm ;
-    ?dup-if
-      unloop exit
+reg-reg-imm ori,
-    then
+reg-reg-imm andi,
-    addr1 char+ TO addr1
-    addr2 char+ TO addr2
-  loop
-  u1 u2 - ;
  @end example
- Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
- every loop iteration. @code{strcmp} is a typical example of the
- readability problems of using @code{TO}. When you start reading
- @code{strcmp}, you think that @code{addr1} refers to the start of the
- string. Only near the end of the loop you realize that it is something
- else.
- This can be avoided by defining two locals at the start of the loop that
+ @cindex currying
- are initialized with the right value for the current iteration.
+ Another view of @code{CREATE..DOES>} is to consider it as a crude way to
+ supply a part of the parameters for a word (known as @dfn{currying} in
+ the functional language community). E.g., @code{+} needs two
+ parameters. Creating versions of @code{+} with one parameter fixed can
+ be done like this:
  @example
- : strcmp @{ addr1 u1 addr2 u2 -- n @}
+ : curry+ ( n1 -- )
-  addr1 addr2
+     CREATE ,
-  u1 u2 min 0
+ DOES> ( n2 -- n1+n2 )
-  ?do @{ s1 s2 @}
+     @@ + ;
-    s1 c@@ s2 c@@ -
-    ?dup-if
-      unloop exit
-    then
-    s1 char+ s2 char+
-  loop
-drop
-  u1 u2 - ;
- @end example
- Here it is clear from the start that @code{s1} has a different value
- in every loop iteration.
- @node Implementation,  , Programming Style, Gforth locals
+curry+ 3+
- @subsubsection Implementation
+ -2 curry+ 2-
- @cindex locals implementation
+ @end example
- @cindex implementation of locals
- @cindex locals stack
+ @subsubsection The gory details of @code{CREATE..DOES>}
- Gforth uses an extra locals stack. The most compelling reason for
+ @cindex @code{CREATE} ... @code{DOES>}, details
- this is that the return stack is not float-aligned; using an extra stack
- also eliminates the problems and restrictions of using the return stack
- as locals stack. Like the other stacks, the locals stack grows toward
- lower addresses. A few primitives allow an efficient implementation:
- doc-@local#
+ doc-does>
- doc-f@local#
- doc-laddr#
- doc-lp+!#
- doc-lp!
- doc->l
- doc-f>l
- In addition to these primitives, some specializations of these
+ @cindex @code{DOES>} in a separate definition
- primitives for commonly occurring inline arguments are provided for
+ This means that you need not use @code{CREATE} and @code{DOES>} in the
- efficiency reasons, e.g., @code{@@local0} as specialization of
+ same definition; you can put the @code{DOES>}-part in a separate
- @code{@@local#} for the inline argument 0. The following compiling words
+ definition. This allows us to, e.g., select among different DOES>-parts:
- compile the right specialized version, or the general version, as
+ @example
- appropriate:
+ : does1
+ DOES> ( ... -- ... )
+     ... ;
- doc-compile-@local
+ : does2
- doc-compile-f@local
+ DOES> ( ... -- ... )
- doc-compile-lp+!
+     ... ;
- Combinations of conditional branches and @code{lp+!#} like
+ : def-word ( ... -- ... )
- @code{?branch-lp+!#} (the locals pointer is only changed if the branch
+     create ...
- is taken) are provided for efficiency and correctness in loops.
+     IF
+        does1
+     ELSE
+        does2
+     ENDIF ;
+ @end example
- A special area in the dictionary space is reserved for keeping the
+ In this example, the selection of whether to use @code{does1} or
- local variable names. @code{@{} switches the dictionary pointer to this
+ @code{does2} is made at compile-time; at the time that the child word is
- area and @code{@}} switches it back and generates the locals
+ @code{Create}d.
- initializing code. @code{W:} etc.@ are normal defining words. This
- special area is cleared at the start of every colon definition.
- @cindex word list for defining locals
+ @cindex @code{DOES>} in interpretation state
- A special feature of Gforth's dictionary is used to implement the
+ In a standard program you can apply a @code{DOES>}-part only if the last
- definition of locals without type specifiers: every word list (aka
+ word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part
- vocabulary) has its own methods for searching
+ will override the behaviour of the last word defined in any case. In a
- etc. (@pxref{Word Lists}). For the present purpose we defined a word list
+ standard program, you can use @code{DOES>} only in a colon
- with a special search method: When it is searched for a word, it
+ definition. In Gforth, you can also use it in interpretation state, in a
- actually creates that word using @code{W:}. @code{@{} changes the search
+ kind of one-shot mode; for example:
- order to first search the word list containing @code{@}}, @code{W:} etc.,
+ @example
- and then the word list for defining locals without type specifiers.
+ CREATE name ( ... -- ... )
+   @var{initialization}
+ DOES>
+   @var{code} ;
+ @end example
- The lifetime rules support a stack discipline within a colon
+ @noindent
- definition: The lifetime of a local is either nested with other locals
+ is equivalent to the standard:
- lifetimes or it does not overlap them.
+ @example
+ :noname
+ DOES>
+     @var{code} ;
+ CREATE name EXECUTE ( ... -- ... )
+     @var{initialization}
+ @end example
- At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
+ You can get the address of the body of a word with:
- pointer manipulation is generated. Between control structure words
- locals definitions can push locals onto the locals stack. @code{AGAIN}
- is the simplest of the other three control flow words. It has to
- restore the locals stack depth of the corresponding @code{BEGIN}
- before branching. The code looks like this:
- @format
- @code{lp+!#} current-locals-size @minus{} dest-locals-size
- @code{branch} <begin>
- @end format
- @code{UNTIL} is a little more complicated: If it branches back, it
+ doc->body
- must adjust the stack just like @code{AGAIN}. But if it falls through,
- the locals stack must not be changed. The compiler generates the
- following code:
- @format
- @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
- @end format
- The locals stack pointer is only adjusted if the branch is taken.
- @code{THEN} can produce somewhat inefficient code:
+ @node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words
- @format
+ @subsection Supplying names for the defined words
- @code{lp+!#} current-locals-size @minus{} orig-locals-size
+ @cindex names for defined words
- <orig target>:
+ @cindex defining words, name parameter
- @code{lp+!#} orig-locals-size @minus{} new-locals-size
- @end format
- The second @code{lp+!#} adjusts the locals stack pointer from the
- level at the @var{orig} point to the level after the @code{THEN}. The
- first @code{lp+!#} adjusts the locals stack pointer from the current
- level to the level at the orig point, so the complete effect is an
- adjustment from the current level to the right level after the
- @code{THEN}.
- @cindex locals information on the control-flow stack
+ @cindex defining words, name given in a string
- @cindex control-flow stack items, locals information
+ By default, defining words take the names for the defined words from the
- In a conventional Forth implementation a dest control-flow stack entry
+ input stream. Sometimes you want to supply the name from a string. You
- is just the target address and an orig entry is just the address to be
+ can do this with:
- patched. Our locals implementation adds a word list to every orig or dest
- item. It is the list of locals visible (or assumed visible) at the point
- described by the entry. Our implementation also adds a tag to identify
- the kind of entry, in particular to differentiate between live and dead
- (reachable and unreachable) orig entries.
- A few unusual operations have to be performed on locals word lists:
+ doc-nextname
- doc-common-list
+ For example:
- doc-sub-list?
- doc-list-size
- Several features of our locals word list implementation make these
+ @example
- operations easy to implement: The locals word lists are organised as
+ s" foo" nextname create
- linked lists; the tails of these lists are shared, if the lists
+ @end example
- contain some of the same locals; and the address of a name is greater
+ @noindent
- than the address of the names behind it in the list.
+ is equivalent to:
+ @example
+ create foo
+ @end example
- Another important implementation detail is the variable
+ @cindex defining words without name
- @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
+ Sometimes you want to define an @var{anonymous word}; a word without a
- determine if they can be reached directly or only through the branch
+ name. You can do this with:
- that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
- @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
- definition, by @code{BEGIN} and usually by @code{THEN}.
- Counted loops are similar to other loops in most respects, but
+ doc-:noname
- @code{LEAVE} requires special attention: It performs basically the same
- service as @code{AHEAD}, but it does not create a control-flow stack
- entry. Therefore the information has to be stored elsewhere;
- traditionally, the information was stored in the target fields of the
- branches created by the @code{LEAVE}s, by organizing these fields into a
- linked list. Unfortunately, this clever trick does not provide enough
- space for storing our extended control flow information. Therefore, we
- introduce another stack, the leave stack. It contains the control-flow
- stack entries for all unresolved @code{LEAVE}s.
- Local names are kept until the end of the colon definition, even if
+ This leaves the execution token for the word on the stack after the
- they are no longer visible in any control-flow path. In a few cases
+ closing @code{;}. Here's an example in which a deferred word is
- this may lead to increased space needs for the locals name area, but
+ initialised with an @code{xt} from an anonymous colon definition:
- usually less than reclaiming this space would cost in code size.
+ @example
+ Defer deferred
+ :noname ( ... -- ... )
+   ... ;
+ IS deferred
+ @end example
+ Gforth provides an alternative way of doing this, using two separate
+ words:
- @node ANS Forth locals,  , Gforth locals, Locals
+ doc-noname
- @subsection ANS Forth locals
+ @cindex execution token of last defined word
- @cindex locals, ANS Forth style
+ doc-lastxt
- The ANS Forth locals wordset does not define a syntax for locals, but
+ The previous example can be rewritten using @code{noname} and
- words that make it possible to define various syntaxes. One of the
+ @code{lastxt}:
- possible syntaxes is a subset of the syntax we used in the Gforth locals
- wordset, i.e.:
  @example
- @{ local1 local2 ... -- comment @}
+ Defer deferred
- @end example
+ noname : ( ... -- ... )
- @noindent
+   ... ;
- or
+ lastxt IS deferred
- @example
- @{ local1 local2 ... @}
  @end example
- The order of the locals corresponds to the order in a stack comment. The
+ @code{lastxt} also works when the last word was not defined as
- restrictions are:
+ @code{noname}.
- @itemize @bullet
- @item
- Locals can only be cell-sized values (no type specifiers are allowed).
- @item
- Locals can be defined only outside control structures.
- @item
- Locals can interfere with explicit usage of the return stack. For the
- exact (and long) rules, see the standard. If you don't use return stack
- accessing words in a definition using locals, you will be all right. The
- purpose of this rule is to make locals implementation on the return
- stack easier.
- @item
- The whole definition must be in one line.
- @end itemize
- Locals defined in this way behave like @code{VALUE}s (@xref{Simple
+ @node Interpretation and Compilation Semantics,  , Supplying names, Defining Words
- Defining Words}). I.e., they are initialized from the stack. Using their
+ @subsection Interpretation and Compilation Semantics
- name produces their value. Their value can be changed using @code{TO}.
+ @cindex semantics, interpretation and compilation
- Since this syntax is supported by Gforth directly, you need not do
- anything to use it. If you want to port a program using this syntax to
- another ANS Forth system, use @file{compat/anslocal.fs} to implement the
- syntax on the other system.
- Note that a syntax shown in the standard, section A.13 looks
+ @cindex interpretation semantics
- similar, but is quite different in having the order of locals
+ The @dfn{interpretation semantics} of a word are what the text
- reversed. Beware!
+ interpreter does when it encounters the word in interpret state. It also
+ appears in some other contexts, e.g., the execution token returned by
+ @code{' @var{word}} identifies the interpretation semantics of
+ @var{word} (in other words, @code{' @var{word} execute} is equivalent to
+ interpret-state text interpretation of @code{@var{word}}).
- The ANS Forth locals wordset itself consists of a word:
+ @cindex compilation semantics
+ The @dfn{compilation semantics} of a word are what the text interpreter
+ does when it encounters the word in compile state. It also appears in
+ other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In
+ standard terminology, ``appends to the current definition''.} the
+ compilation semantics of @var{word}.
- doc-(local)
+ @cindex execution semantics
+ The standard also talks about @dfn{execution semantics}. They are used
+ only for defining the interpretation and compilation semantics of many
+ words. By default, the interpretation semantics of a word are to
+ @code{execute} its execution semantics, and the compilation semantics of
+ a word are to @code{compile,} its execution semantics.@footnote{In
+ standard terminology: The default interpretation semantics are its
+ execution semantics; the default compilation semantics are to append its
+ execution semantics to the execution semantics of the current
+ definition.}
- The ANS Forth locals extension wordset defines a syntax using @code{locals|}, but it is so
+ @comment TODO expand, make it co-operate with new sections on text interpreter.
- awful that we strongly recommend not to use it. We have implemented this
- syntax to make porting to Gforth easy, but do not document it here. The
- problem with this syntax is that the locals are defined in an order
- reversed with respect to the standard stack comment notation, making
- programs harder to read, and easier to misread and miswrite. The only
- merit of this syntax is that it is easy to implement using the ANS Forth
- locals wordset.
- @node Defining Words, The Text Interpreter, Locals, Words
+ @cindex immediate words
- @section Defining Words
+ @cindex compile-only words
- @cindex defining words
+ You can change the semantics of the most-recently defined word:
- @menu
+ doc-immediate
- * Simple Defining Words::
+ doc-compile-only
- * Colon Definitions::
+ doc-restrict
- * User-defined Defining Words::
- * Supplying names::
- * Interpretation and Compilation Semantics::
- @end menu
- @node Simple Defining Words, Colon Definitions, Defining Words, Defining Words
+ Note that ticking (@code{'}) a compile-only word gives an error
- @subsection Simple Defining Words
+ (``Interpreting a compile-only word'').
- @cindex simple defining words
- @cindex defining words, simple
- doc-constant
+ Gforth also allows you to define words with arbitrary combinations of
- doc-2constant
+ interpretation and compilation semantics.
- doc-fconstant
- doc-variable
- doc-2variable
- doc-fvariable
- doc-create
- doc-user
- doc-value
- doc-to
- doc-defer
- doc-is
- Definitions in ANS Standard Forth for @code{defer}, @code{<is>} and
+ doc-interpret/compile:
- @code{[is]} are provided in @file{compat/defer.fs}. TODO - what do
- the two is words do?
- @node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words
+ This feature was introduced for implementing @code{TO} and @code{S"}. I
- @subsection Colon Definitions
+ recommend that you do not define such words, as cute as they may be:
- @cindex colon definitions
+ they make it hard to get at both parts of the word in some contexts.
+ E.g., assume you want to get an execution token for the compilation
+ part. Instead, define two words, one that embodies the interpretation
+ part, and one that embodies the compilation part.  Once you have done
+ that, you can define a combined word with @code{interpret/compile:} for
+ the convenience of your users.
+ You might try to use this feature to provide an optimizing
+ implementation of the default compilation semantics of a word. For
+ example, by defining:
  @example
- : name ( ... -- ... )
+ :noname
-     word1 word2 word3 ;
+    foo bar ;
+ :noname
+    POSTPONE foo POSTPONE bar ;
+ interpret/compile: foobar
  @end example
- creates a word called @code{name}, that, upon execution, executes
+ @noindent
- @code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}.
+ as an optimizing version of:
- The explanation above is somewhat superficial. @xref{Interpretation and
- Compilation Semantics} for an in-depth discussion of some of the issues
- involved.
- doc-:
- doc-;
- @node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words
- @subsection User-defined Defining Words
- @cindex user-defined defining words
- @cindex defining words, user-defined
- You can create new defining words simply by wrapping defining-time code
+ @example
- around existing defining words and putting the sequence in a colon
+ : foobar
- definition.
+     foo bar ;
+ @end example
- @comment TODO example
+ Unfortunately, this does not work correctly with @code{[compile]},
+ because @code{[compile]} assumes that the compilation semantics of all
+ @code{interpret/compile:} words are non-default. I.e., @code{[compile]
+ foobar} would compile the compilation semantics for the optimizing
+ @code{foobar}, whereas it would compile the interpretation semantics for
+ the non-optimizing @code{foobar}.
- @cindex @code{CREATE} ... @code{DOES>}
+ @cindex state-smart words (are a bad idea)
- If you want the words defined with your defining words to behave
+ Some people try to use @var{state-smart} words to emulate the feature provided
- differently from words defined with standard defining words, you can
+ by @code{interpret/compile:} (words are state-smart if they check
- write your defining word like this:
+ @code{STATE} during execution). E.g., they would try to code
+ @code{foobar} like this:
  @example
- : def-word ( "name" -- )
+ : foobar
-     Create @var{code1}
+   STATE @@
- DOES> ( ... -- ... )
+   IF ( compilation state )
-     @var{code2} ;
+     POSTPONE foo POSTPONE bar
+   ELSE
- def-word name
+     foo bar
+   ENDIF ; immediate
  @end example
- Technically, this fragment defines a defining word @code{def-word}, and
+ Although this works if @code{foobar} is only processed by the text
- a word @code{name}; when you execute @code{name}, the address of the
+ interpreter, it does not work in other contexts (like @code{'} or
- body of @code{name} is put on the data stack and @var{code2} is executed
+ @code{POSTPONE}). E.g., @code{' foobar} will produce an execution token
- (the address of the body of @code{name} is the address @code{HERE}
+ for a state-smart word, not for the interpretation semantics of the
- returns immediately after the @code{CREATE}). The word @code{name} is
+ original @code{foobar}; when you execute this execution token (directly
- sometimes called a @var{child} of @code{def-word}.
+ with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile
+ state, the result will not be what you expected (i.e., it will not
+ perform @code{foo bar}). State-smart words are a bad idea. Simply don't
+ write them@footnote{For a more detailed discussion of this topic, see
+ @cite{@code{State}-smartness -- Why it is Evil and How to Exorcise it} by Anton
+ Ertl; presented at EuroForth '98 and available from
+ @url{http://www.complang.tuwien.ac.at/papers/}}!
- In other words, if you make the following definitions:
+ @cindex defining words with arbitrary semantics combinations
+ It is also possible to write defining words that define words with
+ arbitrary combinations of interpretation and compilation semantics. In
+ general, they look like this:
  @example
- : def-word1 ( "name" -- )
+ : def-word
-     Create @var{code1} ;
+     create-interpret/compile
+     @var{code1}
- : action1 ( ... -- ... )
+ interpretation>
-     @var{code2} ;
+     @var{code2}
+ <interpretation
- def-word name1
+ compilation>
+     @var{code3}
+ <compilation ;
  @end example
- Using @code{name1 action1} is equivalent to using @code{name}.
+ For a @var{word} defined with @code{def-word}, the interpretation
+ semantics are to push the address of the body of @var{word} and perform
- E.g., you can implement @code{Constant} in this way:
+ @var{code2}, and the compilation semantics are to push the address of
+ the body of @var{word} and perform @var{code3}. E.g., @code{constant}
+ can also be defined like this (except that the defined constants don't
+ behave correctly when @code{[compile]}d):
  @example
- : constant ( w "name" -- )
+ : constant ( n "name" -- )
-     create ,
+     create-interpret/compile
- DOES> ( -- w )
+     ,
-     @@ ;
+ interpretation> ( -- n )
+     @@
+ <interpretation
+ compilation> ( compilation. -- ; run-time. -- n )
+     @@ postpone literal
+ <compilation ;
  @end example
- @comment that is the classic example.. maybe it should be earlier. There
+ doc-create-interpret/compile
- @comment is a beautiful description of how this works and what it does in
+ doc-interpretation>
- @comment the Forthwrite 100th edition.
+ doc-<interpretation
+ doc-compilation>
+ doc-<compilation
- When you create a constant with @code{5 constant five}, first a new word
+ Note that words defined with @code{interpret/compile:} and
- @code{five} is created, then the value 5 is laid down in the body of
+ @code{create-interpret/compile} have an extended header structure that
- @code{five} with @code{,}. When @code{five} is invoked, the address of
+ differs from other words; however, unless you try to access them with
- the body is put on the stack, and @code{@@} retrieves the value 5.
+ plain address arithmetic, you should not notice this. Words for
+ accessing the header structure usually know how to deal with this; e.g.,
+ @code{' word >body} also gives you the body of a word created with
+ @code{create-interpret/compile}.
- @cindex stack effect of @code{DOES>}-parts
+ @c ----------------------------------------------------------
- @cindex @code{DOES>}-parts, stack effect
+ @node The Text Interpreter, Tokens for Words, Defining Words, Words
- In the example above the stack comment after the @code{DOES>} specifies
+ @section  The Text Interpreter
- the stack effect of the defined words, not the stack effect of the
+ @cindex interpreter - outer
- following code (the following code expects the address of the body on
+ @cindex text interpreter
- the top of stack, which is not reflected in the stack comment). This is
+ @cindex outer interpreter
- the convention that I use and recommend (it clashes a bit with using
- locals declarations for stack effect specification, though).
- @subsubsection Applications of @code{CREATE..DOES>}
+ Intro blah.
- @cindex @code{CREATE} ... @code{DOES>}, applications
- You may wonder how to use this feature. Here are some usage patterns:
+ @comment TODO
- @cindex factoring similar colon definitions
+ doc->in
- When you see a sequence of code occurring several times, and you can
+ doc-tib
- identify a meaning, you will factor it out as a colon definition. When
+ doc-#tib
- you see similar colon definitions, you can factor them using
+ doc-span
- @code{CREATE..DOES>}. E.g., an assembler usually defines several words
+ doc-restore-input
- that look very similar:
+ doc-save-input
- @example
+ doc-source
- : ori, ( reg-target reg-source n -- )
+ doc-source-id
-asm-reg-reg-imm ;
- : andi, ( reg-target reg-source n -- )
-asm-reg-reg-imm ;
- @end example
- @noindent
- This could be factored with:
- @example
- : reg-reg-imm ( op-code -- )
-     CREATE ,
- DOES> ( reg-target reg-source n -- )
-     @@ asm-reg-reg-imm ;
-reg-reg-imm ori,
-reg-reg-imm andi,
- @end example
- @cindex currying
- Another view of @code{CREATE..DOES>} is to consider it as a crude way to
- supply a part of the parameters for a word (known as @dfn{currying} in
- the functional language community). E.g., @code{+} needs two
- parameters. Creating versions of @code{+} with one parameter fixed can
- be done like this:
- @example
- : curry+ ( n1 -- )
-     CREATE ,
- DOES> ( n2 -- n1+n2 )
-     @@ + ;
-curry+ 3+
- -2 curry+ 2-
- @end example
- @subsubsection The gory details of @code{CREATE..DOES>}
+ @menu
- @cindex @code{CREATE} ... @code{DOES>}, details
+ * Number Conversion::
+ * Interpret/Compile states::
+ * Literals::
+ * Interpreter Directives::
+ @end menu
- doc-does>
+ @comment TODO
- @cindex @code{DOES>} in a separate definition
+ The text interpreter works on input one line at a time. Starting at
- This means that you need not use @code{CREATE} and @code{DOES>} in the
+ the beginning of the line, it skips leading spaces (called
- same definition; you can put the @code{DOES>}-part in a separate
+ @var{delimiters}) then parses a string (a sequence of non-space
- definition. This allows us to, e.g., select among different DOES>-parts:
+ characters) until it either reaches a space character or it
- @example
+ reaches the end of the line. Having parsed a string, it then makes two
- : does1
+ attempts to do something with it:
- DOES> ( ... -- ... )
-     ... ;
- : does2
+ * It looks the string up in a dictionary of definitions. If the string
- DOES> ( ... -- ... )
+   is found in the dictionary, the string names a @var{definition} (also
-     ... ;
+   known as a @var{word}) and the dictionary search will return an
+   @var{Execution token} (xt) for the definition and some flags that show
+   when the definition can be used legally. If the definition can be
+   legally executed in @var{Interpret} mode then the text interpreter will
+   use the xt to execute it, otherwise it will issue an error
+   message. The dictionary is described in more detail in <blah>.
- : def-word ( ... -- ... )
+ * If the string is not found in the dictionary, the text interpreter
-     create ...
+   attempts to treat it as a number in the current radix (base 10 after
-     IF
+   initial startup). If the string represents a legal number in the
-        does1
+   current radix, the number is pushed onto the appropriate parameter
-     ELSE
+   stack. Stacks are discussed in more detail in <blah>. Number
-        does2
+   conversion is described in more detail in <section about +, -
-     ENDIF ;
+   numbers and different number formats>.
- @end example
- In this example, the selection of whether to use @code{does1} or
+ If both of these attempts fail, the remainer of the input line is
- @code{does2} is made at compile-time; at the time that the child word is
+ discarded and the text interpreter isses an error message. If one of
- @code{Create}d.
+ these attempts succeeds, the text interpreter repeats the parsing
+ process until the end of the line has been reached. At this point,
+ it prints the status message ``  ok'' and waits for more input.
- @cindex @code{DOES>} in interpretation state
+ There are two important things to note about the behaviour of the text
- In a standard program you can apply a @code{DOES>}-part only if the last
+ interpreter:
- word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part
- will override the behaviour of the last word defined in any case. In a
- standard program, you can use @code{DOES>} only in a colon
- definition. In Gforth, you can also use it in interpretation state, in a
- kind of one-shot mode; for example:
- @example
- CREATE name ( ... -- ... )
-   @var{initialization}
- DOES>
-   @var{code} ;
- @end example
- @noindent
+ * it processes each input string to completion before parsing
- is equivalent to the standard:
+   additional characters from the input line.
- @example
- :noname
- DOES>
-     @var{code} ;
- CREATE name EXECUTE ( ... -- ... )
-     @var{initialization}
- @end example
- You can get the address of the body of a word with:
+ * it keeps track of its position in the input line using a variable
+   (called >IN, pronounced ``to-in''). The value of >IN can be modified
+   by the execution of definitions in the input line. This means that
+   definitions can ``trick'' the text interpreter either into skipping
+   sections of the input line or into parsing a section of the
+   input line more than once.
- doc->body
- @node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words
+ @node Number Conversion, Interpret/Compile states, The Text Interpreter, The Text Interpreter
- @subsection Supplying names for the defined words
+ @subsection Number Conversion
- @cindex names for defined words
+ @cindex number conversion
- @cindex defining words, name parameter
+ @cindex double-cell numbers, input format
+ @cindex input format for double-cell numbers
+ @cindex single-cell numbers, input format
+ @cindex input format for single-cell numbers
+ @cindex floating-point numbers, input format
+ @cindex input format for floating-point numbers
- @cindex defining words, name given in a string
+ If the text interpreter fails to find a particular string in the name
- By default, defining words take the names for the defined words from the
+ dictionary, it attempts to convert it to a number using a set of rules.
- input stream. Sometimes you want to supply the name from a string. You
- can do this with:
- doc-nextname
+ Let <digit> represent any character that is a legal digit in the current
+ number base (for example, 0-9 when the number base is decimal or 0-9, A-F
+ when the number base is hexadecimal).
- For example:
+ Let <decimal digit> represent any character in the range 0-9.
- @example
+ @comment TODO need to extend the next defn to support fp format
- s" foo" nextname create
+ Let @{+ | -@} represent the optional presence of either a @code{+} or
- @end example
+ @code{-} character.
- @noindent
- is equivalent to:
- @example
- create foo
- @end example
- @cindex defining words without name
+ Let * represent any number of instances of the previous character
- Sometimes you want to define an @var{anonymous word}; a word without a
+ (including none).
- name. You can do this with:
- doc-:noname
+ Let any other character represent itself.
- This leaves the execution token for the word on the stack after the
+ Now, the conversion rules are:
- closing @code{;}. Here's an example in which a deferred word is
- initialised with an @code{xt} from an anonymous colon definition:
- @example
- Defer deferred
- :noname ( ... -- ... )
-   ... ;
- IS deferred
- @end example
- Gforth provides an alternative way of doing this, using two separate
+ @itemize @bullet
- words:
+ @item
+ A string of the form <digit><digit>* is treated as a single-precision
+ (CELL-sized) positive integer. Examples are 0 123 6784532 32343212343456 42
+ @item
+ A string of the form -<digit><digit>* is treated as a single-precision
+ (CELL-sized) negative integer, and is represented using 2's-complement
+ arithmetic. Examples are -45 -5681 -0
+ @item
+ A string of the form <digit><digit>*.<digit>* is treated as a double-precision
+ (double-CELL-sized) positive integer. Examples are 3465. 3.465 34.65
+ (and note that these all represent the same number).
+ @item
+ A string of the form -<digit><digit>*.<digit>* is treated as a
+ double-precision (double-CELL-sized) negative integer, and is
+ represented using 2's-complement arithmetic. Examples are -3465. -3.465
+ -34.65 (and note that these all represent the same number).
+ @item
+ A string of the form @{+ | -@}<decimal digit>@{.@}<decimal digit>*@{e | E@}@{+
+ | -@}<decimal digit><decimal digit>* is treated as floating-point
+ number. Examples are 1e0 1.e 1.e0 +1e+0 (which all represent the same
+ number) +12.E-4
+ @end itemize
- doc-noname
+ By default, the number base used for integer number conversion is given
- @cindex execution token of last defined word
+ by the contents of a variable named @code{BASE}. Base 10 (decimal) is
- doc-lastxt
+ always used for floating-point number conversion.
- The previous example can be rewritten using @code{noname} and
+ doc-base
- @code{lastxt}:
+ doc-hex
+ doc-decimal
- @example
+ @cindex '-prefix for character strings
- Defer deferred
+ @cindex &-prefix for decimal numbers
- noname : ( ... -- ... )
+ @cindex %-prefix for binary numbers
-   ... ;
+ @cindex $-prefix for hexadecimal numbers
- lastxt IS deferred
+ Gforth allows you to override the value of @code{BASE} by using a prefix
- @end example
+ before the first digit of an (integer) number. Four prefixes are
+ supported:
- @code{lastxt} also works when the last word was not defined as
+ @itemize @bullet
- @code{noname}.
+ @item
+ @code{&} -- decimal number
+ @item
+ @code{%} -- binary number
+ @item
+ @code{$} -- hexadecimal number
+ @item
+ @code{'} -- base 256 number
+ @end itemize
+ Here are some examples, with the equivalent decimal number shown after
+ in braces:
- @node Interpretation and Compilation Semantics,  , Supplying names, Defining Words
+ -$41 (-65), %1001101 (205), %1001.0001 (145 - a double-precision number),
- @subsection Interpretation and Compilation Semantics
+ 'AB (16706; ascii A is 65, ascii B is 66, number is 65*256 + 66),
- @cindex semantics, interpretation and compilation
+ 'ab (24930; ascii a is 97, ascii B is 98, number is 97*256 + 98),
+ &905 (905), $abc (2478), $ABC (2478).
- @cindex interpretation semantics
+ @cindex number conversion - traps for the unwary
- The @dfn{interpretation semantics} of a word are what the text
+ Number conversion has a number of traps for the unwary:
- interpreter does when it encounters the word in interpret state. It also
- appears in some other contexts, e.g., the execution token returned by
- @code{' @var{word}} identifies the interpretation semantics of
- @var{word} (in other words, @code{' @var{word} execute} is equivalent to
- interpret-state text interpretation of @code{@var{word}}).
- @cindex compilation semantics
+ @itemize @bullet
- The @dfn{compilation semantics} of a word are what the text interpreter
+ @item
- does when it encounters the word in compile state. It also appears in
+ You cannot determine the current number base using the code sequence
- other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In
+ @code{BASE @@ .} -- the number base is always 10 in the current number
- standard terminology, ``appends to the current definition''.} the
+ base. Instead, use something like @code{BASE @@ DECIMAL DUP . BASE !}
- compilation semantics of @var{word}.
+ @item
+ If the number base is set to a value greater than 14 (for example,
- @cindex execution semantics
+ hexadecimal), the number 123E4 is ambiguous; the conversion rules allow
- The standard also talks about @dfn{execution semantics}. They are used
+ it to be intepreted as either a single-precision integer or a
- only for defining the interpretation and compilation semantics of many
+ floating-point number (Gforth treats it as an integer). The ambiguity
- words. By default, the interpretation semantics of a word are to
+ can be resolved by explicitly stating the sign of the mantissa and/or
- @code{execute} its execution semantics, and the compilation semantics of
+ exponent: 123E+4 or +123E4 -- if the number base is decimal, no
- a word are to @code{compile,} its execution semantics.@footnote{In
+ ambiguity arises; either representation will be treated as a
- standard terminology: The default interpretation semantics are its
+ floating-point number.
- execution semantics; the default compilation semantics are to append its
+ @item
- execution semantics to the execution semantics of the current
+ There is a word @code{bin} but it does @var{not} set the number base!
- definition.}
+ It is used to specify file types.
+ @item
+ ANS Forth requires the @code{.} of a double-precision number to
+ be the final character in the string. Allowing the @code{.} to be
+ anywhere after the first digit is a Gforth extension.
+ @item
+ The number conversion process does not check for overflow.
+ @item
+ In Gforth, number conversion to floating-point numbers always use base
+, irrespective of the value of @code{BASE}. In ANS Forth,
+ conversion to floating-point numbers whilst the value of
+ @code{BASE} is not 10 is an ambiguous condition.
+ @end itemize
- @comment TODO expand, make it co-operate with new sections on text interpreter.
- @cindex immediate words
+ @node Interpret/Compile states, Literals, Number Conversion, The Text Interpreter
- You can change the compilation semantics into @code{execute}ing the
+ @subsection Interpret/Compile states
- execution semantics with
+ @cindex Interpret/Compile states
- doc-immediate
+ @comment TODO Intro blah.
- @cindex compile-only words
+ doc-state
- You can remove the interpretation semantics of a word with
+ doc-[
+ doc-]
- doc-compile-only
- doc-restrict
- Note that ticking (@code{'}) compile-only words gives an error
+ @node Literals, Interpreter Directives, Interpret/Compile states, The Text Interpreter
- (``Interpreting a compile-only word'').
+ @subsection Literals
+ @cindex Literals
- Gforth also allows you to define words with arbitrary combinations of
+ @comment TODO Intro blah.
- interpretation and compilation semantics.
- doc-interpret/compile:
+ doc-literal
+ doc-]L
+ doc-2literal
+ doc-fliteral
- This feature was introduced for implementing @code{TO} and @code{S"}. I
+ @node Interpreter Directives, ,Literals, The Text Interpreter
- recommend that you do not define such words, as cute as they may be:
+ @subsection Interpreter Directives
- they make it hard to get at both parts of the word in some contexts.
+ @cindex interpreter directives
- E.g., assume you want to get an execution token for the compilation
- part. Instead, define two words, one that embodies the interpretation
- part, and one that embodies the compilation part.  Once you have done
- that, you can define a combined word with @code{interpret/compile:} for
- the convenience of your users.
- You also might try to  with this feature, like this:
+ These words are usually used outside of definitions; for example, to
+ control which parts of a source file are processed by the text
+ interpreter. There are only a few ANS Forth Standard words, but Gforth
+ supplements these with a rich set of immediate control structure words
+ to compensate for the fact that the non-immediate versions can only be
+ used in compile state (@pxref{Control Structures}).
- You might try to use this feature to provide an optimizing
+ doc-[IF]
- implementation of the default compilation semantics of a word. For
+ doc-[ELSE]
- example, by defining:
+ doc-[THEN]
- @example
+ doc-[ENDIF]
- :noname
-    foo bar ;
- :noname
-    POSTPONE foo POSTPONE bar ;
- interpret/compile: foobar
- @end example
- @noindent
+ doc-[IFDEF]
- as an optimizing version of:
+ doc-[IFUNDEF]
- @example
+ doc-[?DO]
- : foobar
+ doc-[DO]
-     foo bar ;
+ doc-[FOR]
- @end example
+ doc-[LOOP]
+ doc-[+LOOP]
+ doc-[NEXT]
- Unfortunately, this does not work correctly with @code{[compile]},
+ doc-[BEGIN]
- because @code{[compile]} assumes that the compilation semantics of all
+ doc-[UNTIL]
- @code{interpret/compile:} words are non-default. I.e., @code{[compile]
+ doc-[AGAIN]
- foobar} would compile the compilation semantics for the optimizing
+ doc-[WHILE]
- @code{foobar}, whereas it would compile the interpretation semantics for
+ doc-[REPEAT]
- the non-optimizing @code{foobar}.
- @cindex state-smart words (are a bad idea)
+ @c -------------------------------------------------------------
- Some people try to use @var{state-smart} words to emulate the feature provided
+ @node Tokens for Words, Word Lists, The Text Interpreter, Words
- by @code{interpret/compile:} (words are state-smart if they check
+ @section Tokens for Words
- @code{STATE} during execution). E.g., they would try to code
+ @cindex tokens for words
- @code{foobar} like this:
- @example
+ This chapter describes the creation and use of tokens that represent
- : foobar
+ words on the stack (and in data space).
-   STATE @@
-   IF ( compilation state )
-     POSTPONE foo POSTPONE bar
-   ELSE
-     foo bar
-   ENDIF ; immediate
- @end example
- Although this works if @code{foobar} is only processed by the text
+ Named words have interpretation and compilation semantics. Unnamed words
- interpreter, it does not work in other contexts (like @code{'} or
+ just have execution semantics.
- @code{POSTPONE}). E.g., @code{' foobar} will produce an execution token
- for a state-smart word, not for the interpretation semantics of the
- original @code{foobar}; when you execute this execution token (directly
- with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile
- state, the result will not be what you expected (i.e., it will not
- perform @code{foo bar}). State-smart words are a bad idea. Simply don't
- write them@footnote{For a more detailed discussion of this topic, see
- @cite{@code{State}-smartness -- Why it is Evil and How to Exorcise it} by Anton
- Ertl; presented at EuroForth '98 and available from
- @url{http://www.complang.tuwien.ac.at/papers/}}!
- @cindex defining words with arbitrary semantics combinations
+ @comment TODO ?normally interpretation semantics are the execution semantics.
- It is also possible to write defining words that define words with
+ @comment this should all be covered in earlier ss
- arbitrary combinations of interpretation and compilation semantics. In
- general, they look like this:
- @example
+ @cindex execution token
- : def-word
+ An @dfn{execution token} represents the execution semantics of an
-     create-interpret/compile
+ unnamed word. An execution token occupies one cell. As explained in
-     @var{code1}
+ @ref{Supplying names}, the execution token of the last word
- interpretation>
+ defined can be produced with @code{lastxt}.
-     @var{code2}
- <interpretation
- compilation>
-     @var{code3}
- <compilation ;
- @end example
- For a @var{word} defined with @code{def-word}, the interpretation
+ doc-execute
- semantics are to push the address of the body of @var{word} and perform
+ doc-compile,
- @var{code2}, and the compilation semantics are to push the address of
- the body of @var{word} and perform @var{code3}. E.g., @code{constant}
- can also be defined like this (except that the defined constants don't
- behave correctly when @code{[compile]}d):
- @example
+ @cindex code field address
- : constant ( n "name" -- )
+ @cindex CFA
-     create-interpret/compile
+ In Gforth, the abstract data type @emph{execution token} is implemented
-     ,
+ as a code field address (CFA).
- interpretation> ( -- n )
+ @comment TODO note that the standard does not say what it represents..
-     @@
+ @comment and you cannot necessarily compile it in all Forths (eg native
- <interpretation
+ @comment compilers?).
- compilation> ( compilation. -- ; run-time. -- n )
-     @@ postpone literal
- <compilation ;
- @end example
- doc-create-interpret/compile
+ The interpretation semantics of a named word are also represented by an
- doc-interpretation>
+ execution token. You can get it with:
- doc-<interpretation
- doc-compilation>
- doc-<compilation
- Note that words defined with @code{interpret/compile:} and
+ doc-[']
- @code{create-interpret/compile} have an extended header structure that
+ doc-'
- differs from other words; however, unless you try to access them with
- plain address arithmetic, you should not notice this. Words for
- accessing the header structure usually know how to deal with this; e.g.,
- @code{' word >body} also gives you the body of a word created with
- @code{create-interpret/compile}.
- @c ----------------------------------------------------------
+ For literals, you use @code{'} in interpreted code and @code{[']} in
- @node The Text Interpreter, Structures, Defining Words, Words
+ compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusually
- @section  The Text Interpreter
+ by complaining about compile-only words. To get an execution token for a
- @cindex interpreter - outer
+ compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP']
- @cindex text interpreter
+ @var{X} drop}.
- @cindex outer interpreter
- Intro blah.
+ @cindex compilation token
+ The compilation semantics are represented by a @dfn{compilation token}
+ consisting of two cells: @var{w xt}. The top cell @var{xt} is an
+ execution token. The compilation semantics represented by the
+ compilation token can be performed with @code{execute}, which consumes
+ the whole compilation token, with an additional stack effect determined
+ by the represented compilation semantics.
- @comment TODO
+ doc-[comp']
+ doc-comp'
- doc->in
+ You can compile the compilation semantics with @code{postpone,}. I.e.,
- doc-tib
+ @code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE
- doc-#tib
+ @var{word}}.
- doc-span
- doc-restore-input
- doc-save-input
- doc-source
- doc-source-id
+ doc-postpone,
- @menu
+ At present, the @var{w} part of a compilation token is an execution
- * Number Conversion::
+ token, and the @var{xt} part represents either @code{execute} or
- * Interpret/Compile states::
+ @code{compile,}. However, don't rely on that knowledge, unless necessary;
- * Literals::
+ we may introduce unusual compilation tokens in the future (e.g.,
- * Interpreter Directives::
+ compilation tokens representing the compilation semantics of literals).
- @end menu
- @comment TODO
+ @cindex name token
+ @cindex name field address
+ @cindex NFA
+ Named words are also represented by the @dfn{name token}, (@var{nt}). The abstract
+ data type @emph{name token} is implemented as a name field address (NFA).
- The text interpreter works on input one line at a time. Starting at
+ doc-find-name
- the beginning of the line, it skips leading spaces (called
+ doc-name>int
- "delimiters") then parses a string (a sequence of non-space
+ doc-name?int
- characters) until it either reaches a space character or it
+ doc-name>comp
- reaches the end of the line. Having parsed a string, it then makes two
+ doc-name>string
- attempts to do something with it:
- * It looks the string up in a dictionary of definitions. If the string
+ @c -------------------------------------------------------------
-   is found in the dictionary, the string names a "definition" (also
+ @node Word Lists, Environmental Queries, Tokens for Words, Words
-   known as a "word") and the dictionary search will return an
+ @section Word Lists
-   "Execution token" (xt) for the definition and some flags that show
+ @cindex word lists
-   when the definition can be used legally. If the definition can be
+ @cindex name dictionary
-   legally executed in "Interpret" mode then the text interpreter will
-   use the xt to execute it, otherwise it will issue an error
-   message. The dictionary is described in more detail in <blah>.
- * If the string is not found in the dictionary, the text interpreter
+ @cindex wid
-   attempts to treat it as a number in the current radix (base 10 after
+ All definitions other than those created by @code{:noname} have an entry
-   initial startup). If the string represents a legal number in the
+ in the name dictionary. The name dictionary is fragmented into a number
-   current radix, the number is pushed onto the appropriate parameter
+ of parts, called @var{word lists}. A word list is identified by a
-   stack. Stacks are discussed in more detail in <blah>. Number
+ cell-sized word list identifier (@var{wid}) in much the same way as a
-   conversion is described in more detail in <section about +, -
+ file is identified by a file handle. The numerical value of the wid has
-   numbers and different number formats>.
+ no (portable) meaning, and might change from session to session.
- If both of these attempts fail, the remainer of the input line is
+ @cindex compilation word list
- discarded and the text interpreter isses an error message. If one of
+ At any one time, a single word list is defined as the word list to which
- these attempts succeeds, the text interpreter repeats the parsing
+ all new definitions will be added -- this is called the @var{compilation
- process until the end of the line has been reached. At this point,
+ word list}. When Gforth is started, the compilation word list is the
- it prints the status message "  ok" and waits for more input.
+ word list called @code{FORTH-WORDLIST}.
- There are two important things to note about the behaviour of the text
+ @cindex search order stack
- interpreter:
+ Forth maintains a stack of word lists, representing the @var{search
+ order}.  When the name dictionary is searched (for example, when
+ attempting to find a word's execution token during compilation), only
+ those word lists that are currently in the search order are
+ searched. The most recently-defined word in the word list at the top of
+ the word list stack is searched first, and the search proceeds until
+ either the word is located or the oldest definition in the word list at
+ the bottom of the stack is reached. Definitions of the word may exist in
+ more than one word lists; the search order determines which version will
+ be found.
- * it processes each input string to completion before parsing
+ The ANS Forth Standard ``Search order'' word set is intended to provide a
-   additional characters from the input line.
+ set of low-level tools that allow various different schemes to be
+ implemented. Gforth provides @code{vocabulary}, a traditional Forth
+ word.  @file{compat/vocabulary.fs} provides an implementation in ANS
+ Standard Forth.
- * it keeps track of its position in the input line using a variable
+ TODO: locals section refers to here, saying that every word list (aka
-   (called >IN, pronounced "to-in"). The value of >IN can be modified
+ vocabulary) has its own methods for searching etc. Need to document that.
-   by the execution of definitions in the input line. This means that
-   definitions can "trick" the text interpreter either into skipping
-   sections of the input line or into parsing a section of the
-   input line more than once.
+ doc-forth-wordlist
+ doc-definitions
+ doc-get-current
+ doc-set-current
- @node Number Conversion, Interpret/Compile states, The Text Interpreter, The Text Interpreter
+ @comment TODO when a defn (like set-order) is instanced twice, the second instance gets documented.
- @subsection Number Conversion
+ @comment In general that might be fine, but in this example (search.fs) the second instance is an
- @cindex Number conversion
+ @comment alias, so it would not naturally have documentation
- @cindex double-cell numbers, input format
+ @comment .. the fix to that is to add a specific prefix, like the object-orientation stuff does.
- @cindex input format for double-cell numbers
- @cindex single-cell numbers, input format
- @cindex input format for single-cell numbers
- @cindex floating-point numbers, input format
- @cindex input format for floating-point numbers
- If the text interpreter fails to find a particular string in the name
+ doc-get-order
- dictionary, it attempts to convert it to a number using a set of rules.
+ doc-set-order
+ doc-wordlist
+ doc-also
+ doc-forth
+ doc-only
+ doc-order
+ doc-previous
- Let <digit> represent any character that is a legal digit in the current
+ doc-find
- number base (for example, 0-9 when the number base is decimal or 0-9, A-F
+ doc-search-wordlist
- when the number base is hexadecimal).
- Let <decimal digit> represent any character in the range 0-9.
+ doc-words
+ doc-vlist
- @comment TODO need to extend the next defn to support fp format
+ doc-mappedwordlist
- Let @{+ | -@} represent the optional presence of either a @code{+} or
+ doc-root
- @code{-} character.
+ doc-vocabulary
+ doc-seal
+ doc-vocs
+ doc-current
+ doc-context
- Let * represent any number of instances of the previous character
+ @menu
- (including none).
+ * Why use word lists?::
+ * Word list examples::
+ @end menu
- Let any other character represent itself.
+ @node Why use word lists?, Word list examples, Word Lists, Word Lists
+ @subsection Why use word lists?
+ @cindex word lists - why use them?
- Now, the conversion rules are:
+ There are several reasons for using multiple word lists:
  @itemize @bullet
  @item
- A string of the form <digit><digit>* is treated as a single-precision
+ To improve compilation speed by reducing the number of name dictionary
- (CELL-sized) positive integer. Examples are 0 123 6784532 32343212343456 42
+ entries that must be searched. This is achieved by creating a new
- @item
+ word list that contains all of the definitions that are used in the
- A string of the form -<digit><digit>* is treated as a single-precision
+ definition of a Forth system but which would not usually be used by
- (CELL-sized) negative integer, and is represented using 2's-complement
+ programs running on that system. That word list would be on the search
- arithmetic. Examples are -45 -5681 -0
+ list when the Forth system was compiled but would be removed from the
- @item
+ search list for normal operation. This can be a useful technique for
- A string of the form <digit><digit>*.<digit>* is treated as a double-precision
+ low-performance systems (for example, 8-bit processors in embedded
- (double-CELL-sized) positive integer. Examples are 3465. 3.465 34.65
+ systems) but is unlikely to be necessary in high-performance desktop
- (and note that these all represent the same number).
+ systems.
  @item
- A string of the form -<digit><digit>*.<digit>* is treated as a
+ To prevent a set of words from being used outside the context in which
- double-precision (double-CELL-sized) negative integer, and is
+ they are valid. Two classic examples of this are an integrated editor
- represented using 2's-complement arithmetic. Examples are -3465. -3.465
+ (all of the edit commands are defined in a separate word list; the
- -34.65 (and note that these all represent the same number).
+ search order is set to the editor word list when the editor is invoked;
+ the old search order is restored when the editor is terminated) and an
+ integrated assembler (the op-codes for the machine are defined in a
+ separate word list which is used when a @code{CODE} word is defined).
  @item
- A string of the form @{+ | -@}<decimal digit>@{.@}<decimal digit>*@{e | E@}@{+
+ To prevent a name-space clash between multiple definitions with the same
- | -@}<decimal digit><decimal digit>* is treated as floating-point
+ name. For example, when building a cross-compiler you might have a word
- number. Examples are 1e0 1.e 1.e0 +1e+0 (which all represent the same
+ @code{IF} that generates conditional code for your target system. By
- number) +12.E-4
+ placing this definition in a different word list you can control whether
+ the host system's @code{IF} or the target system's @code{IF} get used in
+ any particular context by controlling the order of the word lists on the
+ search order stack.
  @end itemize
- By default, the number base used for integer number conversion is given
+ @node Word list examples, ,Why use word lists?, Word Lists
- by the contents of a variable named @code{BASE}. Base 10 (decimal) is
+ @subsection Word list examples
- always used for floating-point number conversion.
+ @cindex word lists - examples
- doc-base
- doc-hex
- doc-decimal
- @cindex '-prefix for character strings
+ Here is an example of creating and using a new wordlist using ANS
- @cindex &-prefix for decimal numbers
+ Forth Standard words:
- @cindex %-prefix for binary numbers
- @cindex $-prefix for hexadecimal numbers
- Gforth allows you to override the value of @code{BASE} by using a prefix
- before the first digit of an (integer) number. Four prefixes are
- supported:
- @itemize @bullet
+ @example
- @item
+ wordlist constant my-new-words-wordlist
- @code{&} -- decimal number
+ : my-new-words get-order nip my-new-words-wordlist swap set-order ;
- @item
- @code{%} -- binary number
- @item
- @code{$} -- hexadecimal number
- @item
- @code{'} -- base 256 number
- @end itemize
- Here are some examples, with the equivalent decimal number shown after
+ \ add it to the search order
- in braces:
+ also my-new-words
- -$41 (-65) %1001101 (205) %1001.0001 (145 - a double-precision number)
+ \ alternatively, add it to the search order and make it
- 'AB (16706; ascii A is 65, ascii B is 66, number is 65*256 + 66)
+ \ the compilation word list
- 'ab (24930; ascii a is 97, ascii B is 98, number is 97*256 + 98)
+ also my-new-words definitions
- &905 (905) $abc (2478) $ABC (2478)
+ \ type "order" to see the problem
+ @end example
- @cindex Number conversion - traps for the unwary
+ The problem with this example is that @code{order} has no way to
- Number conversion has a number of traps for the unwary:
+ associate the name @code{my-new-words} with the wid of the word list (in
+ Gforth, @code{order} and @code{vocs} will display @code{???}  for a wid
+ that has no associated name). There is no Standard way of associating a
+ name with a wid.
- @itemize @bullet
+ In Gforth, this example can be re-coded using @code{vocabulary}, which
- @item
+ associates a name with a wid:
- You cannot determine the current number base using the code sequence
- @code{BASE @@ .} -- the number base is always 10 in the current number
- base. Instead, use something like @code{BASE @@ DECIMAL DUP . BASE !}
- @item
- If the number base is set to a value greater than 14 (for example,
- hexadecimal), the number 123E4 is ambiguous; the conversion rules allow
- it to be intepreted as either a single-precision integer or a
- floating-point number (Gforth treats it as an integer). The ambiguity
- can be resolved by explicitly stating the sign of the mantissa and/or
- exponent: 123E+4 or +123E4 -- if the number base is decimal, no
- ambiguity arises; either representation will be treated as a
- floating-point number.
- @item
- There is a word @code{bin} but it does @var{not} set the number base!
- It is used to specify file types.
- @item
- ANS Forth Standard requires the @code{.} of a double-precision number to
- be the final character in the string. Allowing the @code{.} to be
- anywhere after the first digit is a Gforth extension.
- @item
- The number conversion process does not check for overflow.
- @item
- In Gforth, number conversion to floating-point numbers always use base
-, irrespective of the value of @code{BASE}. For the ANS Forth
- Standard, conversion to floating-point numbers whilst the value of
- @code{BASE} is not 10 is an ambiguous condition.
- @end itemize
+ @example
+ vocabulary my-new-words
- @node Interpret/Compile states, Literals, Number Conversion, The Text Interpreter
+ \ add it to the search order
- @subsection Interpret/Compile states
+ my-new-words
- @cindex Interpret/Compile states
- @comment TODO
+ \ alternatively, add it to the search order and make it
- Intro blah.
+ \ the compilation word list
+ my-new-words definitions
+ \ type "order" to see that the problem is solved
+ @end example
- doc-state
+ @c -------------------------------------------------------------
- doc-[
+ @node Environmental Queries, Files, Word Lists, Words
- doc-]
+ @section Environmental Queries
+ @cindex environmental queries
+ @comment TODO more index entries
- @node Literals, Interpreter Directives, Interpret/Compile states, The Text Interpreter
- @subsection Literals
- @cindex Literals
- @comment TODO
- Intro blah.
- doc-literal
- doc-]L
- doc-2literal
- doc-fliteral
- @node Interpreter Directives, ,Literals, The Text Interpreter
- @subsection Interpreter Directives
- @cindex Interpreter Directives
- These words are usually used outside of definitions; for example, to
- control which parts of a source file are processed by the text
- interpreter. There are only a few ANS Forth Standard words, but Gforth
- supplements these with a rich set of immediate control structure words
- to compensate for the fact that the non-immediate versions can only be
- used in compile state (@pxref{Control Structures}).
- doc-[IF]
- doc-[ELSE]
- doc-[THEN]
- doc-[ENDIF]
- doc-[IFDEF]
- doc-[IFUNDEF]
- doc-[?DO]
- doc-[DO]
- doc-[FOR]
- doc-[LOOP]
- doc-[+LOOP]
- doc-[NEXT]
- doc-[BEGIN]
+ ANS Forth introduced the idea of ``environmental queries'' as a way
- doc-[UNTIL]
+ for a program running on a system to determine certain characteristics of the system.
- doc-[AGAIN]
+ The Standard specifies a number of strings that might be recognised by a system.
- doc-[WHILE]
- doc-[REPEAT]
+ The Standard requires that the name space used for environmental queries
+ be distinct from the name space used for definitions.
- @c ----------------------------------------------------------
+ Typically, environmental queries are supported by creating a set of
- @node Structures, Object-oriented Forth, The Text Interpreter, Words
+ definitions in a word list that is @var{only} used during environmental
- @section  Structures
+ queries; that is what Gforth does. There is no Standard way of adding
- @cindex structures
+ definitions to the set of recognised environmental queries, but any
- @cindex records
+ implementation that supports the loading of optional word sets must have
+ some mechanism for doing this (after loading the word set, the
+ associated environmental query string must return @code{true}). In
+ Gforth, the word list used to honour environmental queries can be
+ manipulated just like any other word list.
- This section presents the structure package that comes with Gforth. A
+ doc-environment?
- version of the package implemented in ANS Standard Forth is available in
+ doc-environment-wordlist
- @file{compat/struct.fs}. This package was inspired by a posting on
- comp.lang.forth in 1989 (unfortunately I don't remember, by whom;
- possibly John Hayes). A version of this section has been published in
- ???. Marcel Hendrix provided helpful comments.
- @menu
+ doc-gforth
- * Why explicit structure support?::
+ doc-os-class
- * Structure Usage::
- * Structure Naming Convention::
- * Structure Implementation::
- * Structure Glossary::
- @end menu
- @node Why explicit structure support?, Structure Usage, Structures, Structures
+ Note that, whilst the documentation for (e.g.) @code{gforth} shows it
- @subsection Why explicit structure support?
+ returning two items on the stack, querying it using @code{environment?}
+ will return an additional item; the @code{true} flag that shows that the
+ string was recognised.
- @cindex address arithmetic for structures
+ @comment TODO Document the standard strings or note where they are documented herein
- @cindex structures using address arithmetic
- If we want to use a structure containing several fields, we could simply
- reserve memory for it, and access the fields using address arithmetic
- (@pxref{Address arithmetic}). As an example, consider a structure with
- the following fields
- @table @code
+ Here are some examples of using environmental queries:
- @item a
- is a float
- @item b
- is a cell
- @item c
- is a float
- @end table
- Given the (float-aligned) base address of the structure we get the
+ @example
- address of the field
+ s" address-unit-bits" environment? 0=
+ [IF]
+      cr .( environmental attribute address-units-bits unknown... ) cr
+ [THEN]
- @table @code
+ s" block" environment? [IF] DROP include block.fs [THEN]
- @item a
- without doing anything further.
- @item b
- with @code{float+}
- @item c
- with @code{float+ cell+ faligned}
- @end table
- It is easy to see that this can become quite tiring.
+ s" gforth" environment? [IF] 2DROP include compat/vocabulary.fs [THEN]
- Moreover, it is not very readable, because seeing a
+ s" gforth" environment? [IF] .( Gforth version ) TYPE
- @code{cell+} tells us neither which kind of structure is
+                         [ELSE] .( Not Gforth..) [THEN]
- accessed nor what field is accessed; we have to somehow infer the kind
+ @end example
- of structure, and then look up in the documentation, which field of
- that structure corresponds to that offset.
- Finally, this kind of address arithmetic also causes maintenance
- troubles: If you add or delete a field somewhere in the middle of the
- structure, you have to find and change all computations for the fields
- afterwards.
- So, instead of using @code{cell+} and friends directly, how
+ Here is an example of adding a definition to the environment word list:
- about storing the offsets in constants:
  @example
-constant a-offset
+ get-current environment-wordlist set-current
-float+ constant b-offset
+ true constant block
-float+ cell+ faligned c-offset
+ true constant block-ext
+ set-current
  @end example
- Now we can get the address of field @code{x} with @code{x-offset
+ You can see what definitions are in the environment word list like this:
- +}. This is much better in all respects. Of course, you still
- have to change all later offset definitions if you add a field. You can
- fix this by declaring the offsets in the following way:
  @example
-constant a-offset
+ get-order 1+ environment-wordlist swap set-order words previous
- a-offset float+ constant b-offset
- b-offset cell+ faligned constant c-offset
  @end example
- Since we always use the offsets with @code{+}, we could use a defining
- word @code{cfield} that includes the @code{+} in the action of the
- defined word:
- @example
+ @c -------------------------------------------------------------
- : cfield ( n "name" -- )
+ @node Files, Blocks, Environmental Queries, Words
-     create ,
+ @section Files
- does> ( name execution: addr1 -- addr2 )
-     @@ + ;
-cfield a
+ Gforth provides facilities for accessing files that are stored in the
-a float+ cfield b
+ host operating system's file-system. Files that are processed by Gforth
-b cell+ faligned cfield c
+ can be divided into two categories:
- @end example
- Instead of @code{x-offset +}, we now simply write @code{x}.
+ @itemize @bullet
+ @item
+ Files that are processed by the Text Interpreter (@var{Forth source files}).
+ @item
+ Files that are processed by some other program (@var{general files}).
+ @end itemize
- The structure field words now can be used quite nicely. However,
+ @menu
- their definition is still a bit cumbersome: We have to repeat the
+ * Forth source files::
- name, the information about size and alignment is distributed before
+ * General files::
- and after the field definitions etc.  The structure package presented
+ * Search Paths::
- here addresses these problems.
+ * Forth Search Paths::
+ * General Search Paths::
+ @end menu
- @node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures
- @subsection Structure Usage
- @cindex structure usage
- @cindex @code{field} usage
+ @c -------------------------------------------------------------
- @cindex @code{struct} usage
+ @node Forth source files, General files, Files, Files
- @cindex @code{end-struct} usage
+ @subsection Forth source files
- You can define a structure for a (data-less) linked list with:
+ @cindex including files
- @example
+ @cindex Forth source files
- struct
-     cell% field list-next
- end-struct list%
- @end example
- With the address of the list node on the stack, you can compute the
+ The simplest way to interpret the contents of a file is to use one of
- address of the field that contains the address of the next node with
+ these two formats:
- @code{list-next}. E.g., you can determine the length of a list
- with:
  @example
- : list-length ( list -- n )
+ include mysource.fs
- \ "list" is a pointer to the first element of a linked list
+ s" mysource.fs" included
- \ "n" is the length of the list
-begin ( list1 n1 )
-         over
-     while ( list1 n1 )
-+ swap list-next @@ swap
-     repeat
-     nip ;
  @end example
- You can reserve memory for a list node in the dictionary with
+ Sometimes you want to include a file only if it is not included already
- @code{list% %allot}, which leaves the address of the list node on the
+ (by, say, another source file). In that case, you can use one of these
- stack. For the equivalent allocation on the heap you can use @code{list%
+ fomats:
- %alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior),
- use @code{list% %allocate}). You can get the the size of a list
- node with @code{list% %size} and its alignment with @code{list%
- %alignment}.
- Note that in ANS Forth the body of a @code{create}d word is
- @code{aligned} but not necessarily @code{faligned};
- therefore, if you do a:
  @example
- create @emph{name} foo% %allot
+ require mysource.fs
+ needs mysource.fs
+ s" mysource.fs" required
  @end example
- @noindent
+ @cindex stack effect of included files
- then the memory alloted for @code{foo%} is
+ @cindex including files, stack effect
- guaranteed to start at the body of @code{@emph{name}} only if
+ I recommend that you write your source files such that interpreting them
- @code{foo%} contains only character, cell and double fields.
+ does not change the stack. This allows using these files with
+ @code{required} and friends without complications. For example:
- @cindex strcutures containing structures
- You can include a structure @code{foo%} as a field of
- another structure, like this:
  @example
- struct
+require foo.fs drop
- ...
-     foo% field ...
- ...
- end-struct ...
  @end example
- @cindex structure extension
- @cindex extended records
- Instead of starting with an empty structure, you can extend an
- existing structure. E.g., a plain linked list without data, as defined
- above, is hardly useful; You can extend it to a linked list of integers,
- like this:@footnote{This feature is also known as @emph{extended
- records}. It is the main innovation in the Oberon language; in other
- words, adding this feature to Modula-2 led Wirth to create a new
- language, write a new compiler etc.  Adding this feature to Forth just
- required a few lines of code.}
- @example
+ doc-include-file
- list%
+ doc-included
-     cell% field intlist-int
+ doc-include
- end-struct intlist%
+ @comment TODO describe what happens on error. Describes how the require
- @end example
+ @comment stuff works and describe how to clear/reset the history (eg
+ @comment for debug). Might want to include that in the MARKER example.
+ doc-required
+ doc-require
+ doc-needs
- @code{intlist%} is a structure with two fields:
+ A definition in ANS Forth for @code{required} is provided in
- @code{list-next} and @code{intlist-int}.
+ @file{compat/required.fs}.
- @cindex structures containing arrays
+ @c -------------------------------------------------------------
- You can specify an array type containing @emph{n} elements of
+ @node General files, Search Paths, Forth source files, Files
- type @code{foo%} like this:
+ @subsection General files
+ @cindex general files
+ @cindex file-handling
- @example
+ Files are opened/created by name and type. The following types are
- foo% @emph{n} *
+ recognised:
- @end example
- You can use this array type in any place where you can use a normal
+ doc-r/o
- type, e.g., when defining a @code{field}, or with
+ doc-r/w
- @code{%allot}.
+ doc-w/o
+ doc-bin
- @cindex first field optimization
+ When a file is opened/created, it returns a file identifier,
- The first field is at the base address of a structure and the word
+ @var{wfileid} that is used for all other file commands. All file
- for this field (e.g., @code{list-next}) actually does not change
+ commands also return a status value, @var{wior}, that is 0 for a
- the address on the stack. You may be tempted to leave it away in the
+ successful operation and an implementation-defined non-zero value in the
- interest of run-time and space efficiency. This is not necessary,
+ case of an error.
- because the structure package optimizes this case and compiling such
- words does not generate any code. So, in the interest of readability
- and maintainability you should include the word for the field when
- accessing the field.
- @node Structure Naming Convention, Structure Implementation, Structure Usage, Structures
+ doc-open-file
- @subsection Structure Naming Convention
+ doc-create-file
- @cindex structure naming conventions
- The field names that come to (my) mind are often quite generic, and,
+ doc-close-file
- if used, would cause frequent name clashes. E.g., many structures
+ doc-delete-file
- probably contain a @code{counter} field. The structure names
+ doc-rename-file
- that come to (my) mind are often also the logical choice for the names
+ doc-read-file
- of words that create such a structure.
+ doc-read-line
+ doc-write-file
+ doc-write-line
+ doc-emit-file
+ doc-flush-file
- Therefore, I have adopted the following naming conventions:
+ doc-file-status
+ doc-file-position
+ doc-reposition-file
+ doc-file-size
+ doc-resize-file
- @itemize @bullet
+ @c ---------------------------------------------------------
- @cindex field naming convention
+ @node Search Paths, Forth Search Paths, General files, Files
- @item
+ @subsection Search Paths
- The names of fields are of the form
+ @cindex path for @code{included}
- @code{@emph{struct}-@emph{field}}, where
+ @cindex file search path
- @code{@emph{struct}} is the basic name of the structure, and
+ @cindex @code{include} search path
- @code{@emph{field}} is the basic name of the field. You can
+ @cindex search path for files
- think of field words as converting the (address of the)
- structure into the (address of the) field.
- @cindex structure naming convention
+ @comment what uses these search paths.. just include and friends?
- @item
+ If you specify an absolute filename (i.e., a filename starting with
- The names of structures are of the form
+ @file{/} or @file{~}, or with @file{:} in the second position (as in
- @code{@emph{struct}%}, where
+ @samp{C:...})) for @code{included} and friends, that file is included
- @code{@emph{struct}} is the basic name of the structure.
+ just as you would expect.
- @end itemize
- This naming convention does not work that well for fields of extended
+ For relative filenames, Gforth uses a search path similar to Forth's
- structures; e.g., the integer list structure has a field
+ search order (@pxref{Word Lists}). It tries to find the given filename
- @code{intlist-int}, but has @code{list-next}, not
+ in the directories present in the path, and includes the first one it
- @code{intlist-next}.
+ finds. There are separate search paths for Forth source files and
+ general files.
- @node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures
+ If the search path contains the directory @file{.} (as it should), this
- @subsection Structure Implementation
+ refers to the directory that the present file was @code{included}
- @cindex structure implementation
+ from. This allows files to include other files relative to their own
- @cindex implementation of structures
+ position (irrespective of the current working directory or the absolute
+ position).  This feature is essential for libraries consisting of
+ several files, where a file may include other files from the library.
+ It corresponds to @code{#include "..."} in C. If the current input
+ source is not a file, @file{.} refers to the directory of the innermost
+ file being included, or, if there is no file being included, to the
+ current working directory.
- The central idea in the implementation is to pass the data about the
+ Use @file{~+} to refer to the current working directory (as in the
- structure being built on the stack, not in some global
+ @code{bash}).
- variable. Everything else falls into place naturally once this design
- decision is made.
- The type description on the stack is of the form @emph{align
+ If the filename starts with @file{./}, the search path is not searched
- size}. Keeping the size on the top-of-stack makes dealing with arrays
+ (just as with absolute filenames), and the @file{.} has the same meaning
- very simple.
+ as described above.
- @code{field} is a defining word that uses @code{Create}
+ @c ---------------------------------------------------------
- and @code{DOES>}. The body of the field contains the offset
+ @node Forth Search Paths, General Search Paths, Search Paths, Files
- of the field, and the normal @code{DOES>} action is simply:
+ @subsubsection Forth Search Paths
+ @cindex search path control - forth
+ The search path is initialized when you start Gforth (@pxref{Invoking
+ Gforth}). You can display it and change it using these words:
+ doc-.fpath
+ doc-fpath+
+ doc-fpath=
+ doc-open-fpath-file
+ Here is an example of using @code{fpath} and @code{require}:
  @example
- @ +
+ fpath= /usr/lib/forth/|./
+ require timer.fs
  @end example
- @noindent
+ @c ---------------------------------------------------------
- i.e., add the offset to the address, giving the stack effect
+ @node General Search Paths,  , Forth Search Paths, Files
- @var{addr1 -- addr2} for a field.
+ @subsubsection General Search Paths
+ @cindex search path control - for user applications
- @cindex first field optimization, implementation
+ Your application may need to search files in several directories, like
- This simple structure is slightly complicated by the optimization
+ @code{included} does. To facilitate this, Gforth allows you to define
- for fields with offset 0, which requires a different
+ and use your own search paths, by providing generic equivalents of the
- @code{DOES>}-part (because we cannot rely on there being
+ Forth search path words:
- something on the stack if such a field is invoked during
- compilation). Therefore, we put the different @code{DOES>}-parts
- in separate words, and decide which one to invoke based on the
- offset. For a zero offset, the field is basically a noop; it is
- immediate, and therefore no code is generated when it is compiled.
- @node Structure Glossary,  , Structure Implementation, Structures
+ doc-.path
- @subsection Structure Glossary
+ doc-path+
- @cindex structure glossary
+ doc-path=
+ doc-open-path-file
- doc-%align
+ Here's an example of creating a search path:
- doc-%alignment
- doc-%alloc
+ @example
- doc-%allocate
+ \ Make a buffer for the path:
- doc-%allot
+ create mypath   100 chars ,     \ maximum length (is checked)
- doc-cell%
+,             \ real len
- doc-char%
+chars allot \ space for path
- doc-dfloat%
+ @end example
- doc-double%
- doc-end-struct
- doc-field
- doc-float%
- doc-nalign
- doc-sfloat%
- doc-%size
- doc-struct
  @c -------------------------------------------------------------
- @node Object-oriented Forth, Tokens for Words, Structures, Words
+ @node Blocks, Other I/O, Files, Words
- @section Object-oriented Forth
+ @section Blocks
- Gforth comes with three packets for object-oriented programming:
+ This chapter describes how to use block files within Gforth.
- @file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them
- is preloaded, so you have to @code{include} them before use. The most
+ Block files are traditionally means of data and source storage in
- important differences between these packets (and others) are discussed
+ Forth. They have been very important in resource-starved computers
- in @ref{Comparison with other object models}. All packets are written
+ without OS in the past. Gforth doesn't encourage to use blocks as
- in ANS Forth and can be used with any other ANS Forth.
+ source, and provides blocks only for backward compatibility. The ANS
+ standard requires blocks to be available when files are.
+ @comment TODO what about errors on open-blocks?
+ doc-open-blocks
+ doc-use
+ doc-scr
+ doc-blk
+ doc-get-block-fid
+ doc-block-position
+ doc-update
+ doc-save-buffers
+ doc-save-buffer
+ doc-empty-buffers
+ doc-empty-buffer
+ doc-flush
+ doc-get-buffer
+ doc---block-block
+ doc-buffer
+ doc-updated?
+ doc-list
+ doc-load
+ doc-thru
+ doc-+load
+ doc-+thru
+ doc---block--->
+ doc-block-included
+ @c -------------------------------------------------------------
+ @node Other I/O, Programming Tools, Blocks, Words
+ @section Other I/O
+ @comment TODO more index entries
  @menu
- * Why object-oriented programming?::
+ * Simple numeric output::       Predefined formats
- * Object-Oriented Terminology::
+ * Formatted numeric output::    Formatted (pictured) output
- * Objects::
+ * String Formats::              How Forth stores strings in memory
- * OOF::
+ * Displaying characters and strings:: Other stuff
- * Mini-OOF::
+ * Input::                       Input
- * Comparison with other object models::
  @end menu
+ @node Simple numeric output, Formatted numeric output, Other I/O, Other I/O
+ @subsection Simple numeric output
+ @cindex simple numeric output
+ @comment TODO more index entries
- @node Why object-oriented programming?, Object-Oriented Terminology, , Object-oriented Forth
+ The simplest output functions are those that display numbers from the
- @subsubsection Why object-oriented programming?
+ data or floating-point stacks. Floating-point output is always displayed
- @cindex object-oriented programming motivation
+ using base 10. Numbers displayed from the data stack use the value stored
- @cindex motivation for object-oriented programming
+ in @code{base}.
- Often we have to deal with several data structures (@emph{objects}),
+ doc-.
- that have to be treated similarly in some respects, but differently in
+ doc-dec.
- others. Graphical objects are the textbook example: circles, triangles,
+ doc-hex.
- dinosaurs, icons, and others, and we may want to add more during program
+ doc-u.
- development. We want to apply some operations to any graphical object,
+ doc-.r
- e.g., @code{draw} for displaying it on the screen. However, @code{draw}
+ doc-u.r
- has to do something different for every kind of object.
+ doc-d.
- @comment TODO add some other operations eg perimeter, area
+ doc-ud.
- @comment and tie in to concrete examples later..
+ doc-d.r
+ doc-ud.r
+ doc-f.
+ doc-fe.
+ doc-fs.
- We could implement @code{draw} as a big @code{CASE}
+ Examples of printing the number 1234.5678E23 in the different floating-point output
- control structure that executes the appropriate code depending on the
+ formats are shown below:
- kind of object to be drawn. This would be not be very elegant, and,
- moreover, we would have to change @code{draw} every time we add
- a new kind of graphical object (say, a spaceship).
- What we would rather do is: When defining spaceships, we would tell
+ @example
- the system: "Here's how you @code{draw} a spaceship; you figure
+ f. 123456779999999000000000000.
- out the rest."
+ fe. 123.456779999999E24
+ fs. 1.23456779999999E26
- This is the problem that all systems solve that (rightfully) call
+ @end example
- themselves object-oriented; the object-oriented packages presented here
- solve this problem (and not much else).
- @comment TODO ?list properties of oo systems.. oo vs o-based?
- @node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth
- @subsubsection Object-Oriented Terminology
- @cindex object-oriented terminology
- @cindex terminology for object-oriented programming
- This section is mainly for reference, so you don't have to understand
- all of it right away.  The terminology is mainly Smalltalk-inspired.  In
- short:
- @table @emph
- @cindex class
- @item class
- a data structure definition with some extras.
- @cindex object
+ @node Formatted numeric output, String Formats, Simple numeric output, Other I/O
- @item object
+ @subsection Formatted numeric output
- an instance of the data structure described by the class definition.
+ @cindex Formatted numeric output
+ @cindex pictured numeric output
+ @comment TODO more index entries
- @cindex instance variables
+ Forth traditionally uses a technique called @var{pictured numeric
- @item instance variables
+ output} for formatted printing of integers.  In this technique, digits
- fields of the data structure.
+ are extracted from the number (using the current output radix defined by
+ @code{base}), converted to ASCII codes and appended to a string that is
+ built in a scratch-pad area of memory (@pxref{core-idef,
+ Implementation-defined options, Implementation-defined
+ options}). Arbitrary characters can be appended to the string during the
+ extraction process. The completed string is specified by an address
+ and length and can be manipulated (@code{TYPE}ed, copied, modified)
+ under program control.
- @cindex selector
+ All of the words described in the previous section for simple numeric
- @cindex method selector
+ output are implemented in Gforth using pictured numeric output.
- @cindex virtual function
- @item selector
- (or @emph{method selector}) a word (e.g.,
- @code{draw}) that performs an operation on a variety of data
- structures (classes). A selector describes @emph{what} operation to
- perform. In C++ terminology: a (pure) virtual function.
- @cindex method
+ Three important things to remember about Pictured Numeric Output:
- @item method
- the concrete definition that performs the operation
- described by the selector for a specific class. A method specifies
- @emph{how} the operation is performed for a specific class.
- @cindex selector invocation
+ @itemize @bullet
- @cindex message send
+ @item
- @cindex invoking a selector
+ It always operates on double-precision numbers; to display a single-precision number,
- @item selector invocation
+ convert it first (@pxref{Double precision} for ways of doing this).
- a call of a selector. One argument of the call (the TOS (top-of-stack))
+ @item
- is used for determining which method is used. In Smalltalk terminology:
+ It always treats the double-precision number as though it were unsigned. Refer to
- a message (consisting of the selector and the other arguments) is sent
+ the examples below for ways of printing signed numbers.
- to the object.
+ @item
+ The string is built up from right to left; least significant digit first.
+ @end itemize
- @cindex receiving object
+ doc-<#
- @item receiving object
+ doc-#
- the object used for determining the method executed by a selector
+ doc-#s
- invocation. In the @file{objects.fs} model, it is the object that is on
+ doc-hold
- the TOS when the selector is invoked. (@emph{Receiving} comes from
+ doc-sign
- the Smalltalk @emph{message} terminology.)
+ doc-#>
- @cindex child class
+ doc-represent
- @cindex parent class
- @cindex inheritance
- @item child class
- a class that has (@emph{inherits}) all properties (instance variables,
- selectors, methods) from a @emph{parent class}. In Smalltalk
- terminology: The subclass inherits from the superclass. In C++
- terminology: The derived class inherits from the base class.
- @end table
+ Here are some examples of using pictured numeric output:
- @c If you wonder about the message sending terminology, it comes from
+ @example
- @c a time when each object had it's own task and objects communicated via
+ : my-u. ( u -- )
- @c message passing; eventually the Smalltalk developers realized that
+   \ Simplest use of pns.. behaves like Standard u.
- @c they can do most things through simple (indirect) calls. They kept the
+             \ convert to unsigned double
- @c terminology.
+   <#             \ start conversion
+   #s             \ convert all digits
+   #>             \ complete conversion
+   TYPE SPACE ;   \ display, with trailing space
+ : cents-only ( u -- )
+             \ convert to unsigned double
+   <#             \ start conversion
+   # #            \ convert two least-significant digits
+   #>             \ complete conversion, discard other digits
+   TYPE SPACE ;   \ display, with trailing space
- @node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth
+ : dollars-and-cents ( u -- )
- @subsection The @file{objects.fs} model
+             \ convert to unsigned double
- @cindex objects
+   <#             \ start conversion
- @cindex object-oriented programming
+   # #            \ convert two least-significant digits
+   [char] . hold  \ insert decimal point
+   #s             \ convert remaining digits
+   [char] $ hold  \ append currency symbol
+   #>             \ complete conversion
+   TYPE SPACE ;   \ display, with trailing space
- @cindex @file{objects.fs}
+ : my-. ( n -- )
- @cindex @file{oof.fs}
+   \ handling negatives.. behaves like Standard .
+   s>d            \ convert to signed double
+   swap over dabs \ leave sign byte followed by unsigned double
+   <#             \ start conversion
+   #s             \ convert all digits
+   rot sign       \ get at sign byte, append "-" if needed
+   #>             \ complete conversion
+   TYPE SPACE ;   \ display, with trailing space
- This section describes the @file{objects.fs} packet. This material also has been published in @cite{Yet Another Forth Objects Package} by Anton Ertl and appeared in Forth Dimensions 19(2), pages 37--43 (@url{http://www.complang.tuwien.ac.at/forth/objects/objects.html}).
+ : account. ( n -- )
- @c McKewan's and Zsoter's packages
+   \ accountants don't like minus signs, they use braces
+   \ for negative numbers
+   s>d            \ convert to signed double
+   swap over dabs \ leave sign byte followed by unsigned double
+   <#             \ start conversion
+pick         \ get copy of sign byte
+< IF [char] ) hold THEN \ right-most character of output
+   #s             \ convert all digits
+   rot            \ get at sign byte
+< IF [char] ( hold THEN
+   #>             \ complete conversion
+   TYPE SPACE ;   \ display, with trailing space
+ @end example
- This section assumes that you have read @ref{Structures}.
+ Here are some examples of using these words:
- The techniques on which this model is based have been used to implement
+ @example
- the parser generator, Gray, and have also been used in Gforth for
+my-u. 1
- implementing the various flavours of word lists (hashed or not,
+ hex -1 my-u. decimal FFFFFFFF
- case-sensitive or not, special-purpose word lists for locals etc.).
+cents-only 01
+cents-only 34
+dollars-and-cents $0.02
+dollars-and-cents $12.34
+my-. 123
+ -123 my. -123
+account. 123
+ -456 account. (456)
+ @end example
- @menu
+ @node String Formats, Displaying characters and strings, Formatted numeric output, Other I/O
- * Properties of the Objects model::
+ @subsection String Formats
- * Basic Objects Usage::
+ @cindex string formats
- * The Objects base class::
- * Creating objects::
- * Object-Oriented Programming Style::
- * Class Binding::
- * Method conveniences::
- * Classes and Scoping::
- * Object Interfaces::
- * Objects Implementation::
- * Objects Glossary::
- @end menu
- Marcel Hendrix provided helpful comments on this section. Andras Zsoter
+ @comment TODO more index entries
- and Bernd Paysan helped me with the related works section.
- @node Properties of the Objects model, Basic Objects Usage, Objects, Objects
+ Forth commonly uses two different methods for representing a string:
- @subsubsection Properties of the @file{objects.fs} model
- @cindex @file{objects.fs} properties
  @itemize @bullet
  @item
- It is straightforward to pass objects on the stack. Passing
+ @cindex address of counted string
- selectors on the stack is a little less convenient, but possible.
+ As a @var{counted string}, represented by a @var{c-addr}. The char
+ addressed by @var{c-addr} contains a character-count, @var{n}, of the
+ string and the string occupies the subsequent @var{n} char addresses in
+ memory.
+ @item
+ As cell pair on the stack; @var{c-addr u}, where @var{u} is the length
+ of the string in characters, and @var{c-addr} is the address of the
+ first byte of the string.
+ @end itemize
- @item
+ ANS Forth encourages the use of the second format when representing
- Objects are just data structures in memory, and are referenced by their
+ strings on the stack, whilst conceeding that the counted string format
- address. You can create words for objects with normal defining words
+ remains useful as a way of storing strings in memory.
- like @code{constant}. Likewise, there is no difference between instance
- variables that contain objects and those that contain other data.
- @item
+ doc-count
- Late binding is efficient and easy to use.
- @item
+ @xref{Memory Blocks} for words that move, copy and search
- It avoids parsing, and thus avoids problems with state-smartness
+ for strings. @xref{Displaying characters and strings,} for words that
- and reduced extensibility; for convenience there are a few parsing
+ display characters and strings.
- words, but they have non-parsing counterparts. There are also a few
- defining words that parse. This is hard to avoid, because all standard
- defining words parse (except @code{:noname}); however, such
- words are not as bad as many other parsing words, because they are not
- state-smart.
- @item
- It does not try to incorporate everything. It does a few things and does
- them well (IMO). In particular, this model was not designed to support
- information hiding (although it has features that may help); you can use
- a separate package for achieving this.
- @item
+ @node Displaying characters and strings, Input, String Formats, Other I/O
- It is layered; you don't have to learn and use all features to use this
+ @subsection Displaying characters and strings
- model. Only a few features are necessary (@xref{Basic Objects Usage},
+ @cindex displaying characters and strings
- @xref{The Objects base class}, @xref{Creating objects}.), the others
+ @cindex compiling characters and strings
- are optional and independent of each other.
+ @cindex cursor control
- @item
+ @comment TODO more index entries
- An implementation in ANS Forth is available.
- @end itemize
+ This section starts with a glossary of Forth words and ends with a set
+ of examples.
+ doc-bl
+ doc-space
+ doc-spaces
+ doc-emit
+ doc-toupper
+ doc-."
+ doc-.(
+ doc-type
+ doc-cr
+ doc-at-xy
+ doc-page
+ doc-s"
+ doc-c"
+ doc-char
+ doc-[char]
+ doc-sliteral
- @node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects
+ As an example, consider the following text, stored in a file @file{test.fs}:
- @subsubsection Basic @file{objects.fs} Usage
- @cindex basic objects usage
- @cindex objects, basic usage
- You can define a class for graphical objects like this:
- @cindex @code{class} usage
- @cindex @code{end-class} usage
- @cindex @code{selector} usage
  @example
- object class \ "object" is the parent class
+ .( text-1)
-   selector draw ( x y graphical -- )
+ : my-word
- end-class graphical
+   ." text-2" cr
+   .( text-3)
+ ;
+ ." text-4"
+ : my-char
+   [char] ALPHABET emit
+   char emit
+ ;
  @end example
- This code defines a class @code{graphical} with an
+ When you load this code into Gforth, the following output is generated:
- operation @code{draw}.  We can perform the operation
- @code{draw} on any @code{graphical} object, e.g.:
  @example
-100 t-rex draw
+ @kbd{include test.fs <return>} text-1text-3text-4 ok
  @end example
- @noindent
+ @itemize @bullet
- where @code{t-rex} is a word (say, a constant) that produces a
+ @item
- graphical object.
+ Messages @code{text-1} and @code{text-3} are displayed because @code{.(}
+ is an immediate word; it behaves in the same way whether it is used inside
+ or outside a colon definition.
+ @item
+ Message @code{text-4} is displayed because of Gforth's added interpretation
+ semantics for @code{."}.
+ @item
+ Message @code{text-2} is @var{not} displayed, because the text interpreter
+ performs the compilation semantics for @code{."} within the definition of
+ @code{my-word}.
+ @end itemize
- @comment nac TODO add a 2nd operation eg perimeter.. and use for
+ Here are some examples of executing @code{my-word} and @code{my-char}:
- @comment a concrete example
- @cindex abstract class
+ @example
- How do we create a graphical object? With the present definitions,
+ @kbd{my-word <return>} text-2
- we cannot create a useful graphical object. The class
+  ok
- @code{graphical} describes graphical objects in general, but not
+ @kbd{my-char fred <return>} Af ok
- any concrete graphical object type (C++ users would call it an
+ @kbd{my-char jim <return>} Aj ok
- @emph{abstract class}); e.g., there is no method for the selector
+ @end example
- @code{draw} in the class @code{graphical}.
- For concrete graphical objects, we define child classes of the
+ @itemize @bullet
- class @code{graphical}, e.g.:
+ @item
+ Message @code{text-2} is displayed because of the run-time behaviour of
+ @code{."}.
+ @item
+ @code{[char]} compiles the ``A'' from ``ALPHABET'' and puts its display code
+ on the stack at run-time. @code{emit} always displays the character
+ when @code{my-char} is executed.
+ @item
+ @code{char} parses a string at run-time and the second @code{emit} displays
+ the first character of the string.
+ @item
+ If you type @code{see my-char} you can see that @code{[char]} discarded
+ the text ``LPHABET'' and only compiled the display code for ``A'' into the
+ definition of @code{my-char}.
+ @end itemize
- @cindex @code{overrides} usage
- @cindex @code{field} usage in class definition
- @example
- graphical class \ "graphical" is the parent class
-   cell% field circle-radius
- :noname ( x y circle -- )
-   circle-radius @@ draw-circle ;
- overrides draw
- :noname ( n-radius circle -- )
+ @node Input, , Displaying characters and strings, Other I/O
-   circle-radius ! ;
+ @subsection Input
- overrides construct
+ @cindex input
+ @comment TODO more index entries
- end-class circle
+ Blah on traditional and recommended string formats.
- @end example
- Here we define a class @code{circle} as a child of @code{graphical},
+ doc--trailing
- with field @code{circle-radius} (which behaves just like a field
+ doc-/string
- (@pxref{Structures}); it defines (using @code{overrides}) new methods
+ doc-convert
- for the selectors @code{draw} and @code{construct} (@code{construct} is
+ doc->number
- defined in @code{object}, the parent class of @code{graphical}).
+ doc->float
+ doc-accept
+ doc-query
+ doc-expect
+ doc-evaluate
+ doc-key
+ doc-key?
- Now we can create a circle on the heap (i.e.,
+ TODO reference the block move stuff elsewhere
- @code{allocate}d memory) with:
- @cindex @code{heap-new} usage
+ TODO convert and >number might be better in the numeric input section.
- @example
-circle heap-new constant my-circle
- @end example
- @noindent
+ TODO maybe some of these shouldn't be here but should be in a ``parsing'' section
- @code{heap-new} invokes @code{construct}, thus
- initializing the field @code{circle-radius} with 50. We can draw
- this new circle at (100,100) with:
- @example
-100 my-circle draw
- @end example
- @cindex selector invocation, restrictions
+ @c -------------------------------------------------------------
- @cindex class definition, restrictions
+ @node Programming Tools, Assembler and Code Words, Other I/O, Words
- Note: You can only invoke a selector if the object on the TOS
+ @section Programming Tools
- (the receiving object) belongs to the class where the selector was
+ @cindex programming tools
- defined or one of its descendents; e.g., you can invoke
- @code{draw} only for objects belonging to @code{graphical}
- or its descendents (e.g., @code{circle}).  Immediately before
- @code{end-class}, the search order has to be the same as
- immediately after @code{class}.
- @node The Objects base class, Creating objects, Basic Objects Usage, Objects
+ @menu
- @subsubsection The @file{object.fs} base class
+ * Debugging::                   Simple and quick.
- @cindex @code{object} class
+ * Assertions::                  Making your programs self-checking.
+ * Singlestep Debugger::         Executing your program word by word.
+ @end menu
- When you define a class, you have to specify a parent class.  So how do
+ @node Debugging, Assertions, Programming Tools, Programming Tools
- you start defining classes? There is one class available from the start:
+ @subsection Debugging
- @code{object}. It is ancestor for all classes and so is the
+ @cindex debugging
- only class that has no parent. It has two selectors: @code{construct}
- and @code{print}.
- @node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects
+ Languages with a slow edit/compile/link/test development loop tend to
- @subsubsection Creating objects
+ require sophisticated tracing/stepping debuggers to facilate
- @cindex creating objects
+ productive debugging.
- @cindex object creation
- @cindex object allocation options
- @cindex @code{heap-new} discussion
+ A much better (faster) way in fast-compiling languages is to add
- @cindex @code{dict-new} discussion
+ printing code at well-selected places, let the program run, look at
- @cindex @code{construct} discussion
+ the output, see where things went wrong, add more printing code, etc.,
- You can create and initialize an object of a class on the heap with
+ until the bug is found.
- @code{heap-new} ( ... class -- object ) and in the dictionary
- (allocation with @code{allot}) with @code{dict-new} (
- ... class -- object ). Both words invoke @code{construct}, which
- consumes the stack items indicated by "..." above.
- @cindex @code{init-object} discussion
+ The simple debugging aids provided in @file{debugs.fs}
- @cindex @code{class-inst-size} discussion
+ are meant to support this style of debugging. In addition, there are
- If you want to allocate memory for an object yourself, you can get its
+ words for non-destructively inspecting the stack and memory:
- alignment and size with @code{class-inst-size 2@@} ( class --
- align size ). Once you have memory for an object, you can initialize
- it with @code{init-object} ( ... class object -- );
- @code{construct} does only a part of the necessary work.
- @node Object-Oriented Programming Style, Class Binding, Creating objects, Objects
+ doc-.s
- @subsubsection Object-Oriented Programming Style
+ doc-f.s
- @cindex object-oriented programming style
- This section is not exhaustive.
+ There is a word @code{.r} but it does @var{not} display the return
+ stack! It is used for formatted numeric output.
- @cindex stack effects of selectors
+ doc-depth
- @cindex selectors and stack effects
+ doc-fdepth
- In general, it is a good idea to ensure that all methods for the
+ doc-clearstack
- same selector have the same stack effect: when you invoke a selector,
+ doc-?
- you often have no idea which method will be invoked, so, unless all
+ doc-dump
- methods have the same stack effect, you will not know the stack effect
- of the selector invocation.
- One exception to this rule is methods for the selector
+ The word @code{~~} prints debugging information (by default the source
- @code{construct}. We know which method is invoked, because we
+ location and the stack contents). It is easy to insert. If you use Emacs
- specify the class to be constructed at the same place. Actually, I
+ it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
- defined @code{construct} as a selector only to give the users a
+ query-replace them with nothing). The deferred words
- convenient way to specify initialization. The way it is used, a
+ @code{printdebugdata} and @code{printdebugline} control the output of
- mechanism different from selector invocation would be more natural
+ @code{~~}. The default source location output format works well with
- (but probably would take more code and more space to explain).
+ Emacs' compilation mode, so you can step through the program at the
+ source level using @kbd{C-x `} (the advantage over a stepping debugger
+ is that you can step in any direction and you know where the crash has
+ happened or where the strange data has occurred).
- @node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects
+ The default actions of @code{~~} clobber the contents of the pictured
- @subsubsection Class Binding
+ numeric output string, so you should not use @code{~~}, e.g., between
- @cindex class binding
+ @code{<#} and @code{#>}.
- @cindex early binding
- @cindex late binding
+ doc-~~
- Normal selector invocations determine the method at run-time depending
+ doc-printdebugdata
- on the class of the receiving object. This run-time selection is called
+ doc-printdebugline
- @var{late binding}.
- Sometimes it's preferable to invoke a different method. For example,
+ doc-see
- you might want to use the simple method for @code{print}ing
+ doc-marker
- @code{object}s instead of the possibly long-winded @code{print} method
- of the receiver class. You can achieve this by replacing the invocation
+ Here's an example of using @code{marker} at the start of a source file
- of @code{print} with:
+ that you are debugging; it ensures that you only ever have one copy of
+ the file's definitions compiled at any time:
- @cindex @code{[bind]} usage
  @example
- [bind] object print
+ [IFDEF] my-code
- @end example
+     my-code
+ [ENDIF]
- @noindent
+ marker my-code
- in compiled code or:
- @cindex @code{bind} usage
+ \ .. definitions start here
- @example
+ \ .
- bind object print
+ \ .
+ \ end
  @end example
- @cindex class binding, alternative to
- @noindent
- in interpreted code. Alternatively, you can define the method with a
- name (e.g., @code{print-object}), and then invoke it through the
- name. Class binding is just a (often more convenient) way to achieve
- the same effect; it avoids name clutter and allows you to invoke
- methods directly without naming them first.
- @cindex superclass binding
- @cindex parent class binding
- A frequent use of class binding is this: When we define a method
- for a selector, we often want the method to do what the selector does
- in the parent class, and a little more. There is a special word for
- this purpose: @code{[parent]}; @code{[parent]
- @emph{selector}} is equivalent to @code{[bind] @emph{parent
- selector}}, where @code{@emph{parent}} is the parent
- class of the current class. E.g., a method definition might look like:
- @cindex @code{[parent]} usage
- @example
- :noname
-   dup [parent] foo \ do parent's foo on the receiving object
-   ... \ do some more
- ; overrides foo
- @end example
- @cindex class binding as optimization
+ @node Assertions, Singlestep Debugger, Debugging, Programming Tools
- In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions,
+ @subsection Assertions
- March 1997), Andrew McKewan presents class binding as an optimization
+ @cindex assertions
- technique. I recommend not using it for this purpose unless you are in
- an emergency. Late binding is pretty fast with this model anyway, so the
- benefit of using class binding is small; the cost of using class binding
- where it is not appropriate is reduced maintainability.
- While we are at programming style questions: You should bind
+ It is a good idea to make your programs self-checking, especially if you
- selectors only to ancestor classes of the receiving object. E.g., say,
+ make an assumption that may become invalid during maintenance (for
- you know that the receiving object is of class @code{foo} or its
+ example, that a certain field of a data structure is never zero). Gforth
- descendents; then you should bind only to @code{foo} and its
+ supports @var{assertions} for this purpose. They are used like this:
- ancestors.
- @node Method conveniences, Classes and Scoping, Class Binding, Objects
+ @example
- @subsubsection Method conveniences
+ assert( @var{flag} )
- @cindex method conveniences
+ @end example
- In a method you usually access the receiving object pretty often.  If
+ The code between @code{assert(} and @code{)} should compute a flag, that
- you define the method as a plain colon definition (e.g., with
+ should be true if everything is alright and false otherwise. It should
- @code{:noname}), you may have to do a lot of stack
+ not change anything else on the stack. The overall stack effect of the
- gymnastics. To avoid this, you can define the method with @code{m:
+ assertion is @code{( -- )}. E.g.
- ... ;m}. E.g., you could define the method for
- @code{draw}ing a @code{circle} with
- @cindex @code{this} usage
- @cindex @code{m:} usage
- @cindex @code{;m} usage
  @example
- m: ( x y circle -- )
+ assert( 1 1 + 2 = ) \ what we learn in school
-   ( x y ) this circle-radius @@ draw-circle ;m
+ assert( dup 0<> ) \ assert that the top of stack is not zero
+ assert( false ) \ this code should not be reached
  @end example
- @cindex @code{exit} in @code{m: ... ;m}
+ The need for assertions is different at different times. During
- @cindex @code{exitm} discussion
+ debugging, we want more checking, in production we sometimes care more
- @cindex @code{catch} in @code{m: ... ;m}
+ for speed. Therefore, assertions can be turned off, i.e., the assertion
- When this method is executed, the receiver object is removed from the
+ becomes a comment. Depending on the importance of an assertion and the
- stack; you can access it with @code{this} (admittedly, in this
+ time it takes to check it, you may want to turn off some assertions and
- example the use of @code{m: ... ;m} offers no advantage). Note
+ keep others turned on. Gforth provides several levels of assertions for
- that I specify the stack effect for the whole method (i.e. including
+ this purpose:
- the receiver object), not just for the code between @code{m:}
- and @code{;m}. You cannot use @code{exit} in
- @code{m:...;m}; instead, use
- @code{exitm}.@footnote{Moreover, for any word that calls
- @code{catch} and was defined before loading
- @code{objects.fs}, you have to redefine it like I redefined
- @code{catch}: @code{: catch this >r catch r> to-this ;}}
- @cindex @code{inst-var} usage
+ doc-assert0(
- You will frequently use sequences of the form @code{this
+ doc-assert1(
- @emph{field}} (in the example above: @code{this
+ doc-assert2(
- circle-radius}). If you use the field only in this way, you can
+ doc-assert3(
- define it with @code{inst-var} and eliminate the
+ doc-assert(
- @code{this} before the field name. E.g., the @code{circle}
+ doc-)
- class above could also be defined with:
- @example
+ The variable @code{assert-level} specifies the highest assertions that
- graphical class
+ are turned on. I.e., at the default @code{assert-level} of one,
-   cell% inst-var radius
+ @code{assert0(} and @code{assert1(} assertions perform checking, while
+ @code{assert2(} and @code{assert3(} assertions are treated as comments.
+ The value of @code{assert-level} is evaluated at compile-time, not at
+ run-time. Therefore you cannot turn assertions on or off at run-time;
+ you have to set the @code{assert-level} appropriately before compiling a
+ piece of code. You can compile different pieces of code at different
+ @code{assert-level}s (e.g., a trusted library at level 1 and
+ newly-written code at level 3).
- m: ( x y circle -- )
+ doc-assert-level
-   radius @@ draw-circle ;m
- overrides draw
- m: ( n-radius circle -- )
+ If an assertion fails, a message compatible with Emacs' compilation mode
-   radius ! ;m
+ is produced and the execution is aborted (currently with @code{ABORT"}.
- overrides construct
+ If there is interest, we will introduce a special throw code. But if you
+ intend to @code{catch} a specific condition, using @code{throw} is
+ probably more appropriate than an assertion).
- end-class circle
+ Definitions in ANS Forth for these assertion words are provided
- @end example
+ in @file{compat/assert.fs}.
- @code{radius} can only be used in @code{circle} and its
- descendent classes and inside @code{m:...;m}.
- @cindex @code{inst-value} usage
+ @node Singlestep Debugger, , Assertions, Programming Tools
- You can also define fields with @code{inst-value}, which is
+ @subsection Singlestep Debugger
- to @code{inst-var} what @code{value} is to
+ @cindex singlestep Debugger
- @code{variable}.  You can change the value of such a field with
+ @cindex debugging Singlestep
- @code{[to-inst]}.  E.g., we could also define the class
+ @cindex @code{dbg}
- @code{circle} like this:
+ @cindex @code{BREAK:}
+ @cindex @code{BREAK"}
- @example
+ When you create a new word there's often the need to check whether it
- graphical class
+ behaves correctly or not. You can do this by typing @code{dbg
-   inst-value radius
+ badword}. A debug session might look like this:
- m: ( x y circle -- )
+ @example
-   radius draw-circle ;m
+ : badword 0 DO i . LOOP ;  ok
- overrides draw
+dbg badword
+ : badword
+ Scanning code...
- m: ( n-radius circle -- )
+ Nesting debugger ready!
-   [to-inst] radius ;m
- overrides construct
- end-class circle
+D4738  8049BC4 0              -> [ 2 ] 00002 00000
+D4740  8049F68 DO             -> [ 0 ]
+D4744  804A0C8 i              -> [ 1 ] 00000
+D4748 400C5E60 .              -> 0 [ 0 ]
+D474C  8049D0C LOOP           -> [ 0 ]
+D4744  804A0C8 i              -> [ 1 ] 00001
+D4748 400C5E60 .              -> 1 [ 0 ]
+D474C  8049D0C LOOP           -> [ 0 ]
+D4758  804B384 ;              ->  ok
  @end example
+ Each line displayed is one step. You always have to hit return to
+ execute the next word that is displayed. If you don't want to execute
+ the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is
+ an overview what keys are available:
- @node Classes and Scoping, Object Interfaces, Method conveniences, Objects
+ @table @i
- @subsubsection Classes and Scoping
- @cindex classes and scoping
- @cindex scoping and classes
- Inheritance is frequent, unlike structure extension. This exacerbates
+ @item <return>
- the problem with the field name convention (@pxref{Structure Naming
+ Next; Execute the next word.
- Convention}): One always has to remember in which class the field was
- originally defined; changing a part of the class structure would require
- changes for renaming in otherwise unaffected code.
- @cindex @code{inst-var} visibility
+ @item n
- @cindex @code{inst-value} visibility
+ Nest; Single step through next word.
- To solve this problem, I added a scoping mechanism (which was not in my
- original charter): A field defined with @code{inst-var} (or
- @code{inst-value}) is visible only in the class where it is defined and in
- the descendent classes of this class.  Using such fields only makes
- sense in @code{m:}-defined methods in these classes anyway.
- This scoping mechanism allows us to use the unadorned field name,
+ @item u
- because name clashes with unrelated words become much less likely.
+ Unnest; Stop debugging and execute rest of word. If we got to this word
+ with nest, continue debugging with the calling word.
- @cindex @code{protected} discussion
+ @item d
- @cindex @code{private} discussion
+ Done; Stop debugging and execute rest.
- Once we have this mechanism, we can also use it for controlling the
- visibility of other words: All words defined after
- @code{protected} are visible only in the current class and its
- descendents. @code{public} restores the compilation
- (i.e. @code{current}) word list that was in effect before. If you
- have several @code{protected}s without an intervening
- @code{public} or @code{set-current}, @code{public}
- will restore the compilation word list in effect before the first of
- these @code{protected}s.
- @node Object Interfaces, Objects Implementation, Classes and Scoping, Objects
+ @item s
- @subsubsection Object Interfaces
+ Stop; Abort immediately.
- @cindex object interfaces
- @cindex interfaces for objects
- In this model you can only call selectors defined in the class of the
+ @end table
- receiving objects or in one of its ancestors. If you call a selector
- with a receiving object that is not in one of these classes, the
- result is undefined; if you are lucky, the program crashes
- immediately.
- @cindex selectors common to hardly-related classes
+ Debugging large application with this mechanism is very difficult, because
- Now consider the case when you want to have a selector (or several)
+ you have to nest very deeply into the program before the interesting part
- available in two classes: You would have to add the selector to a
+ begins. This takes a lot of time.
- common ancestor class, in the worst case to @code{object}. You
- may not want to do this, e.g., because someone else is responsible for
- this ancestor class.
- The solution for this problem is interfaces. An interface is a
+ To do it more directly put a @code{BREAK:} command into your source code.
- collection of selectors. If a class implements an interface, the
+ When program execution reaches @code{BREAK:} the single step debugger is
- selectors become available to the class and its descendents. A class
+ invoked and you have all the features described above.
- can implement an unlimited number of interfaces. For the problem
- discussed above, we would define an interface for the selector(s), and
- both classes would implement the interface.
- As an example, consider an interface @code{storage} for
+ If you have more than one part to debug it is useful to know where the
- writing objects to disk and getting them back, and a class
+ program has stopped at the moment. You can do this by the
- @code{foo} that implements it. The code would look like this:
+ @code{BREAK" string"} command. This behaves like @code{BREAK:} except that
+ string is typed out when the ``breakpoint'' is reached.
- @cindex @code{interface} usage
+ doc-dbg
- @cindex @code{end-interface} usage
+ doc-BREAK:
- @cindex @code{implementation} usage
+ doc-BREAK"
- @example
- interface
-   selector write ( file object -- )
-   selector read1 ( file object -- )
- end-interface storage
- bar class
-   storage implementation
- ... overrides write
+ @c -------------------------------------------------------------
- ... overrides read
+ @node Assembler and Code Words, Threading Words, Programming Tools, Words
- ...
+ @section Assembler and Code Words
- end-class foo
+ @cindex assembler
- @end example
+ @cindex code words
- @noindent
+ Gforth provides some words for defining primitives (words written in
- (I would add a word @code{read} @var{( file -- object )} that uses
+ machine code), and for defining the the machine-code equivalent of
- @code{read1} internally, but that's beyond the point illustrated
+ @code{DOES>}-based defining words. However, the machine-independent
- here.)
+ nature of Gforth poses a few problems: First of all, Gforth runs on
+ several architectures, so it can provide no standard assembler. What's
+ worse is that the register allocation not only depends on the processor,
+ but also on the @code{gcc} version and options used.
- Note that you cannot use @code{protected} in an interface; and
+ The words that Gforth offers encapsulate some system dependences (e.g., the
- of course you cannot define fields.
+ header structure), so a system-independent assembler may be used in
+ Gforth. If you do not have an assembler, you can compile machine code
+ directly with @code{,} and @code{c,}.
- In the Neon model, all selectors are available for all classes;
+ doc-assembler
- therefore it does not need interfaces. The price you pay in this model
+ doc-code
- is slower late binding, and therefore, added complexity to avoid late
+ doc-end-code
- binding.
+ doc-;code
+ doc-flush-icache
- @node Objects Implementation, Objects Glossary, Object Interfaces, Objects
+ If @code{flush-icache} does not work correctly, @code{code} words
- @subsubsection @file{objects.fs} Implementation
+ etc. will not work (reliably), either.
- @cindex @file{objects.fs} implementation
- @cindex @code{object-map} discussion
+ @code{flush-icache} is always present. The other words are rarely used
- An object is a piece of memory, like one of the data structures
+ and reside in @code{code.fs}, which is usually not loaded. You can load
- described with @code{struct...end-struct}. It has a field
+ it with @code{require code.fs}.
- @code{object-map} that points to the method map for the object's
- class.
- @cindex method map
+ @cindex registers of the inner interpreter
- @cindex virtual function table
+ In the assembly code you will want to refer to the inner interpreter's
- The @emph{method map}@footnote{This is Self terminology; in C++
+ registers (e.g., the data stack pointer) and you may want to use other
- terminology: virtual function table.} is an array that contains the
+ registers for temporary storage. Unfortunately, the register allocation
- execution tokens (@var{xt}s) of the methods for the object's class. Each
+ is installation-dependent.
- selector contains an offset into a method map.
- @cindex @code{selector} implementation, class
+ The easiest solution is to use explicit register declarations
- @code{selector} is a defining word that uses
+ (@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info,
- @code{CREATE} and @code{DOES>}. The body of the
+ GNU C Manual}) for all of the inner interpreter's registers: You have to
- selector contains the offset; the @code{does>} action for a
+ compile Gforth with @code{-DFORCE_REG} (configure option
- class selector is, basically:
+ @code{--enable-force-reg}) and the appropriate declarations must be
+ present in the @code{machine.h} file (see @code{mips.h} for an example;
+ you can find a full list of all declarable register symbols with
+ @code{grep register engine.c}). If you give explicit registers to all
+ variables that are declared at the beginning of @code{engine()}, you
+ should be able to use the other caller-saved registers for temporary
+ storage. Alternatively, you can use the @code{gcc} option
+ @code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code
+ Generation Conventions, gcc.info, GNU C Manual}) to reserve a register
+ (however, this restriction on register allocation may slow Gforth
+ significantly).
- @example
+ If this solution is not viable (e.g., because @code{gcc} does not allow
- ( object addr ) @@ over object-map @@ + @@ execute
+ you to explicitly declare all the registers you need), you have to find
- @end example
+ out by looking at the code where the inner interpreter's registers
+ reside and which registers can be used for temporary storage. You can
+ get an assembly listing of the engine's code with @code{make engine.s}.
- Since @code{object-map} is the first field of the object, it
+ In any case, it is good practice to abstract your assembly code from the
- does not generate any code. As you can see, calling a selector has a
+ actual register allocation. E.g., if the data stack pointer resides in
- small, constant cost.
+ register @code{$17}, create an alias for this register called @code{sp},
+ and use that in your assembly code.
- @cindex @code{current-interface} discussion
+ @cindex code words, portable
- @cindex class implementation and representation
+ Another option for implementing normal and defining words efficiently
- A class is basically a @code{struct} combined with a method
+ is to add the desired functionality to the source of Gforth. For normal
- map. During the class definition the alignment and size of the class
+ words you just have to edit @file{primitives} (@pxref{Automatic
- are passed on the stack, just as with @code{struct}s, so
+ Generation}). Defining words (equivalent to @code{;CODE} words, for fast
- @code{field} can also be used for defining class
+ defined words) may require changes in @file{engine.c}, @file{kernel.fs},
- fields. However, passing more items on the stack would be
+ @file{prims2x.fs}, and possibly @file{cross.fs}.
- inconvenient, so @code{class} builds a data structure in memory,
- which is accessed through the variable
- @code{current-interface}. After its definition is complete, the
- class is represented on the stack by a pointer (e.g., as parameter for
- a child class definition).
- A new class starts off with the alignment and size of its parent,
- and a copy of the parent's method map. Defining new fields extends the
- size and alignment; likewise, defining new selectors extends the
- method map. @code{overrides} just stores a new @var{xt} in the method
- map at the offset given by the selector.
- @cindex class binding, implementation
+ @c -------------------------------------------------------------
- Class binding just gets the @var{xt} at the offset given by the selector
+ @node Threading Words, Locals, Assembler and Code Words, Words
- from the class's method map and @code{compile,}s (in the case of
+ @section Threading Words
- @code{[bind]}) it.
+ @cindex threading words
- @cindex @code{this} implementation
+ @cindex code address
- @cindex @code{catch} and @code{this}
+ These words provide access to code addresses and other threading stuff
- @cindex @code{this} and @code{catch}
+ in Gforth (and, possibly, other interpretive Forths). It more or less
- I implemented @code{this} as a @code{value}. At the
+ abstracts away the differences between direct and indirect threading
- start of an @code{m:...;m} method the old @code{this} is
+ (and, for direct threading, the machine dependences). However, at
- stored to the return stack and restored at the end; and the object on
+ present this wordset is still incomplete. It is also pretty low-level;
- the TOS is stored @code{TO this}. This technique has one
+ some day it will hopefully be made unnecessary by an internals wordset
- disadvantage: If the user does not leave the method via
+ that abstracts implementation details away completely.
- @code{;m}, but via @code{throw} or @code{exit},
- @code{this} is not restored (and @code{exit} may
- crash). To deal with the @code{throw} problem, I have redefined
- @code{catch} to save and restore @code{this}; the same
- should be done with any word that can catch an exception. As for
- @code{exit}, I simply forbid it (as a replacement, there is
- @code{exitm}).
- @cindex @code{inst-var} implementation
+ doc-threading-method
- @code{inst-var} is just the same as @code{field}, with
+ doc->code-address
- a different @code{does>} action:
+ doc->does-code
- @example
+ doc-code-address!
- @@ this +
+ doc-does-code!
- @end example
+ doc-does-handler!
- Similar for @code{inst-value}.
+ doc-/does-handler
- @cindex class scoping implementation
+ The code addresses produced by various defining words are produced by
- Each class also has a word list that contains the words defined with
+ the following words:
- @code{inst-var} and @code{inst-value}, and its protected
- words. It also has a pointer to its parent. @code{class} pushes
- the word lists of the class and all its ancestors onto the search order stack,
- and @code{end-class} drops them.
- @cindex interface implementation
+ doc-docol:
- An interface is like a class without fields, parent and protected
+ doc-docon:
- words; i.e., it just has a method map. If a class implements an
+ doc-dovar:
- interface, its method map contains a pointer to the method map of the
+ doc-douser:
- interface. The positive offsets in the map are reserved for class
+ doc-dodefer:
- methods, therefore interface map pointers have negative
+ doc-dofield:
- offsets. Interfaces have offsets that are unique throughout the
- system, unlike class selectors, whose offsets are only unique for the
- classes where the selector is available (invokable).
- This structure means that interface selectors have to perform one
+ You can recognize words defined by a @code{CREATE}...@code{DOES>} word
- indirection more than class selectors to find their method. Their body
+ with @code{>does-code}. If the word was defined in that way, the value
- contains the interface map pointer offset in the class method map, and
+ returned is non-zero and identifies the @code{DOES>} used by the
- the method offset in the interface method map. The
+ defining word.
- @code{does>} action for an interface selector is, basically:
+ @comment TODO should that be ``identifies the xt of the DOES> ??''
- @example
+ @c -------------------------------------------------------------
- ( object selector-body )
+ @node Locals, Structures, Threading Words, Words
-dup selector-interface @@ ( object selector-body object interface-offset )
+ @section Locals
- swap object-map @@ + @@ ( object selector-body map )
+ @cindex locals
- swap selector-offset @@ + @@ execute
- @end example
- where @code{object-map} and @code{selector-offset} are
+ Local variables can make Forth programming more enjoyable and Forth
- first fields and generate no code.
+ programs easier to read. Unfortunately, the locals of ANS Forth are
+ laden with restrictions. Therefore, we provide not only the ANS Forth
+ locals wordset, but also our own, more powerful locals wordset (we
+ implemented the ANS Forth locals wordset through our locals wordset).
- As a concrete example, consider the following code:
+ The ideas in this section have also been published in the paper
+ @cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented
+ at EuroForth '94; it is available at
+ @*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}.
- @example
+ @menu
- interface
+ * Gforth locals::
-   selector if1sel1
+ * ANS Forth locals::
-   selector if1sel2
+ @end menu
- end-interface if1
- object class
+ @node Gforth locals, ANS Forth locals, Locals, Locals
-   if1 implementation
+ @subsection Gforth locals
-   selector cl1sel1
+ @cindex Gforth locals
-   cell% inst-var cl1iv1
+ @cindex locals, Gforth style
- ' m1 overrides construct
+ Locals can be defined with
- ' m2 overrides if1sel1
- ' m3 overrides if1sel2
- ' m4 overrides cl1sel2
- end-class cl1
- create obj1 object dict-new drop
+ @example
- create obj2 cl1    dict-new drop
+ @{ local1 local2 ... -- comment @}
+ @end example
+ or
+ @example
+ @{ local1 local2 ... @}
  @end example
- The data structure created by this code (including the data structure
+ E.g.,
- for @code{object}) is shown in the <a
+ @example
- href="objects-implementation.eps">figure</a>, assuming a cell size of 4.
+ : max @{ n1 n2 -- n3 @}
- @comment nac TODO add this diagram..
+  n1 n2 > if
+    n1
+  else
+    n2
+  endif ;
+ @end example
- @node Objects Glossary,  , Objects Implementation, Objects
+ The similarity of locals definitions with stack comments is intended. A
- @subsubsection @file{objects.fs} Glossary
+ locals definition often replaces the stack comment of a word. The order
- @cindex @file{objects.fs} Glossary
+ of the locals corresponds to the order in a stack comment and everything
+ after the @code{--} is really a comment.
- doc---objects-bind
+ This similarity has one disadvantage: It is too easy to confuse locals
- doc---objects-<bind>
+ declarations with stack comments, causing bugs and making them hard to
- doc---objects-bind'
+ find. However, this problem can be avoided by appropriate coding
- doc---objects-[bind]
+ conventions: Do not use both notations in the same program. If you do,
- doc---objects-class
+ they should be distinguished using additional means, e.g. by position.
- doc---objects-class->map
- doc---objects-class-inst-size
- doc---objects-class-override!
- doc---objects-construct
- doc---objects-current'
- doc---objects-[current]
- doc---objects-current-interface
- doc---objects-dict-new
- doc---objects-drop-order
- doc---objects-end-class
- doc---objects-end-class-noname
- doc---objects-end-interface
- doc---objects-end-interface-noname
- doc---objects-exitm
- doc---objects-heap-new
- doc---objects-implementation
- doc---objects-init-object
- doc---objects-inst-value
- doc---objects-inst-var
- doc---objects-interface
- doc---objects-;m
- doc---objects-m:
- doc---objects-method
- doc---objects-object
- doc---objects-overrides
- doc---objects-[parent]
- doc---objects-print
- doc---objects-protected
- doc---objects-public
- doc---objects-push-order
- doc---objects-selector
- doc---objects-this
- doc---objects-<to-inst>
- doc---objects-[to-inst]
- doc---objects-to-this
- doc---objects-xt-new
- @c -------------------------------------------------------------
+ @cindex types of locals
- @node OOF, Mini-OOF, Objects, Object-oriented Forth
+ @cindex locals types
- @subsection The @file{oof.fs} model
+ The name of the local may be preceded by a type specifier, e.g.,
- @cindex oof
+ @code{F:} for a floating point value:
- @cindex object-oriented programming
- @cindex @file{objects.fs}
+ @example
- @cindex @file{oof.fs}
+ : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
+ \ complex multiplication
+  Ar Br f* Ai Bi f* f-
+  Ar Bi f* Ai Br f* f+ ;
+ @end example
+ @cindex flavours of locals
+ @cindex locals flavours
+ @cindex value-flavoured locals
+ @cindex variable-flavoured locals
+ Gforth currently supports cells (@code{W:}, @code{W^}), doubles
+ (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
+ (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
+ with @code{W:}, @code{D:} etc.) produces its value and can be changed
+ with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
+ produces its address (which becomes invalid when the variable's scope is
+ left). E.g., the standard word @code{emit} can be defined in terms of
+ @code{type} like this:
- This section describes the @file{oof.fs} packet.
+ @example
+ : emit @{ C^ char* -- @}
+     char* 1 type ;
+ @end example
- The packet described in this section has been used in bigFORTH since 1991, and
+ @cindex default type of locals
- used for two large applications: a chromatographic system used to
+ @cindex locals, default type
- create new medicaments, and a graphic user interface library (MINOS).
+ A local without type specifier is a @code{W:} local. Both flavours of
+ locals are initialized with values from the data or FP stack.
- You can find a description (in German) of @file{oof.fs} in @cite{Object
+ Currently there is no way to define locals with user-defined data
- oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension}
+ structures, but we are working on it.
-(2), 1994.
+ Gforth allows defining locals everywhere in a colon definition. This
+ poses the following questions:
  @menu
- * Properties of the OOF model::
+ * Where are locals visible by name?::
- * Basic OOF Usage::
+ * How long do locals live?::
- * The OOF base class::
+ * Programming Style::
- * Class Declaration::
+ * Implementation::
- * Class Implementation::
  @end menu
- @node Properties of the OOF model, Basic OOF Usage, OOF, OOF
+ @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
- @subsubsection Properties of the @file{oof.fs} model
+ @subsubsection Where are locals visible by name?
- @cindex @file{oof.fs} properties
+ @cindex locals visibility
+ @cindex visibility of locals
+ @cindex scope of locals
- @itemize @bullet
+ Basically, the answer is that locals are visible where you would expect
- @item
+ it in block-structured languages, and sometimes a little longer. If you
- This model combines object oriented programming with information
+ want to restrict the scope of a local, enclose its definition in
- hiding. It helps you writing large application, where scoping is
+ @code{SCOPE}...@code{ENDSCOPE}.
- necessary, because it provides class-oriented scoping.
- @item
+ doc-scope
- Named objects, object pointers, and object arrays can be created,
+ doc-endscope
- selector invocation uses the "object selector" syntax. Selector invocation
- to objects and/or selectors on the stack is a bit less convenient, but
- possible.
- @item
+ These words behave like control structure words, so you can use them
- Selector invocation and instance variable usage of the active object is
+ with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
- straightforward, since both make use of the active object.
+ arbitrary ways.
- @item
+ If you want a more exact answer to the visibility question, here's the
- Late binding is efficient and easy to use.
+ basic principle: A local is visible in all places that can only be
+ reached through the definition of the local@footnote{In compiler
+ construction terminology, all places dominated by the definition of the
+ local.}. In other words, it is not visible in places that can be reached
+ without going through the definition of the local. E.g., locals defined
+ in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
+ defined in @code{BEGIN}...@code{UNTIL} are visible after the
+ @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
- @item
+ The reasoning behind this solution is: We want to have the locals
- State-smart objects parse selectors. However, extensibility is provided
+ visible as long as it is meaningful. The user can always make the
- using a (parsing) selector @code{postpone} and a selector @code{'}.
+ visibility shorter by using explicit scoping. In a place that can
+ only be reached through the definition of a local, the meaning of a
+ local name is clear. In other places it is not: How is the local
+ initialized at the control flow path that does not contain the
+ definition? Which local is meant, if the same name is defined twice in
+ two independent control flow paths?
- @item
+ This should be enough detail for nearly all users, so you can skip the
- An implementation in ANS Forth is available.
+ rest of this section. If you really must know all the gory details and
+ options, read on.
- @end itemize
+ In order to implement this rule, the compiler has to know which places
+ are unreachable. It knows this automatically after @code{AHEAD},
+ @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
+ most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
+ compiler that the control flow never reaches that place. If
+ @code{UNREACHABLE} is not used where it could, the only consequence is
+ that the visibility of some locals is more limited than the rule above
+ says. If @code{UNREACHABLE} is used where it should not (i.e., if you
+ lie to the compiler), buggy code will be produced.
+ doc-unreachable
- @node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF
+ Another problem with this rule is that at @code{BEGIN}, the compiler
- @subsubsection Basic @file{oof.fs} Usage
+ does not know which locals will be visible on the incoming
- @cindex @file{oof.fs} usage
+ back-edge. All problems discussed in the following are due to this
+ ignorance of the compiler (we discuss the problems using @code{BEGIN}
+ loops as examples; the discussion also applies to @code{?DO} and other
+ loops). Perhaps the most insidious example is:
+ @example
+ AHEAD
+ BEGIN
+   x
+ [ 1 CS-ROLL ] THEN
+   @{ x @}
+   ...
+ UNTIL
+ @end example
- This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}).
+ This should be legal according to the visibility rule. The use of
+ @code{x} can only be reached through the definition; but that appears
+ textually below the use.
- You can define a class for graphical objects like this:
+ From this example it is clear that the visibility rules cannot be fully
+ implemented without major headaches. Our implementation treats common
+ cases as advertised and the exceptions are treated in a safe way: The
+ compiler makes a reasonable guess about the locals visible after a
+ @code{BEGIN}; if it is too pessimistic, the
+ user will get a spurious error about the local not being defined; if the
+ compiler is too optimistic, it will notice this later and issue a
+ warning. In the case above the compiler would complain about @code{x}
+ being undefined at its use. You can see from the obscure examples in
+ this section that it takes quite unusual control structures to get the
+ compiler into trouble, and even then it will often do fine.
- @cindex @code{class} usage
+ If the @code{BEGIN} is reachable from above, the most optimistic guess
- @cindex @code{class;} usage
+ is that all locals visible before the @code{BEGIN} will also be
- @cindex @code{method} usage
+ visible after the @code{BEGIN}. This guess is valid for all loops that
+ are entered only through the @code{BEGIN}, in particular, for normal
+ @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
+ @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
+ compiler. When the branch to the @code{BEGIN} is finally generated by
+ @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
+ warns the user if it was too optimistic:
  @example
- object class graphical \ "object" is the parent class
+ IF
-   method draw ( x y graphical -- )
+   @{ x @}
- class;
+ BEGIN
+   \ x ?
+ [ 1 cs-roll ] THEN
+   ...
+ UNTIL
  @end example
- This code defines a class @code{graphical} with an
+ Here, @code{x} lives only until the @code{BEGIN}, but the compiler
- operation @code{draw}.  We can perform the operation
+ optimistically assumes that it lives until the @code{THEN}. It notices
- @code{draw} on any @code{graphical} object, e.g.:
+ this difference when it compiles the @code{UNTIL} and issues a
+ warning. The user can avoid the warning, and make sure that @code{x}
+ is not used in the wrong area by using explicit scoping:
  @example
-100 t-rex draw
+ IF
+   SCOPE
+   @{ x @}
+   ENDSCOPE
+ BEGIN
+ [ 1 cs-roll ] THEN
+   ...
+ UNTIL
  @end example
- @noindent
+ Since the guess is optimistic, there will be no spurious error messages
- where @code{t-rex} is an object or object pointer, created with e.g.
+ about undefined locals.
- @code{graphical : t-rex}.
- @cindex abstract class
+ If the @code{BEGIN} is not reachable from above (e.g., after
- How do we create a graphical object? With the present definitions,
+ @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
- we cannot create a useful graphical object. The class
+ optimistic guess, as the locals visible after the @code{BEGIN} may be
- @code{graphical} describes graphical objects in general, but not
+ defined later. Therefore, the compiler assumes that no locals are
- any concrete graphical object type (C++ users would call it an
+ visible after the @code{BEGIN}. However, the user can use
- @emph{abstract class}); e.g., there is no method for the selector
+ @code{ASSUME-LIVE} to make the compiler assume that the same locals are
- @code{draw} in the class @code{graphical}.
+ visible at the BEGIN as at the point where the top control-flow stack
+ item was created.
- For concrete graphical objects, we define child classes of the
+ doc-assume-live
- class @code{graphical}, e.g.:
+ E.g.,
  @example
- graphical class circle \ "graphical" is the parent class
+ @{ x @}
-   cell var circle-radius
+ AHEAD
- how:
+ ASSUME-LIVE
-   : draw ( x y -- )
+ BEGIN
-     circle-radius @@ draw-circle ;
+   x
+ [ 1 CS-ROLL ] THEN
-   : init ( n-radius -- (
+   ...
-     circle-radius ! ;
+ UNTIL
- class;
  @end example
- Here we define a class @code{circle} as a child of @code{graphical},
+ Other cases where the locals are defined before the @code{BEGIN} can be
- with a field @code{circle-radius}; it defines new methods for the
+ handled by inserting an appropriate @code{CS-ROLL} before the
- selectors @code{draw} and @code{init} (@code{init} is defined in
+ @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
- @code{object}, the parent class of @code{graphical}).
+ behind the @code{ASSUME-LIVE}).
- Now we can create a circle in the dictionary with
+ Cases where locals are defined after the @code{BEGIN} (but should be
+ visible immediately after the @code{BEGIN}) can only be handled by
+ rearranging the loop. E.g., the ``most insidious'' example above can be
+ arranged into:
  @example
-circle : my-circle
+ BEGIN
+   @{ x @}
+   ... 0=
+ WHILE
+   x
+ REPEAT
  @end example
- @noindent
+ @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals
- @code{:} invokes @code{init}, thus initializing the field
+ @subsubsection How long do locals live?
- @code{circle-radius} with 50. We can draw this new circle at (100,100)
+ @cindex locals lifetime
- with:
+ @cindex lifetime of locals
- @example
+ The right answer for the lifetime question would be: A local lives at
-100 my-circle draw
+ least as long as it can be accessed. For a value-flavoured local this
- @end example
+ means: until the end of its visibility. However, a variable-flavoured
+ local could be accessed through its address far beyond its visibility
+ scope. Ultimately, this would mean that such locals would have to be
+ garbage collected. Since this entails un-Forth-like implementation
+ complexities, I adopted the same cowardly solution as some other
+ languages (e.g., C): The local lives only as long as it is visible;
+ afterwards its address is invalid (and programs that access it
+ afterwards are erroneous).
- @cindex selector invocation, restrictions
+ @node Programming Style, Implementation, How long do locals live?, Gforth locals
- @cindex class definition, restrictions
+ @subsubsection Programming Style
- Note: You can only invoke a selector if the receiving object belongs to
+ @cindex locals programming style
- the class where the selector was defined or one of its descendents;
+ @cindex programming style, locals
- e.g., you can invoke @code{draw} only for objects belonging to
- @code{graphical} or its descendents (e.g., @code{circle}). The scoping
- mechanism will check if you try to invoke a selector that is not
- defined in this class hierarchy, so you'll get an error at compilation
- time.
+ The freedom to define locals anywhere has the potential to change
+ programming styles dramatically. In particular, the need to use the
+ return stack for intermediate storage vanishes. Moreover, all stack
+ manipulations (except @code{PICK}s and @code{ROLL}s with run-time
+ determined arguments) can be eliminated: If the stack items are in the
+ wrong order, just write a locals definition for all of them; then
+ write the items in the order you want.
- @node The OOF base class, Class Declaration, Basic OOF Usage, OOF
+ This seems a little far-fetched and eliminating stack manipulations is
- @subsubsection The @file{oof.fs} base class
+ unlikely to become a conscious programming objective. Still, the number
- @cindex @file{oof.fs} base class
+ of stack manipulations will be reduced dramatically if local variables
+ are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with
+ a traditional implementation of @code{max}).
- When you define a class, you have to specify a parent class.  So how do
+ This shows one potential benefit of locals: making Forth programs more
- you start defining classes? There is one class available from the start:
+ readable. Of course, this benefit will only be realized if the
- @code{object}. You have to use it as ancestor for all classes. It is the
+ programmers continue to honour the principle of factoring instead of
- only class that has no parent. Classes are also objects, except that
+ using the added latitude to make the words longer.
- they don't have instance variables; class manipulation such as
- inheritance or changing definitions of a class is handled through
- selectors of the class @code{object}.
- @code{object} provides a number of selectors:
+ @cindex single-assignment style for locals
+ Using @code{TO} can and should be avoided.  Without @code{TO},
+ every value-flavoured local has only a single assignment and many
+ advantages of functional languages apply to Forth. I.e., programs are
+ easier to analyse, to optimize and to read: It is clear from the
+ definition what the local stands for, it does not turn into something
+ different later.
- @itemize @bullet
+ E.g., a definition using @code{TO} might look like this:
- @item
+ @example
- @code{class} for subclassing, @code{definitions} to add definitions
+ : strcmp @{ addr1 u1 addr2 u2 -- n @}
- later on, and @code{class?} to get type informations (is the class a
+  u1 u2 min 0
- subclass of the class passed on the stack?).
+  ?do
- doc---object-class
+    addr1 c@@ addr2 c@@ -
- doc---object-definitions
+    ?dup-if
- doc---object-class?
+      unloop exit
+    then
+    addr1 char+ TO addr1
+    addr2 char+ TO addr2
+  loop
+  u1 u2 - ;
+ @end example
+ Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
+ every loop iteration. @code{strcmp} is a typical example of the
+ readability problems of using @code{TO}. When you start reading
+ @code{strcmp}, you think that @code{addr1} refers to the start of the
+ string. Only near the end of the loop you realize that it is something
+ else.
- @item
+ This can be avoided by defining two locals at the start of the loop that
- @code{init} and @code{dispose} as constructor and destructor of the
+ are initialized with the right value for the current iteration.
- object. @code{init} is invocated after the object's memory is allocated,
+ @example
- while @code{dispose} also handles deallocation. Thus if you redefine
+ : strcmp @{ addr1 u1 addr2 u2 -- n @}
- @code{dispose}, you have to call the parent's dispose with @code{super
+  addr1 addr2
- dispose}, too.
+  u1 u2 min 0
- doc---object-init
+  ?do @{ s1 s2 @}
- doc---object-dispose
+    s1 c@@ s2 c@@ -
+    ?dup-if
+      unloop exit
+    then
+    s1 char+ s2 char+
+  loop
+drop
+  u1 u2 - ;
+ @end example
+ Here it is clear from the start that @code{s1} has a different value
+ in every loop iteration.
- @item
+ @node Implementation,  , Programming Style, Gforth locals
- @code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and
+ @subsubsection Implementation
- @code{[]} to create named and unnamed objects and object arrays or
+ @cindex locals implementation
- object pointers.
+ @cindex implementation of locals
- doc---object-new
- doc---object-new[]
- doc---object-:
- doc---object-ptr
- doc---object-asptr
- doc---object-[]
- @item
+ @cindex locals stack
- @code{::} and @code{super} for explicit scoping. You should use explicit
+ Gforth uses an extra locals stack. The most compelling reason for
- scoping only for super classes or classes with the same set of instance
+ this is that the return stack is not float-aligned; using an extra stack
- variables. Explicitly-scoped selectors use early binding.
+ also eliminates the problems and restrictions of using the return stack
- doc---object-::
+ as locals stack. Like the other stacks, the locals stack grows toward
- doc---object-super
+ lower addresses. A few primitives allow an efficient implementation:
- @item
+ doc-@local#
- @code{self} to get the address of the object
+ doc-f@local#
- doc---object-self
+ doc-laddr#
+ doc-lp+!#
+ doc-lp!
+ doc->l
+ doc-f>l
- @item
+ In addition to these primitives, some specializations of these
- @code{bind}, @code{bound}, @code{link}, and @code{is} to assign object
+ primitives for commonly occurring inline arguments are provided for
- pointers and instance defers.
+ efficiency reasons, e.g., @code{@@local0} as specialization of
- doc---object-bind
+ @code{@@local#} for the inline argument 0. The following compiling words
- doc---object-bound
+ compile the right specialized version, or the general version, as
- doc---object-link
+ appropriate:
- doc---object-is
- @item
+ doc-compile-@local
- @code{'} to obtain selector tokens, @code{send} to invocate selectors
+ doc-compile-f@local
- form the stack, and @code{postpone} to generate selector invocation code.
+ doc-compile-lp+!
- doc---object-'
- doc---object-postpone
- @item
+ Combinations of conditional branches and @code{lp+!#} like
- @code{with} and @code{endwith} to select the active object from the
+ @code{?branch-lp+!#} (the locals pointer is only changed if the branch
- stack, and enable its scope. Using @code{with} and @code{endwith}
+ is taken) are provided for efficiency and correctness in loops.
- also allows you to create code using selector @code{postpone} without being
- trapped by the state-smart objects.
- doc---object-with
- doc---object-endwith
- @end itemize
+ A special area in the dictionary space is reserved for keeping the
+ local variable names. @code{@{} switches the dictionary pointer to this
+ area and @code{@}} switches it back and generates the locals
+ initializing code. @code{W:} etc.@ are normal defining words. This
+ special area is cleared at the start of every colon definition.
- @node Class Declaration, Class Implementation, The OOF base class, OOF
+ @cindex word list for defining locals
- @subsubsection Class Declaration
+ A special feature of Gforth's dictionary is used to implement the
- @cindex class declaration
+ definition of locals without type specifiers: every word list (aka
+ vocabulary) has its own methods for searching
+ etc. (@pxref{Word Lists}). For the present purpose we defined a word list
+ with a special search method: When it is searched for a word, it
+ actually creates that word using @code{W:}. @code{@{} changes the search
+ order to first search the word list containing @code{@}}, @code{W:} etc.,
+ and then the word list for defining locals without type specifiers.
- @itemize @bullet
+ The lifetime rules support a stack discipline within a colon
- @item
+ definition: The lifetime of a local is either nested with other locals
- Instance variables
+ lifetimes or it does not overlap them.
- doc---oof-var
- @item
+ At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
- Object pointers
+ pointer manipulation is generated. Between control structure words
- doc---oof-ptr
+ locals definitions can push locals onto the locals stack. @code{AGAIN}
- doc---oof-asptr
+ is the simplest of the other three control flow words. It has to
+ restore the locals stack depth of the corresponding @code{BEGIN}
- @item
+ before branching. The code looks like this:
- Instance defers
+ @format
- doc---oof-defer
+ @code{lp+!#} current-locals-size @minus{} dest-locals-size
+ @code{branch} <begin>
- @item
+ @end format
- Method selectors
- doc---oof-early
- doc---oof-method
- @item
- Class-wide variables
- doc---oof-static
- @item
+ @code{UNTIL} is a little more complicated: If it branches back, it
- End declaration
+ must adjust the stack just like @code{AGAIN}. But if it falls through,
- doc---oof-how:
+ the locals stack must not be changed. The compiler generates the
- doc---oof-class;
+ following code:
+ @format
+ @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
+ @end format
+ The locals stack pointer is only adjusted if the branch is taken.
- @end itemize
+ @code{THEN} can produce somewhat inefficient code:
+ @format
+ @code{lp+!#} current-locals-size @minus{} orig-locals-size
+ <orig target>:
+ @code{lp+!#} orig-locals-size @minus{} new-locals-size
+ @end format
+ The second @code{lp+!#} adjusts the locals stack pointer from the
+ level at the @var{orig} point to the level after the @code{THEN}. The
+ first @code{lp+!#} adjusts the locals stack pointer from the current
+ level to the level at the orig point, so the complete effect is an
+ adjustment from the current level to the right level after the
+ @code{THEN}.
- @c -------------------------------------------------------------
+ @cindex locals information on the control-flow stack
- @node Class Implementation,  , Class Declaration, OOF
+ @cindex control-flow stack items, locals information
- @subsubsection Class Implementation
+ In a conventional Forth implementation a dest control-flow stack entry
- @cindex class implementation
+ is just the target address and an orig entry is just the address to be
+ patched. Our locals implementation adds a word list to every orig or dest
+ item. It is the list of locals visible (or assumed visible) at the point
+ described by the entry. Our implementation also adds a tag to identify
+ the kind of entry, in particular to differentiate between live and dead
+ (reachable and unreachable) orig entries.
- @c -------------------------------------------------------------
+ A few unusual operations have to be performed on locals word lists:
- @node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth
- @subsection The @file{mini-oof.fs} model
- @cindex mini-oof
- Gforth's third object oriented Forth package is a 12-liner. It uses a
+ doc-common-list
- mixture of the @file{object.fs} and the @file{oof.fs} syntax,
+ doc-sub-list?
- and reduces to the bare minimum of features. This is based on a posting
+ doc-list-size
- of Bernd Paysan in comp.arch.
- @menu
+ Several features of our locals word list implementation make these
- * Basic Mini-OOF Usage::
+ operations easy to implement: The locals word lists are organised as
- * Mini-OOF Example::
+ linked lists; the tails of these lists are shared, if the lists
- * Mini-OOF Implementation::
+ contain some of the same locals; and the address of a name is greater
- @end menu
+ than the address of the names behind it in the list.
- @c -------------------------------------------------------------
+ Another important implementation detail is the variable
- @node Basic Mini-OOF Usage, Mini-OOF Example, , Mini-OOF
+ @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
- @subsubsection Basic @file{mini-oof.fs} Usage
+ determine if they can be reached directly or only through the branch
- @cindex mini-oof usage
+ that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
+ @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
+ definition, by @code{BEGIN} and usually by @code{THEN}.
- There is a base class (@code{class}, which allocates one cell
+ Counted loops are similar to other loops in most respects, but
- for the object pointer) plus seven other words: to define a method, a
+ @code{LEAVE} requires special attention: It performs basically the same
- variable, a class; to end a class, to resolve binding, to allocate an
+ service as @code{AHEAD}, but it does not create a control-flow stack
- object and to compile a class method.
+ entry. Therefore the information has to be stored elsewhere;
- @comment TODO better description of the last one
+ traditionally, the information was stored in the target fields of the
+ branches created by the @code{LEAVE}s, by organizing these fields into a
+ linked list. Unfortunately, this clever trick does not provide enough
+ space for storing our extended control flow information. Therefore, we
+ introduce another stack, the leave stack. It contains the control-flow
+ stack entries for all unresolved @code{LEAVE}s.
- doc-object
+ Local names are kept until the end of the colon definition, even if
- doc-method
+ they are no longer visible in any control-flow path. In a few cases
- doc-var
+ this may lead to increased space needs for the locals name area, but
- doc-class
+ usually less than reclaiming this space would cost in code size.
- doc-end-class
- doc-defines
- doc-new
- doc-::
- @c -------------------------------------------------------------
+ @node ANS Forth locals,  , Gforth locals, Locals
- @node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF
+ @subsection ANS Forth locals
- @subsubsection Mini-OOF Example
+ @cindex locals, ANS Forth style
- @cindex mini-oof example
- A short example shows how to use this package.
+ The ANS Forth locals wordset does not define a syntax for locals, but
- @comment nac TODO could flesh this out with some comments from the Forthwrite article
+ words that make it possible to define various syntaxes. One of the
+ possible syntaxes is a subset of the syntax we used in the Gforth locals
+ wordset, i.e.:
  @example
- object class
+ @{ local1 local2 ... -- comment @}
-   method init
-   method draw
- end-class graphical
  @end example
+ @noindent
- This code defines a class @code{graphical} with an
+ or
- operation @code{draw}.  We can perform the operation
- @code{draw} on any @code{graphical} object, e.g.:
  @example
-100 t-rex draw
+ @{ local1 local2 ... @}
  @end example
- where @code{t-rex} is an object or object pointer, created with e.g.
+ The order of the locals corresponds to the order in a stack comment. The
- @code{graphical new Constant t-rex}.
+ restrictions are:
- For concrete graphical objects, we define child classes of the
- class @code{graphical}, e.g.:
- @example
+ @itemize @bullet
- graphical class
+ @item
-   cell var circle-radius
+ Locals can only be cell-sized values (no type specifiers are allowed).
- end-class circle \ "graphical" is the parent class
+ @item
+ Locals can be defined only outside control structures.
+ @item
+ Locals can interfere with explicit usage of the return stack. For the
+ exact (and long) rules, see the standard. If you don't use return stack
+ accessing words in a definition using locals, you will be all right. The
+ purpose of this rule is to make locals implementation on the return
+ stack easier.
+ @item
+ The whole definition must be in one line.
+ @end itemize
- :noname ( x y -- )
+ Locals defined in this way behave like @code{VALUE}s (@xref{Simple
-   circle-radius @@ draw-circle ; circle defines draw
+ Defining Words}). I.e., they are initialized from the stack. Using their
- :noname ( r -- )
+ name produces their value. Their value can be changed using @code{TO}.
-   circle-radius ! ; circle defines init
- @end example
- There is no implicit init method, so we have to define one. The creation
+ Since this syntax is supported by Gforth directly, you need not do
- code of the object now has to call init explicitely.
+ anything to use it. If you want to port a program using this syntax to
+ another ANS Forth system, use @file{compat/anslocal.fs} to implement the
+ syntax on the other system.
- @example
+ Note that a syntax shown in the standard, section A.13 looks
- circle new Constant my-circle
+ similar, but is quite different in having the order of locals
-my-circle init
+ reversed. Beware!
- @end example
- It is also possible to add a function to create named objects with
+ The ANS Forth locals wordset itself consists of a word:
- automatic call of @code{init}, given that all objects have @code{init}
- on the same place:
- @example
+ doc-(local)
- : new: ( .. o "name" -- )
-     new dup Constant init ;
-circle new: large-circle
- @end example
- We can draw this new circle at (100,100) with:
+ The ANS Forth locals extension wordset defines a syntax using @code{locals|}, but it is so
+ awful that we strongly recommend not to use it. We have implemented this
+ syntax to make porting to Gforth easy, but do not document it here. The
+ problem with this syntax is that the locals are defined in an order
+ reversed with respect to the standard stack comment notation, making
+ programs harder to read, and easier to misread and miswrite. The only
+ merit of this syntax is that it is easy to implement using the ANS Forth
+ locals wordset.
- @example
-100 my-circle draw
- @end example
- @node Mini-OOF Implementation, , Mini-OOF Example, Mini-OOF
+ @c ----------------------------------------------------------
- @subsubsection @file{mini-oof.fs} Implementation
+ @node Structures, Object-oriented Forth, Locals, Words
+ @section  Structures
+ @cindex structures
+ @cindex records
- Object-oriented systems with late binding typically use a
+ This section presents the structure package that comes with Gforth. A
- "vtable"-approach: the first variable in each object is a pointer to a
+ version of the package implemented in ANS Forth is available in
- table, which contains the methods as function pointers. The vtable
+ @file{compat/struct.fs}. This package was inspired by a posting on
- may also contain other information.
+ comp.lang.forth in 1989 (unfortunately I don't remember, by whom;
+ possibly John Hayes). A version of this section has been published in
+ ???. Marcel Hendrix provided helpful comments.
- So first, let's declare methods:
+ @menu
+ * Why explicit structure support?::
+ * Structure Usage::
+ * Structure Naming Convention::
+ * Structure Implementation::
+ * Structure Glossary::
+ @end menu
- @example
+ @node Why explicit structure support?, Structure Usage, Structures, Structures
- : method ( m v -- m' v ) Create  over , swap cell+ swap
+ @subsection Why explicit structure support?
-   DOES> ( ... o -- ... ) @ over @ + @ execute ;
- @end example
- During method declaration, the number of methods and instance
+ @cindex address arithmetic for structures
- variables is on the stack (in address units). @code{method} creates
+ @cindex structures using address arithmetic
- one method and increments the method number. To execute a method, it
+ If we want to use a structure containing several fields, we could simply
- takes the object, fetches the vtable pointer, adds the offset, and
+ reserve memory for it, and access the fields using address arithmetic
- executes the @var{xt} stored there. Each method takes the object it is
+ (@pxref{Address arithmetic}). As an example, consider a structure with
- invoked from as top of stack parameter. The method itself should
+ the following fields
- consume that object.
- Now, we also have to declare instance variables
+ @table @code
+ @item a
+ is a float
+ @item b
+ is a cell
+ @item c
+ is a float
+ @end table
- @example
+ Given the (float-aligned) base address of the structure we get the
- : var ( m v size -- m v' ) Create  over , +
+ address of the field
-   DOES> ( o -- addr ) @ + ;
- @end example
- As before, a word is created with the current offset. Instance
+ @table @code
- variables can have different sizes (cells, floats, doubles, chars), so
+ @item a
- all we do is take the size and add it to the offset. If your machine
+ without doing anything further.
- has alignment restrictions, put the proper @code{aligned} or
+ @item b
- @code{faligned} before the variable, to adjust the variable
+ with @code{float+}
- offset. That's why it is on the top of stack.
+ @item c
+ with @code{float+ cell+ faligned}
+ @end table
- We need a starting point (the base object) and some syntactic sugar:
+ It is easy to see that this can become quite tiring.
- @example
+ Moreover, it is not very readable, because seeing a
- Create object  1 cells , 2 cells ,
+ @code{cell+} tells us neither which kind of structure is
- : class ( class -- class methods vars ) dup 2@ ;
+ accessed nor what field is accessed; we have to somehow infer the kind
- @end example
+ of structure, and then look up in the documentation, which field of
+ that structure corresponds to that offset.
- For inheritance, the vtable of the parent object has to be
+ Finally, this kind of address arithmetic also causes maintenance
- copied when a new, derived class is declared. This gives all the
+ troubles: If you add or delete a field somewhere in the middle of the
- methods of the parent class, which can be overridden, though.
+ structure, you have to find and change all computations for the fields
+ afterwards.
+ So, instead of using @code{cell+} and friends directly, how
+ about storing the offsets in constants:
  @example
- : end-class  ( class methods vars -- )
+constant a-offset
-   Create  here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP
+float+ constant b-offset
-   cell+ dup cell+ r> rot @ 2 cells /string move ;
+float+ cell+ faligned c-offset
  @end example
- The first line creates the vtable, initialized with
+ Now we can get the address of field @code{x} with @code{x-offset
- @code{noop}s. The second line is the inheritance mechanism, it
+ +}. This is much better in all respects. Of course, you still
- copies the xts from the parent vtable.
+ have to change all later offset definitions if you add a field. You can
+ fix this by declaring the offsets in the following way:
- We still have no way to define new methods, let's do that now:
  @example
- : defines ( xt class -- ) ' >body @ + ! ;
+constant a-offset
+ a-offset float+ constant b-offset
+ b-offset cell+ faligned constant c-offset
  @end example
- To allocate a new object, we need a word, too:
+ Since we always use the offsets with @code{+}, we could use a defining
+ word @code{cfield} that includes the @code{+} in the action of the
+ defined word:
  @example
- : new ( class -- o )  here over @ allot swap over ! ;
+ : cfield ( n "name" -- )
+     create ,
+ does> ( name execution: addr1 -- addr2 )
+     @@ + ;
+cfield a
+a float+ cfield b
+b cell+ faligned cfield c
  @end example
- Sometimes derived classes want to access the method of the
+ Instead of @code{x-offset +}, we now simply write @code{x}.
- parent object. There are two ways to achieve this with Mini-OOF:
- first, you could use named words, and second, you could look up the
+ The structure field words now can be used quite nicely. However,
- vtable of the parent object.
+ their definition is still a bit cumbersome: We have to repeat the
+ name, the information about size and alignment is distributed before
+ and after the field definitions etc.  The structure package presented
+ here addresses these problems.
+ @node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures
+ @subsection Structure Usage
+ @cindex structure usage
+ @cindex @code{field} usage
+ @cindex @code{struct} usage
+ @cindex @code{end-struct} usage
+ You can define a structure for a (data-less) linked list with:
  @example
- : :: ( class "name" -- ) ' >body @ + @ compile, ;
+ struct
+     cell% field list-next
+ end-struct list%
  @end example
+ With the address of the list node on the stack, you can compute the
- Nothing can be more confusing than a good example, so here is
+ address of the field that contains the address of the next node with
- one. First let's declare a text object (called
+ @code{list-next}. E.g., you can determine the length of a list
- @code{button}), that stores text and position:
+ with:
  @example
- object class
+ : list-length ( list -- n )
-   cell var text
+ \ "list" is a pointer to the first element of a linked list
-   cell var len
+ \ "n" is the length of the list
-   cell var x
+BEGIN ( list1 n1 )
-   cell var y
+         over
-   method init
+     WHILE ( list1 n1 )
-   method draw
++ swap list-next @@ swap
- end-class button
+     REPEAT
+     nip ;
  @end example
- @noindent
+ You can reserve memory for a list node in the dictionary with
- Now, implement the two methods, @code{draw} and @code{init}:
+ @code{list% %allot}, which leaves the address of the list node on the
+ stack. For the equivalent allocation on the heap you can use @code{list%
+ %alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior),
+ use @code{list% %allocate}). You can get the the size of a list
+ node with @code{list% %size} and its alignment with @code{list%
+ %alignment}.
+ Note that in ANS Forth the body of a @code{create}d word is
+ @code{aligned} but not necessarily @code{faligned};
+ therefore, if you do a:
  @example
- :noname ( o -- )
+ create @emph{name} foo% %allot
-  >r r@ x @ r@ y @ at-xy  r@ text @ r> len @ type ;
-  button defines draw
- :noname ( addr u o -- )
-  >r 0 r@ x ! 0 r@ y ! r@ len ! r> text ! ;
-  button defines init
  @end example
  @noindent
- To demonstrate inheritance, we define a class @code{bold-button}, with no
+ then the memory alloted for @code{foo%} is
- new data and no new methods.
+ guaranteed to start at the body of @code{@emph{name}} only if
+ @code{foo%} contains only character, cell and double fields.
+ @cindex strcutures containing structures
+ You can include a structure @code{foo%} as a field of
+ another structure, like this:
  @example
- button class
+ struct
- end-class bold-button
+ ...
+     foo% field ...
- : bold   27 emit ." [1m" ;
+ ...
- : normal 27 emit ." [0m" ;
+ end-struct ...
+ @end example
- @noindent
+ @cindex structure extension
- The class @code{bold-button} has a different draw method to
+ @cindex extended records
- @code{button}, but the new method is defined in terms of the draw method
+ Instead of starting with an empty structure, you can extend an
- for @code{button}:
+ existing structure. E.g., a plain linked list without data, as defined
+ above, is hardly useful; You can extend it to a linked list of integers,
+ like this:@footnote{This feature is also known as @emph{extended
+ records}. It is the main innovation in the Oberon language; in other
+ words, adding this feature to Modula-2 led Wirth to create a new
+ language, write a new compiler etc.  Adding this feature to Forth just
+ required a few lines of code.}
- :noname bold [ button :: draw ] normal ; bold-button defines draw
+ @example
+ list%
+     cell% field intlist-int
+ end-struct intlist%
  @end example
- @noindent
+ @code{intlist%} is a structure with two fields:
- Finally, create two objects and apply methods:
+ @code{list-next} and @code{intlist-int}.
+ @cindex structures containing arrays
+ You can specify an array type containing @emph{n} elements of
+ type @code{foo%} like this:
  @example
- button new Constant foo
+ foo% @emph{n} *
- s" thin foo" foo init
- page
- foo draw
- bold-button new Constant bar
- s" fat bar" bar init
-bar y !
- bar draw
  @end example
+ You can use this array type in any place where you can use a normal
+ type, e.g., when defining a @code{field}, or with
+ @code{%allot}.
- @node Comparison with other object models, , Mini-OOF, Object-oriented Forth
+ @cindex first field optimization
- @subsubsection Comparison with other object models
+ The first field is at the base address of a structure and the word
- @cindex comparison of object models
+ for this field (e.g., @code{list-next}) actually does not change
- @cindex object models, comparison
+ the address on the stack. You may be tempted to leave it away in the
+ interest of run-time and space efficiency. This is not necessary,
+ because the structure package optimizes this case and compiling such
+ words does not generate any code. So, in the interest of readability
+ and maintainability you should include the word for the field when
+ accessing the field.
- Many object-oriented Forth extensions have been proposed (@cite{A survey
+ @node Structure Naming Convention, Structure Implementation, Structure Usage, Structures
- of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford
+ @subsection Structure Naming Convention
- J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the
+ @cindex structure naming convention
- relation of the object models described here to two well-known and two
- closely-related (by the use of method maps) models.
- @cindex Neon model
+ The field names that come to (my) mind are often quite generic, and,
- The most popular model currently seems to be the Neon model (see
+ if used, would cause frequent name clashes. E.g., many structures
- @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March
+ probably contain a @code{counter} field. The structure names
-) by Andrew McKewan) but this model has a number of limitations
+ that come to (my) mind are often also the logical choice for the names
- @footnote{A longer version of this critique can be
+ of words that create such a structure.
- found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth
- Dimensions, May 1997) by Anton Ertl.}:
+ Therefore, I have adopted the following naming conventions:
  @itemize @bullet
+ @cindex field naming convention
  @item
- It uses a @code{@emph{selector
+ The names of fields are of the form
- object}} syntax, which makes it unnatural to pass objects on the
+ @code{@emph{struct}-@emph{field}}, where
- stack.
+ @code{@emph{struct}} is the basic name of the structure, and
+ @code{@emph{field}} is the basic name of the field. You can
+ think of field words as converting the (address of the)
+ structure into the (address of the) field.
+ @cindex structure naming convention
  @item
- It requires that the selector parses the input stream (at
+ The names of structures are of the form
- compile time); this leads to reduced extensibility and to bugs that are+
+ @code{@emph{struct}%}, where
- hard to find.
+ @code{@emph{struct}} is the basic name of the structure.
+ @end itemize
- @item
+ This naming convention does not work that well for fields of extended
- It allows using every selector to every object;
+ structures; e.g., the integer list structure has a field
- this eliminates the need for classes, but makes it harder to create
+ @code{intlist-int}, but has @code{list-next}, not
- efficient implementations.
+ @code{intlist-next}.
- @end itemize
- @cindex Pountain's object-oriented model
+ @node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures
- Another well-known publication is @cite{Object-Oriented Forth} (Academic
+ @subsection Structure Implementation
- Press, London, 1987) by Dick Pountain. However, it is not really about
+ @cindex structure implementation
- object-oriented programming, because it hardly deals with late
+ @cindex implementation of structures
- binding. Instead, it focuses on features like information hiding and
- overloading that are characteristic of modular languages like Ada (83).
- @cindex Zsoter's object-oriented model
+ The central idea in the implementation is to pass the data about the
- In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1) 1996, pages 31-35)
+ structure being built on the stack, not in some global
- Andras Zsoter describes a model that makes heavy use of an active object
+ variable. Everything else falls into place naturally once this design
- (like @code{this} in @file{objects.fs}): The active object is not only
+ decision is made.
- used for accessing all fields, but also specifies the receiving object
- of every selector invocation; you have to change the active object
- explicitly with @code{@{ ... @}}, whereas in @file{objects.fs} it
- changes more or less implicitly at @code{m: ... ;m}. Such a change at
- the method entry point is unnecessary with the Zsoter's model, because
- the receiving object is the active object already. On the other hand, the explicit
- change is absolutely necessary in that model, because otherwise no one
- could ever change the active object. An ANS Forth implementation of this
- model is available at @url{http://www.forth.org/fig/oopf.html}.
- @cindex @file{oof.fs}, differences to other models
+ The type description on the stack is of the form @emph{align
- The @file{oof.fs} model combines information hiding and overloading
+ size}. Keeping the size on the top-of-stack makes dealing with arrays
- resolution (by keeping names in various word lists) with object-oriented
+ very simple.
- programming. It sets the active object implicitly on method entry, but
- also allows explicit changing (with @code{>o...o>} or with
- @code{with...endwith}). It uses parsing and state-smart objects and
- classes for resolving overloading and for early binding: the object or
- class parses the selector and determines the method from this. If the
- selector is not parsed by an object or class, it performs a call to the
- selector for the active object (late binding), like Zsoter's model.
- Fields are always accessed through the active object. The big
- disadvantage of this model is the parsing and the state-smartness, which
- reduces extensibility and increases the opportunities for subtle bugs;
- essentially, you are only safe if you never tick or @code{postpone} an
- object or class (Bernd disagrees, but I (Anton) am not convinced).
- @cindex @file{mini-oof.fs}, differences to other models
+ @code{field} is a defining word that uses @code{Create}
- The @file{mini-oof.fs} model is quite similar to a very stripped-down version of
+ and @code{DOES>}. The body of the field contains the offset
- the @file{objects.fs} model, but syntactically it is a mixture of the @file{objects.fs} and
+ of the field, and the normal @code{DOES>} action is simply:
- @file{oof.fs} models.
+ @example
+ @ +
+ @end example
+ @noindent
+ i.e., add the offset to the address, giving the stack effect
+ @var{addr1 -- addr2} for a field.
- @c -------------------------------------------------------------
+ @cindex first field optimization, implementation
- @node Tokens for Words, Word Lists, Object-oriented Forth, Words
+ This simple structure is slightly complicated by the optimization
- @section Tokens for Words
+ for fields with offset 0, which requires a different
- @cindex tokens for words
+ @code{DOES>}-part (because we cannot rely on there being
+ something on the stack if such a field is invoked during
+ compilation). Therefore, we put the different @code{DOES>}-parts
+ in separate words, and decide which one to invoke based on the
+ offset. For a zero offset, the field is basically a noop; it is
+ immediate, and therefore no code is generated when it is compiled.
- This chapter describes the creation and use of tokens that represent
+ @node Structure Glossary,  , Structure Implementation, Structures
- words on the stack (and in data space).
+ @subsection Structure Glossary
+ @cindex structure glossary
- Named words have interpretation and compilation semantics. Unnamed words
+ doc-%align
- just have execution semantics.
+ doc-%alignment
+ doc-%alloc
+ doc-%allocate
+ doc-%allot
+ doc-cell%
+ doc-char%
+ doc-dfloat%
+ doc-double%
+ doc-end-struct
+ doc-field
+ doc-float%
+ doc-naligned
+ doc-sfloat%
+ doc-%size
+ doc-struct
- @comment TODO ?normally interpretation semantics are the execution semantics.
+ @c -------------------------------------------------------------
- @comment this should all be covered in earlier ss
+ @node Object-oriented Forth, Passing Commands to the OS, Structures, Words
+ @section Object-oriented Forth
- @cindex execution token
+ Gforth comes with three packages for object-oriented programming:
- An @dfn{execution token} represents the execution semantics of an
+ @file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them
- unnamed word. An execution token occupies one cell. As explained in
+ is preloaded, so you have to @code{include} them before use. The most
- @ref{Supplying names}, the execution token of the last word
+ important differences between these packages (and others) are discussed
- defined can be produced with @code{lastxt}.
+ in @ref{Comparison with other object models}. All packages are written
+ in ANS Forth and can be used with any other ANS Forth.
- You can perform the semantics represented by an execution token with:
+ @menu
- doc-execute
+ * Why object-oriented programming?::
- You can compile the word with:
+ * Object-Oriented Terminology::
- doc-compile,
+ * Objects::
+ * OOF::
+ * Mini-OOF::
+ * Comparison with other object models::
+ @end menu
- @cindex code field address
- @cindex CFA
- In Gforth, the abstract data type @emph{execution token} is implemented
- as CFA (code field address).
- @comment TODO note that the standard does not say what it represents..
- @comment and you cannot necessarily compile it in all Forths (eg native
- @comment compilers?).
- The interpretation semantics of a named word are also represented by an
+ @node Why object-oriented programming?, Object-Oriented Terminology, , Object-oriented Forth
- execution token. You can get it with
+ @subsubsection Why object-oriented programming?
+ @cindex object-oriented programming motivation
+ @cindex motivation for object-oriented programming
- doc-[']
+ Often we have to deal with several data structures (@emph{objects}),
- doc-'
+ that have to be treated similarly in some respects, but differently in
+ others. Graphical objects are the textbook example: circles, triangles,
+ dinosaurs, icons, and others, and we may want to add more during program
+ development. We want to apply some operations to any graphical object,
+ e.g., @code{draw} for displaying it on the screen. However, @code{draw}
+ has to do something different for every kind of object.
+ @comment TODO add some other operations eg perimeter, area
+ @comment and tie in to concrete examples later..
- For literals, you use @code{'} in interpreted code and @code{[']} in
+ We could implement @code{draw} as a big @code{CASE}
- compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusual
+ control structure that executes the appropriate code depending on the
- by complaining about compile-only words. To get an execution token for a
+ kind of object to be drawn. This would be not be very elegant, and,
- compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP']
+ moreover, we would have to change @code{draw} every time we add
- @var{X} drop}.
+ a new kind of graphical object (say, a spaceship).
- @cindex compilation token
+ What we would rather do is: When defining spaceships, we would tell
- The compilation semantics are represented by a @dfn{compilation token}
+ the system: ``Here's how you @code{draw} a spaceship; you figure
- consisting of two cells: @var{w xt}. The top cell @var{xt} is an
+ out the rest''.
- execution token. The compilation semantics represented by the
- compilation token can be performed with @code{execute}, which consumes
- the whole compilation token, with an additional stack effect determined
- by the represented compilation semantics.
- doc-[comp']
+ This is the problem that all systems solve that (rightfully) call
- doc-comp'
+ themselves object-oriented; the object-oriented packages presented here
+ solve this problem (and not much else).
+ @comment TODO ?list properties of oo systems.. oo vs o-based?
- You can compile the compilation semantics with @code{postpone,}. I.e.,
+ @node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth
- @code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE
+ @subsubsection Object-Oriented Terminology
- @var{word}}.
+ @cindex object-oriented terminology
+ @cindex terminology for object-oriented programming
- doc-postpone,
+ This section is mainly for reference, so you don't have to understand
+ all of it right away.  The terminology is mainly Smalltalk-inspired.  In
+ short:
- At present, the @var{w} part of a compilation token is an execution
+ @table @emph
- token, and the @var{xt} part represents either @code{execute} or
+ @cindex class
- @code{compile,}. However, don't rely on that knowledge, unless necessary;
+ @item class
- we may introduce unusual compilation tokens in the future (e.g.,
+ a data structure definition with some extras.
- compilation tokens representing the compilation semantics of literals).
- @cindex name token
+ @cindex object
- @cindex name field address
+ @item object
- @cindex NFA
+ an instance of the data structure described by the class definition.
- Named words are also represented by the @dfn{name token}. The abstract
- data type @emph{name token} is implemented as NFA (name field address).
- doc-find-name
+ @cindex instance variables
- doc-name>int
+ @item instance variables
- doc-name?int
+ fields of the data structure.
- doc-name>comp
- doc-name>string
- @node Word Lists, Environmental Queries, Tokens for Words, Words
+ @cindex selector
- @section Word Lists
+ @cindex method selector
- @cindex word lists
+ @cindex virtual function
- @cindex name dictionary
+ @item selector
+ (or @emph{method selector}) a word (e.g.,
+ @code{draw}) that performs an operation on a variety of data
+ structures (classes). A selector describes @emph{what} operation to
+ perform. In C++ terminology: a (pure) virtual function.
- @cindex wid
+ @cindex method
- All definitions other than those created by @code{:noname} have an entry
+ @item method
- in the name dictionary. The name dictionary is fragmented into a number
+ the concrete definition that performs the operation
- of parts, called @var{word lists}. A word list is identified by a
+ described by the selector for a specific class. A method specifies
- cell-sized word list identifier (@var{wid}) in much the same way as a
+ @emph{how} the operation is performed for a specific class.
- file is identified by a file handle. The numerical value of the wid has
- no (portable) meaning, and might change from session to session.
- @cindex compilation word list
+ @cindex selector invocation
- At any one time, a single word list is defined as the word list to which
+ @cindex message send
- all new definitions will be added -- this is called the @var{compilation
+ @cindex invoking a selector
- word list}. When Gforth is started, the compilation word list is the
+ @item selector invocation
- word list called @code{FORTH-WORDLIST}.
+ a call of a selector. One argument of the call (the TOS (top-of-stack))
+ is used for determining which method is used. In Smalltalk terminology:
+ a message (consisting of the selector and the other arguments) is sent
+ to the object.
- @cindex search order stack
+ @cindex receiving object
- Forth maintains a stack of word lists, representing the @var{search
+ @item receiving object
- order}.  When the name dictionary is searched (for example, when
+ the object used for determining the method executed by a selector
- attempting to find a word's execution token during compilation), only
+ invocation. In the @file{objects.fs} model, it is the object that is on
- those word lists that are currently in the search order are
+ the TOS when the selector is invoked. (@emph{Receiving} comes from
- searched. The most recently-defined word in the word list at the top of
+ the Smalltalk @emph{message} terminology.)
- the word list stack is searched first, and the search proceeds until
- either the word is located or the oldest definition in the word list at
- the bottom of the stack is reached. Definitions of the word may exist in
- more than one word lists; the search order determines which version will
- be found.
- The ANS Forth Standard "Search order" word set is intended to provide a
+ @cindex child class
- set of low-level tools that allow various different schemes to be
+ @cindex parent class
- implemented. Gforth provides @code{vocabulary}, a traditional Forth
+ @cindex inheritance
- word.  @file{compat/vocabulary.fs} provides an implementation in ANS
+ @item child class
- Standard Forth.
+ a class that has (@emph{inherits}) all properties (instance variables,
+ selectors, methods) from a @emph{parent class}. In Smalltalk
+ terminology: The subclass inherits from the superclass. In C++
+ terminology: The derived class inherits from the base class.
- TODO: locals section refers to here, saying that every word list (aka
+ @end table
- vocabulary) has its own methods for searching etc. Need to document that.
- doc-forth-wordlist
+ @c If you wonder about the message sending terminology, it comes from
- doc-definitions
+ @c a time when each object had it's own task and objects communicated via
- doc-get-current
+ @c message passing; eventually the Smalltalk developers realized that
- doc-set-current
+ @c they can do most things through simple (indirect) calls. They kept the
+ @c terminology.
- @comment TODO when a defn (like set-order) is instanced twice, the second instance gets documented.
- @comment In general that might be fine, but in this example (search.fs) the second instance is an
- @comment alias, so it would not naturally have documentation
- doc-get-order
+ @node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth
- doc-set-order
+ @subsection The @file{objects.fs} model
- doc-wordlist
+ @cindex objects
- doc-also
+ @cindex object-oriented programming
- doc-forth
- doc-only
- doc-order
- doc-previous
- doc-find
+ @cindex @file{objects.fs}
- doc-search-wordlist
+ @cindex @file{oof.fs}
- doc-words
+ This section describes the @file{objects.fs} package. This material also has been published in @cite{Yet Another Forth Objects Package} by Anton Ertl and appeared in Forth Dimensions 19(2), pages 37--43 (@url{http://www.complang.tuwien.ac.at/forth/objects/objects.html}).
- doc-vlist
+ @c McKewan's and Zsoter's packages
+ This section assumes that you have read @ref{Structures}.
+ The techniques on which this model is based have been used to implement
+ the parser generator, Gray, and have also been used in Gforth for
+ implementing the various flavours of word lists (hashed or not,
+ case-sensitive or not, special-purpose word lists for locals etc.).
- doc-mappedwordlist
- doc-root
- doc-vocabulary
- doc-seal
- doc-vocs
- doc-current
- doc-context
  @menu
- * Why use word lists?::
+ * Properties of the Objects model::
- * Word list examples::
+ * Basic Objects Usage::
+ * The Objects base class::
+ * Creating objects::
+ * Object-Oriented Programming Style::
+ * Class Binding::
+ * Method conveniences::
+ * Classes and Scoping::
+ * Object Interfaces::
+ * Objects Implementation::
+ * Objects Glossary::
  @end menu
- @node Why use word lists?, Word list examples, Word Lists, Word Lists
+ Marcel Hendrix provided helpful comments on this section. Andras Zsoter
- @subsection Why use word lists?
+ and Bernd Paysan helped me with the related works section.
- @cindex word lists - why use them?
- There are several reasons for using multiple word lists:
+ @node Properties of the Objects model, Basic Objects Usage, Objects, Objects
+ @subsubsection Properties of the @file{objects.fs} model
+ @cindex @file{objects.fs} properties
  @itemize @bullet
  @item
- To improve compilation speed by reducing the number of name dictionary
+ It is straightforward to pass objects on the stack. Passing
- entries that must be searched. This is achieved by creating a new
+ selectors on the stack is a little less convenient, but possible.
- word list that contains all of the definitions that are used in the
- definition of a Forth system but which would not usually be used by
- programs running on that system. That word list would be on the search
- list when the Forth system was compiled but would be removed from the
- search list for normal operation. This can be a useful technique for
- low-performance systems (for example, 8-bit processors in embedded
- systems) but is unlikely to be necessary in high-performance desktop
- systems.
  @item
- To prevent a set of words from being used outside the context in which
+ Objects are just data structures in memory, and are referenced by their
- they are valid. Two classic examples of this are an integrated editor
+ address. You can create words for objects with normal defining words
- (all of the edit commands are defined in a separate word list; the
+ like @code{constant}. Likewise, there is no difference between instance
- search order is set to the editor word list when the editor is invoked;
+ variables that contain objects and those that contain other data.
- the old search order is restored when the editor is terminated) and an
- integrated assembler (the op-codes for the machine are defined in a
- separate word list which is used when a @code{CODE} word is defined).
  @item
- To prevent a name-space clash between multiple definitions with the same
+ Late binding is efficient and easy to use.
- name. For example, when building a cross-compiler you might have a word
- @code{IF} that generates conditional code for your target system. By
- placing this definition in a different word list you can control whether
- the host system's @code{IF} or the target system's @code{IF} get used in
- any particular context by controlling the order of the word lists on the
- search order stack.
- @end itemize
- @node Word list examples, ,Why use word lists?, Word Lists
+ @item
- @subsection Word list examples
+ It avoids parsing, and thus avoids problems with state-smartness
- @cindex word lists - examples
+ and reduced extensibility; for convenience there are a few parsing
+ words, but they have non-parsing counterparts. There are also a few
+ defining words that parse. This is hard to avoid, because all standard
+ defining words parse (except @code{:noname}); however, such
+ words are not as bad as many other parsing words, because they are not
+ state-smart.
- Here is an example of creating and using a new wordlist using ANS
+ @item
- Standard words:
+ It does not try to incorporate everything. It does a few things and does
+ them well (IMO). In particular, this model was not designed to support
+ information hiding (although it has features that may help); you can use
+ a separate package for achieving this.
- @example
+ @item
- wordlist constant my-new-words-wordlist
+ It is layered; you don't have to learn and use all features to use this
- : my-new-words get-order nip my-new-words-wordlist swap set-order ;
+ model. Only a few features are necessary (@xref{Basic Objects Usage},
+ @xref{The Objects base class}, @xref{Creating objects}.), the others
+ are optional and independent of each other.
- \ add it to the search order
+ @item
- also my-new-words
+ An implementation in ANS Forth is available.
- \ alternatively, add it to the search order and make it
+ @end itemize
- \ the compilation word list
- also my-new-words definitions
- \ type "order" to see the problem
- @end example
- The problem with this example is that @code{order} has no way to
- associate the name @code{my-new-words} with the wid of the word list (in
- Gforth, @code{order} and @code{vocs} will display @code{???}  for a wid
- that has no associated name). There is no Standard way of associating a
- name with a wid.
- In Gforth, this example can be re-coded using @code{vocabulary}, which
+ @node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects
- associates a name with a wid:
+ @subsubsection Basic @file{objects.fs} Usage
+ @cindex basic objects usage
+ @cindex objects, basic usage
+ You can define a class for graphical objects like this:
+ @cindex @code{class} usage
+ @cindex @code{end-class} usage
+ @cindex @code{selector} usage
  @example
- vocabulary my-new-words
+ object class \ "object" is the parent class
+   selector draw ( x y graphical -- )
+ end-class graphical
+ @end example
- \ add it to the search order
+ This code defines a class @code{graphical} with an
- my-new-words
+ operation @code{draw}.  We can perform the operation
+ @code{draw} on any @code{graphical} object, e.g.:
- \ alternatively, add it to the search order and make it
+ @example
- \ the compilation word list
+100 t-rex draw
- my-new-words definitions
- \ type "order" to see that the problem is solved
  @end example
+ @noindent
+ where @code{t-rex} is a word (say, a constant) that produces a
+ graphical object.
- @node Environmental Queries, Files, Word Lists, Words
+ @comment nac TODO add a 2nd operation eg perimeter.. and use for
- @section Environmental Queries
+ @comment a concrete example
- @cindex environmental queries
- @comment TODO more index entries
- The ANS Standard introduced the idea of "environmental queries" as a way
+ @cindex abstract class
- for a program running on a system to determine certain characteristics of the system.
+ How do we create a graphical object? With the present definitions,
- The Standard specifies a number of strings that might be recognised by a system.
+ we cannot create a useful graphical object. The class
+ @code{graphical} describes graphical objects in general, but not
+ any concrete graphical object type (C++ users would call it an
+ @emph{abstract class}); e.g., there is no method for the selector
+ @code{draw} in the class @code{graphical}.
- The Standard requires that the name space used for environmental queries
+ For concrete graphical objects, we define child classes of the
- be distinct from the name space used for definitions.
+ class @code{graphical}, e.g.:
- Typically, environmental queries are supported by creating a set of
+ @cindex @code{overrides} usage
- definitions in a word set that is @var{only} used during environmental
+ @cindex @code{field} usage in class definition
- queries; that is what Gforth does. There is no Standard way of adding
+ @example
- definitions to the set of recognised environmental queries, but any
+ graphical class \ "graphical" is the parent class
- implementation that supports the loading of optional word sets must have
+   cell% field circle-radius
- some mechanism for doing this (after loading the word set, the
- associated environmental query string must return @code{true}). In
- Gforth, the word set used to honour environmental queries can be
- manipulated just like any other word set.
- doc-environment?
+ :noname ( x y circle -- )
- doc-environment-wordlist
+   circle-radius @@ draw-circle ;
+ overrides draw
- doc-gforth
+ :noname ( n-radius circle -- )
- doc-os-class
+   circle-radius ! ;
+ overrides construct
- Note that, whilst the documentation for (eg) @code{gforth} shows it
+ end-class circle
- returning two items on the stack, querying it using @code{environment?}
+ @end example
- will return an additional item; the @code{true} flag that shows that the
- string was recognised.
- TODO Document the standard strings or note where they are documented herein
+ Here we define a class @code{circle} as a child of @code{graphical},
+ with field @code{circle-radius} (which behaves just like a field
+ (@pxref{Structures}); it defines (using @code{overrides}) new methods
+ for the selectors @code{draw} and @code{construct} (@code{construct} is
+ defined in @code{object}, the parent class of @code{graphical}).
- Here are some examples of using environmental queries:
+ Now we can create a circle on the heap (i.e.,
+ @code{allocate}d memory) with:
+ @cindex @code{heap-new} usage
  @example
- s" address-unit-bits" environment? 0=
+circle heap-new constant my-circle
- [IF]
+ @end example
-      cr .( environmental attribute address-units-bits unknown... ) cr
- [THEN]
- s" block" environment? [IF] DROP include block.fs [THEN]
- s" gforth" environment? [IF] 2DROP include compat/vocabulary.fs [THEN]
- s" gforth" environment? [IF] .( Gforth version ) TYPE [ELSE] .( Not Gforth..) [THEN]
+ @noindent
+ @code{heap-new} invokes @code{construct}, thus
+ initializing the field @code{circle-radius} with 50. We can draw
+ this new circle at (100,100) with:
+ @example
+100 my-circle draw
  @end example
+ @cindex selector invocation, restrictions
+ @cindex class definition, restrictions
+ Note: You can only invoke a selector if the object on the TOS
+ (the receiving object) belongs to the class where the selector was
+ defined or one of its descendents; e.g., you can invoke
+ @code{draw} only for objects belonging to @code{graphical}
+ or its descendents (e.g., @code{circle}).  Immediately before
+ @code{end-class}, the search order has to be the same as
+ immediately after @code{class}.
- Here is an example of adding a definition to the environment word list:
+ @node The Objects base class, Creating objects, Basic Objects Usage, Objects
+ @subsubsection The @file{object.fs} base class
+ @cindex @code{object} class
- @example
+ When you define a class, you have to specify a parent class.  So how do
- get-current environment-wordlist set-current
+ you start defining classes? There is one class available from the start:
- true constant block
+ @code{object}. It is ancestor for all classes and so is the
- true constant block-ext
+ only class that has no parent. It has two selectors: @code{construct}
- set-current
+ and @code{print}.
- @end example
- You can see what definitions are in the environment word list like this:
+ @node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects
+ @subsubsection Creating objects
+ @cindex creating objects
+ @cindex object creation
+ @cindex object allocation options
- @example
+ @cindex @code{heap-new} discussion
- get-order 1+ environment-wordlist swap set-order words previous
+ @cindex @code{dict-new} discussion
- @end example
+ @cindex @code{construct} discussion
+ You can create and initialize an object of a class on the heap with
+ @code{heap-new} ( ... class -- object ) and in the dictionary
+ (allocation with @code{allot}) with @code{dict-new} (
+ ... class -- object ). Both words invoke @code{construct}, which
+ consumes the stack items indicated by "..." above.
+ @cindex @code{init-object} discussion
+ @cindex @code{class-inst-size} discussion
+ If you want to allocate memory for an object yourself, you can get its
+ alignment and size with @code{class-inst-size 2@@} ( class --
+ align size ). Once you have memory for an object, you can initialize
+ it with @code{init-object} ( ... class object -- );
+ @code{construct} does only a part of the necessary work.
+ @node Object-Oriented Programming Style, Class Binding, Creating objects, Objects
+ @subsubsection Object-Oriented Programming Style
+ @cindex object-oriented programming style
- @node Files, Including Files, Environmental Queries, Words
+ This section is not exhaustive.
- @section Files
- This chapter describes how to operate on files from Forth.
+ @cindex stack effects of selectors
+ @cindex selectors and stack effects
+ In general, it is a good idea to ensure that all methods for the
+ same selector have the same stack effect: when you invoke a selector,
+ you often have no idea which method will be invoked, so, unless all
+ methods have the same stack effect, you will not know the stack effect
+ of the selector invocation.
- Files are opened/created by name and type. The following types are
+ One exception to this rule is methods for the selector
- recognised:
+ @code{construct}. We know which method is invoked, because we
+ specify the class to be constructed at the same place. Actually, I
+ defined @code{construct} as a selector only to give the users a
+ convenient way to specify initialization. The way it is used, a
+ mechanism different from selector invocation would be more natural
+ (but probably would take more code and more space to explain).
- doc-r/o
+ @node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects
- doc-r/w
+ @subsubsection Class Binding
- doc-w/o
+ @cindex class binding
- doc-bin
+ @cindex early binding
- When a file is opened/created, it returns a file identifier,
+ @cindex late binding
- @var{wfileid} that is used for all other file commands. All file
+ Normal selector invocations determine the method at run-time depending
- commands also return a status value, @var{wior}, that is 0 for a
+ on the class of the receiving object. This run-time selection is called
- successful operation and an implementation-defined non-zero value in the
+ @var{late binding}.
- case of an error.
- doc-open-file
+ Sometimes it's preferable to invoke a different method. For example,
- doc-create-file
+ you might want to use the simple method for @code{print}ing
+ @code{object}s instead of the possibly long-winded @code{print} method
+ of the receiver class. You can achieve this by replacing the invocation
+ of @code{print} with:
- doc-close-file
+ @cindex @code{[bind]} usage
- doc-delete-file
+ @example
- doc-rename-file
+ [bind] object print
- doc-read-file
+ @end example
- doc-read-line
- doc-write-file
- doc-write-line
- doc-emit-file
- doc-flush-file
- doc-file-status
+ @noindent
- doc-file-position
+ in compiled code or:
- doc-reposition-file
- doc-file-size
- doc-resize-file
- @node Including Files, Blocks, Files, Words
+ @cindex @code{bind} usage
- @section Including Files
+ @example
- @cindex including files
+ bind object print
+ @end example
- @menu
+ @cindex class binding, alternative to
- * Words for Including::
+ @noindent
- * Search Path::
+ in interpreted code. Alternatively, you can define the method with a
- * Forth Search Paths::
+ name (e.g., @code{print-object}), and then invoke it through the
- * General Search Paths::
+ name. Class binding is just a (often more convenient) way to achieve
- @end menu
+ the same effect; it avoids name clutter and allows you to invoke
+ methods directly without naming them first.
- @node Words for Including, Search Path, Including Files, Including Files
+ @cindex superclass binding
- @subsection Words for Including
+ @cindex parent class binding
+ A frequent use of class binding is this: When we define a method
+ for a selector, we often want the method to do what the selector does
+ in the parent class, and a little more. There is a special word for
+ this purpose: @code{[parent]}; @code{[parent]
+ @emph{selector}} is equivalent to @code{[bind] @emph{parent
+ selector}}, where @code{@emph{parent}} is the parent
+ class of the current class. E.g., a method definition might look like:
- doc-include-file
+ @cindex @code{[parent]} usage
- doc-included
+ @example
- doc-include
+ :noname
+   dup [parent] foo \ do parent's foo on the receiving object
+   ... \ do some more
+ ; overrides foo
+ @end example
- Usually you want to include a file only if it is not included already
+ @cindex class binding as optimization
- (by, say, another source file):
+ In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions,
- @comment TODO describe what happens on error. Describes how the require
+ March 1997), Andrew McKewan presents class binding as an optimization
- @comment stuff works and describe how to clear/reset the history (eg
+ technique. I recommend not using it for this purpose unless you are in
- @comment for debug). Might want to include that in the MARKER example.
+ an emergency. Late binding is pretty fast with this model anyway, so the
+ benefit of using class binding is small; the cost of using class binding
+ where it is not appropriate is reduced maintainability.
- doc-required
+ While we are at programming style questions: You should bind
- doc-require
+ selectors only to ancestor classes of the receiving object. E.g., say,
- doc-needs
+ you know that the receiving object is of class @code{foo} or its
+ descendents; then you should bind only to @code{foo} and its
+ ancestors.
- A definition in ANS Standard Forth for @code{required} is provided in
+ @node Method conveniences, Classes and Scoping, Class Binding, Objects
- @file{compat/required.fs}.
+ @subsubsection Method conveniences
+ @cindex method conveniences
- @cindex stack effect of included files
+ In a method you usually access the receiving object pretty often.  If
- @cindex including files, stack effect
+ you define the method as a plain colon definition (e.g., with
- I recommend that you write your source files such that interpreting them
+ @code{:noname}), you may have to do a lot of stack
- does not change the stack. This allows using these files with
+ gymnastics. To avoid this, you can define the method with @code{m:
- @code{required} and friends without complications. E.g.,
+ ... ;m}. E.g., you could define the method for
+ @code{draw}ing a @code{circle} with
+ @cindex @code{this} usage
+ @cindex @code{m:} usage
+ @cindex @code{;m} usage
  @example
-require foo.fs drop
+ m: ( x y circle -- )
+   ( x y ) this circle-radius @@ draw-circle ;m
  @end example
- @node Search Path, Forth Search Paths, Words for Including, Including Files
+ @cindex @code{exit} in @code{m: ... ;m}
- @subsection Search Path
+ @cindex @code{exitm} discussion
- @cindex path for @code{included}
+ @cindex @code{catch} in @code{m: ... ;m}
- @cindex file search path
+ When this method is executed, the receiver object is removed from the
- @cindex include search path
+ stack; you can access it with @code{this} (admittedly, in this
- @cindex search path for files
+ example the use of @code{m: ... ;m} offers no advantage). Note
+ that I specify the stack effect for the whole method (i.e. including
- @comment what uses these search paths.. just inc;lude and friends?
+ the receiver object), not just for the code between @code{m:}
- If you specify an absolute filename (i.e., a filename starting with
+ and @code{;m}. You cannot use @code{exit} in
- @file{/} or @file{~}, or with @file{:} in the second position (as in
+ @code{m:...;m}; instead, use
- @samp{C:...})) for @code{included} and friends, that file is included
+ @code{exitm}.@footnote{Moreover, for any word that calls
- just as you would expect.
+ @code{catch} and was defined before loading
+ @code{objects.fs}, you have to redefine it like I redefined
+ @code{catch}: @code{: catch this >r catch r> to-this ;}}
- For relative filenames, Gforth uses a search path similar to Forth's
+ @cindex @code{inst-var} usage
- search order (@pxref{Word Lists}). It tries to find the given filename in
+ You will frequently use sequences of the form @code{this
- the directories present in the path, and includes the first one it
+ @emph{field}} (in the example above: @code{this
- finds.
+ circle-radius}). If you use the field only in this way, you can
+ define it with @code{inst-var} and eliminate the
+ @code{this} before the field name. E.g., the @code{circle}
+ class above could also be defined with:
- If the search path contains the directory @file{.} (as it should), this
+ @example
- refers to the directory that the present file was @code{included}
+ graphical class
- from. This allows files to include other files relative to their own
+   cell% inst-var radius
- position (irrespective of the current working directory or the absolute
- position).  This feature is essential for libraries consisting of
- several files, where a file may include other files from the library.
- It corresponds to @code{#include "..."} in C. If the current input
- source is not a file, @file{.} refers to the directory of the innermost
- file being included, or, if there is no file being included, to the
- current working directory.
- Use @file{~+} to refer to the current working directory (as in the
+ m: ( x y circle -- )
- @code{bash}).
+   radius @@ draw-circle ;m
+ overrides draw
- If the filename starts with @file{./}, the search path is not searched
+ m: ( n-radius circle -- )
- (just as with absolute filenames), and the @file{.} has the same meaning
+   radius ! ;m
- as described above.
+ overrides construct
- @node Forth Search Paths, General Search Paths, Search Path, Including Files
+ end-class circle
- @subsection Forth Search Paths
+ @end example
- @cindex search path control - forth
- The search path is initialized when you start Gforth (@pxref{Invoking
+ @code{radius} can only be used in @code{circle} and its
- Gforth}). You can display it with
+ descendent classes and inside @code{m:...;m}.
- doc-.fpath
+ @cindex @code{inst-value} usage
+ You can also define fields with @code{inst-value}, which is
+ to @code{inst-var} what @code{value} is to
+ @code{variable}.  You can change the value of such a field with
+ @code{[to-inst]}.  E.g., we could also define the class
+ @code{circle} like this:
- You can change it later with the following words:
+ @example
+ graphical class
+   inst-value radius
- doc-fpath+
+ m: ( x y circle -- )
- doc-fpath=
+   radius draw-circle ;m
+ overrides draw
- Using fpath and require would look like:
- @example
+ m: ( n-radius circle -- )
- fpath= /usr/lib/forth/|./
+   [to-inst] radius ;m
+ overrides construct
- require timer.fs
+ end-class circle
  @end example
- If you have the need to look for a file in the Forth search path, you could
- use this Gforth feature in your application:
- doc-open-fpath-file
+ @node Classes and Scoping, Object Interfaces, Method conveniences, Objects
+ @subsubsection Classes and Scoping
+ @cindex classes and scoping
+ @cindex scoping and classes
- @node General Search Paths,  , Forth Search Paths, Including Files
+ Inheritance is frequent, unlike structure extension. This exacerbates
- @subsection General Search Paths
+ the problem with the field name convention (@pxref{Structure Naming
- @cindex search path control - for user applications
+ Convention}): One always has to remember in which class the field was
+ originally defined; changing a part of the class structure would require
+ changes for renaming in otherwise unaffected code.
- Your application may need to search files in sevaral directories, like
+ @cindex @code{inst-var} visibility
- @code{included} does. For this purpose you can define and use your own
+ @cindex @code{inst-value} visibility
- search paths. Create a search path like this:
+ To solve this problem, I added a scoping mechanism (which was not in my
+ original charter): A field defined with @code{inst-var} (or
+ @code{inst-value}) is visible only in the class where it is defined and in
+ the descendent classes of this class.  Using such fields only makes
+ sense in @code{m:}-defined methods in these classes anyway.
- @example
+ This scoping mechanism allows us to use the unadorned field name,
- \ Make a buffer for the path:
+ because name clashes with unrelated words become much less likely.
- create mypath   100 chars ,     \ maximum length (is checked)
-,             \ real len
-chars allot \ space for path
- @end example
- You have the same functions for the forth search path in a generic version
+ @cindex @code{protected} discussion
- for different paths.
+ @cindex @code{private} discussion
+ Once we have this mechanism, we can also use it for controlling the
+ visibility of other words: All words defined after
+ @code{protected} are visible only in the current class and its
+ descendents. @code{public} restores the compilation
+ (i.e. @code{current}) word list that was in effect before. If you
+ have several @code{protected}s without an intervening
+ @code{public} or @code{set-current}, @code{public}
+ will restore the compilation word list in effect before the first of
+ these @code{protected}s.
- Gforth also provides generic equivalents of the Forth search path words:
+ @node Object Interfaces, Objects Implementation, Classes and Scoping, Objects
+ @subsubsection Object Interfaces
+ @cindex object interfaces
+ @cindex interfaces for objects
- doc-.path
+ In this model you can only call selectors defined in the class of the
- doc-path+
+ receiving objects or in one of its ancestors. If you call a selector
- doc-path=
+ with a receiving object that is not in one of these classes, the
- doc-open-path-file
+ result is undefined; if you are lucky, the program crashes
+ immediately.
+ @cindex selectors common to hardly-related classes
+ Now consider the case when you want to have a selector (or several)
+ available in two classes: You would have to add the selector to a
+ common ancestor class, in the worst case to @code{object}. You
+ may not want to do this, e.g., because someone else is responsible for
+ this ancestor class.
- @node Blocks, Other I/O, Including Files, Words
+ The solution for this problem is interfaces. An interface is a
- @section Blocks
+ collection of selectors. If a class implements an interface, the
+ selectors become available to the class and its descendents. A class
+ can implement an unlimited number of interfaces. For the problem
+ discussed above, we would define an interface for the selector(s), and
+ both classes would implement the interface.
- This chapter describes how to use block files within Gforth.
+ As an example, consider an interface @code{storage} for
+ writing objects to disk and getting them back, and a class
+ @code{foo} that implements it. The code would look like this:
- Block files are traditionally means of data and source storage in
+ @cindex @code{interface} usage
- Forth. They have been very important in resource-starved computers
+ @cindex @code{end-interface} usage
- without OS in the past. Gforth doesn't encourage to use blocks as
+ @cindex @code{implementation} usage
- source, and provides blocks only for backward compatibility. The ANS
+ @example
- standard requires blocks to be available when files are.
+ interface
+   selector write ( file object -- )
+   selector read1 ( file object -- )
+ end-interface storage
- @comment TODO what about errors on open-blocks?
+ bar class
- doc-open-blocks
+   storage implementation
- doc-use
- doc-scr
- doc-blk
- doc-get-block-fid
- doc-block-position
- doc-update
- doc-save-buffers
- doc-save-buffer
- doc-empty-buffers
- doc-empty-buffer
- doc-flush
- doc-get-buffer
- doc---block-block
- doc-buffer
- doc-updated?
- doc-list
- doc-load
- doc-thru
- doc-+load
- doc-+thru
- doc---block--->
- doc-block-included
- @node Other I/O, Programming Tools, Blocks, Words
+ ... overrides write
- @section Other I/O
+ ... overrides read
- @comment TODO more index entries
+ ...
+ end-class foo
+ @end example
- @menu
+ @noindent
- * Simple numeric output::       Predefined formats
+ (I would add a word @code{read} @var{( file -- object )} that uses
- * Formatted numeric output::    Formatted (pictured) output
+ @code{read1} internally, but that's beyond the point illustrated
- * String Formats::              How Forth stores strings in memory
+ here.)
- * Displaying characters and strings:: Other stuff
- * Input::                       Input
- @end menu
- @node Simple numeric output, Formatted numeric output, Other I/O, Other I/O
+ Note that you cannot use @code{protected} in an interface; and
- @subsection Simple numeric output
+ of course you cannot define fields.
- @cindex Simple numeric output
- @comment TODO more index entries
- The simplest output functions are those that display numbers from the
+ In the Neon model, all selectors are available for all classes;
- data or floating-point stacks. Floating-point output is always displayed
+ therefore it does not need interfaces. The price you pay in this model
- using base 10. Numbers displayed from the data stack use the value stored
+ is slower late binding, and therefore, added complexity to avoid late
- in @code{base}.
+ binding.
- doc-.
+ @node Objects Implementation, Objects Glossary, Object Interfaces, Objects
- doc-dec.
+ @subsubsection @file{objects.fs} Implementation
- doc-hex.
+ @cindex @file{objects.fs} implementation
- doc-u.
- doc-.r
- doc-u.r
- doc-d.
- doc-ud.
- doc-d.r
- doc-ud.r
- doc-f.
- doc-fe.
- doc-fs.
- Examples of printing the number 1234.5678E23 in the different floating-point output
+ @cindex @code{object-map} discussion
- formats are shown below:
+ An object is a piece of memory, like one of the data structures
+ described with @code{struct...end-struct}. It has a field
+ @code{object-map} that points to the method map for the object's
+ class.
+ @cindex method map
+ @cindex virtual function table
+ The @emph{method map}@footnote{This is Self terminology; in C++
+ terminology: virtual function table.} is an array that contains the
+ execution tokens (@var{xt}s) of the methods for the object's class. Each
+ selector contains an offset into a method map.
+ @cindex @code{selector} implementation, class
+ @code{selector} is a defining word that uses
+ @code{CREATE} and @code{DOES>}. The body of the
+ selector contains the offset; the @code{does>} action for a
+ class selector is, basically:
  @example
- f. 123456779999999000000000000.
+ ( object addr ) @@ over object-map @@ + @@ execute
- fe. 123.456779999999E24
- fs. 1.23456779999999E26
  @end example
+ Since @code{object-map} is the first field of the object, it
+ does not generate any code. As you can see, calling a selector has a
+ small, constant cost.
- @node Formatted numeric output, String Formats, Simple numeric output, Other I/O
+ @cindex @code{current-interface} discussion
- @subsection Formatted numeric output
+ @cindex class implementation and representation
- @cindex Formatted numeric output
+ A class is basically a @code{struct} combined with a method
- @cindex pictured numeric output
+ map. During the class definition the alignment and size of the class
- @comment TODO more index entries
+ are passed on the stack, just as with @code{struct}s, so
+ @code{field} can also be used for defining class
+ fields. However, passing more items on the stack would be
+ inconvenient, so @code{class} builds a data structure in memory,
+ which is accessed through the variable
+ @code{current-interface}. After its definition is complete, the
+ class is represented on the stack by a pointer (e.g., as parameter for
+ a child class definition).
- Forth traditionally uses a technique called @var{pictured numeric
+ A new class starts off with the alignment and size of its parent,
- output} for formatted printing of integers.  In this technique,
+ and a copy of the parent's method map. Defining new fields extends the
- digits are extracted from the number (using the current output radix
+ size and alignment; likewise, defining new selectors extends the
- defined by @code{base}), converted to ASCII codes and appended to a
+ method map. @code{overrides} just stores a new @var{xt} in the method
- string that is built in a scratch-pad area of memory
+ map at the offset given by the selector.
- (@pxref{core-idef,Implementation-defined options}). During the extraction
- sequence, other arbitrary characters can be appended to the string. The
- completed string is specified by an address and length and can
- be manipulated (@code{TYPE}ed, copied, modified) under program control.
- All of the words described in the previous section for simple numeric
+ @cindex class binding, implementation
- output are implemented in Gforth using pictured numeric output.
+ Class binding just gets the @var{xt} at the offset given by the selector
+ from the class's method map and @code{compile,}s (in the case of
+ @code{[bind]}) it.
- Three important things to remember about Pictured Numeric Output:
+ @cindex @code{this} implementation
+ @cindex @code{catch} and @code{this}
+ @cindex @code{this} and @code{catch}
+ I implemented @code{this} as a @code{value}. At the
+ start of an @code{m:...;m} method the old @code{this} is
+ stored to the return stack and restored at the end; and the object on
+ the TOS is stored @code{TO this}. This technique has one
+ disadvantage: If the user does not leave the method via
+ @code{;m}, but via @code{throw} or @code{exit},
+ @code{this} is not restored (and @code{exit} may
+ crash). To deal with the @code{throw} problem, I have redefined
+ @code{catch} to save and restore @code{this}; the same
+ should be done with any word that can catch an exception. As for
+ @code{exit}, I simply forbid it (as a replacement, there is
+ @code{exitm}).
- @itemize @bullet
+ @cindex @code{inst-var} implementation
- @item
+ @code{inst-var} is just the same as @code{field}, with
- It always operates on double-precision numbers; to display a single-precision number,
+ a different @code{DOES>} action:
- convert it first (@pxref{Double precision} for ways of doing this).
+ @example
- @item
+ @@ this +
- It always treats the double-precision number as though it were unsigned. Refer to
+ @end example
- the examples below for ways of printing signed numbers.
+ Similar for @code{inst-value}.
- @item
- The string is built up from right to left; least significant digit first.
- @end itemize
- doc-<#
+ @cindex class scoping implementation
- doc-#
+ Each class also has a word list that contains the words defined with
- doc-#s
+ @code{inst-var} and @code{inst-value}, and its protected
- doc-hold
+ words. It also has a pointer to its parent. @code{class} pushes
- doc-sign
+ the word lists of the class and all its ancestors onto the search order stack,
- doc-#>
+ and @code{end-class} drops them.
- doc-represent
+ @cindex interface implementation
+ An interface is like a class without fields, parent and protected
+ words; i.e., it just has a method map. If a class implements an
+ interface, its method map contains a pointer to the method map of the
+ interface. The positive offsets in the map are reserved for class
+ methods, therefore interface map pointers have negative
+ offsets. Interfaces have offsets that are unique throughout the
+ system, unlike class selectors, whose offsets are only unique for the
+ classes where the selector is available (invokable).
- Here are some examples of using pictured numeric output:
+ This structure means that interface selectors have to perform one
+ indirection more than class selectors to find their method. Their body
+ contains the interface map pointer offset in the class method map, and
+ the method offset in the interface method map. The
+ @code{does>} action for an interface selector is, basically:
  @example
- : my-u. ( u -- )
+ ( object selector-body )
-   \ Simplest use of pns.. behaves like Standard u.
+dup selector-interface @@ ( object selector-body object interface-offset )
-             \ convert to unsigned double
+ swap object-map @@ + @@ ( object selector-body map )
-   <#             \ start conversion
+ swap selector-offset @@ + @@ execute
-   #s             \ convert all digits
+ @end example
-   #>             \ complete conversion
-   TYPE SPACE ;   \ display, with trailing space
- : cents-only ( u -- )
+ where @code{object-map} and @code{selector-offset} are
-             \ convert to unsigned double
+ first fields and generate no code.
-   <#             \ start conversion
-   # #            \ convert two least-significant digits
-   #>             \ complete conversion, discard other digits
-   TYPE SPACE ;   \ display, with trailing space
- : dollars-and-cents ( u -- )
+ As a concrete example, consider the following code:
-             \ convert to unsigned double
-   <#             \ start conversion
-   # #            \ convert two least-significant digits
-   [char] . hold  \ insert decimal point
-   #s             \ convert remaining digits
-   [char] $ hold  \ append currency symbol
-   #>             \ complete conversion
-   TYPE SPACE ;   \ display, with trailing space
- : my-. ( n -- )
+ @example
-   \ handling negatives.. behaves like Standard .
+ interface
-   s>d            \ convert to signed double
+   selector if1sel1
-   swap over dabs \ leave sign byte followed by unsigned double
+   selector if1sel2
-   <#             \ start conversion
+ end-interface if1
-   #s             \ convert all digits
-   rot sign       \ get at sign byte, append "-" if needed
-   #>             \ complete conversion
-   TYPE SPACE ;   \ display, with trailing space
- : account. ( n -- )
+ object class
-   \ accountants don't like minus signs, they use braces
+   if1 implementation
-   \ for negative numbers
+   selector cl1sel1
-   s>d            \ convert to signed double
+   cell% inst-var cl1iv1
-   swap over dabs \ leave sign byte followed by unsigned double
-   <#             \ start conversion
-pick         \ get copy of sign byte
-< IF [char] ) hold THEN \ right-most character of output
-   #s             \ convert all digits
-   rot            \ get at sign byte
-< IF [char] ( hold THEN
-   #>             \ complete conversion
-   TYPE SPACE ;   \ display, with trailing space
- @end example
- Here are some examples of using these words:
+ ' m1 overrides construct
+ ' m2 overrides if1sel1
+ ' m3 overrides if1sel2
+ ' m4 overrides cl1sel2
+ end-class cl1
- @example
+ create obj1 object dict-new drop
-my-u. 1
+ create obj2 cl1    dict-new drop
- hex -1 my-u. decimal FFFFFFFF
-cents-only 01
-cents-only 34
-dollars-and-cents $0.02
-dollars-and-cents $12.34
-my-. 123
- -123 my. -123
-account. 123
- -456 account. (456)
  @end example
+ The data structure created by this code (including the data structure
+ for @code{object}) is shown in the <a
+ href="objects-implementation.eps">figure</a>, assuming a cell size of 4.
+ @comment nac TODO add this diagram..
+ @node Objects Glossary,  , Objects Implementation, Objects
+ @subsubsection @file{objects.fs} Glossary
+ @cindex @file{objects.fs} Glossary
+ doc---objects-bind
+ doc---objects-<bind>
+ doc---objects-bind'
+ doc---objects-[bind]
+ doc---objects-class
+ doc---objects-class->map
+ doc---objects-class-inst-size
+ doc---objects-class-override!
+ doc---objects-construct
+ doc---objects-current'
+ doc---objects-[current]
+ doc---objects-current-interface
+ doc---objects-dict-new
+ doc---objects-drop-order
+ doc---objects-end-class
+ doc---objects-end-class-noname
+ doc---objects-end-interface
+ doc---objects-end-interface-noname
+ doc---objects-exitm
+ doc---objects-heap-new
+ doc---objects-implementation
+ doc---objects-init-object
+ doc---objects-inst-value
+ doc---objects-inst-var
+ doc---objects-interface
+ doc---objects-;m
+ doc---objects-m:
+ doc---objects-method
+ doc---objects-object
+ doc---objects-overrides
+ doc---objects-[parent]
+ doc---objects-print
+ doc---objects-protected
+ doc---objects-public
+ doc---objects-push-order
+ doc---objects-selector
+ doc---objects-this
+ doc---objects-<to-inst>
+ doc---objects-[to-inst]
+ doc---objects-to-this
+ doc---objects-xt-new
+ @c -------------------------------------------------------------
+ @node OOF, Mini-OOF, Objects, Object-oriented Forth
+ @subsection The @file{oof.fs} model
+ @cindex oof
+ @cindex object-oriented programming
+ @cindex @file{objects.fs}
+ @cindex @file{oof.fs}
+ This section describes the @file{oof.fs} package.
+ The package described in this section has been used in bigFORTH since 1991, and
+ used for two large applications: a chromatographic system used to
+ create new medicaments, and a graphic user interface library (MINOS).
+ You can find a description (in German) of @file{oof.fs} in @cite{Object
+ oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension}
+(2), 1994.
+ @menu
+ * Properties of the OOF model::
+ * Basic OOF Usage::
+ * The OOF base class::
+ * Class Declaration::
+ * Class Implementation::
+ @end menu
+ @node Properties of the OOF model, Basic OOF Usage, OOF, OOF
+ @subsubsection Properties of the @file{oof.fs} model
+ @cindex @file{oof.fs} properties
+ @itemize @bullet
+ @item
+ This model combines object oriented programming with information
+ hiding. It helps you writing large application, where scoping is
+ necessary, because it provides class-oriented scoping.
- @node String Formats, Displaying characters and strings, Formatted numeric output, Other I/O
+ @item
- @subsection String Formats
+ Named objects, object pointers, and object arrays can be created,
- @cindex string formats
+ selector invocation uses the ``object selector'' syntax. Selector invocation
+ to objects and/or selectors on the stack is a bit less convenient, but
+ possible.
- @comment TODO more index entries
+ @item
+ Selector invocation and instance variable usage of the active object is
+ straightforward, since both make use of the active object.
- Forth commonly uses two different methods for representing a string:
+ @item
+ Late binding is efficient and easy to use.
- @itemize @bullet
  @item
- @cindex address of counted string
+ State-smart objects parse selectors. However, extensibility is provided
- As a @var{counted string}, represented by a c-addr. The char addressed
+ using a (parsing) selector @code{postpone} and a selector @code{'}.
- by c-addr contains a character-count, n,  of the string and the string
- occupies the subsequent n char addresses in memory.
  @item
- As cell pair on the stack; c-addr u, where u is the length of the string
+ An implementation in ANS Forth is available.
- in characters, and c-addr is the address of the first byte of the string.
  @end itemize
- The ANS Forth Standard encourages the use of the second format when
- representing strings on the stack, whilst conceeding that the counted
- string format remains useful as a way of storing strings in memory.
- doc-count
+ @node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF
+ @subsubsection Basic @file{oof.fs} Usage
+ @cindex @file{oof.fs} usage
- @xref{Memory Blocks} for words that move, copy and search
+ This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}).
- for strings. @xref{Displaying characters and strings,} for words that
- display characters and strings.
+ You can define a class for graphical objects like this:
- @node Displaying characters and strings, Input, String Formats, Other I/O
+ @cindex @code{class} usage
- @subsection Displaying characters and strings
+ @cindex @code{class;} usage
- @cindex displaying characters and strings
+ @cindex @code{method} usage
- @cindex compiling characters and strings
+ @example
- @cindex cursor control
+ object class graphical \ "object" is the parent class
+   method draw ( x y graphical -- )
+ class;
+ @end example
- @comment TODO more index entries
+ This code defines a class @code{graphical} with an
+ operation @code{draw}.  We can perform the operation
+ @code{draw} on any @code{graphical} object, e.g.:
- This section starts with a glossary of Forth words and ends with a set
+ @example
- of examples.
+100 t-rex draw
+ @end example
- doc-bl
+ @noindent
- doc-space
+ where @code{t-rex} is an object or object pointer, created with e.g.
- doc-spaces
+ @code{graphical : t-rex}.
- doc-emit
- doc-toupper
- doc-."
- doc-.(
- doc-type
- doc-cr
- doc-at-xy
- doc-page
- doc-s"
- doc-c"
- doc-char
- doc-[char]
- doc-sliteral
- As an example, consider the following text, stored in a file @file{test.fs}:
+ @cindex abstract class
+ How do we create a graphical object? With the present definitions,
+ we cannot create a useful graphical object. The class
+ @code{graphical} describes graphical objects in general, but not
+ any concrete graphical object type (C++ users would call it an
+ @emph{abstract class}); e.g., there is no method for the selector
+ @code{draw} in the class @code{graphical}.
+ For concrete graphical objects, we define child classes of the
+ class @code{graphical}, e.g.:
  @example
- .( text-1)
+ graphical class circle \ "graphical" is the parent class
- : my-word
+   cell var circle-radius
-   ." text-2" cr
+ how:
-   .( text-3)
+   : draw ( x y -- )
- ;
+     circle-radius @@ draw-circle ;
- ." text-4"
+   : init ( n-radius -- (
+     circle-radius ! ;
+ class;
+ @end example
- : my-char
+ Here we define a class @code{circle} as a child of @code{graphical},
-   [char] ALPHABET emit
+ with a field @code{circle-radius}; it defines new methods for the
-   char emit
+ selectors @code{draw} and @code{init} (@code{init} is defined in
- ;
+ @code{object}, the parent class of @code{graphical}).
+ Now we can create a circle in the dictionary with:
+ @example
+circle : my-circle
  @end example
- When you load this code into Gforth, the following output is generated:
+ @noindent
+ @code{:} invokes @code{init}, thus initializing the field
+ @code{circle-radius} with 50. We can draw this new circle at (100,100)
+ with:
  @example
- @kbd{include test.fs} text-1text-3text-4 ok
+100 my-circle draw
  @end example
+ @cindex selector invocation, restrictions
+ @cindex class definition, restrictions
+ Note: You can only invoke a selector if the receiving object belongs to
+ the class where the selector was defined or one of its descendents;
+ e.g., you can invoke @code{draw} only for objects belonging to
+ @code{graphical} or its descendents (e.g., @code{circle}). The scoping
+ mechanism will check if you try to invoke a selector that is not
+ defined in this class hierarchy, so you'll get an error at compilation
+ time.
+ @node The OOF base class, Class Declaration, Basic OOF Usage, OOF
+ @subsubsection The @file{oof.fs} base class
+ @cindex @file{oof.fs} base class
+ When you define a class, you have to specify a parent class.  So how do
+ you start defining classes? There is one class available from the start:
+ @code{object}. You have to use it as ancestor for all classes. It is the
+ only class that has no parent. Classes are also objects, except that
+ they don't have instance variables; class manipulation such as
+ inheritance or changing definitions of a class is handled through
+ selectors of the class @code{object}.
+ @code{object} provides a number of selectors:
  @itemize @bullet
  @item
- Messages @code{text-1} and @code{text-3} are displayed because @code{.(}
+ @code{class} for subclassing, @code{definitions} to add definitions
- is an immediate word; it behaves in the same way whether it is used inside
+ later on, and @code{class?} to get type informations (is the class a
- or outside a colon definition.
+ subclass of the class passed on the stack?).
- @item
+ doc---object-class
- Message @code{text-4} is displayed because of Gforth's added interpretation
+ doc---object-definitions
- semantics for @code{."}.
+ doc---object-class?
  @item
- Message @code{text-2} is @var{not} displayed, because the text interpreter
+ @code{init} and @code{dispose} as constructor and destructor of the
- performs the compilation semantics for @code{."} within the definition of
+ object. @code{init} is invocated after the object's memory is allocated,
- @code{my-word}.
+ while @code{dispose} also handles deallocation. Thus if you redefine
- @end itemize
+ @code{dispose}, you have to call the parent's dispose with @code{super
+ dispose}, too.
+ doc---object-init
+ doc---object-dispose
- Here are some examples of executing @code{my-word} and @code{my-char}:
+ @item
+ @code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and
+ @code{[]} to create named and unnamed objects and object arrays or
+ object pointers.
+ doc---object-new
+ doc---object-new[]
+ doc---object-:
+ doc---object-ptr
+ doc---object-asptr
+ doc---object-[]
- @example
+ @item
- my-word text-2
+ @code{::} and @code{super} for explicit scoping. You should use explicit
-  ok
+ scoping only for super classes or classes with the same set of instance
- @kbd{my-char fred} Af ok
+ variables. Explicitly-scoped selectors use early binding.
- @kbd{my-char jim} Aj ok
+ doc---object-::
- @end example
+ doc---object-super
- @itemize @bullet
  @item
- Message @code{text-2} is displayed because of the run-time behaviour of
+ @code{self} to get the address of the object
- @code{."}.
+ doc---object-self
  @item
- @code{[char]} compiles the "A" from "ALPHABET" and puts its display code
+ @code{bind}, @code{bound}, @code{link}, and @code{is} to assign object
- on the stack at run-time. @code{emit} always displays the character
+ pointers and instance defers.
- when @code{my-char} is executed.
+ doc---object-bind
+ doc---object-bound
+ doc---object-link
+ doc---object-is
  @item
- @code{char} parses a string at run-time and the second @code{emit} displays
+ @code{'} to obtain selector tokens, @code{send} to invocate selectors
- the first character of the string.
+ form the stack, and @code{postpone} to generate selector invocation code.
+ doc---object-'
+ doc---object-postpone
  @item
- If you type @code{see my-char} you can see that @code{[char]} discarded
+ @code{with} and @code{endwith} to select the active object from the
- the text "LPHABET" and only compiled the display code for "A" into the
+ stack, and enable its scope. Using @code{with} and @code{endwith}
- definition of @code{my-char}.
+ also allows you to create code using selector @code{postpone} without being
+ trapped by the state-smart objects.
+ doc---object-with
+ doc---object-endwith
  @end itemize
+ @node Class Declaration, Class Implementation, The OOF base class, OOF
+ @subsubsection Class Declaration
+ @cindex class declaration
+ @itemize @bullet
+ @item
+ Instance variables
+ doc---oof-var
+ @item
+ Object pointers
+ doc---oof-ptr
+ doc---oof-asptr
- @node Input, , Displaying characters and strings, Other I/O
+ @item
- @subsection Input
+ Instance defers
- @cindex Input
+ doc---oof-defer
- @comment TODO more index entries
- Blah on traditional and recommended string formats.
+ @item
+ Method selectors
+ doc---oof-early
+ doc---oof-method
- doc--trailing
+ @item
- doc-/string
+ Class-wide variables
- doc-convert
+ doc---oof-static
- doc->number
- doc->float
- doc-accept
- doc-query
- doc-expect
- doc-evaluate
- doc-key
- doc-key?
- TODO reference the block move stuff elsewhere
+ @item
+ End declaration
+ doc---oof-how:
+ doc---oof-class;
- TODO convert and >number might be better in the numeric input section.
+ @end itemize
- TODO maybe some of these shouldn't be here but should be in a "parsing" section
+ @c -------------------------------------------------------------
+ @node Class Implementation,  , Class Declaration, OOF
+ @subsubsection Class Implementation
+ @cindex class implementation
+ @c -------------------------------------------------------------
+ @node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth
+ @subsection The @file{mini-oof.fs} model
+ @cindex mini-oof
- @node Programming Tools, Assembler and Code Words, Other I/O, Words
+ Gforth's third object oriented Forth package is a 12-liner. It uses a
- @section Programming Tools
+ mixture of the @file{object.fs} and the @file{oof.fs} syntax,
- @cindex programming tools
+ and reduces to the bare minimum of features. This is based on a posting
+ of Bernd Paysan in comp.arch.
  @menu
- * Debugging::                   Simple and quick.
+ * Basic Mini-OOF Usage::
- * Assertions::                  Making your programs self-checking.
+ * Mini-OOF Example::
- * Singlestep Debugger::         Executing your program word by word.
+ * Mini-OOF Implementation::
  @end menu
- @node Debugging, Assertions, Programming Tools, Programming Tools
+ @c -------------------------------------------------------------
- @subsection Debugging
+ @node Basic Mini-OOF Usage, Mini-OOF Example, , Mini-OOF
- @cindex debugging
+ @subsubsection Basic @file{mini-oof.fs} Usage
+ @cindex mini-oof usage
- Languages with a slow edit/compile/link/test development loop tend to
- require sophisticated tracing/stepping debuggers to facilate
- productive debugging.
- A much better (faster) way in fast-compiling languages is to add
+ There is a base class (@code{class}, which allocates one cell
- printing code at well-selected places, let the program run, look at
+ for the object pointer) plus seven other words: to define a method, a
- the output, see where things went wrong, add more printing code, etc.,
+ variable, a class; to end a class, to resolve binding, to allocate an
- until the bug is found.
+ object and to compile a class method.
+ @comment TODO better description of the last one
- The simple debugging aids provided in @file{debugs.fs}
+ doc-object
- are meant to support this style of debugging. In addition, there are
+ doc-method
- words for non-destructively inspecting the stack and memory:
+ doc-var
+ doc-class
+ doc-end-class
+ doc-defines
+ doc-new
+ doc-::
- doc-.s
- doc-f.s
- There is a word @code{.r} but it does @var{not} display the return
+ @c -------------------------------------------------------------
- stack! It is used for formatted numeric output.
+ @node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF
+ @subsubsection Mini-OOF Example
+ @cindex mini-oof example
- doc-depth
+ A short example shows how to use this package. This example, in slightly
- doc-fdepth
+ extended form, is supplied as @file{moof-exm.fs}
- doc-clearstack
+ @comment nac TODO could flesh this out with some comments from the Forthwrite article
- doc-?
- doc-dump
- The word @code{~~} prints debugging information (by default the source
+ @example
- location and the stack contents). It is easy to insert. If you use Emacs
+ object class
- it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
+   method init
- query-replace them with nothing). The deferred words
+   method draw
- @code{printdebugdata} and @code{printdebugline} control the output of
+ end-class graphical
- @code{~~}. The default source location output format works well with
+ @end example
- Emacs' compilation mode, so you can step through the program at the
- source level using @kbd{C-x `} (the advantage over a stepping debugger
- is that you can step in any direction and you know where the crash has
- happened or where the strange data has occurred).
- Note that the default actions clobber the contents of the pictured
+ This code defines a class @code{graphical} with an
- numeric output string, so you should not use @code{~~}, e.g., between
+ operation @code{draw}.  We can perform the operation
- @code{<#} and @code{#>}.
+ @code{draw} on any @code{graphical} object, e.g.:
- doc-~~
+ @example
- doc-printdebugdata
+100 t-rex draw
- doc-printdebugline
+ @end example
- doc-see
+ where @code{t-rex} is an object or object pointer, created with e.g.
- doc-marker
+ @code{graphical new Constant t-rex}.
- Here's an example of using @code{marker} at the start of a source file
+ For concrete graphical objects, we define child classes of the
- that you are debugging; it ensures that you only ever have one copy of
+ class @code{graphical}, e.g.:
- the file's definitions compiled at any time:
  @example
- [IFDEF] my-code
+ graphical class
-     my-code
+   cell var circle-radius
- [ENDIF]
+ end-class circle \ "graphical" is the parent class
- marker my-code
- \ .. definitions start here
+ :noname ( x y -- )
- \ .
+   circle-radius @@ draw-circle ; circle defines draw
- \ .
+ :noname ( r -- )
- \ end
+   circle-radius ! ; circle defines init
  @end example
+ There is no implicit init method, so we have to define one. The creation
+ code of the object now has to call init explicitely.
- @node Assertions, Singlestep Debugger, Debugging, Programming Tools
- @subsection Assertions
- @cindex assertions
- It is a good idea to make your programs self-checking, in particular, if
- you use an assumption (e.g., that a certain field of a data structure is
- never zero) that may become wrong during maintenance. Gforth supports
- assertions for this purpose. They are used like this:
  @example
- assert( @var{flag} )
+ circle new Constant my-circle
+my-circle init
  @end example
- The code between @code{assert(} and @code{)} should compute a flag, that
+ It is also possible to add a function to create named objects with
- should be true if everything is alright and false otherwise. It should
+ automatic call of @code{init}, given that all objects have @code{init}
- not change anything else on the stack. The overall stack effect of the
+ on the same place:
- assertion is @code{( -- )}. E.g.
  @example
- assert( 1 1 + 2 = ) \ what we learn in school
+ : new: ( .. o "name" -- )
- assert( dup 0<> ) \ assert that the top of stack is not zero
+     new dup Constant init ;
- assert( false ) \ this code should not be reached
+circle new: large-circle
  @end example
- The need for assertions is different at different times. During
+ We can draw this new circle at (100,100) with:
- debugging, we want more checking, in production we sometimes care more
- for speed. Therefore, assertions can be turned off, i.e., the assertion
- becomes a comment. Depending on the importance of an assertion and the
- time it takes to check it, you may want to turn off some assertions and
- keep others turned on. Gforth provides several levels of assertions for
- this purpose:
- doc-assert0(
+ @example
- doc-assert1(
+100 my-circle draw
- doc-assert2(
+ @end example
- doc-assert3(
- doc-assert(
- doc-)
- @code{Assert(} is the same as @code{assert1(}. The variable
+ @node Mini-OOF Implementation, , Mini-OOF Example, Mini-OOF
- @code{assert-level} specifies the highest assertions that are turned
+ @subsubsection @file{mini-oof.fs} Implementation
- on. I.e., at the default @code{assert-level} of one, @code{assert0(} and
- @code{assert1(} assertions perform checking, while @code{assert2(} and
- @code{assert3(} assertions are treated as comments.
- Note that the @code{assert-level} is evaluated at compile-time, not at
- run-time. I.e., you cannot turn assertions on or off at run-time, you
- have to set the @code{assert-level} appropriately before compiling a
- piece of code. You can compile several pieces of code at several
- @code{assert-level}s (e.g., a trusted library at level 1 and newly
- written code at level 3).
- doc-assert-level
+ Object-oriented systems with late binding typically use a
+ ``vtable''-approach: the first variable in each object is a pointer to a
+ table, which contains the methods as function pointers. The vtable
+ may also contain other information.
- If an assertion fails, a message compatible with Emacs' compilation mode
+ So first, let's declare methods:
- is produced and the execution is aborted (currently with @code{ABORT"}.
- If there is interest, we will introduce a special throw code. But if you
- intend to @code{catch} a specific condition, using @code{throw} is
- probably more appropriate than an assertion).
- Definitions in ANS Standard Forth for these assertion words are provided
+ @example
- in @file{compat/assert.fs}.
+ : method ( m v -- m' v ) Create  over , swap cell+ swap
+   DOES> ( ... o -- ... ) @ over @ + @ execute ;
+ @end example
+ During method declaration, the number of methods and instance
+ variables is on the stack (in address units). @code{method} creates
+ one method and increments the method number. To execute a method, it
+ takes the object, fetches the vtable pointer, adds the offset, and
+ executes the @var{xt} stored there. Each method takes the object it is
+ invoked from as top of stack parameter. The method itself should
+ consume that object.
- @node Singlestep Debugger, , Assertions, Programming Tools
+ Now, we also have to declare instance variables
- @subsection Singlestep Debugger
- @cindex singlestep Debugger
- @cindex debugging Singlestep
- @cindex @code{dbg}
- @cindex @code{BREAK:}
- @cindex @code{BREAK"}
- When a new word is created there's often the need to check whether it behaves
+ @example
- correctly or not. You can do this by typing @code{dbg badword}.
+ : var ( m v size -- m v' ) Create  over , +
+   DOES> ( o -- addr ) @ + ;
+ @end example
- doc-dbg
+ As before, a word is created with the current offset. Instance
+ variables can have different sizes (cells, floats, doubles, chars), so
+ all we do is take the size and add it to the offset. If your machine
+ has alignment restrictions, put the proper @code{aligned} or
+ @code{faligned} before the variable, to adjust the variable
+ offset. That's why it is on the top of stack.
- This might look like:
+ We need a starting point (the base object) and some syntactic sugar:
  @example
- : badword 0 DO i . LOOP ;  ok
+ Create object  1 cells , 2 cells ,
-dbg badword
+ : class ( class -- class methods vars ) dup 2@ ;
- : badword
+ @end example
- Scanning code...
- Nesting debugger ready!
+ For inheritance, the vtable of the parent object has to be
+ copied when a new, derived class is declared. This gives all the
+ methods of the parent class, which can be overridden, though.
-D4738  8049BC4 0              -> [ 2 ] 00002 00000
+ @example
-D4740  8049F68 DO             -> [ 0 ]
+ : end-class  ( class methods vars -- )
-D4744  804A0C8 i              -> [ 1 ] 00000
+   Create  here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP
-D4748 400C5E60 .              -> 0 [ 0 ]
+   cell+ dup cell+ r> rot @ 2 cells /string move ;
-D474C  8049D0C LOOP           -> [ 0 ]
-D4744  804A0C8 i              -> [ 1 ] 00001
-D4748 400C5E60 .              -> 1 [ 0 ]
-D474C  8049D0C LOOP           -> [ 0 ]
-D4758  804B384 ;              ->  ok
  @end example
- Each line displayed is one step. You always have to hit return to
+ The first line creates the vtable, initialized with
- execute the next word that is displayed. If you don't want to execute
+ @code{noop}s. The second line is the inheritance mechanism, it
- the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is
+ copies the xts from the parent vtable.
- an overview what keys are available:
- @table @i
+ We still have no way to define new methods, let's do that now:
- @item <return>
+ @example
- Next; Execute the next word.
+ : defines ( xt class -- ) ' >body @ + ! ;
+ @end example
- @item n
+ To allocate a new object, we need a word, too:
- Nest; Single step through next word.
- @item u
+ @example
- Unnest; Stop debugging and execute rest of word. If we got to this word
+ : new ( class -- o )  here over @ allot swap over ! ;
- with nest, continue debugging with the calling word.
+ @end example
- @item d
+ Sometimes derived classes want to access the method of the
- Done; Stop debugging and execute rest.
+ parent object. There are two ways to achieve this with Mini-OOF:
+ first, you could use named words, and second, you could look up the
+ vtable of the parent object.
- @item s
+ @example
- Stopp; Abort immediately.
+ : :: ( class "name" -- ) ' >body @ + @ compile, ;
+ @end example
- @end table
- Debugging large application with this mechanism is very difficult, because
+ Nothing can be more confusing than a good example, so here is
- you have to nest very deep into the program before the interesting part
+ one. First let's declare a text object (called
- begins. This takes a lot of time.
+ @code{button}), that stores text and position:
- To do it more directly put a @code{BREAK:} command into your source code.
+ @example
- When program execution reaches @code{BREAK:} the single step debugger is
+ object class
- invoked and you have all the features described above.
+   cell var text
+   cell var len
+   cell var x
+   cell var y
+   method init
+   method draw
+ end-class button
+ @end example
- If you have more than one part to debug it is useful to know where the
+ @noindent
- program has stopped at the moment. You can do this by the
+ Now, implement the two methods, @code{draw} and @code{init}:
- @code{BREAK" string"} command. This behaves like @code{BREAK:} except that
- string is typed out when the ``breakpoint'' is reached.
- @node Assembler and Code Words, Threading Words, Programming Tools, Words
+ @example
- @section Assembler and Code Words
+ :noname ( o -- )
- @cindex assembler
+  >r r@ x @ r@ y @ at-xy  r@ text @ r> len @ type ;
- @cindex code words
+  button defines draw
+ :noname ( addr u o -- )
+  >r 0 r@ x ! 0 r@ y ! r@ len ! r> text ! ;
+  button defines init
+ @end example
- Gforth provides some words for defining primitives (words written in
+ @noindent
- machine code), and for defining the the machine-code equivalent of
+ To demonstrate inheritance, we define a class @code{bold-button}, with no
- @code{DOES>}-based defining words. However, the machine-independent
+ new data and no new methods:
- nature of Gforth poses a few problems: First of all, Gforth runs on
- several architectures, so it can provide no standard assembler. What's
- worse is that the register allocation not only depends on the processor,
- but also on the @code{gcc} version and options used.
- The words that Gforth offers encapsulate some system dependences (e.g., the
+ @example
- header structure), so a system-independent assembler may be used in
+ button class
- Gforth. If you do not have an assembler, you can compile machine code
+ end-class bold-button
- directly with @code{,} and @code{c,}.
- doc-assembler
+ : bold   27 emit ." [1m" ;
- doc-code
+ : normal 27 emit ." [0m" ;
- doc-end-code
+ @end example
- doc-;code
- doc-flush-icache
- If @code{flush-icache} does not work correctly, @code{code} words
+ @noindent
- etc. will not work (reliably), either.
+ The class @code{bold-button} has a different draw method to
+ @code{button}, but the new method is defined in terms of the draw method
+ for @code{button}:
- These words are rarely used. Therefore they reside in @code{code.fs},
+ @example
- which is usually not loaded (except @code{flush-icache}, which is always
+ :noname bold [ button :: draw ] normal ; bold-button defines draw
- present). You can load them with @code{require code.fs}.
+ @end example
- @cindex registers of the inner interpreter
+ @noindent
- In the assembly code you will want to refer to the inner interpreter's
+ Finally, create two objects and apply methods:
- registers (e.g., the data stack pointer) and you may want to use other
- registers for temporary storage. Unfortunately, the register allocation
- is installation-dependent.
- The easiest solution is to use explicit register declarations
+ @example
- (@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info,
+ button new Constant foo
- GNU C Manual}) for all of the inner interpreter's registers: You have to
+ s" thin foo" foo init
- compile Gforth with @code{-DFORCE_REG} (configure option
+ page
- @code{--enable-force-reg}) and the appropriate declarations must be
+ foo draw
- present in the @code{machine.h} file (see @code{mips.h} for an example;
+ bold-button new Constant bar
- you can find a full list of all declarable register symbols with
+ s" fat bar" bar init
- @code{grep register engine.c}). If you give explicit registers to all
+bar y !
- variables that are declared at the beginning of @code{engine()}, you
+ bar draw
- should be able to use the other caller-saved registers for temporary
+ @end example
- storage. Alternatively, you can use the @code{gcc} option
- @code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code
- Generation Conventions, gcc.info, GNU C Manual}) to reserve a register
- (however, this restriction on register allocation may slow Gforth
- significantly).
- If this solution is not viable (e.g., because @code{gcc} does not allow
- you to explicitly declare all the registers you need), you have to find
- out by looking at the code where the inner interpreter's registers
- reside and which registers can be used for temporary storage. You can
- get an assembly listing of the engine's code with @code{make engine.s}.
- In any case, it is good practice to abstract your assembly code from the
+ @node Comparison with other object models, , Mini-OOF, Object-oriented Forth
- actual register allocation. E.g., if the data stack pointer resides in
+ @subsubsection Comparison with other object models
- register @code{$17}, create an alias for this register called @code{sp},
+ @cindex comparison of object models
- and use that in your assembly code.
+ @cindex object models, comparison
- @cindex code words, portable
+ Many object-oriented Forth extensions have been proposed (@cite{A survey
- Another option for implementing normal and defining words efficiently
+ of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford
- is: adding the wanted functionality to the source of Gforth. For normal
+ J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the
- words you just have to edit @file{primitives} (@pxref{Automatic
+ relation of the object models described here to two well-known and two
- Generation}), defining words (equivalent to @code{;CODE} words, for fast
+ closely-related (by the use of method maps) models.
- defined words) may require changes in @file{engine.c}, @file{kernel.fs},
- @file{prims2x.fs}, and possibly @file{cross.fs}.
+ @cindex Neon model
+ The most popular model currently seems to be the Neon model (see
+ @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March
+) by Andrew McKewan) but this model has a number of limitations
+ @footnote{A longer version of this critique can be
+ found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth
+ Dimensions, May 1997) by Anton Ertl.}:
- @node Threading Words, Passing Commands to the OS, Assembler and Code Words, Words
+ @itemize @bullet
- @section Threading Words
+ @item
- @cindex threading words
+ It uses a @code{@emph{selector
+ object}} syntax, which makes it unnatural to pass objects on the
+ stack.
- @cindex code address
+ @item
- These words provide access to code addresses and other threading stuff
+ It requires that the selector parses the input stream (at
- in Gforth (and, possibly, other interpretive Forths). It more or less
+ compile time); this leads to reduced extensibility and to bugs that are+
- abstracts away the differences between direct and indirect threading
+ hard to find.
- (and, for direct threading, the machine dependences). However, at
- present this wordset is still incomplete. It is also pretty low-level;
- some day it will hopefully be made unnecessary by an internals wordset
- that abstracts implementation details away completely.
- doc-threading-method
+ @item
- doc->code-address
+ It allows using every selector to every object;
- doc->does-code
+ this eliminates the need for classes, but makes it harder to create
- doc-code-address!
+ efficient implementations.
- doc-does-code!
+ @end itemize
- doc-does-handler!
- doc-/does-handler
- The code addresses produced by various defining words are produced by
+ @cindex Pountain's object-oriented model
- the following words:
+ Another well-known publication is @cite{Object-Oriented Forth} (Academic
+ Press, London, 1987) by Dick Pountain. However, it is not really about
+ object-oriented programming, because it hardly deals with late
+ binding. Instead, it focuses on features like information hiding and
+ overloading that are characteristic of modular languages like Ada (83).
- doc-docol:
+ @cindex Zsoter's object-oriented model
- doc-docon:
+ In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1) 1996, pages 31-35)
- doc-dovar:
+ Andras Zsoter describes a model that makes heavy use of an active object
- doc-douser:
+ (like @code{this} in @file{objects.fs}): The active object is not only
- doc-dodefer:
+ used for accessing all fields, but also specifies the receiving object
- doc-dofield:
+ of every selector invocation; you have to change the active object
+ explicitly with @code{@{ ... @}}, whereas in @file{objects.fs} it
+ changes more or less implicitly at @code{m: ... ;m}. Such a change at
+ the method entry point is unnecessary with the Zsoter's model, because
+ the receiving object is the active object already. On the other hand, the explicit
+ change is absolutely necessary in that model, because otherwise no one
+ could ever change the active object. An ANS Forth implementation of this
+ model is available at @url{http://www.forth.org/fig/oopf.html}.
- You can recognize words defined by a @code{CREATE}...@code{DOES>} word
+ @cindex @file{oof.fs}, differences to other models
- with @code{>DOES-CODE}. If the word was defined in that way, the value
+ The @file{oof.fs} model combines information hiding and overloading
- returned is different from 0 and identifies the @code{DOES>} used by the
+ resolution (by keeping names in various word lists) with object-oriented
- defining word.
+ programming. It sets the active object implicitly on method entry, but
- @comment TODO should that be "identifies the xt of the DOES> ??
+ also allows explicit changing (with @code{>o...o>} or with
+ @code{with...endwith}). It uses parsing and state-smart objects and
+ classes for resolving overloading and for early binding: the object or
+ class parses the selector and determines the method from this. If the
+ selector is not parsed by an object or class, it performs a call to the
+ selector for the active object (late binding), like Zsoter's model.
+ Fields are always accessed through the active object. The big
+ disadvantage of this model is the parsing and the state-smartness, which
+ reduces extensibility and increases the opportunities for subtle bugs;
+ essentially, you are only safe if you never tick or @code{postpone} an
+ object or class (Bernd disagrees, but I (Anton) am not convinced).
+ @cindex @file{mini-oof.fs}, differences to other models
+ The @file{mini-oof.fs} model is quite similar to a very stripped-down version of
+ the @file{objects.fs} model, but syntactically it is a mixture of the @file{objects.fs} and
+ @file{oof.fs} models.
- @node Passing Commands to the OS, Miscellaneous Words, Threading Words, Words
+ @c -------------------------------------------------------------
+ @node Passing Commands to the OS, Miscellaneous Words, Object-oriented Forth, Words
  @section Passing Commands to the Operating System
  @cindex operating system - passing commands
  @cindex shell commands
- Line 7102  doc-system
+ Line 7275  doc-system
  doc-$?
  doc-getenv
+ @c -------------------------------------------------------------
  @node Miscellaneous Words,  , Passing Commands to the OS, Words
  @section Miscellaneous Words
  @cindex miscellaneous words
- These section lists the ANS Standard Forth words that are not documented
+ These section lists the ANS Forth words that are not documented
  elsewhere in this manual. Ultimately, they all need proper homes.
  doc-,
- Line 7126  doc-word
+ Line 7299  doc-word
  doc-[compile]
  doc-refill
- These ANS Standard Forth words are not currently implemented in Gforth
+ These ANS Forth words are not currently implemented in Gforth
  (see TODO section on dependencies)
- The following ANS Standard Forth words are not currently supported by Gforth
+ The following ANS Forth words are not currently supported by Gforth
  (@pxref{ANS conformance})
  @code{EDITOR}
- Line 7385  installation-dependent. Currently a char
+ Line 7558  installation-dependent. Currently a char
  @item character-set extensions and matching of names:
  @cindex character-set extensions and matching of names
- @cindex case sensitivity for name lookup
+ @cindex case-sensitivity for name lookup
- @cindex name lookup, case sensitivity
+ @cindex name lookup, case-sensitivity
- @cindex locale and case sensitivity
+ @cindex locale and case-sensitivity
  Any character except the ASCII NUL character can be used in a
  name. Matching is case-insensitive (except in @code{TABLE}s). The
  matching is performed using the C function @code{strncasecmp}, whose
- Line 7413  like @code{PARSE} otherwise. @code{(NAME
+ Line 7586  like @code{PARSE} otherwise. @code{(NAME
  interpreter (aka text interpreter) by default, treats all white-space
  characters as delimiters.
- @item format of the control flow stack:
+ @item format of the control-flow stack:
- @cindex control flow stack, format
+ @cindex control-flow stack, format
- The data stack is used as control flow stack. The size of a control flow
+ The data stack is used as control-flow stack. The size of a control-flow
  stack item in cells is given by the constant @code{cs-item-size}. At the
  time of this writing, an item consists of a (pointer to a) locals list
  (third), an address in the code (second), and a tag for identifying the
- Line 7443  The error string is stored into the vari
+ Line 7616  The error string is stored into the vari
  @item input line terminator:
  @cindex input line terminator
  @cindex line terminator on input
- @cindex newline charcter on input
+ @cindex newline character on input
  For interactive input, @kbd{C-m} (CR) and @kbd{C-j} (LF) terminate
  lines. One of these characters is typically produced when you type the
  @kbd{Enter} or @kbd{Return} key.
- Line 7548  The remainder of dictionary space. @code
+ Line 7721  The remainder of dictionary space. @code
  @item system case-sensitivity characteristics:
  @cindex case-sensitivity characteristics
- Dictionary searches are case insensitive (except in
+ Dictionary searches are case-insensitive (except in
  @code{TABLE}s). However, as explained above under @i{character-set
  extensions}, the matching for non-ASCII characters is determined by the
  locale you are using. In the default @code{C} locale all non-ASCII
- Line 7594  No.
+ Line 7767  No.
  @item a name is neither a word nor a number:
  @cindex name not found
- @cindex Undefined word
+ @cindex undefined word
  @code{-13 throw} (Undefined word). Actually, @code{-13 bounce}, which
  preserves the data and FP stack, so you don't lose more work than
  necessary.
  @item a definition name exceeds the maximum length allowed:
- @cindex Word name too long
+ @cindex word name too long
  @code{-19 throw} (Word name too long)
  @item addressing a region not inside the various data spaces of the forth system:
- Line 7611  the operating system. On decent systems:
+ Line 7784  the operating system. On decent systems:
  address).
  @item argument type incompatible with parameter:
- @cindex Argument type mismatch
+ @cindex argument type mismatch
  This is usually not caught. Some words perform checks, e.g., the control
  flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type
  mismatch).
- Line 7626  get an execution token for @code{compile
+ Line 7799  get an execution token for @code{compile
  @item dividing by zero:
  @cindex dividing by zero
  @cindex floating point unidentified fault, integer division
- @cindex divide by zero
  On better platforms, this produces a @code{-10 throw} (Division by
  zero); on other systems, this typically results in a @code{-55 throw}
  (Floating-point unidentified fault).
- Line 7634  zero); on other systems, this typically
+ Line 7806  zero); on other systems, this typically
  @item insufficient data stack or return stack space:
  @cindex insufficient data stack or return stack space
  @cindex stack overflow
- @cindex Address alignment exception, stack overflow
+ @cindex address alignment exception, stack overflow
  @cindex Invalid memory address, stack overflow
  Depending on the operating system, the installation, and the invocation
  of Gforth, this is either checked by the memory management hardware, or
- Line 7729  Compiles a recursive call to the definin
+ Line 7901  Compiles a recursive call to the definin
  @item argument input source different than current input source for @code{RESTORE-INPUT}:
  @cindex argument input source different than current input source for @code{RESTORE-INPUT}
- @cindex Argument type mismatch, @code{RESTORE-INPUT}
+ @cindex argument type mismatch, @code{RESTORE-INPUT}
  @cindex @code{RESTORE-INPUT}, Argument type mismatch
  @code{-12 THROW}. Note that, once an input file is closed (e.g., because
  the end of the file was reached), its source-id may be
- Line 7748  memory access faults or execution of ill
+ Line 7920  memory access faults or execution of ill
  @item data space read/write with incorrect alignment:
  @cindex data space read/write with incorrect alignment
  @cindex alignment faults
- @cindex Address alignment exception
+ @cindex address alignment exception
  Processor-dependent. Typically results in a @code{-23 throw} (Address
  alignment exception). Under Linux-Intel on a 486 or later processor with
  alignment turned on, incorrect alignment results in a @code{-9 throw}
- Line 7781  defined by @code{CONSTANT}; in the latte
+ Line 7953  defined by @code{CONSTANT}; in the latte
  @item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}):
  @cindex name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]})
- @cindex Undefined word, @code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}
+ @cindex undefined word, @code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}
  @code{-13 throw} (Undefined word)
  @item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}):
- Line 7795  Assume @code{: X POSTPONE TO ; IMMEDIATE
+ Line 7967  Assume @code{: X POSTPONE TO ; IMMEDIATE
  compilation semantics of @code{TO}.
  @item String longer than a counted string returned by @code{WORD}:
- @cindex String longer than a counted string returned by @code{WORD}
+ @cindex string longer than a counted string returned by @code{WORD}
  @cindex @code{WORD}, string overflow
  Not checked. The string will be ok, but the count will, of course,
  contain only the least significant bits of the length.
- Line 8507  as well as possible.
+ Line 8679  as well as possible.
  @cindex @code{FORGET}, deleting the compilation word list
  Not implemented (yet).
- @item fewer than @var{u}+1 items on the control flow stack (@code{CS-PICK}, @code{CS-ROLL}):
+ @item fewer than @var{u}+1 items on the control-flow stack (@code{CS-PICK}, @code{CS-ROLL}):
- @cindex @code{CS-PICK}, fewer than @var{u}+1 items on the control flow stack
+ @cindex @code{CS-PICK}, fewer than @var{u}+1 items on the control flow-stack
- @cindex @code{CS-ROLL}, fewer than @var{u}+1 items on the control flow stack
+ @cindex @code{CS-ROLL}, fewer than @var{u}+1 items on the control flow-stack
  @cindex control-flow stack underflow
  This typically results in an @code{abort"} with a descriptive error
  message (may change into a @code{-22 throw} (Control structure mismatch)
- Line 8596  are applied to the latest defined word (
+ Line 8768  are applied to the latest defined word (
  @item search order empty (@code{previous}):
  @cindex @code{previous}, search order empty
- @cindex Vocstack empty, @code{previous}
+ @cindex vocstack empty, @code{previous}
  @code{abort" Vocstack empty"}.
  @item too many word lists in search order (@code{also}):
  @cindex @code{also}, too many word lists in search order
- @cindex Vocstack full, @code{also}
+ @cindex vocstack full, @code{also}
  @code{abort" Vocstack full"}.
  @end table
- Line 8664  Signals?
+ Line 8836  Signals?
  Accessing the Stacks
+ @c ******************************************************************
  @node Emacs and Gforth, Image Files, Integrating Gforth, Top
  @chapter Emacs and Gforth
  @cindex Emacs and Gforth
- Line 8678  Accessing the Stacks
+ Line 8851  Accessing the Stacks
  @cindex Forth mode in Emacs
  Gforth comes with @file{gforth.el}, an improved version of
  @file{forth.el} by Goran Rydqvist (included in the TILE package). The
- improvements are a better (but still not perfect) handling of
+ improvements are:
- indentation. I have also added comment paragraph filling (@kbd{M-q}),
- commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) regions and
+ @itemize @bullet
- removing debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). I left the
+ @item
- stuff I do not use alone, even though some of it only makes sense for
+ A better (but still not perfect) handling of indentation.
- TILE. To get a description of these features, enter Forth mode and type
+ @item
- @kbd{C-h m}.
+ Comment paragraph filling (@kbd{M-q})
+ @item
+ Commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) of regions
+ @item
+ Removal of debugging tracers (@kbd{C-x ~}, @pxref{Debugging}).
+ @end itemize
+ I left the stuff I do not use alone, even though some of it only makes
+ sense for TILE. To get a description of these features, enter Forth mode
+ and type @kbd{C-h m}.
  @cindex source location of error or debugging output in Emacs
  @cindex error output, finding the source location in Emacs
- Line 8700  message is only a few keystrokes away (@
+ Line 8882  message is only a few keystrokes away (@
  @cindex @file{TAGS} file
  @cindex @file{etags.fs}
  @cindex viewing the source of a word in Emacs
- Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file
+ Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file will
- (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) will be produced that
+ be produced (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) that
  contains the definitions of all words defined afterwards. You can then
  find the source for a word using @kbd{M-.}. Note that emacs can use
  several tags files at the same time (e.g., one for the Gforth sources
- Line 8719  file:
+ Line 8901  file:
  (setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist))
  @end example
+ @c ******************************************************************
  @node Image Files, Engine, Emacs and Gforth, Top
  @chapter Image Files
- @cindex image files
+ @cindex image file
- @cindex @code{.fi} files
+ @cindex @file{.fi} files
  @cindex precompiled Forth code
  @cindex dictionary in persistent form
  @cindex persistent form of dictionary
- Line 8774  Our Forth system consists not only of pr
+ Line 8957  Our Forth system consists not only of pr
  definitions written in Forth. Since the Forth compiler itself belongs to
  those definitions, it is not possible to start the system with the
  primitives and the Forth source alone. Therefore we provide the Forth
- code as an image file in nearly executable form. At the start of the
+ code as an image file in nearly executable form. When Gforth starts up,
- system a C routine loads the image file into memory, optionally
+ a C routine loads the image file into memory, optionally relocates the
- relocates the addresses, then sets up the memory (stacks etc.) according
+ addresses, then sets up the memory (stacks etc.) according to
- to information in the image file, and starts executing Forth code.
+ information in the image file, and (finally) starts executing Forth
+ code.
  The image file variants represent different compromises between the
  goals of making it easy to generate image files and making them
  portable.
  @cindex relocation at run-time
- Win32Forth 3.4 and Mitch Bradleys @code{cforth} use relocation at
+ Win32Forth 3.4 and Mitch Bradley's @code{cforth} use relocation at
  run-time. This avoids many of the complications discussed below (image
  files are data relocatable without further ado), but costs performance
  (one addition per memory access).
  @cindex relocation at load-time
- By contrast, our loader performs relocation at image load time. The
+ By contrast, the Gforth loader performs relocation at image load time. The
- loader also has to replace tokens standing for primitive calls with the
+ loader also has to replace tokens that represent primitive calls with the
  appropriate code-field addresses (or code addresses in the case of
  direct threading).
- Line 8809  caused by the design of the image file l
+ Line 8993  caused by the design of the image file l
  @item
  There is only one segment; in particular, this means, that an image file
  cannot represent @code{ALLOCATE}d memory chunks (and pointers to
- them). And the contents of the stacks are not represented, either.
+ them). The contents of the stacks are not represented, either.
  @item
  The only kinds of relocation supported are: adding the same offset to
- Line 8855  a place where it is stored in a non-mang
+ Line 9039  a place where it is stored in a non-mang
  @node  Non-Relocatable Image Files, Data-Relocatable Image Files, Image File Background, Image Files
  @section Non-Relocatable Image Files
  @cindex non-relocatable image files
- @cindex image files, non-relocatable
+ @cindex image file, non-relocatable
  These files are simple memory dumps of the dictionary. They are specific
  to the executable (i.e., @file{gforth} file) they were created
- Line 8873  doc-savesystem
+ Line 9057  doc-savesystem
  @node Data-Relocatable Image Files, Fully Relocatable Image Files, Non-Relocatable Image Files, Image Files
  @section Data-Relocatable Image Files
  @cindex data-relocatable image files
- @cindex image files, data-relocatable
+ @cindex image file, data-relocatable
  These files contain relocatable data addresses, but fixed code addresses
  (instead of tokens). They are specific to the executable (i.e.,
- Line 8886  Relocatable Image Files}).
+ Line 9070  Relocatable Image Files}).
  @node Fully Relocatable Image Files, Stack and Dictionary Sizes, Data-Relocatable Image Files, Image Files
  @section Fully Relocatable Image Files
  @cindex fully relocatable image files
- @cindex image files, fully relocatable
+ @cindex image file, fully relocatable
  @cindex @file{kern*.fi}, relocatability
  @cindex @file{gforth.fi}, relocatability
- Line 9021  gforth -i @var{image}
+ Line 9205  gforth -i @var{image}
  @end example
  @cindex executable image file
- @cindex image files, executable
+ @cindex image file, executable
  If your operating system supports starting scripts with a line of the
  form @code{#! ...}, you just have to type the image file name to start
  Gforth with this image file (note that the file extension @code{.fi} is
  just a convention). I.e., to run Gforth with the image file @var{image},
  you can just type @var{image} instead of @code{gforth -i @var{image}}.
+ For example, if you place this text in a file:
+ @example
+ #! /usr/local/bin/gforth
+ ." Hello, world" CR
+ bye
+ @end example
+ @noindent
+ And then make the file executable (chmod +x in Unix), you can run it
+ directly from the command line. The sequence @code{#!} is used in two
+ ways; firstly, it is recognised as a ``magic sequence'' by the operating
+ system, secondly it is treated as a comment character by Gforth. Because
+ of the second usage, a space is required between @code{#!} and the path
+ to the executable.
+ @comment TODO describe the #! magic with reference to the Power Tools book.
  doc-#!
  @node Modifying the Startup Sequence,  , Running Image Files, Image Files
- Line 9037  doc-#!
+ Line 9240  doc-#!
  @cindex initialization sequence of image file
  You can add your own initialization to the startup sequence through the
- deferred word
+ deferred word @code{'cold}. @code{'cold} is invoked just before the
+ image-specific command line processing (by default, loading files and
- doc-'cold
+ evaluating (@code{-e}) strings) starts.
- @code{'cold} is invoked just before the image-specific command line
- processing (by default, loading files and evaluating (@code{-e}) strings)
- starts.
  A sequence for adding your initialization usually looks like this:
- Line 9055  A sequence for adding your initializatio
+ Line 9254  A sequence for adding your initializatio
  @end example
  @cindex turnkey image files
- @cindex image files, turnkey applications
+ @cindex image file, turnkey applications
  You can make a turnkey image by letting @code{'cold} execute a word
  (your turnkey application) that never returns; instead, it exits Gforth
  via @code{bye} or @code{throw}.
- Line 9063  via @code{bye} or @code{throw}.
+ Line 9262  via @code{bye} or @code{throw}.
  @cindex command-line arguments, access
  @cindex arguments on the command line, access
  You can access the (image-specific) command-line arguments through the
- variables @code{argc} and @code{argv}. @code{arg} provides conventient
+ variables @code{argc} and @code{argv}. @code{arg} provides convenient
  access to @code{argv}.
+ If @code{'cold} exits normally, Gforth processes the command-line
+ arguments as files to be loaded and strings to be evaluated.  Therefore,
+ @code{'cold} should remove the arguments it has used in this case.
+ doc-'cold
  doc-argc
  doc-argv
  doc-arg
- If @code{'cold} exits normally, Gforth processes the command-line
- arguments as files to be loaded and strings to be evaluated.  Therefore,
- @code{'cold} should remove the arguments it has used in this case.
  @c ******************************************************************
  @node Engine, Binding to System Library, Image Files, Top
- Line 9080  arguments as files to be loaded and stri
+ Line 9281  arguments as files to be loaded and stri
  @cindex engine
  @cindex virtual machine
- Reading this section is not necessary for programming with Gforth. It
+ Reading this chapter is not necessary for programming with Gforth. It
  may be helpful for finding your way in the Gforth sources.
  The ideas in this section have also been published in the papers
- Line 9100  Ertl, presented at EuroForth '93; the la
+ Line 9301  Ertl, presented at EuroForth '93; the la
  @section Portability
  @cindex engine portability
- One of the main goals of the effort is availability across a wide range
+ An important goal of the Gforth Project is availability across a wide
- of personal machines. fig-Forth, and, to a lesser extent, F83, achieved
+ range of personal machines. fig-Forth, and, to a lesser extent, F83,
- this goal by manually coding the engine in assembly language for several
+ achieved this goal by manually coding the engine in assembly language
- then-popular processors. This approach is very labor-intensive and the
+ for several then-popular processors. This approach is very
- results are short-lived due to progress in computer architecture.
+ labor-intensive and the results are short-lived due to progress in
+ computer architecture.
  @cindex C, using C for the engine
  Others have avoided this problem by coding in C, e.g., Mitch Bradley
- Line 9169  makes it possible to take the address of
+ Line 9371  makes it possible to take the address of
  @code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as
  @code{goto x}.
- @cindex NEXT, indirect threaded
+ @cindex @code{NEXT}, indirect threaded
  @cindex indirect threaded inner interpreter
  @cindex inner interpreter, indirect threaded
- With this feature an indirect threaded NEXT looks like:
+ With this feature an indirect threaded @code{NEXT} looks like:
  @example
  cfa = *ip++;
  ca = *cfa;
- Line 9186  executed; The @code{ca} (code address) f
+ Line 9388  executed; The @code{ca} (code address) f
  executable code, e.g., a primitive or the colon definition handler
  @code{docol}.
- @cindex NEXT, direct threaded
+ @cindex @code{NEXT}, direct threaded
  @cindex direct threaded inner interpreter
  @cindex inner interpreter, direct threaded
  Direct threading is even simpler:
- Line 9196  goto *ca;
+ Line 9398  goto *ca;
  @end example
  Of course we have packaged the whole thing neatly in macros called
- @code{NEXT} and @code{NEXT1} (the part of NEXT after fetching the cfa).
+ @code{NEXT} and @code{NEXT1} (the part of @code{NEXT} after fetching the cfa).
  @menu
  * Scheduling::
- Line 9221  sp++;
+ Line 9423  sp++;
  sp[0]=n;
  NEXT;
  @end example
- the NEXT comes strictly after the other code, i.e., there is nearly no
+ the @code{NEXT} comes strictly after the other code, i.e., there is nearly no
  scheduling. After a little thought the problem becomes clear: The
  compiler cannot know that @code{sp} and @code{ip} point to different
  addresses (and the version of @code{gcc} we used would not know it even
- Line 9229  if it was possible), so it could not mov
+ Line 9431  if it was possible), so it could not mov
  store to the TOS. Indeed the pointers could be the same, if code on or
  very near the top of stack were executed. In the interest of speed we
  chose to forbid this probably unused ``feature'' and helped the compiler
- in scheduling: NEXT is divided into the loading part (@code{NEXT_P1})
+ in scheduling: @code{NEXT} is divided into the loading part (@code{NEXT_P1})
  and the goto part (@code{NEXT_P2}). @code{+} now looks like:
  @example
  n=sp[0]+sp[1];
- Line 9274  supported on all machines.
+ Line 9476  supported on all machines.
  @subsection DOES>
  @cindex @code{DOES>} implementation
- @cindex dodoes routine
+ @cindex @code{dodoes} routine
- @cindex DOES-code
+ @cindex @code{DOES>}-code
  One of the most complex parts of a Forth engine is @code{dodoes}, i.e.,
  the chunk of code executed by every word defined by a
  @code{CREATE}...@code{DOES>} pair. The main problem here is: How to find
  the Forth code to be executed, i.e. the code after the
- @code{DOES>} (the DOES-code)? There are two solutions:
+ @code{DOES>} (the @code{DOES>}-code)? There are two solutions:
  In fig-Forth the code field points directly to the @code{dodoes} and the
- DOES-code address is stored in the cell after the code address (i.e. at
+ @code{DOES>}code address is stored in the cell after the code address (i.e. at
- @code{@var{cfa} cell+}). It may seem that this solution is illegal in
+ @code{@var{CFA} cell+}). It may seem that this solution is illegal in
  the Forth-79 and all later standards, because in fig-Forth this address
  lies in the body (which is illegal in these standards). However, by
  making the code field larger for all words this solution becomes legal
- Line 9296  to avoid having different image files fo
+ Line 9498  to avoid having different image files fo
  systems (direct threaded systems require two-cell code fields on many
  machines).
- @cindex DOES-handler
+ @cindex @code{DOES>}-handler
  The other approach is that the code field points or jumps to the cell
- after @code{DOES}. In this variant there is a jump to @code{dodoes} at
+ after @code{DOES>}. In this variant there is a jump to @code{dodoes} at
- this address (the DOES-handler). @code{dodoes} can then get the
+ this address (the @code{DOES>}-handler). @code{dodoes} can then get the
- DOES-code address by computing the code address, i.e., the address of
+ @code{DOES>}-code address by computing the code address, i.e., the address of
  the jump to dodoes, and add the length of that jump field. A variant of
  this is to have a call to @code{dodoes} after the @code{DOES>}; then the
  return address (which can be found in the return register on RISCs) is
- the DOES-code address. Since the two cells available in the code field
+ the @code{DOES>}-code address. Since the two cells available in the code field
  are used up by the jump to the code address in direct threading on many
  architectures, we use this approach for direct threading on these
  architectures. We did not want to add another cell to the code field.
- Line 9388  well and produces optimal code for @code
+ Line 9590  well and produces optimal code for @code
  HP RISC machines: Defining the @code{n}s does not produce any code, and
  using them as intermediate storage also adds no cost.
- There are also other optimizations, that are not illustrated by this
+ There are also other optimizations that are not illustrated by this
- example: Assignments between simple variables are usually for free (copy
+ example: assignments between simple variables are usually for free (copy
  propagation). If one of the stack items is not used by the primitive
  (e.g.  in @code{drop}), the compiler eliminates the load from the stack
  (dead code elimination). On the other hand, there are some things that
- Line 9400  a stack item to the place where it just
+ Line 9602  a stack item to the place where it just
  While programming a primitive is usually easy, there are a few cases
  where the programmer has to take the actions of the generator into
  account, most notably @code{?dup}, but also words that do not (always)
- fall through to NEXT.
+ fall through to @code{NEXT}.
  @node TOS Optimization, Produced code, Automatic Generation, Primitives
  @subsection TOS Optimization
- Line 9530  matmul    1.00  1.47  1.35   1.46  0.74
+ Line 9732  matmul    1.00  1.47  1.35   1.46  0.74
  fib       1.00  1.52  1.34   1.22  0.86  1.74  2.99  4.30
  @end example
- You may find the good performance of Gforth compared with the systems
+ You may be quite surprised by the good performance of Gforth when
- written in assembly language quite surprising. One important reason for
+ compared with systems written in assembly language. One important reason
- the disappointing performance of these systems is probably that they are
+ for the disappointing performance of these other systems is probably
- not written optimally for the 486 (e.g., they use the @code{lods}
+ that they are not written optimally for the 486 (e.g., they use the
- instruction). In addition, Win32Forth uses a comfortable, but costly
+ @code{lods} instruction). In addition, Win32Forth uses a comfortable,
- method for relocating the Forth image: like @code{cforth}, it computes
+ but costly method for relocating the Forth image: like @code{cforth}, it
- the actual addresses at run time, resulting in two address computations
+ computes the actual addresses at run time, resulting in two address
- per NEXT (@pxref{Image File Background}).
+ computations per @code{NEXT} (@pxref{Image File Background}).
- Only Eforth with the peephole optimizer performs comparable to
+ Only Eforth with the peephole optimizer has a performance that is
- Gforth. The speedups achieved with peephole optimization of threaded
+ comparable to Gforth. The speedups achieved with peephole optimization
- code are quite remarkable. Adding a peephole optimizer to Gforth should
+ of threaded code are quite remarkable. Adding a peephole optimizer to
- cause similar speedups.
+ Gforth should cause similar speedups.
  The speedup of Gforth over PFE, ThisForth and TILE can be easily
  explained with the self-imposed restriction of the latter systems to
- Line 9552  Vars, , Defining Global Register Variabl
+ Line 9754  Vars, , Defining Global Register Variabl
  Moreover, current C compilers have a hard time optimizing other aspects
  of the ThisForth and the TILE source.
- Note that the performance of Gforth on 386 architecture processors
+ The performance of Gforth on 386 architecture processors varies widely
- varies widely with the version of @code{gcc} used. E.g., @code{gcc-2.5.8}
+ with the version of @code{gcc} used. E.g., @code{gcc-2.5.8} failed to
- failed to allocate any of the virtual machine registers into real
+ allocate any of the virtual machine registers into real machine
- machine registers by itself and would not work correctly with explicit
+ registers by itself and would not work correctly with explicit register
- register declarations, giving a 1.3 times slower engine (on a 486DX2/66
+ declarations, giving a 1.3 times slower engine (on a 486DX2/66 running
- running the Sieve) than the one measured above.
+ the Sieve) than the one measured above.
- Note also that there have been several releases of Win32Forth since the
+ Note that there have been several releases of Win32Forth since the
- release presented here, so the results presented here may have little
+ release presented here, so the results presented above may have little
  predictive value for the performance of Win32Forth today.
  @cindex @file{Benchres}
- Line 9575  newer version of these measurements at
+ Line 9777  newer version of these measurements at
  @url{http://www.complang.tuwien.ac.at/forth/performance.html}. You can
  find numbers for Gforth on various machines in @file{Benchres}.
+ @c ******************************************************************
  @node Binding to System Library, Cross Compiler, Engine, Top
  @chapter Binding to System Library
- Line 9652  was developed across the Internet, and i
+ Line 9855  was developed across the Internet, and i
  physically for the first 4 years of development.
  @section Pedigree
- @cindex Pedigree of Gforth
+ @cindex pedigree of Gforth
  Gforth descends from bigFORTH (1993) and fig-Forth. Gforth and PFE (by
  Dirk Zoller) will cross-fertilize each other. Of course, a significant
- Line 9702  information about Forth there.
+ Line 9905  information about Forth there.
  @node Internet resources, Books, Forth-related information, Forth-related information
  @section Internet resources
- @cindex Internet resources
+ @cindex internet resources
  @cindex comp.lang.forth
  @cindex frequently asked questions
- Line 9738  Research (JFAR) and a searchable Forth b
+ Line 9941  Research (JFAR) and a searchable Forth b
  @node Books, The Forth Interest Group, Internet resources, Forth-related information
  @section Books
- @cindex Books
+ @cindex books on Forth
  As the Standard is relatively new, there are not many books out yet. It
  is not recommended to learn Forth by using Gforth and a book that is not
- Line 9749  should be ok, because ANS Forth is prima
+ Line 9952  should be ok, because ANS Forth is prima
  @cindex standard document for ANS Forth
  @cindex ANS Forth document
  The definite reference if you want to write ANS Forth programs is, of
- course, the ANS Forth Standard. It is available in printed form from the
+ course, the ANS Forth document. It is available in printed form from the
  National Standards Institute Sales Department (Tel.: USA (212) 642-4900;
  Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about
  $200. You can also get it from Global Engineering Documents (Tel.: USA
- Line 9763  format); this HTML version also includes
+ Line 9966  format); this HTML version also includes
  Interpretation (RFIs). Some pointers to these versions can be found
  through @*@url{http://www.complang.tuwien.ac.at/projects/forth.html}.
- @cindex introductory book
+ @cindex introductory book on Forth
- @cindex book, introductory
+ @cindex book on Forth, introductory
  @cindex Woehr, Jack: @cite{Forth: The New Model}
  @cindex @cite{Forth: The new model} (book)
  @cite{Forth: The New Model} by Jack Woehr (Prentice-Hall, 1993) is an
- Line 9795  hardly more useful than a pre-ANS book.
+ Line 9998  hardly more useful than a pre-ANS book.
  @cindex Forth interest group (FIG)
  The Forth Interest Group (FIG) is a world-wide, non-profit,
- member-supported organisation. It publishes a regular magazine and
+ member-supported organisation. It publishes a regular magazine,
- offers other benefits of membership. You can contact the FIG through
+ @var{FORTH Dimensions}, and offers other benefits of membership. You can
- their office email address: @email{office@@forth.org} or by visiting
+ contact the FIG through their office email address:
- their web site at @url{http://www.forth.org/}. This web site also
+ @email{office@@forth.org} or by visiting their web site at
- includes links to FIG chapters in other countries and American cities
+ @url{http://www.forth.org/}. This web site also includes links to FIG
+ chapters in other countries and American cities
  (@url{http://www.forth.org/chapters.html}).
  @node Conferences, , The Forth Interest Group, Forth-related information
- Line 9807  includes links to FIG chapters in other
+ Line 10011  includes links to FIG chapters in other
  @cindex Conferences
  There are several regular conferences related to Forth. They are all
- well-publicised in FIG magazine and on the comp.lang.forth news group:
+ well-publicised in @var{FORTH Dimensions} and on the comp.lang.forth
+ news group:
  @itemize @bullet
  @item
- Line 9824  EuroForth -- this European conference ta
+ Line 10029  EuroForth -- this European conference ta
  @node Word Index, Concept Index, Forth-related information, Top
  @unnumbered Word Index
- This index is as incomplete as the manual. Each word is listed with
+ This index is a list of Forth words that have ``glossary'' entries
- stack effect and wordset.
+ within this manual. Each word is listed with its stack effect and
+ wordset.
  @printindex fn
  @node Concept Index,  , Word Index, Top
  @unnumbered Concept and Word Index
- This index is as incomplete as the manual. Not all entries listed are
+ Not all entries listed in this index are present verbatim in the
- present verbatim in the text. Only the names are listed for the words
+ text. This index also duplicates, in abbreviated form, all of the words
- here.
+ listed in the Word Index (only the names are listed for the words here).
  @printindex cp

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>

Removed from v.1.25
changed lines
	Added in v.1.26