gforth/doc/gforth.ds - diff

Return to gforth.ds CVS log

Up to [gforth] / gforth / doc

Diff for /gforth/doc/gforth.ds between versions 1.77 and 1.78

-version 1.77, 2000/08/21 20:08:02
+version 1.78, 2000/08/22 18:15:38
  Line 170  personal machines. This manual correspon
  * Name Index::                  Forth words, only names listed
  * Concept Index::               A menu covering many topics
- @detailmenu
+ @detailmenu --- The Detailed Node Listing ---
-  --- The Detailed Node Listing ---
  Gforth Environment
- Line 250  Forth Words
+ Line 249  Forth Words
  * Files::
  * Blocks::
  * Other I/O::
- * Programming Tools::
- * Assembler and Code Words::
- * Threading Words::
  * Locals::
  * Structures::
  * Object-oriented Forth::
+ * Programming Tools::
+ * Assembler and Code Words::
+ * Threading Words::
  * Passing Commands to the OS::
  * Keeping track of Time::
  * Miscellaneous Words::
- Line 357  Other I/O
+ Line 356  Other I/O
  * Displaying characters and strings::  Other stuff
  * Input::                       Input
- Programming Tools
- * Examining::
- * Forgetting words::
- * Debugging::                   Simple and quick.
- * Assertions::                  Making your programs self-checking.
- * Singlestep Debugger::         Executing your program word by word.
- Assembler and Code Words
- * Code and ;code::
- * Common Assembler::            Assembler Syntax
- * Common Disassembler::
- * 386 Assembler::               Deviations and special cases
- * Alpha Assembler::             Deviations and special cases
- * MIPS assembler::              Deviations and special cases
- * Other assemblers::            How to write them
  Locals
  * Gforth locals::
- Line 384  Gforth locals
+ Line 365  Gforth locals
  * Where are locals visible by name?::
  * How long do locals live?::
- * Programming Style::
+ * Locals programming style::
- * Implementation::
+ * Locals implementation::
  Structures
- Line 433  The @file{mini-oof.fs} model
+ Line 414  The @file{mini-oof.fs} model
  * Mini-OOF Example::
  * Mini-OOF Implementation::
+ Programming Tools
+ * Examining::
+ * Forgetting words::
+ * Debugging::                   Simple and quick.
+ * Assertions::                  Making your programs self-checking.
+ * Singlestep Debugger::         Executing your program word by word.
+ Assembler and Code Words
+ * Code and ;code::
+ * Common Assembler::            Assembler Syntax
+ * Common Disassembler::
+ * 386 Assembler::               Deviations and special cases
+ * Alpha Assembler::             Deviations and special cases
+ * MIPS assembler::              Deviations and special cases
+ * Other assemblers::            How to write them
  Tools
  * ANS Report::                  Report the words used, sorted by wordset.
- Line 4350  the exercises in a .fs file in the distr
+ Line 4349  the exercises in a .fs file in the distr
  * Files::
  * Blocks::
  * Other I/O::
- * Programming Tools::
- * Assembler and Code Words::
- * Threading Words::
  * Locals::
  * Structures::
  * Object-oriented Forth::
+ * Programming Tools::
+ * Assembler and Code Words::
+ * Threading Words::
  * Passing Commands to the OS::
  * Keeping track of Time::
  * Miscellaneous Words::
- Line 4936  doc-2rdrop
+ Line 4935  doc-2rdrop
  @node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation
  @subsection Locals stack
- Gforth uses an extra locals stack. It is described, along with the
+ Gforth uses an extra locals stack.  It is described, along with the
- reasons for its existence, in @ref{Implementation,Implementation of locals}.
+ reasons for its existence, in @ref{Locals implementation}.
  @node Stack pointer manipulation,  , Locals stack, Stack Manipulation
  @subsection Stack pointer manipulation
- Line 7714  Forth.
+ Line 7713  Forth.
  @comment TODO: locals section refers to here, saying that every word list (aka
  @comment vocabulary) has its own methods for searching etc. Need to document that.
+ @c anton: but better in a separate subsection on wordlist internals
  @comment TODO: document markers, reveal, tables, mappedwordlist
  Line 8377  doc-block-included
  @c -------------------------------------------------------------
- @node Other I/O, Programming Tools, Blocks, Words
+ @node Other I/O, Locals, Blocks, Words
  @section Other I/O
  @cindex I/O - keyboard and display
  Line 8715  doc-expect
  doc-span
  @c -------------------------------------------------------------
- @node Programming Tools, Assembler and Code Words, Other I/O, Words
+ @node Locals, Structures, Other I/O, Words
- @section Programming Tools
+ @section Locals
- @cindex programming tools
+ @cindex locals
+ Local variables can make Forth programming more enjoyable and Forth
+ programs easier to read. Unfortunately, the locals of ANS Forth are
+ laden with restrictions. Therefore, we provide not only the ANS Forth
+ locals wordset, but also our own, more powerful locals wordset (we
+ implemented the ANS Forth locals wordset through our locals wordset).
+ The ideas in this section have also been published in M. Anton Ertl,
+ @cite{@uref{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz,
+ Automatic Scoping of Local Variables}}, EuroForth '94.
  @menu
- * Examining::
+ * Gforth locals::
- * Forgetting words::
+ * ANS Forth locals::
- * Debugging::                   Simple and quick.
- * Assertions::                  Making your programs self-checking.
- * Singlestep Debugger::         Executing your program word by word.
  @end menu
- @node Examining, Forgetting words, Programming Tools, Programming Tools
+ @node Gforth locals, ANS Forth locals, Locals, Locals
- @subsection Examining data and code
+ @subsection Gforth locals
- @cindex examining data and code
+ @cindex Gforth locals
- @cindex data examination
+ @cindex locals, Gforth style
- @cindex code examination
- The following words inspect the stack non-destructively:
+ Locals can be defined with
- doc-.s
+ @example
- doc-f.s
+ @{ local1 local2 ... -- comment @}
+ @end example
+ or
+ @example
+ @{ local1 local2 ... @}
+ @end example
- There is a word @code{.r} but it does @i{not} display the return stack!
+ E.g.,
- It is used for formatted numeric output (@pxref{Simple numeric output}).
+ @example
+ : max @{ n1 n2 -- n3 @}
+  n1 n2 > if
+    n1
+  else
+    n2
+  endif ;
+ @end example
- doc-depth
+ The similarity of locals definitions with stack comments is intended. A
- doc-fdepth
+ locals definition often replaces the stack comment of a word. The order
- doc-clearstack
+ of the locals corresponds to the order in a stack comment and everything
+ after the @code{--} is really a comment.
- The following words inspect memory.
+ This similarity has one disadvantage: It is too easy to confuse locals
+ declarations with stack comments, causing bugs and making them hard to
+ find. However, this problem can be avoided by appropriate coding
+ conventions: Do not use both notations in the same program. If you do,
+ they should be distinguished using additional means, e.g. by position.
- doc-?
+ @cindex types of locals
- doc-dump
+ @cindex locals types
+ The name of the local may be preceded by a type specifier, e.g.,
+ @code{F:} for a floating point value:
- And finally, @code{see} allows to inspect code:
+ @example
+ : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
+ \ complex multiplication
+  Ar Br f* Ai Bi f* f-
+  Ar Bi f* Ai Br f* f+ ;
+ @end example
- doc-see
+ @cindex flavours of locals
- doc-xt-see
+ @cindex locals flavours
+ @cindex value-flavoured locals
+ @cindex variable-flavoured locals
+ Gforth currently supports cells (@code{W:}, @code{W^}), doubles
+ (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
+ (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
+ with @code{W:}, @code{D:} etc.) produces its value and can be changed
+ with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
+ produces its address (which becomes invalid when the variable's scope is
+ left). E.g., the standard word @code{emit} can be defined in terms of
+ @code{type} like this:
- @node Forgetting words, Debugging, Examining, Programming Tools
+ @example
- @subsection Forgetting words
+ : emit @{ C^ char* -- @}
- @cindex words, forgetting
+     char* 1 type ;
- @cindex forgeting words
+ @end example
- @c  anton: other, maybe better places for this subsection: Defining Words;
+ @cindex default type of locals
- @c  Dictionary allocation.  At least a reference should be there.
+ @cindex locals, default type
+ A local without type specifier is a @code{W:} local. Both flavours of
+ locals are initialized with values from the data or FP stack.
- Forth allows you to forget words (and everything that was alloted in the
+ Currently there is no way to define locals with user-defined data
- dictonary after them) in a LIFO manner.
+ structures, but we are working on it.
- doc-marker
+ Gforth allows defining locals everywhere in a colon definition. This
+ poses the following questions:
- The most common use of this feature is during progam development: when
+ @menu
- you change a source file, forget all the words it defined and load it
+ * Where are locals visible by name?::
- again (since you also forget everything defined after the source file
+ * How long do locals live?::
- was loaded, you have to reload that, too).  Note that effects like
+ * Locals programming style::
- storing to variables and destroyed system words are not undone when you
+ * Locals implementation::
- forget words.  With a system like Gforth, that is fast enough at
+ @end menu
- starting up and compiling, I find it more convenient to exit and restart
- Gforth, as this gives me a clean slate.
- Here's an example of using @code{marker} at the start of a source file
+ @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
- that you are debugging; it ensures that you only ever have one copy of
+ @subsubsection Where are locals visible by name?
- the file's definitions compiled at any time:
+ @cindex locals visibility
+ @cindex visibility of locals
+ @cindex scope of locals
- @example
+ Basically, the answer is that locals are visible where you would expect
- [IFDEF] my-code
+ it in block-structured languages, and sometimes a little longer. If you
-     my-code
+ want to restrict the scope of a local, enclose its definition in
- [ENDIF]
+ @code{SCOPE}...@code{ENDSCOPE}.
- marker my-code
- init-included-files
- \ .. definitions start here
+ doc-scope
- \ .
+ doc-endscope
- \ .
- \ end
- @end example
- @node Debugging, Assertions, Forgetting words, Programming Tools
+ These words behave like control structure words, so you can use them
- @subsection Debugging
+ with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
- @cindex debugging
+ arbitrary ways.
- Languages with a slow edit/compile/link/test development loop tend to
+ If you want a more exact answer to the visibility question, here's the
- require sophisticated tracing/stepping debuggers to facilate debugging.
+ basic principle: A local is visible in all places that can only be
+ reached through the definition of the local@footnote{In compiler
+ construction terminology, all places dominated by the definition of the
+ local.}. In other words, it is not visible in places that can be reached
+ without going through the definition of the local. E.g., locals defined
+ in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
+ defined in @code{BEGIN}...@code{UNTIL} are visible after the
+ @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
- A much better (faster) way in fast-compiling languages is to add
+ The reasoning behind this solution is: We want to have the locals
- printing code at well-selected places, let the program run, look at
+ visible as long as it is meaningful. The user can always make the
- the output, see where things went wrong, add more printing code, etc.,
+ visibility shorter by using explicit scoping. In a place that can
- until the bug is found.
+ only be reached through the definition of a local, the meaning of a
+ local name is clear. In other places it is not: How is the local
+ initialized at the control flow path that does not contain the
+ definition? Which local is meant, if the same name is defined twice in
+ two independent control flow paths?
- The simple debugging aids provided in @file{debugs.fs}
+ This should be enough detail for nearly all users, so you can skip the
- are meant to support this style of debugging.
+ rest of this section. If you really must know all the gory details and
+ options, read on.
- The word @code{~~} prints debugging information (by default the source
+ In order to implement this rule, the compiler has to know which places
- location and the stack contents). It is easy to insert. If you use Emacs
+ are unreachable. It knows this automatically after @code{AHEAD},
- it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
+ @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
- query-replace them with nothing). The deferred words
+ most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
- @code{printdebugdata} and @code{printdebugline} control the output of
+ compiler that the control flow never reaches that place. If
- @code{~~}. The default source location output format works well with
+ @code{UNREACHABLE} is not used where it could, the only consequence is
- Emacs' compilation mode, so you can step through the program at the
+ that the visibility of some locals is more limited than the rule above
- source level using @kbd{C-x `} (the advantage over a stepping debugger
+ says. If @code{UNREACHABLE} is used where it should not (i.e., if you
- is that you can step in any direction and you know where the crash has
+ lie to the compiler), buggy code will be produced.
- happened or where the strange data has occurred).
- doc-~~
- doc-printdebugdata
- doc-printdebugline
- @node Assertions, Singlestep Debugger, Debugging, Programming Tools
+ doc-unreachable
- @subsection Assertions
- @cindex assertions
- It is a good idea to make your programs self-checking, especially if you
- make an assumption that may become invalid during maintenance (for
- example, that a certain field of a data structure is never zero). Gforth
- supports @dfn{assertions} for this purpose. They are used like this:
+ Another problem with this rule is that at @code{BEGIN}, the compiler
+ does not know which locals will be visible on the incoming
+ back-edge. All problems discussed in the following are due to this
+ ignorance of the compiler (we discuss the problems using @code{BEGIN}
+ loops as examples; the discussion also applies to @code{?DO} and other
+ loops). Perhaps the most insidious example is:
  @example
- assert( @i{flag} )
+ AHEAD
+ BEGIN
+   x
+ [ 1 CS-ROLL ] THEN
+   @{ x @}
+   ...
+ UNTIL
  @end example
- The code between @code{assert(} and @code{)} should compute a flag, that
+ This should be legal according to the visibility rule. The use of
- should be true if everything is alright and false otherwise. It should
+ @code{x} can only be reached through the definition; but that appears
- not change anything else on the stack. The overall stack effect of the
+ textually below the use.
- assertion is @code{( -- )}. E.g.
+ From this example it is clear that the visibility rules cannot be fully
+ implemented without major headaches. Our implementation treats common
+ cases as advertised and the exceptions are treated in a safe way: The
+ compiler makes a reasonable guess about the locals visible after a
+ @code{BEGIN}; if it is too pessimistic, the
+ user will get a spurious error about the local not being defined; if the
+ compiler is too optimistic, it will notice this later and issue a
+ warning. In the case above the compiler would complain about @code{x}
+ being undefined at its use. You can see from the obscure examples in
+ this section that it takes quite unusual control structures to get the
+ compiler into trouble, and even then it will often do fine.
+ If the @code{BEGIN} is reachable from above, the most optimistic guess
+ is that all locals visible before the @code{BEGIN} will also be
+ visible after the @code{BEGIN}. This guess is valid for all loops that
+ are entered only through the @code{BEGIN}, in particular, for normal
+ @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
+ @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
+ compiler. When the branch to the @code{BEGIN} is finally generated by
+ @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
+ warns the user if it was too optimistic:
  @example
- assert( 1 1 + 2 = ) \ what we learn in school
+ IF
- assert( dup 0<> ) \ assert that the top of stack is not zero
+   @{ x @}
- assert( false ) \ this code should not be reached
+ BEGIN
+   \ x ?
+ [ 1 cs-roll ] THEN
+   ...
+ UNTIL
  @end example
- The need for assertions is different at different times. During
+ Here, @code{x} lives only until the @code{BEGIN}, but the compiler
- debugging, we want more checking, in production we sometimes care more
+ optimistically assumes that it lives until the @code{THEN}. It notices
- for speed. Therefore, assertions can be turned off, i.e., the assertion
+ this difference when it compiles the @code{UNTIL} and issues a
- becomes a comment. Depending on the importance of an assertion and the
+ warning. The user can avoid the warning, and make sure that @code{x}
- time it takes to check it, you may want to turn off some assertions and
+ is not used in the wrong area by using explicit scoping:
- keep others turned on. Gforth provides several levels of assertions for
+ @example
- this purpose:
+ IF
+   SCOPE
+   @{ x @}
+   ENDSCOPE
+ BEGIN
+ [ 1 cs-roll ] THEN
+   ...
+ UNTIL
+ @end example
+ Since the guess is optimistic, there will be no spurious error messages
+ about undefined locals.
- doc-assert0(
+ If the @code{BEGIN} is not reachable from above (e.g., after
- doc-assert1(
+ @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
- doc-assert2(
+ optimistic guess, as the locals visible after the @code{BEGIN} may be
- doc-assert3(
+ defined later. Therefore, the compiler assumes that no locals are
- doc-assert(
+ visible after the @code{BEGIN}. However, the user can use
- doc-)
+ @code{ASSUME-LIVE} to make the compiler assume that the same locals are
+ visible at the BEGIN as at the point where the top control-flow stack
+ item was created.
- The variable @code{assert-level} specifies the highest assertions that
+ doc-assume-live
- are turned on. I.e., at the default @code{assert-level} of one,
- @code{assert0(} and @code{assert1(} assertions perform checking, while
- @code{assert2(} and @code{assert3(} assertions are treated as comments.
- The value of @code{assert-level} is evaluated at compile-time, not at
- run-time. Therefore you cannot turn assertions on or off at run-time;
- you have to set the @code{assert-level} appropriately before compiling a
- piece of code. You can compile different pieces of code at different
- @code{assert-level}s (e.g., a trusted library at level 1 and
- newly-written code at level 3).
+ @noindent
+ E.g.,
+ @example
+ @{ x @}
+ AHEAD
+ ASSUME-LIVE
+ BEGIN
+   x
+ [ 1 CS-ROLL ] THEN
+   ...
+ UNTIL
+ @end example
- doc-assert-level
+ Other cases where the locals are defined before the @code{BEGIN} can be
+ handled by inserting an appropriate @code{CS-ROLL} before the
+ @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
+ behind the @code{ASSUME-LIVE}).
+ Cases where locals are defined after the @code{BEGIN} (but should be
+ visible immediately after the @code{BEGIN}) can only be handled by
+ rearranging the loop. E.g., the ``most insidious'' example above can be
+ arranged into:
+ @example
+ BEGIN
+   @{ x @}
+   ... 0=
+ WHILE
+   x
+ REPEAT
+ @end example
- If an assertion fails, a message compatible with Emacs' compilation mode
+ @node How long do locals live?, Locals programming style, Where are locals visible by name?, Gforth locals
- is produced and the execution is aborted (currently with @code{ABORT"}.
+ @subsubsection How long do locals live?
- If there is interest, we will introduce a special throw code. But if you
+ @cindex locals lifetime
- intend to @code{catch} a specific condition, using @code{throw} is
+ @cindex lifetime of locals
- probably more appropriate than an assertion).
- Definitions in ANS Forth for these assertion words are provided
+ The right answer for the lifetime question would be: A local lives at
- in @file{compat/assert.fs}.
+ least as long as it can be accessed. For a value-flavoured local this
+ means: until the end of its visibility. However, a variable-flavoured
+ local could be accessed through its address far beyond its visibility
+ scope. Ultimately, this would mean that such locals would have to be
+ garbage collected. Since this entails un-Forth-like implementation
+ complexities, I adopted the same cowardly solution as some other
+ languages (e.g., C): The local lives only as long as it is visible;
+ afterwards its address is invalid (and programs that access it
+ afterwards are erroneous).
+ @node Locals programming style, Locals implementation, How long do locals live?, Gforth locals
+ @subsubsection Locals programming style
+ @cindex locals programming style
+ @cindex programming style, locals
- @node Singlestep Debugger,  , Assertions, Programming Tools
+ The freedom to define locals anywhere has the potential to change
- @subsection Singlestep Debugger
+ programming styles dramatically. In particular, the need to use the
- @cindex singlestep Debugger
+ return stack for intermediate storage vanishes. Moreover, all stack
- @cindex debugging Singlestep
+ manipulations (except @code{PICK}s and @code{ROLL}s with run-time
+ determined arguments) can be eliminated: If the stack items are in the
+ wrong order, just write a locals definition for all of them; then
+ write the items in the order you want.
- When you create a new word there's often the need to check whether it
+ This seems a little far-fetched and eliminating stack manipulations is
- behaves correctly or not. You can do this by typing @code{dbg
+ unlikely to become a conscious programming objective. Still, the number
- badword}. A debug session might look like this:
+ of stack manipulations will be reduced dramatically if local variables
+ are used liberally (e.g., compare @code{max} (@pxref{Gforth locals}) with
+ a traditional implementation of @code{max}).
- @example
+ This shows one potential benefit of locals: making Forth programs more
- : badword 0 DO i . LOOP ;  ok
+ readable. Of course, this benefit will only be realized if the
-dbg badword
+ programmers continue to honour the principle of factoring instead of
- : badword
+ using the added latitude to make the words longer.
- Scanning code...
- Nesting debugger ready!
+ @cindex single-assignment style for locals
+ Using @code{TO} can and should be avoided.  Without @code{TO},
+ every value-flavoured local has only a single assignment and many
+ advantages of functional languages apply to Forth. I.e., programs are
+ easier to analyse, to optimize and to read: It is clear from the
+ definition what the local stands for, it does not turn into something
+ different later.
-D4738  8049BC4 0              -> [ 2 ] 00002 00000
+ E.g., a definition using @code{TO} might look like this:
-D4740  8049F68 DO             -> [ 0 ]
+ @example
-D4744  804A0C8 i              -> [ 1 ] 00000
+ : strcmp @{ addr1 u1 addr2 u2 -- n @}
-D4748 400C5E60 .              -> 0 [ 0 ]
+  u1 u2 min 0
-D474C  8049D0C LOOP           -> [ 0 ]
+  ?do
-D4744  804A0C8 i              -> [ 1 ] 00001
+    addr1 c@@ addr2 c@@ -
-D4748 400C5E60 .              -> 1 [ 0 ]
+    ?dup-if
-D474C  8049D0C LOOP           -> [ 0 ]
+      unloop exit
-D4758  804B384 ;              ->  ok
+    then
+    addr1 char+ TO addr1
+    addr2 char+ TO addr2
+  loop
+  u1 u2 - ;
  @end example
+ Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
+ every loop iteration. @code{strcmp} is a typical example of the
+ readability problems of using @code{TO}. When you start reading
+ @code{strcmp}, you think that @code{addr1} refers to the start of the
+ string. Only near the end of the loop you realize that it is something
+ else.
- Each line displayed is one step. You always have to hit return to
+ This can be avoided by defining two locals at the start of the loop that
- execute the next word that is displayed. If you don't want to execute
+ are initialized with the right value for the current iteration.
- the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is
+ @example
- an overview what keys are available:
+ : strcmp @{ addr1 u1 addr2 u2 -- n @}
+  addr1 addr2
+  u1 u2 min 0
+  ?do @{ s1 s2 @}
+    s1 c@@ s2 c@@ -
+    ?dup-if
+      unloop exit
+    then
+    s1 char+ s2 char+
+  loop
+drop
+  u1 u2 - ;
+ @end example
+ Here it is clear from the start that @code{s1} has a different value
+ in every loop iteration.
- @table @i
+ @node Locals implementation,  , Locals programming style, Gforth locals
+ @subsubsection Locals implementation
+ @cindex locals implementation
+ @cindex implementation of locals
- @item @key{RET}
+ @cindex locals stack
- Next; Execute the next word.
+ Gforth uses an extra locals stack. The most compelling reason for
+ this is that the return stack is not float-aligned; using an extra stack
+ also eliminates the problems and restrictions of using the return stack
+ as locals stack. Like the other stacks, the locals stack grows toward
+ lower addresses. A few primitives allow an efficient implementation:
- @item n
- Nest; Single step through next word.
- @item u
+ doc-@local#
- Unnest; Stop debugging and execute rest of word. If we got to this word
+ doc-f@local#
- with nest, continue debugging with the calling word.
+ doc-laddr#
+ doc-lp+!#
+ doc-lp!
+ doc->l
+ doc-f>l
- @item d
- Done; Stop debugging and execute rest.
- @item s
+ In addition to these primitives, some specializations of these
- Stop; Abort immediately.
+ primitives for commonly occurring inline arguments are provided for
+ efficiency reasons, e.g., @code{@@local0} as specialization of
+ @code{@@local#} for the inline argument 0. The following compiling words
+ compile the right specialized version, or the general version, as
+ appropriate:
- @end table
- Debugging large application with this mechanism is very difficult, because
+ doc-compile-@local
- you have to nest very deeply into the program before the interesting part
+ doc-compile-f@local
- begins. This takes a lot of time.
+ doc-compile-lp+!
- To do it more directly put a @code{BREAK:} command into your source code.
- When program execution reaches @code{BREAK:} the single step debugger is
- invoked and you have all the features described above.
- If you have more than one part to debug it is useful to know where the
+ Combinations of conditional branches and @code{lp+!#} like
- program has stopped at the moment. You can do this by the
+ @code{?branch-lp+!#} (the locals pointer is only changed if the branch
- @code{BREAK" string"} command. This behaves like @code{BREAK:} except that
+ is taken) are provided for efficiency and correctness in loops.
- string is typed out when the ``breakpoint'' is reached.
+ A special area in the dictionary space is reserved for keeping the
+ local variable names. @code{@{} switches the dictionary pointer to this
+ area and @code{@}} switches it back and generates the locals
+ initializing code. @code{W:} etc.@ are normal defining words. This
+ special area is cleared at the start of every colon definition.
- doc-dbg
+ @cindex word list for defining locals
- doc-break:
+ A special feature of Gforth's dictionary is used to implement the
- doc-break"
+ definition of locals without type specifiers: every word list (aka
+ vocabulary) has its own methods for searching
+ etc. (@pxref{Word Lists}). For the present purpose we defined a word list
+ with a special search method: When it is searched for a word, it
+ actually creates that word using @code{W:}. @code{@{} changes the search
+ order to first search the word list containing @code{@}}, @code{W:} etc.,
+ and then the word list for defining locals without type specifiers.
+ The lifetime rules support a stack discipline within a colon
+ definition: The lifetime of a local is either nested with other locals
+ lifetimes or it does not overlap them.
+ At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
+ pointer manipulation is generated. Between control structure words
+ locals definitions can push locals onto the locals stack. @code{AGAIN}
+ is the simplest of the other three control flow words. It has to
+ restore the locals stack depth of the corresponding @code{BEGIN}
+ before branching. The code looks like this:
+ @format
+ @code{lp+!#} current-locals-size @minus{} dest-locals-size
+ @code{branch} <begin>
+ @end format
- @c -------------------------------------------------------------
+ @code{UNTIL} is a little more complicated: If it branches back, it
- @node Assembler and Code Words, Threading Words, Programming Tools, Words
+ must adjust the stack just like @code{AGAIN}. But if it falls through,
- @section Assembler and Code Words
+ the locals stack must not be changed. The compiler generates the
- @cindex assembler
+ following code:
- @cindex code words
+ @format
+ @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
+ @end format
+ The locals stack pointer is only adjusted if the branch is taken.
- @menu
+ @code{THEN} can produce somewhat inefficient code:
- * Code and ;code::
+ @format
- * Common Assembler::            Assembler Syntax
+ @code{lp+!#} current-locals-size @minus{} orig-locals-size
- * Common Disassembler::
+ <orig target>:
- * 386 Assembler::               Deviations and special cases
+ @code{lp+!#} orig-locals-size @minus{} new-locals-size
- * Alpha Assembler::             Deviations and special cases
+ @end format
- * MIPS assembler::              Deviations and special cases
+ The second @code{lp+!#} adjusts the locals stack pointer from the
- * Other assemblers::            How to write them
+ level at the @i{orig} point to the level after the @code{THEN}. The
- @end menu
+ first @code{lp+!#} adjusts the locals stack pointer from the current
+ level to the level at the orig point, so the complete effect is an
+ adjustment from the current level to the right level after the
+ @code{THEN}.
- @node Code and ;code, Common Assembler, Assembler and Code Words, Assembler and Code Words
+ @cindex locals information on the control-flow stack
- @subsection @code{Code} and @code{;code}
+ @cindex control-flow stack items, locals information
+ In a conventional Forth implementation a dest control-flow stack entry
+ is just the target address and an orig entry is just the address to be
+ patched. Our locals implementation adds a word list to every orig or dest
+ item. It is the list of locals visible (or assumed visible) at the point
+ described by the entry. Our implementation also adds a tag to identify
+ the kind of entry, in particular to differentiate between live and dead
+ (reachable and unreachable) orig entries.
- Gforth provides some words for defining primitives (words written in
+ A few unusual operations have to be performed on locals word lists:
- machine code), and for defining the machine-code equivalent of
- @code{DOES>}-based defining words. However, the machine-independent
- nature of Gforth poses a few problems: First of all, Gforth runs on
- several architectures, so it can provide no standard assembler. What's
- worse is that the register allocation not only depends on the processor,
- but also on the @code{gcc} version and options used.
- The words that Gforth offers encapsulate some system dependences (e.g.,
- the header structure), so a system-independent assembler may be used in
- Gforth. If you do not have an assembler, you can compile machine code
- directly with @code{,} and @code{c,}@footnote{This isn't portable,
- because these words emit stuff in @i{data} space; it works because
- Gforth has unified code/data spaces. Assembler isn't likely to be
- portable anyway.}.
+ doc-common-list
+ doc-sub-list?
+ doc-list-size
- doc-assembler
- doc-init-asm
- doc-code
- doc-end-code
- doc-;code
- doc-flush-icache
+ Several features of our locals word list implementation make these
+ operations easy to implement: The locals word lists are organised as
+ linked lists; the tails of these lists are shared, if the lists
+ contain some of the same locals; and the address of a name is greater
+ than the address of the names behind it in the list.
- If @code{flush-icache} does not work correctly, @code{code} words
+ Another important implementation detail is the variable
- etc. will not work (reliably), either.
+ @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
+ determine if they can be reached directly or only through the branch
+ that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
+ @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
+ definition, by @code{BEGIN} and usually by @code{THEN}.
- The typical usage of these @code{code} words can be shown most easily by
+ Counted loops are similar to other loops in most respects, but
- analogy to the equivalent high-level defining words:
+ @code{LEAVE} requires special attention: It performs basically the same
+ service as @code{AHEAD}, but it does not create a control-flow stack
+ entry. Therefore the information has to be stored elsewhere;
+ traditionally, the information was stored in the target fields of the
+ branches created by the @code{LEAVE}s, by organizing these fields into a
+ linked list. Unfortunately, this clever trick does not provide enough
+ space for storing our extended control flow information. Therefore, we
+ introduce another stack, the leave stack. It contains the control-flow
+ stack entries for all unresolved @code{LEAVE}s.
+ Local names are kept until the end of the colon definition, even if
+ they are no longer visible in any control-flow path. In a few cases
+ this may lead to increased space needs for the locals name area, but
+ usually less than reclaiming this space would cost in code size.
+ @node ANS Forth locals,  , Gforth locals, Locals
+ @subsection ANS Forth locals
+ @cindex locals, ANS Forth style
+ The ANS Forth locals wordset does not define a syntax for locals, but
+ words that make it possible to define various syntaxes. One of the
+ possible syntaxes is a subset of the syntax we used in the Gforth locals
+ wordset, i.e.:
  @example
- : foo                              code foo
+ @{ local1 local2 ... -- comment @}
-    <high-level Forth words>              <assembler>
+ @end example
- ;                                  end-code
+ @noindent
+ or
- : bar                              : bar
+ @example
-    <high-level Forth words>           <high-level Forth words>
+ @{ local1 local2 ... @}
-    CREATE                             CREATE
-       <high-level Forth words>           <high-level Forth words>
-    DOES>                              ;code
-       <high-level Forth words>           <assembler>
- ;                                  end-code
  @end example
- @c anton: the following stuff is also in "Common Assembler", in less detail.
+ The order of the locals corresponds to the order in a stack comment. The
+ restrictions are:
- @cindex registers of the inner interpreter
+ @itemize @bullet
- In the assembly code you will want to refer to the inner interpreter's
+ @item
- registers (e.g., the data stack pointer) and you may want to use other
+ Locals can only be cell-sized values (no type specifiers are allowed).
- registers for temporary storage. Unfortunately, the register allocation
+ @item
- is installation-dependent.
+ Locals can be defined only outside control structures.
+ @item
+ Locals can interfere with explicit usage of the return stack. For the
+ exact (and long) rules, see the standard. If you don't use return stack
+ accessing words in a definition using locals, you will be all right. The
+ purpose of this rule is to make locals implementation on the return
+ stack easier.
+ @item
+ The whole definition must be in one line.
+ @end itemize
- In particular, @code{ip} (Forth instruction pointer) and @code{rp}
+ Locals defined in ANS Forth behave like @code{VALUE}s
- (return stack pointer) are in different places in @code{gforth} and
+ (@pxref{Values}). I.e., they are initialized from the stack. Using their
- @code{gforth-fast}.  This means that you cannot write a @code{NEXT}
+ name produces their value. Their value can be changed using @code{TO}.
- routine that works on both versions; so for doing @code{NEXT}, I
- recomment jumping to @code{' noop >code-address}, which contains nothing
- but a @code{NEXT}.
- For general accesses to the inner interpreter's registers, the easiest
+ Since the syntax above is supported by Gforth directly, you need not do
- solution is to use explicit register declarations (@pxref{Explicit Reg
+ anything to use it. If you want to port a program using this syntax to
- Vars, , Variables in Specified Registers, gcc.info, GNU C Manual}) for
+ another ANS Forth system, use @file{compat/anslocal.fs} to implement the
- all of the inner interpreter's registers: You have to compile Gforth
+ syntax on the other system.
- with @code{-DFORCE_REG} (configure option @code{--enable-force-reg}) and
- the appropriate declarations must be present in the @code{machine.h}
- file (see @code{mips.h} for an example; you can find a full list of all
- declarable register symbols with @code{grep register engine.c}). If you
- give explicit registers to all variables that are declared at the
- beginning of @code{engine()}, you should be able to use the other
- caller-saved registers for temporary storage. Alternatively, you can use
- the @code{gcc} option @code{-ffixed-REG} (@pxref{Code Gen Options, ,
- Options for Code Generation Conventions, gcc.info, GNU C Manual}) to
- reserve a register (however, this restriction on register allocation may
- slow Gforth significantly).
- If this solution is not viable (e.g., because @code{gcc} does not allow
+ Note that a syntax shown in the standard, section A.13 looks
- you to explicitly declare all the registers you need), you have to find
+ similar, but is quite different in having the order of locals
- out by looking at the code where the inner interpreter's registers
+ reversed. Beware!
- reside and which registers can be used for temporary storage. You can
- get an assembly listing of the engine's code with @code{make engine.s}.
- In any case, it is good practice to abstract your assembly code from the
+ The ANS Forth locals wordset itself consists of one word:
- actual register allocation. E.g., if the data stack pointer resides in
- register @code{$17}, create an alias for this register called @code{sp},
- and use that in your assembly code.
- @cindex code words, portable
+ doc-(local)
- Another option for implementing normal and defining words efficiently
- is to add the desired functionality to the source of Gforth. For normal
- words you just have to edit @file{primitives} (@pxref{Automatic
- Generation}). Defining words (equivalent to @code{;CODE} words, for fast
- defined words) may require changes in @file{engine.c}, @file{kernel.fs},
- @file{prims2x.fs}, and possibly @file{cross.fs}.
- @node Common Assembler, Common Disassembler, Code and ;code, Assembler and Code Words
+ The ANS Forth locals extension wordset defines a syntax using
- @subsection Common Assembler
+ @code{locals|}, but it is so awful that we strongly recommend not to use
+ it. We have implemented this syntax to make porting to Gforth easy, but
+ do not document it here. The problem with this syntax is that the locals
+ are defined in an order reversed with respect to the standard stack
+ comment notation, making programs harder to read, and easier to misread
+ and miswrite. The only merit of this syntax is that it is easy to
+ implement using the ANS Forth locals wordset.
- The assemblers in Gforth generally use a postfix syntax, i.e., the
- instruction name follows the operands.
- The operands are passed in the usual order (the same that is used in the
+ @c ----------------------------------------------------------
- manual of the architecture).  Since they all are Forth words, they have
+ @node Structures, Object-oriented Forth, Locals, Words
- to be separated by spaces; you can also use Forth words to compute the
+ @section  Structures
- operands.
+ @cindex structures
+ @cindex records
- The instruction names usually end with a @code{,}.  This makes it easier
+ This section presents the structure package that comes with Gforth. A
- to visually separate instructions if you put several of them on one
+ version of the package implemented in ANS Forth is available in
- line; it also avoids shadowing other Forth words (e.g., @code{and}).
+ @file{compat/struct.fs}. This package was inspired by a posting on
+ comp.lang.forth in 1989 (unfortunately I don't remember, by whom;
+ possibly John Hayes). A version of this section has been published in
+ M. Anton Ertl,
+ @uref{http://www.complang.tuwien.ac.at/forth/objects/structs.html, Yet
+ Another Forth Structures Package}, Forth Dimensions 19(3), pages
+--16. Marcel Hendrix provided helpful comments.
- Registers are usually specified by number; e.g., (decimal) @code{11}
+ @menu
- specifies registers R11 and F11 on the Alpha architecture (which one,
+ * Why explicit structure support?::
- depends on the instruction).  The usual names are also available, e.g.,
+ * Structure Usage::
- @code{s2} for R11 on Alpha.
+ * Structure Naming Convention::
+ * Structure Implementation::
+ * Structure Glossary::
+ @end menu
- Control flow is specified similar to normal Forth code (@pxref{Arbitrary
+ @node Why explicit structure support?, Structure Usage, Structures, Structures
- control structures}), with @code{if,}, @code{ahead,}, @code{then,},
+ @subsection Why explicit structure support?
- @code{begin,}, @code{until,}, @code{again,}, @code{cs-roll},
- @code{cs-pick}, @code{else,}, @code{while,}, and @code{repeat,}.  The
- conditions are specified in a way specific to each assembler.
- Note that the register assignments of the Gforth engine can change
+ @cindex address arithmetic for structures
- between Gforth versions, or even between different compilations of the
+ @cindex structures using address arithmetic
- same Gforth version (e.g., if you use a different GCC version).  So if
+ If we want to use a structure containing several fields, we could simply
- you want to refer to Gforth's registers (e.g., the stack pointer or
+ reserve memory for it, and access the fields using address arithmetic
- TOS), I recommend defining your own words for refering to these
+ (@pxref{Address arithmetic}). As an example, consider a structure with
- registers, and using them later on; then you can easily adapt to a
+ the following fields
- changed register assignment.  The stability of the register assignment
- is usually better if you build Gforth with @code{--enable-force-reg}.
- In particular, the return stack pointer and the instruction pointer are
+ @table @code
- in memory in @code{gforth}, and usually in registers in
+ @item a
- @code{gforth-fast}.  The most common use of these registers is to
+ is a float
- dispatch to the next word (the @code{next} routine).  A portable way to
+ @item b
- do this is to jump to @code{' noop >code-address} (of course, this is
+ is a cell
- less efficient than integrating the @code{next} code and scheduling it
+ @item c
- well).
+ is a float
+ @end table
- @node  Common Disassembler, 386 Assembler, Common Assembler, Assembler and Code Words
+ Given the (float-aligned) base address of the structure we get the
- @subsection Common Disassembler
+ address of the field
- You can disassemble a @code{code} word with @code{see}
+ @table @code
- (@pxref{Debugging}).  You can disassemble a section of memory with
+ @item a
+ without doing anything further.
+ @item b
+ with @code{float+}
+ @item c
+ with @code{float+ cell+ faligned}
+ @end table
- doc-disasm
+ It is easy to see that this can become quite tiring.
- The disassembler generally produces output that can be fed into the
+ Moreover, it is not very readable, because seeing a
- assembler (i.e., same syntax, etc.).  It also includes additional
+ @code{cell+} tells us neither which kind of structure is
- information in comments.  In particular, the address of the instruction
+ accessed nor what field is accessed; we have to somehow infer the kind
- is given in a comment before the instruction.
+ of structure, and then look up in the documentation, which field of
+ that structure corresponds to that offset.
- @code{See} may display more or less than the actual code of the word,
+ Finally, this kind of address arithmetic also causes maintenance
- because the recognition of the end of the code is unreliable.  You can
+ troubles: If you add or delete a field somewhere in the middle of the
- use @code{disasm} if it did not display enough.  It may display more, if
+ structure, you have to find and change all computations for the fields
- the code word is not immediately followed by a named word.  If you have
+ afterwards.
- something else there, you can follow the word with @code{align last @ ,}
- to ensure that the end is recognized.
- @node 386 Assembler, Alpha Assembler, Common Disassembler, Assembler and Code Words
+ So, instead of using @code{cell+} and friends directly, how
- @subsection 386 Assembler
+ about storing the offsets in constants:
- The 386 assembler included in Gforth was written by Bernd Paysan, it's
+ @example
- available under GPL, and originally part of bigFORTH.
+constant a-offset
+float+ constant b-offset
+float+ cell+ faligned c-offset
+ @end example
- The 386 disassembler included in Gforth was written by Andrew McKewan
+ Now we can get the address of field @code{x} with @code{x-offset
- and is in the public domain.
+ +}. This is much better in all respects. Of course, you still
+ have to change all later offset definitions if you add a field. You can
+ fix this by declaring the offsets in the following way:
- The disassembler displays code in prefix Intel syntax.
+ @example
+constant a-offset
+ a-offset float+ constant b-offset
+ b-offset cell+ faligned constant c-offset
+ @end example
- The assembler uses a postfix syntax with reversed parameters.
+ Since we always use the offsets with @code{+}, we could use a defining
+ word @code{cfield} that includes the @code{+} in the action of the
+ defined word:
- The assembler includes all instruction of the Athlon, i.e. 486 core
+ @example
- instructions, Pentium and PPro extensions, floating point, MMX, 3Dnow!,
+ : cfield ( n "name" -- )
- but not ISSE. It's an integrated 16- and 32-bit assembler. Default is 32
+     create ,
- bit, you can switch to 16 bit with .86 and back to 32 bit with .386.
+ does> ( name execution: addr1 -- addr2 )
+     @@ + ;
- There are several prefixes to switch between different operation sizes,
+cfield a
- @code{.b} for byte accesses, @code{.w} for word accesses, @code{.d} for
+a float+ cfield b
- double-word accesses. Addressing modes can be switched with @code{.wa}
+b cell+ faligned cfield c
- for 16 bit addresses, and @code{.da} for 32 bit addresses. You don't
+ @end example
- need a prefix for byte register names (@code{AL} et al).
- For floating point operations, the prefixes are @code{.fs} (IEEE
+ Instead of @code{x-offset +}, we now simply write @code{x}.
- single), @code{.fl} (IEEE double), @code{.fx} (extended), @code{.fw}
- (word), @code{.fd} (double-word), and @code{.fq} (quad-word).
- The MMX opcodes don't have size prefixes, they are spelled out like in
+ The structure field words now can be used quite nicely. However,
- the Intel assembler. Instead of move from and to memory, there are
+ their definition is still a bit cumbersome: We have to repeat the
- PLDQ/PLDD and PSTQ/PSTD.
+ name, the information about size and alignment is distributed before
+ and after the field definitions etc.  The structure package presented
+ here addresses these problems.
- The registers lack the 'e' prefix; even in 32 bit mode, eax is called
+ @node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures
- ax.  Immediate values are indicated by postfixing them with @code{#},
+ @subsection Structure Usage
- e.g., @code{3 #}.  Here are some examples of addressing modes:
+ @cindex structure usage
+ @cindex @code{field} usage
+ @cindex @code{struct} usage
+ @cindex @code{end-struct} usage
+ You can define a structure for a (data-less) linked list with:
  @example
-#          \ immediate
+ struct
- ax           \ register
+     cell% field list-next
-di d)    \ 100[edi]
+ end-struct list%
-bx cx di)  \ 4[ebx][ecx]
- di ax *4 i)  \ [edi][eax*4]
-ax *4 i#) \ 20[eax*4]
  @end example
- Some example of instructions are:
+ With the address of the list node on the stack, you can compute the
+ address of the field that contains the address of the next node with
+ @code{list-next}. E.g., you can determine the length of a list
+ with:
  @example
- ax bx mov             \ move ebx,eax
+ : list-length ( list -- n )
-# ax mov            \ mov eax,3
+ \ "list" is a pointer to the first element of a linked list
-di ) ax mov       \ mov eax,100[edi]
+ \ "n" is the length of the list
-bx cx di) ax mov    \ mov eax,4[ebx][ecx]
+BEGIN ( list1 n1 )
- .w ax bx mov          \ mov bx,ax
+         over
+     WHILE ( list1 n1 )
++ swap list-next @@ swap
+     REPEAT
+     nip ;
  @end example
- The following forms are supported for binary instructions:
+ You can reserve memory for a list node in the dictionary with
+ @code{list% %allot}, which leaves the address of the list node on the
+ stack. For the equivalent allocation on the heap you can use @code{list%
+ %alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior),
+ use @code{list% %allocate}). You can get the the size of a list
+ node with @code{list% %size} and its alignment with @code{list%
+ %alignment}.
+ Note that in ANS Forth the body of a @code{create}d word is
+ @code{aligned} but not necessarily @code{faligned};
+ therefore, if you do a:
  @example
- <reg> <reg> <inst>
+ create @emph{name} foo% %allot drop
- <n> # <reg> <inst>
- <mem> <reg> <inst>
- <reg> <mem> <inst>
  @end example
- Immediate to memory is not supported.  The shift/rotate syntax is:
+ @noindent
+ then the memory alloted for @code{foo%} is guaranteed to start at the
+ body of @code{@emph{name}} only if @code{foo%} contains only character,
+ cell and double fields.  Therefore, if your structure contains floats,
+ better use
  @example
- <reg/mem> 1 # shl \ shortens to shift without immediate
+ foo% %allot constant @emph{name}
- <reg/mem> 4 # shl
- <reg/mem> cl shl
  @end example
- Precede string instructions (@code{movs} etc.) with @code{.b} to get
+ @cindex structures containing structures
- the byte version.
+ You can include a structure @code{foo%} as a field of
+ another structure, like this:
- The control structure words @code{IF} @code{UNTIL} etc. must be preceded
- by one of these conditions: @code{vs vc u< u>= 0= 0<> u<= u> 0< 0>= ps
- pc < >= <= >}. (Note that most of these words shadow some Forth words
- when @code{assembler} is in front of @code{forth} in the search path,
- e.g., in @code{code} words).  Currently the control structure words use
- one stack item, so you have to use @code{roll} instead of @code{cs-roll}
- to shuffle them (you can also use @code{swap} etc.).
- Here is an example of a @code{code} word (assumes that the stack pointer
- is in esi and the TOS is in ebx):
  @example
- code my+ ( n1 n2 -- n )
+ struct
-si D) bx add
+ ...
-# si add
+     foo% field ...
-     Next
+ ...
- end-code
+ end-struct ...
  @end example
- @node Alpha Assembler, MIPS assembler, 386 Assembler, Assembler and Code Words
+ @cindex structure extension
- @subsection Alpha Assembler
+ @cindex extended records
+ Instead of starting with an empty structure, you can extend an
- The Alpha assembler and disassembler were originally written by Bernd
+ existing structure. E.g., a plain linked list without data, as defined
- Thallner.
+ above, is hardly useful; You can extend it to a linked list of integers,
+ like this:@footnote{This feature is also known as @emph{extended
- The register names @code{a0}--@code{a5} are not available to avoid
+ records}. It is the main innovation in the Oberon language; in other
- shadowing hex numbers.
+ words, adding this feature to Modula-2 led Wirth to create a new
+ language, write a new compiler etc.  Adding this feature to Forth just
+ required a few lines of code.}
- Immediate forms of arithmetic instructions are distinguished by a
+ @example
- @code{#} just before the @code{,}, e.g., @code{and#,} (note: @code{lda,}
+ list%
- does not count as arithmetic instruction).
+     cell% field intlist-int
+ end-struct intlist%
+ @end example
- You have to specify all operands to an instruction, even those that
+ @code{intlist%} is a structure with two fields:
- other assemblers consider optional, e.g., the destination register for
+ @code{list-next} and @code{intlist-int}.
- @code{br,}, or the destination register and hint for @code{jmp,}.
- You can specify conditions for @code{if,} by removing the first @code{b}
+ @cindex structures containing arrays
- and the trailing @code{,} from a branch with a corresponding name; e.g.,
+ You can specify an array type containing @emph{n} elements of
+ type @code{foo%} like this:
  @example
-fgt if, \ if F11>0e
+ foo% @emph{n} *
-   ...
- endif,
  @end example
- @code{fbgt,} gives @code{fgt}.
+ You can use this array type in any place where you can use a normal
+ type, e.g., when defining a @code{field}, or with
+ @code{%allot}.
- @node MIPS assembler, Other assemblers, Alpha Assembler, Assembler and Code Words
+ @cindex first field optimization
- @subsection MIPS assembler
+ The first field is at the base address of a structure and the word for
+ this field (e.g., @code{list-next}) actually does not change the address
+ on the stack. You may be tempted to leave it away in the interest of
+ run-time and space efficiency. This is not necessary, because the
+ structure package optimizes this case: If you compile a first-field
+ words, no code is generated. So, in the interest of readability and
+ maintainability you should include the word for the field when accessing
+ the field.
- The MIPS assembler was originally written by Christian Pirker.
- Currently the assembler and disassembler only cover the MIPS-I
+ @node Structure Naming Convention, Structure Implementation, Structure Usage, Structures
- architecture (R3000), and don't support FP instructions.
+ @subsection Structure Naming Convention
+ @cindex structure naming convention
- The register names @code{$a0}--@code{$a3} are not available to avoid
+ The field names that come to (my) mind are often quite generic, and,
- shadowing hex numbers.
+ if used, would cause frequent name clashes. E.g., many structures
+ probably contain a @code{counter} field. The structure names
+ that come to (my) mind are often also the logical choice for the names
+ of words that create such a structure.
- Because there is no way to distinguish registers from immediate values,
+ Therefore, I have adopted the following naming conventions:
- you have to explicitly use the immediate forms of instructions, i.e.,
- @code{addiu,}, not just @code{addu,} (@command{as} does this
- implicitly).
- If the architecture manual specifies several formats for the instruction
+ @itemize @bullet
- (e.g., for @code{jalr,}), you usually have to use the one with more
+ @cindex field naming convention
- arguments (i.e., two for @code{jalr,}).  When in doubt, see
+ @item
- @code{arch/mips/testasm.fs} for an example of correct use.
+ The names of fields are of the form
+ @code{@emph{struct}-@emph{field}}, where
+ @code{@emph{struct}} is the basic name of the structure, and
+ @code{@emph{field}} is the basic name of the field. You can
+ think of field words as converting the (address of the)
+ structure into the (address of the) field.
- Branches and jumps in the MIPS architecture have a delay slot.  You have
+ @cindex structure naming convention
- to fill it yourself (the simplest way is to use @code{nop,}), the
+ @item
- assembler does not do it for you (unlike @command{as}).  Even
+ The names of structures are of the form
- @code{if,}, @code{ahead,}, @code{until,}, @code{again,}, @code{while,},
+ @code{@emph{struct}%}, where
- @code{else,} and @code{repeat,} need a delay slot.  Since @code{begin,}
+ @code{@emph{struct}} is the basic name of the structure.
- and @code{then,} just specify branch targets, they are not affected.
+ @end itemize
- Note that you must not put branches, jumps, or @code{li,} into the delay
+ This naming convention does not work that well for fields of extended
- slot: @code{li,} may expand to several instructions, and control flow
+ structures; e.g., the integer list structure has a field
- instructions may not be put into the branch delay slot in any case.
+ @code{intlist-int}, but has @code{list-next}, not
+ @code{intlist-next}.
- For branches the argument specifying the target is a relative address;
+ @node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures
- You have to add the address of the delay slot to get the absolute
+ @subsection Structure Implementation
- address.
+ @cindex structure implementation
+ @cindex implementation of structures
- The MIPS architecture also has load delay slots and restrictions on
+ The central idea in the implementation is to pass the data about the
- using @code{mfhi,} and @code{mflo,}; you have to order the instructions
+ structure being built on the stack, not in some global
- yourself to satisfy these restrictions, the assembler does not do it for
+ variable. Everything else falls into place naturally once this design
- you.
+ decision is made.
- You can specify the conditions for @code{if,} etc. by taking a
+ The type description on the stack is of the form @emph{align
- conditional branch and leaving away the @code{b} at the start and the
+ size}. Keeping the size on the top-of-stack makes dealing with arrays
- @code{,} at the end.  E.g.,
+ very simple.
+ @code{field} is a defining word that uses @code{Create}
+ and @code{DOES>}. The body of the field contains the offset
+ of the field, and the normal @code{DOES>} action is simply:
  @example
-5 eq if,
+ @@ +
-   ... \ do something if $4 equals $5
- then,
  @end example
- @node Other assemblers,  , MIPS assembler, Assembler and Code Words
+ @noindent
- @subsection Other assemblers
+ i.e., add the offset to the address, giving the stack effect
+ @i{addr1 -- addr2} for a field.
- If you want to contribute another assembler/disassembler, please contact
+ @cindex first field optimization, implementation
- us (@email{bug-gforth@@gnu.org}) to check if we have such an assembler
+ This simple structure is slightly complicated by the optimization
- already.  If you are writing them from scratch, please use a similar
+ for fields with offset 0, which requires a different
- syntax style as the one we use (i.e., postfix, commas at the end of the
+ @code{DOES>}-part (because we cannot rely on there being
- instruction names, @pxref{Common Assembler}); make the output of the
+ something on the stack if such a field is invoked during
- disassembler be valid input for the assembler, and keep the style
+ compilation). Therefore, we put the different @code{DOES>}-parts
- similar to the style we used.
+ in separate words, and decide which one to invoke based on the
+ offset. For a zero offset, the field is basically a noop; it is
+ immediate, and therefore no code is generated when it is compiled.
- Hints on implementation: The most important part is to have a good test
+ @node Structure Glossary,  , Structure Implementation, Structures
- suite that contains all instructions.  Once you have that, the rest is
+ @subsection Structure Glossary
- easy.  For actual coding you can take a look at
+ @cindex structure glossary
- @file{arch/mips/disasm.fs} to get some ideas on how to use data for both
- the assembler and disassembler, avoiding redundancy and some potential
- bugs.  You can also look at that file (and @pxref{Advanced does> usage
- example}) to get ideas how to factor a disassembler.
- Start with the disassembler, because it's easier to reuse data from the
- disassembler for the assembler than the other way round.
- For the assembler, take a look at @file{arch/alpha/asm.fs}, which shows
+ doc-%align
- how simple it can be.
+ doc-%alignment
+ doc-%alloc
+ doc-%allocate
+ doc-%allot
+ doc-cell%
+ doc-char%
+ doc-dfloat%
+ doc-double%
+ doc-end-struct
+ doc-field
+ doc-float%
+ doc-naligned
+ doc-sfloat%
+ doc-%size
+ doc-struct
- @c -------------------------------------------------------------
- @node Threading Words, Locals, Assembler and Code Words, Words
- @section Threading Words
- @cindex threading words
- @cindex code address
+ @c -------------------------------------------------------------
- These words provide access to code addresses and other threading stuff
+ @node Object-oriented Forth, Programming Tools, Structures, Words
- in Gforth (and, possibly, other interpretive Forths). It more or less
+ @section Object-oriented Forth
- abstracts away the differences between direct and indirect threading
- (and, for direct threading, the machine dependences). However, at
- present this wordset is still incomplete. It is also pretty low-level;
- some day it will hopefully be made unnecessary by an internals wordset
- that abstracts implementation details away completely.
+ Gforth comes with three packages for object-oriented programming:
+ @file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them
+ is preloaded, so you have to @code{include} them before use. The most
+ important differences between these packages (and others) are discussed
+ in @ref{Comparison with other object models}. All packages are written
+ in ANS Forth and can be used with any other ANS Forth.
- doc-threading-method
+ @menu
- doc->code-address
+ * Why object-oriented programming?::
- doc->does-code
+ * Object-Oriented Terminology::
- doc-code-address!
+ * Objects::
- doc-does-code!
+ * OOF::
- doc-does-handler!
+ * Mini-OOF::
- doc-/does-handler
+ * Comparison with other object models::
+ @end menu
+ @c ----------------------------------------------------------------
+ @node Why object-oriented programming?, Object-Oriented Terminology, Object-oriented Forth, Object-oriented Forth
+ @subsection Why object-oriented programming?
+ @cindex object-oriented programming motivation
+ @cindex motivation for object-oriented programming
- The code addresses produced by various defining words are produced by
+ Often we have to deal with several data structures (@emph{objects}),
- the following words:
+ that have to be treated similarly in some respects, but differently in
+ others. Graphical objects are the textbook example: circles, triangles,
+ dinosaurs, icons, and others, and we may want to add more during program
+ development. We want to apply some operations to any graphical object,
+ e.g., @code{draw} for displaying it on the screen. However, @code{draw}
+ has to do something different for every kind of object.
+ @comment TODO add some other operations eg perimeter, area
+ @comment and tie in to concrete examples later..
+ We could implement @code{draw} as a big @code{CASE}
+ control structure that executes the appropriate code depending on the
+ kind of object to be drawn. This would be not be very elegant, and,
+ moreover, we would have to change @code{draw} every time we add
+ a new kind of graphical object (say, a spaceship).
- doc-docol:
+ What we would rather do is: When defining spaceships, we would tell
- doc-docon:
+ the system: ``Here's how you @code{draw} a spaceship; you figure
- doc-dovar:
+ out the rest''.
- doc-douser:
- doc-dodefer:
- doc-dofield:
+ This is the problem that all systems solve that (rightfully) call
+ themselves object-oriented; the object-oriented packages presented here
+ solve this problem (and not much else).
+ @comment TODO ?list properties of oo systems.. oo vs o-based?
- You can recognize words defined by a @code{CREATE}...@code{DOES>} word
+ @c ------------------------------------------------------------------------
- with @code{>does-code}. If the word was defined in that way, the value
+ @node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth
- returned is non-zero and identifies the @code{DOES>} used by the
+ @subsection Object-Oriented Terminology
- defining word.
+ @cindex object-oriented terminology
- @comment TODO should that be ``identifies the xt of the DOES> ??''
+ @cindex terminology for object-oriented programming
- @c -------------------------------------------------------------
+ This section is mainly for reference, so you don't have to understand
- @node Locals, Structures, Threading Words, Words
+ all of it right away.  The terminology is mainly Smalltalk-inspired.  In
- @section Locals
+ short:
- @cindex locals
- Local variables can make Forth programming more enjoyable and Forth
+ @table @emph
- programs easier to read. Unfortunately, the locals of ANS Forth are
+ @cindex class
- laden with restrictions. Therefore, we provide not only the ANS Forth
+ @item class
- locals wordset, but also our own, more powerful locals wordset (we
+ a data structure definition with some extras.
- implemented the ANS Forth locals wordset through our locals wordset).
- The ideas in this section have also been published in M. Anton Ertl,
+ @cindex object
- @cite{@uref{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz,
+ @item object
- Automatic Scoping of Local Variables}}, EuroForth '94.
+ an instance of the data structure described by the class definition.
- @menu
+ @cindex instance variables
- * Gforth locals::
+ @item instance variables
- * ANS Forth locals::
+ fields of the data structure.
- @end menu
- @node Gforth locals, ANS Forth locals, Locals, Locals
+ @cindex selector
- @subsection Gforth locals
+ @cindex method selector
- @cindex Gforth locals
+ @cindex virtual function
- @cindex locals, Gforth style
+ @item selector
+ (or @emph{method selector}) a word (e.g.,
+ @code{draw}) that performs an operation on a variety of data
+ structures (classes). A selector describes @emph{what} operation to
+ perform. In C++ terminology: a (pure) virtual function.
- Locals can be defined with
+ @cindex method
+ @item method
+ the concrete definition that performs the operation
+ described by the selector for a specific class. A method specifies
+ @emph{how} the operation is performed for a specific class.
- @example
+ @cindex selector invocation
- @{ local1 local2 ... -- comment @}
+ @cindex message send
- @end example
+ @cindex invoking a selector
- or
+ @item selector invocation
- @example
+ a call of a selector. One argument of the call (the TOS (top-of-stack))
- @{ local1 local2 ... @}
+ is used for determining which method is used. In Smalltalk terminology:
- @end example
+ a message (consisting of the selector and the other arguments) is sent
+ to the object.
- E.g.,
+ @cindex receiving object
- @example
+ @item receiving object
- : max @{ n1 n2 -- n3 @}
+ the object used for determining the method executed by a selector
-  n1 n2 > if
+ invocation. In the @file{objects.fs} model, it is the object that is on
-    n1
+ the TOS when the selector is invoked. (@emph{Receiving} comes from
-  else
+ the Smalltalk @emph{message} terminology.)
-    n2
-  endif ;
- @end example
- The similarity of locals definitions with stack comments is intended. A
+ @cindex child class
- locals definition often replaces the stack comment of a word. The order
+ @cindex parent class
- of the locals corresponds to the order in a stack comment and everything
+ @cindex inheritance
- after the @code{--} is really a comment.
+ @item child class
+ a class that has (@emph{inherits}) all properties (instance variables,
+ selectors, methods) from a @emph{parent class}. In Smalltalk
+ terminology: The subclass inherits from the superclass. In C++
+ terminology: The derived class inherits from the base class.
- This similarity has one disadvantage: It is too easy to confuse locals
+ @end table
- declarations with stack comments, causing bugs and making them hard to
- find. However, this problem can be avoided by appropriate coding
- conventions: Do not use both notations in the same program. If you do,
- they should be distinguished using additional means, e.g. by position.
- @cindex types of locals
+ @c If you wonder about the message sending terminology, it comes from
- @cindex locals types
+ @c a time when each object had it's own task and objects communicated via
- The name of the local may be preceded by a type specifier, e.g.,
+ @c message passing; eventually the Smalltalk developers realized that
- @code{F:} for a floating point value:
+ @c they can do most things through simple (indirect) calls. They kept the
+ @c terminology.
- @example
+ @c --------------------------------------------------------------
- : CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @}
+ @node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth
- \ complex multiplication
+ @subsection The @file{objects.fs} model
-  Ar Br f* Ai Bi f* f-
+ @cindex objects
-  Ar Bi f* Ai Br f* f+ ;
+ @cindex object-oriented programming
- @end example
- @cindex flavours of locals
+ @cindex @file{objects.fs}
- @cindex locals flavours
+ @cindex @file{oof.fs}
- @cindex value-flavoured locals
- @cindex variable-flavoured locals
- Gforth currently supports cells (@code{W:}, @code{W^}), doubles
- (@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters
- (@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined
- with @code{W:}, @code{D:} etc.) produces its value and can be changed
- with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.)
- produces its address (which becomes invalid when the variable's scope is
- left). E.g., the standard word @code{emit} can be defined in terms of
- @code{type} like this:
- @example
+ This section describes the @file{objects.fs} package. This material also
- : emit @{ C^ char* -- @}
+ has been published in M. Anton Ertl,
-     char* 1 type ;
+ @cite{@uref{http://www.complang.tuwien.ac.at/forth/objects/objects.html,
- @end example
+ Yet Another Forth Objects Package}}, Forth Dimensions 19(2), pages
+--43.
+ @c McKewan's and Zsoter's packages
- @cindex default type of locals
+ This section assumes that you have read @ref{Structures}.
- @cindex locals, default type
- A local without type specifier is a @code{W:} local. Both flavours of
- locals are initialized with values from the data or FP stack.
- Currently there is no way to define locals with user-defined data
+ The techniques on which this model is based have been used to implement
- structures, but we are working on it.
+ the parser generator, Gray, and have also been used in Gforth for
+ implementing the various flavours of word lists (hashed or not,
+ case-sensitive or not, special-purpose word lists for locals etc.).
- Gforth allows defining locals everywhere in a colon definition. This
- poses the following questions:
  @menu
- * Where are locals visible by name?::
+ * Properties of the Objects model::
- * How long do locals live?::
+ * Basic Objects Usage::
- * Programming Style::
+ * The Objects base class::
- * Implementation::
+ * Creating objects::
+ * Object-Oriented Programming Style::
+ * Class Binding::
+ * Method conveniences::
+ * Classes and Scoping::
+ * Dividing classes::
+ * Object Interfaces::
+ * Objects Implementation::
+ * Objects Glossary::
  @end menu
- @node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals
+ Marcel Hendrix provided helpful comments on this section.
- @subsubsection Where are locals visible by name?
- @cindex locals visibility
- @cindex visibility of locals
- @cindex scope of locals
- Basically, the answer is that locals are visible where you would expect
+ @node Properties of the Objects model, Basic Objects Usage, Objects, Objects
- it in block-structured languages, and sometimes a little longer. If you
+ @subsubsection Properties of the @file{objects.fs} model
- want to restrict the scope of a local, enclose its definition in
+ @cindex @file{objects.fs} properties
- @code{SCOPE}...@code{ENDSCOPE}.
+ @itemize @bullet
+ @item
+ It is straightforward to pass objects on the stack. Passing
+ selectors on the stack is a little less convenient, but possible.
- doc-scope
+ @item
- doc-endscope
+ Objects are just data structures in memory, and are referenced by their
+ address. You can create words for objects with normal defining words
+ like @code{constant}. Likewise, there is no difference between instance
+ variables that contain objects and those that contain other data.
+ @item
+ Late binding is efficient and easy to use.
- These words behave like control structure words, so you can use them
+ @item
- with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in
+ It avoids parsing, and thus avoids problems with state-smartness
- arbitrary ways.
+ and reduced extensibility; for convenience there are a few parsing
+ words, but they have non-parsing counterparts. There are also a few
+ defining words that parse. This is hard to avoid, because all standard
+ defining words parse (except @code{:noname}); however, such
+ words are not as bad as many other parsing words, because they are not
+ state-smart.
- If you want a more exact answer to the visibility question, here's the
+ @item
- basic principle: A local is visible in all places that can only be
+ It does not try to incorporate everything. It does a few things and does
- reached through the definition of the local@footnote{In compiler
+ them well (IMO). In particular, this model was not designed to support
- construction terminology, all places dominated by the definition of the
+ information hiding (although it has features that may help); you can use
- local.}. In other words, it is not visible in places that can be reached
+ a separate package for achieving this.
- without going through the definition of the local. E.g., locals defined
- in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals
- defined in @code{BEGIN}...@code{UNTIL} are visible after the
- @code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}).
- The reasoning behind this solution is: We want to have the locals
+ @item
- visible as long as it is meaningful. The user can always make the
+ It is layered; you don't have to learn and use all features to use this
- visibility shorter by using explicit scoping. In a place that can
+ model. Only a few features are necessary (@pxref{Basic Objects Usage},
- only be reached through the definition of a local, the meaning of a
+ @pxref{The Objects base class}, @pxref{Creating objects}.), the others
- local name is clear. In other places it is not: How is the local
+ are optional and independent of each other.
- initialized at the control flow path that does not contain the
- definition? Which local is meant, if the same name is defined twice in
- two independent control flow paths?
- This should be enough detail for nearly all users, so you can skip the
+ @item
- rest of this section. If you really must know all the gory details and
+ An implementation in ANS Forth is available.
- options, read on.
- In order to implement this rule, the compiler has to know which places
+ @end itemize
- are unreachable. It knows this automatically after @code{AHEAD},
- @code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after
- most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the
- compiler that the control flow never reaches that place. If
- @code{UNREACHABLE} is not used where it could, the only consequence is
- that the visibility of some locals is more limited than the rule above
- says. If @code{UNREACHABLE} is used where it should not (i.e., if you
- lie to the compiler), buggy code will be produced.
- doc-unreachable
+ @node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects
+ @subsubsection Basic @file{objects.fs} Usage
+ @cindex basic objects usage
+ @cindex objects, basic usage
+ You can define a class for graphical objects like this:
- Another problem with this rule is that at @code{BEGIN}, the compiler
+ @cindex @code{class} usage
- does not know which locals will be visible on the incoming
+ @cindex @code{end-class} usage
- back-edge. All problems discussed in the following are due to this
+ @cindex @code{selector} usage
- ignorance of the compiler (we discuss the problems using @code{BEGIN}
- loops as examples; the discussion also applies to @code{?DO} and other
- loops). Perhaps the most insidious example is:
  @example
- AHEAD
+ object class \ "object" is the parent class
- BEGIN
+   selector draw ( x y graphical -- )
-   x
+ end-class graphical
- [ 1 CS-ROLL ] THEN
-   @{ x @}
-   ...
- UNTIL
  @end example
- This should be legal according to the visibility rule. The use of
+ This code defines a class @code{graphical} with an
- @code{x} can only be reached through the definition; but that appears
+ operation @code{draw}.  We can perform the operation
- textually below the use.
+ @code{draw} on any @code{graphical} object, e.g.:
- From this example it is clear that the visibility rules cannot be fully
- implemented without major headaches. Our implementation treats common
- cases as advertised and the exceptions are treated in a safe way: The
- compiler makes a reasonable guess about the locals visible after a
- @code{BEGIN}; if it is too pessimistic, the
- user will get a spurious error about the local not being defined; if the
- compiler is too optimistic, it will notice this later and issue a
- warning. In the case above the compiler would complain about @code{x}
- being undefined at its use. You can see from the obscure examples in
- this section that it takes quite unusual control structures to get the
- compiler into trouble, and even then it will often do fine.
- If the @code{BEGIN} is reachable from above, the most optimistic guess
- is that all locals visible before the @code{BEGIN} will also be
- visible after the @code{BEGIN}. This guess is valid for all loops that
- are entered only through the @code{BEGIN}, in particular, for normal
- @code{BEGIN}...@code{WHILE}...@code{REPEAT} and
- @code{BEGIN}...@code{UNTIL} loops and it is implemented in our
- compiler. When the branch to the @code{BEGIN} is finally generated by
- @code{AGAIN} or @code{UNTIL}, the compiler checks the guess and
- warns the user if it was too optimistic:
  @example
- IF
+100 t-rex draw
-   @{ x @}
- BEGIN
-   \ x ?
- [ 1 cs-roll ] THEN
-   ...
- UNTIL
  @end example
- Here, @code{x} lives only until the @code{BEGIN}, but the compiler
+ @noindent
- optimistically assumes that it lives until the @code{THEN}. It notices
+ where @code{t-rex} is a word (say, a constant) that produces a
- this difference when it compiles the @code{UNTIL} and issues a
+ graphical object.
- warning. The user can avoid the warning, and make sure that @code{x}
- is not used in the wrong area by using explicit scoping:
- @example
- IF
-   SCOPE
-   @{ x @}
-   ENDSCOPE
- BEGIN
- [ 1 cs-roll ] THEN
-   ...
- UNTIL
- @end example
- Since the guess is optimistic, there will be no spurious error messages
+ @comment TODO add a 2nd operation eg perimeter.. and use for
- about undefined locals.
+ @comment a concrete example
- If the @code{BEGIN} is not reachable from above (e.g., after
+ @cindex abstract class
- @code{AHEAD} or @code{EXIT}), the compiler cannot even make an
+ How do we create a graphical object? With the present definitions,
- optimistic guess, as the locals visible after the @code{BEGIN} may be
+ we cannot create a useful graphical object. The class
- defined later. Therefore, the compiler assumes that no locals are
+ @code{graphical} describes graphical objects in general, but not
- visible after the @code{BEGIN}. However, the user can use
+ any concrete graphical object type (C++ users would call it an
- @code{ASSUME-LIVE} to make the compiler assume that the same locals are
+ @emph{abstract class}); e.g., there is no method for the selector
- visible at the BEGIN as at the point where the top control-flow stack
+ @code{draw} in the class @code{graphical}.
- item was created.
+ For concrete graphical objects, we define child classes of the
+ class @code{graphical}, e.g.:
- doc-assume-live
+ @cindex @code{overrides} usage
+ @cindex @code{field} usage in class definition
+ @example
+ graphical class \ "graphical" is the parent class
+   cell% field circle-radius
+ :noname ( x y circle -- )
+   circle-radius @@ draw-circle ;
+ overrides draw
- @noindent
+ :noname ( n-radius circle -- )
- E.g.,
+   circle-radius ! ;
- @example
+ overrides construct
- @{ x @}
- AHEAD
+ end-class circle
- ASSUME-LIVE
- BEGIN
-   x
- [ 1 CS-ROLL ] THEN
-   ...
- UNTIL
  @end example
- Other cases where the locals are defined before the @code{BEGIN} can be
+ Here we define a class @code{circle} as a child of @code{graphical},
- handled by inserting an appropriate @code{CS-ROLL} before the
+ with field @code{circle-radius} (which behaves just like a field
- @code{ASSUME-LIVE} (and changing the control-flow stack manipulation
+ (@pxref{Structures}); it defines (using @code{overrides}) new methods
- behind the @code{ASSUME-LIVE}).
+ for the selectors @code{draw} and @code{construct} (@code{construct} is
+ defined in @code{object}, the parent class of @code{graphical}).
- Cases where locals are defined after the @code{BEGIN} (but should be
+ Now we can create a circle on the heap (i.e.,
- visible immediately after the @code{BEGIN}) can only be handled by
+ @code{allocate}d memory) with:
- rearranging the loop. E.g., the ``most insidious'' example above can be
- arranged into:
+ @cindex @code{heap-new} usage
  @example
- BEGIN
+circle heap-new constant my-circle
-   @{ x @}
-   ... 0=
- WHILE
-   x
- REPEAT
  @end example
- @node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals
+ @noindent
- @subsubsection How long do locals live?
+ @code{heap-new} invokes @code{construct}, thus
- @cindex locals lifetime
+ initializing the field @code{circle-radius} with 50. We can draw
- @cindex lifetime of locals
+ this new circle at (100,100) with:
- The right answer for the lifetime question would be: A local lives at
+ @example
- least as long as it can be accessed. For a value-flavoured local this
+100 my-circle draw
- means: until the end of its visibility. However, a variable-flavoured
+ @end example
- local could be accessed through its address far beyond its visibility
- scope. Ultimately, this would mean that such locals would have to be
- garbage collected. Since this entails un-Forth-like implementation
- complexities, I adopted the same cowardly solution as some other
- languages (e.g., C): The local lives only as long as it is visible;
- afterwards its address is invalid (and programs that access it
- afterwards are erroneous).
- @node Programming Style, Implementation, How long do locals live?, Gforth locals
+ @cindex selector invocation, restrictions
- @subsubsection Programming Style
+ @cindex class definition, restrictions
- @cindex locals programming style
+ Note: You can only invoke a selector if the object on the TOS
- @cindex programming style, locals
+ (the receiving object) belongs to the class where the selector was
+ defined or one of its descendents; e.g., you can invoke
+ @code{draw} only for objects belonging to @code{graphical}
+ or its descendents (e.g., @code{circle}).  Immediately before
+ @code{end-class}, the search order has to be the same as
+ immediately after @code{class}.
- The freedom to define locals anywhere has the potential to change
+ @node The Objects base class, Creating objects, Basic Objects Usage, Objects
- programming styles dramatically. In particular, the need to use the
+ @subsubsection The @file{object.fs} base class
- return stack for intermediate storage vanishes. Moreover, all stack
+ @cindex @code{object} class
- manipulations (except @code{PICK}s and @code{ROLL}s with run-time
- determined arguments) can be eliminated: If the stack items are in the
- wrong order, just write a locals definition for all of them; then
- write the items in the order you want.
- This seems a little far-fetched and eliminating stack manipulations is
+ When you define a class, you have to specify a parent class.  So how do
- unlikely to become a conscious programming objective. Still, the number
+ you start defining classes? There is one class available from the start:
- of stack manipulations will be reduced dramatically if local variables
+ @code{object}. It is ancestor for all classes and so is the
- are used liberally (e.g., compare @code{max} (@pxref{Gforth locals}) with
+ only class that has no parent. It has two selectors: @code{construct}
- a traditional implementation of @code{max}).
+ and @code{print}.
- This shows one potential benefit of locals: making Forth programs more
+ @node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects
- readable. Of course, this benefit will only be realized if the
+ @subsubsection Creating objects
- programmers continue to honour the principle of factoring instead of
+ @cindex creating objects
- using the added latitude to make the words longer.
+ @cindex object creation
+ @cindex object allocation options
- @cindex single-assignment style for locals
+ @cindex @code{heap-new} discussion
- Using @code{TO} can and should be avoided.  Without @code{TO},
+ @cindex @code{dict-new} discussion
- every value-flavoured local has only a single assignment and many
+ @cindex @code{construct} discussion
- advantages of functional languages apply to Forth. I.e., programs are
+ You can create and initialize an object of a class on the heap with
- easier to analyse, to optimize and to read: It is clear from the
+ @code{heap-new} ( ... class -- object ) and in the dictionary
- definition what the local stands for, it does not turn into something
+ (allocation with @code{allot}) with @code{dict-new} (
- different later.
+ ... class -- object ). Both words invoke @code{construct}, which
+ consumes the stack items indicated by "..." above.
- E.g., a definition using @code{TO} might look like this:
+ @cindex @code{init-object} discussion
- @example
+ @cindex @code{class-inst-size} discussion
- : strcmp @{ addr1 u1 addr2 u2 -- n @}
+ If you want to allocate memory for an object yourself, you can get its
-  u1 u2 min 0
+ alignment and size with @code{class-inst-size 2@@} ( class --
-  ?do
+ align size ). Once you have memory for an object, you can initialize
-    addr1 c@@ addr2 c@@ -
+ it with @code{init-object} ( ... class object -- );
-    ?dup-if
+ @code{construct} does only a part of the necessary work.
-      unloop exit
-    then
-    addr1 char+ TO addr1
-    addr2 char+ TO addr2
-  loop
-  u1 u2 - ;
- @end example
- Here, @code{TO} is used to update @code{addr1} and @code{addr2} at
- every loop iteration. @code{strcmp} is a typical example of the
- readability problems of using @code{TO}. When you start reading
- @code{strcmp}, you think that @code{addr1} refers to the start of the
- string. Only near the end of the loop you realize that it is something
- else.
- This can be avoided by defining two locals at the start of the loop that
+ @node Object-Oriented Programming Style, Class Binding, Creating objects, Objects
- are initialized with the right value for the current iteration.
+ @subsubsection Object-Oriented Programming Style
- @example
+ @cindex object-oriented programming style
- : strcmp @{ addr1 u1 addr2 u2 -- n @}
+ @cindex programming style, object-oriented
-  addr1 addr2
-  u1 u2 min 0
-  ?do @{ s1 s2 @}
-    s1 c@@ s2 c@@ -
-    ?dup-if
-      unloop exit
-    then
-    s1 char+ s2 char+
-  loop
-drop
-  u1 u2 - ;
- @end example
- Here it is clear from the start that @code{s1} has a different value
- in every loop iteration.
- @node Implementation,  , Programming Style, Gforth locals
+ This section is not exhaustive.
- @subsubsection Implementation
- @cindex locals implementation
- @cindex implementation of locals
- @cindex locals stack
+ @cindex stack effects of selectors
- Gforth uses an extra locals stack. The most compelling reason for
+ @cindex selectors and stack effects
- this is that the return stack is not float-aligned; using an extra stack
+ In general, it is a good idea to ensure that all methods for the
- also eliminates the problems and restrictions of using the return stack
+ same selector have the same stack effect: when you invoke a selector,
- as locals stack. Like the other stacks, the locals stack grows toward
+ you often have no idea which method will be invoked, so, unless all
- lower addresses. A few primitives allow an efficient implementation:
+ methods have the same stack effect, you will not know the stack effect
+ of the selector invocation.
+ One exception to this rule is methods for the selector
+ @code{construct}. We know which method is invoked, because we
+ specify the class to be constructed at the same place. Actually, I
+ defined @code{construct} as a selector only to give the users a
+ convenient way to specify initialization. The way it is used, a
+ mechanism different from selector invocation would be more natural
+ (but probably would take more code and more space to explain).
- doc-@local#
+ @node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects
- doc-f@local#
+ @subsubsection Class Binding
- doc-laddr#
+ @cindex class binding
- doc-lp+!#
+ @cindex early binding
- doc-lp!
- doc->l
- doc-f>l
+ @cindex late binding
+ Normal selector invocations determine the method at run-time depending
+ on the class of the receiving object. This run-time selection is called
+ @i{late binding}.
- In addition to these primitives, some specializations of these
+ Sometimes it's preferable to invoke a different method. For example,
- primitives for commonly occurring inline arguments are provided for
+ you might want to use the simple method for @code{print}ing
- efficiency reasons, e.g., @code{@@local0} as specialization of
+ @code{object}s instead of the possibly long-winded @code{print} method
- @code{@@local#} for the inline argument 0. The following compiling words
+ of the receiver class. You can achieve this by replacing the invocation
- compile the right specialized version, or the general version, as
+ of @code{print} with:
- appropriate:
+ @cindex @code{[bind]} usage
+ @example
+ [bind] object print
+ @end example
- doc-compile-@local
+ @noindent
- doc-compile-f@local
+ in compiled code or:
- doc-compile-lp+!
+ @cindex @code{bind} usage
+ @example
+ bind object print
+ @end example
- Combinations of conditional branches and @code{lp+!#} like
+ @cindex class binding, alternative to
- @code{?branch-lp+!#} (the locals pointer is only changed if the branch
+ @noindent
- is taken) are provided for efficiency and correctness in loops.
+ in interpreted code. Alternatively, you can define the method with a
+ name (e.g., @code{print-object}), and then invoke it through the
+ name. Class binding is just a (often more convenient) way to achieve
+ the same effect; it avoids name clutter and allows you to invoke
+ methods directly without naming them first.
- A special area in the dictionary space is reserved for keeping the
+ @cindex superclass binding
- local variable names. @code{@{} switches the dictionary pointer to this
+ @cindex parent class binding
- area and @code{@}} switches it back and generates the locals
+ A frequent use of class binding is this: When we define a method
- initializing code. @code{W:} etc.@ are normal defining words. This
+ for a selector, we often want the method to do what the selector does
- special area is cleared at the start of every colon definition.
+ in the parent class, and a little more. There is a special word for
+ this purpose: @code{[parent]}; @code{[parent]
+ @emph{selector}} is equivalent to @code{[bind] @emph{parent
+ selector}}, where @code{@emph{parent}} is the parent
+ class of the current class. E.g., a method definition might look like:
- @cindex word list for defining locals
+ @cindex @code{[parent]} usage
- A special feature of Gforth's dictionary is used to implement the
+ @example
- definition of locals without type specifiers: every word list (aka
+ :noname
- vocabulary) has its own methods for searching
+   dup [parent] foo \ do parent's foo on the receiving object
- etc. (@pxref{Word Lists}). For the present purpose we defined a word list
+   ... \ do some more
- with a special search method: When it is searched for a word, it
+ ; overrides foo
- actually creates that word using @code{W:}. @code{@{} changes the search
+ @end example
- order to first search the word list containing @code{@}}, @code{W:} etc.,
- and then the word list for defining locals without type specifiers.
- The lifetime rules support a stack discipline within a colon
+ @cindex class binding as optimization
- definition: The lifetime of a local is either nested with other locals
+ In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions,
- lifetimes or it does not overlap them.
+ March 1997), Andrew McKewan presents class binding as an optimization
+ technique. I recommend not using it for this purpose unless you are in
+ an emergency. Late binding is pretty fast with this model anyway, so the
+ benefit of using class binding is small; the cost of using class binding
+ where it is not appropriate is reduced maintainability.
- At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack
+ While we are at programming style questions: You should bind
- pointer manipulation is generated. Between control structure words
+ selectors only to ancestor classes of the receiving object. E.g., say,
- locals definitions can push locals onto the locals stack. @code{AGAIN}
+ you know that the receiving object is of class @code{foo} or its
- is the simplest of the other three control flow words. It has to
+ descendents; then you should bind only to @code{foo} and its
- restore the locals stack depth of the corresponding @code{BEGIN}
+ ancestors.
- before branching. The code looks like this:
- @format
- @code{lp+!#} current-locals-size @minus{} dest-locals-size
- @code{branch} <begin>
- @end format
- @code{UNTIL} is a little more complicated: If it branches back, it
+ @node Method conveniences, Classes and Scoping, Class Binding, Objects
- must adjust the stack just like @code{AGAIN}. But if it falls through,
+ @subsubsection Method conveniences
- the locals stack must not be changed. The compiler generates the
+ @cindex method conveniences
- following code:
- @format
- @code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size
- @end format
- The locals stack pointer is only adjusted if the branch is taken.
- @code{THEN} can produce somewhat inefficient code:
+ In a method you usually access the receiving object pretty often.  If
- @format
+ you define the method as a plain colon definition (e.g., with
- @code{lp+!#} current-locals-size @minus{} orig-locals-size
+ @code{:noname}), you may have to do a lot of stack
- <orig target>:
+ gymnastics. To avoid this, you can define the method with @code{m:
- @code{lp+!#} orig-locals-size @minus{} new-locals-size
+ ... ;m}. E.g., you could define the method for
- @end format
+ @code{draw}ing a @code{circle} with
- The second @code{lp+!#} adjusts the locals stack pointer from the
- level at the @i{orig} point to the level after the @code{THEN}. The
- first @code{lp+!#} adjusts the locals stack pointer from the current
- level to the level at the orig point, so the complete effect is an
- adjustment from the current level to the right level after the
- @code{THEN}.
- @cindex locals information on the control-flow stack
+ @cindex @code{this} usage
- @cindex control-flow stack items, locals information
+ @cindex @code{m:} usage
- In a conventional Forth implementation a dest control-flow stack entry
+ @cindex @code{;m} usage
- is just the target address and an orig entry is just the address to be
+ @example
- patched. Our locals implementation adds a word list to every orig or dest
+ m: ( x y circle -- )
- item. It is the list of locals visible (or assumed visible) at the point
+   ( x y ) this circle-radius @@ draw-circle ;m
- described by the entry. Our implementation also adds a tag to identify
+ @end example
- the kind of entry, in particular to differentiate between live and dead
- (reachable and unreachable) orig entries.
- A few unusual operations have to be performed on locals word lists:
+ @cindex @code{exit} in @code{m: ... ;m}
+ @cindex @code{exitm} discussion
+ @cindex @code{catch} in @code{m: ... ;m}
+ When this method is executed, the receiver object is removed from the
+ stack; you can access it with @code{this} (admittedly, in this
+ example the use of @code{m: ... ;m} offers no advantage). Note
+ that I specify the stack effect for the whole method (i.e. including
+ the receiver object), not just for the code between @code{m:}
+ and @code{;m}. You cannot use @code{exit} in
+ @code{m:...;m}; instead, use
+ @code{exitm}.@footnote{Moreover, for any word that calls
+ @code{catch} and was defined before loading
+ @code{objects.fs}, you have to redefine it like I redefined
+ @code{catch}: @code{: catch this >r catch r> to-this ;}}
+ @cindex @code{inst-var} usage
+ You will frequently use sequences of the form @code{this
+ @emph{field}} (in the example above: @code{this
+ circle-radius}). If you use the field only in this way, you can
+ define it with @code{inst-var} and eliminate the
+ @code{this} before the field name. E.g., the @code{circle}
+ class above could also be defined with:
- doc-common-list
+ @example
- doc-sub-list?
+ graphical class
- doc-list-size
+   cell% inst-var radius
+ m: ( x y circle -- )
+   radius @@ draw-circle ;m
+ overrides draw
- Several features of our locals word list implementation make these
+ m: ( n-radius circle -- )
- operations easy to implement: The locals word lists are organised as
+   radius ! ;m
- linked lists; the tails of these lists are shared, if the lists
+ overrides construct
- contain some of the same locals; and the address of a name is greater
- than the address of the names behind it in the list.
- Another important implementation detail is the variable
+ end-class circle
- @code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to
+ @end example
- determine if they can be reached directly or only through the branch
- that they resolve. @code{dead-code} is set by @code{UNREACHABLE},
- @code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon
- definition, by @code{BEGIN} and usually by @code{THEN}.
- Counted loops are similar to other loops in most respects, but
+ @code{radius} can only be used in @code{circle} and its
- @code{LEAVE} requires special attention: It performs basically the same
+ descendent classes and inside @code{m:...;m}.
- service as @code{AHEAD}, but it does not create a control-flow stack
- entry. Therefore the information has to be stored elsewhere;
- traditionally, the information was stored in the target fields of the
- branches created by the @code{LEAVE}s, by organizing these fields into a
- linked list. Unfortunately, this clever trick does not provide enough
- space for storing our extended control flow information. Therefore, we
- introduce another stack, the leave stack. It contains the control-flow
- stack entries for all unresolved @code{LEAVE}s.
- Local names are kept until the end of the colon definition, even if
+ @cindex @code{inst-value} usage
- they are no longer visible in any control-flow path. In a few cases
+ You can also define fields with @code{inst-value}, which is
- this may lead to increased space needs for the locals name area, but
+ to @code{inst-var} what @code{value} is to
- usually less than reclaiming this space would cost in code size.
+ @code{variable}.  You can change the value of such a field with
+ @code{[to-inst]}.  E.g., we could also define the class
+ @code{circle} like this:
+ @example
+ graphical class
+   inst-value radius
- @node ANS Forth locals,  , Gforth locals, Locals
+ m: ( x y circle -- )
- @subsection ANS Forth locals
+   radius draw-circle ;m
- @cindex locals, ANS Forth style
+ overrides draw
- The ANS Forth locals wordset does not define a syntax for locals, but
+ m: ( n-radius circle -- )
- words that make it possible to define various syntaxes. One of the
+   [to-inst] radius ;m
- possible syntaxes is a subset of the syntax we used in the Gforth locals
+ overrides construct
- wordset, i.e.:
- @example
+ end-class circle
- @{ local1 local2 ... -- comment @}
- @end example
- @noindent
- or
- @example
- @{ local1 local2 ... @}
  @end example
- The order of the locals corresponds to the order in a stack comment. The
+ @c !! :m is easy to confuse with m:.  Another name would be better.
- restrictions are:
- @itemize @bullet
+ @c Finally, you can define named methods with @code{:m}.  One use of this
- @item
+ @c feature is the definition of words that occur only in one class and are
- Locals can only be cell-sized values (no type specifiers are allowed).
+ @c not intended to be overridden, but which still need method context
- @item
+ @c (e.g., for accessing @code{inst-var}s).  Another use is for methods that
- Locals can be defined only outside control structures.
+ @c would be bound frequently, if defined anonymously.
- @item
- Locals can interfere with explicit usage of the return stack. For the
- exact (and long) rules, see the standard. If you don't use return stack
- accessing words in a definition using locals, you will be all right. The
- purpose of this rule is to make locals implementation on the return
- stack easier.
- @item
- The whole definition must be in one line.
- @end itemize
- Locals defined in this way behave like @code{VALUE}s
- (@pxref{Values}). I.e., they are initialized from the stack. Using their
- name produces their value. Their value can be changed using @code{TO}.
- Since this syntax is supported by Gforth directly, you need not do
+ @node Classes and Scoping, Dividing classes, Method conveniences, Objects
- anything to use it. If you want to port a program using this syntax to
+ @subsubsection Classes and Scoping
- another ANS Forth system, use @file{compat/anslocal.fs} to implement the
+ @cindex classes and scoping
- syntax on the other system.
+ @cindex scoping and classes
- Note that a syntax shown in the standard, section A.13 looks
+ Inheritance is frequent, unlike structure extension. This exacerbates
- similar, but is quite different in having the order of locals
+ the problem with the field name convention (@pxref{Structure Naming
- reversed. Beware!
+ Convention}): One always has to remember in which class the field was
+ originally defined; changing a part of the class structure would require
+ changes for renaming in otherwise unaffected code.
- The ANS Forth locals wordset itself consists of a word:
+ @cindex @code{inst-var} visibility
+ @cindex @code{inst-value} visibility
+ To solve this problem, I added a scoping mechanism (which was not in my
+ original charter): A field defined with @code{inst-var} (or
+ @code{inst-value}) is visible only in the class where it is defined and in
+ the descendent classes of this class.  Using such fields only makes
+ sense in @code{m:}-defined methods in these classes anyway.
+ This scoping mechanism allows us to use the unadorned field name,
+ because name clashes with unrelated words become much less likely.
- doc-(local)
+ @cindex @code{protected} discussion
+ @cindex @code{private} discussion
+ Once we have this mechanism, we can also use it for controlling the
+ visibility of other words: All words defined after
+ @code{protected} are visible only in the current class and its
+ descendents. @code{public} restores the compilation
+ (i.e. @code{current}) word list that was in effect before. If you
+ have several @code{protected}s without an intervening
+ @code{public} or @code{set-current}, @code{public}
+ will restore the compilation word list in effect before the first of
+ these @code{protected}s.
+ @node Dividing classes, Object Interfaces, Classes and Scoping, Objects
+ @subsubsection Dividing classes
+ @cindex Dividing classes
+ @cindex @code{methods}...@code{end-methods}
- The ANS Forth locals extension wordset defines a syntax using @code{locals|}, but it is so
+ You may want to do the definition of methods separate from the
- awful that we strongly recommend not to use it. We have implemented this
+ definition of the class, its selectors, fields, and instance variables,
- syntax to make porting to Gforth easy, but do not document it here. The
+ i.e., separate the implementation from the definition.  You can do this
- problem with this syntax is that the locals are defined in an order
+ in the following way:
- reversed with respect to the standard stack comment notation, making
- programs harder to read, and easier to misread and miswrite. The only
- merit of this syntax is that it is easy to implement using the ANS Forth
- locals wordset.
+ @example
+ graphical class
+   inst-value radius
+ end-class circle
- @c ----------------------------------------------------------
+ ... \ do some other stuff
- @node Structures, Object-oriented Forth, Locals, Words
- @section  Structures
- @cindex structures
- @cindex records
- This section presents the structure package that comes with Gforth. A
+ circle methods \ now we are ready
- version of the package implemented in ANS Forth is available in
- @file{compat/struct.fs}. This package was inspired by a posting on
- comp.lang.forth in 1989 (unfortunately I don't remember, by whom;
- possibly John Hayes). A version of this section has been published in
- ???. Marcel Hendrix provided helpful comments.
- @menu
+ m: ( x y circle -- )
- * Why explicit structure support?::
+   radius draw-circle ;m
- * Structure Usage::
+ overrides draw
- * Structure Naming Convention::
- * Structure Implementation::
- * Structure Glossary::
- @end menu
- @node Why explicit structure support?, Structure Usage, Structures, Structures
+ m: ( n-radius circle -- )
- @subsection Why explicit structure support?
+   [to-inst] radius ;m
+ overrides construct
- @cindex address arithmetic for structures
+ end-methods
- @cindex structures using address arithmetic
+ @end example
- If we want to use a structure containing several fields, we could simply
- reserve memory for it, and access the fields using address arithmetic
- (@pxref{Address arithmetic}). As an example, consider a structure with
- the following fields
- @table @code
+ You can use several @code{methods}...@code{end-methods} sections.  The
- @item a
+ only things you can do to the class in these sections are: defining
- is a float
+ methods, and overriding the class's selectors.  You must not define new
- @item b
+ selectors or fields.
- is a cell
- @item c
- is a float
- @end table
- Given the (float-aligned) base address of the structure we get the
+ Note that you often have to override a selector before using it.  In
- address of the field
+ particular, you usually have to override @code{construct} with a new
+ method before you can invoke @code{heap-new} and friends.  E.g., you
+ must not create a circle before the @code{overrides construct} sequence
+ in the example above.
- @table @code
+ @node Object Interfaces, Objects Implementation, Dividing classes, Objects
- @item a
+ @subsubsection Object Interfaces
- without doing anything further.
+ @cindex object interfaces
- @item b
+ @cindex interfaces for objects
- with @code{float+}
- @item c
- with @code{float+ cell+ faligned}
- @end table
- It is easy to see that this can become quite tiring.
+ In this model you can only call selectors defined in the class of the
+ receiving objects or in one of its ancestors. If you call a selector
+ with a receiving object that is not in one of these classes, the
+ result is undefined; if you are lucky, the program crashes
+ immediately.
- Moreover, it is not very readable, because seeing a
+ @cindex selectors common to hardly-related classes
- @code{cell+} tells us neither which kind of structure is
+ Now consider the case when you want to have a selector (or several)
- accessed nor what field is accessed; we have to somehow infer the kind
+ available in two classes: You would have to add the selector to a
- of structure, and then look up in the documentation, which field of
+ common ancestor class, in the worst case to @code{object}. You
- that structure corresponds to that offset.
+ may not want to do this, e.g., because someone else is responsible for
+ this ancestor class.
- Finally, this kind of address arithmetic also causes maintenance
+ The solution for this problem is interfaces. An interface is a
- troubles: If you add or delete a field somewhere in the middle of the
+ collection of selectors. If a class implements an interface, the
- structure, you have to find and change all computations for the fields
+ selectors become available to the class and its descendents. A class
- afterwards.
+ can implement an unlimited number of interfaces. For the problem
+ discussed above, we would define an interface for the selector(s), and
+ both classes would implement the interface.
- So, instead of using @code{cell+} and friends directly, how
+ As an example, consider an interface @code{storage} for
- about storing the offsets in constants:
+ writing objects to disk and getting them back, and a class
+ @code{foo} that implements it. The code would look like this:
+ @cindex @code{interface} usage
+ @cindex @code{end-interface} usage
+ @cindex @code{implementation} usage
  @example
-constant a-offset
+ interface
-float+ constant b-offset
+   selector write ( file object -- )
-float+ cell+ faligned c-offset
+   selector read1 ( file object -- )
- @end example
+ end-interface storage
- Now we can get the address of field @code{x} with @code{x-offset
+ bar class
- +}. This is much better in all respects. Of course, you still
+   storage implementation
- have to change all later offset definitions if you add a field. You can
- fix this by declaring the offsets in the following way:
- @example
+ ... overrides write
-constant a-offset
+ ... overrides read1
- a-offset float+ constant b-offset
+ ...
- b-offset cell+ faligned constant c-offset
+ end-class foo
  @end example
- Since we always use the offsets with @code{+}, we could use a defining
+ @noindent
- word @code{cfield} that includes the @code{+} in the action of the
+ (I would add a word @code{read} @i{( file -- object )} that uses
- defined word:
+ @code{read1} internally, but that's beyond the point illustrated
+ here.)
- @example
+ Note that you cannot use @code{protected} in an interface; and
- : cfield ( n "name" -- )
+ of course you cannot define fields.
-     create ,
- does> ( name execution: addr1 -- addr2 )
-     @@ + ;
-cfield a
+ In the Neon model, all selectors are available for all classes;
-a float+ cfield b
+ therefore it does not need interfaces. The price you pay in this model
-b cell+ faligned cfield c
+ is slower late binding, and therefore, added complexity to avoid late
- @end example
+ binding.
- Instead of @code{x-offset +}, we now simply write @code{x}.
+ @node Objects Implementation, Objects Glossary, Object Interfaces, Objects
+ @subsubsection @file{objects.fs} Implementation
+ @cindex @file{objects.fs} implementation
- The structure field words now can be used quite nicely. However,
+ @cindex @code{object-map} discussion
- their definition is still a bit cumbersome: We have to repeat the
+ An object is a piece of memory, like one of the data structures
- name, the information about size and alignment is distributed before
+ described with @code{struct...end-struct}. It has a field
- and after the field definitions etc.  The structure package presented
+ @code{object-map} that points to the method map for the object's
- here addresses these problems.
+ class.
- @node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures
+ @cindex method map
- @subsection Structure Usage
+ @cindex virtual function table
- @cindex structure usage
+ The @emph{method map}@footnote{This is Self terminology; in C++
+ terminology: virtual function table.} is an array that contains the
+ execution tokens (@i{xt}s) of the methods for the object's class. Each
+ selector contains an offset into a method map.
+ @cindex @code{selector} implementation, class
+ @code{selector} is a defining word that uses
+ @code{CREATE} and @code{DOES>}. The body of the
+ selector contains the offset; the @code{DOES>} action for a
+ class selector is, basically:
- @cindex @code{field} usage
- @cindex @code{struct} usage
- @cindex @code{end-struct} usage
- You can define a structure for a (data-less) linked list with:
  @example
- struct
+ ( object addr ) @@ over object-map @@ + @@ execute
-     cell% field list-next
- end-struct list%
  @end example
- With the address of the list node on the stack, you can compute the
+ Since @code{object-map} is the first field of the object, it
- address of the field that contains the address of the next node with
+ does not generate any code. As you can see, calling a selector has a
- @code{list-next}. E.g., you can determine the length of a list
+ small, constant cost.
- with:
- @example
- : list-length ( list -- n )
- \ "list" is a pointer to the first element of a linked list
- \ "n" is the length of the list
-BEGIN ( list1 n1 )
-         over
-     WHILE ( list1 n1 )
-+ swap list-next @@ swap
-     REPEAT
-     nip ;
- @end example
- You can reserve memory for a list node in the dictionary with
- @code{list% %allot}, which leaves the address of the list node on the
- stack. For the equivalent allocation on the heap you can use @code{list%
- %alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior),
- use @code{list% %allocate}). You can get the the size of a list
- node with @code{list% %size} and its alignment with @code{list%
- %alignment}.
- Note that in ANS Forth the body of a @code{create}d word is
- @code{aligned} but not necessarily @code{faligned};
- therefore, if you do a:
- @example
- create @emph{name} foo% %allot
- @end example
- @noindent
- then the memory alloted for @code{foo%} is
- guaranteed to start at the body of @code{@emph{name}} only if
- @code{foo%} contains only character, cell and double fields.
- @cindex structures containing structures
- You can include a structure @code{foo%} as a field of
- another structure, like this:
- @example
- struct
- ...
-     foo% field ...
- ...
- end-struct ...
- @end example
- @cindex structure extension
- @cindex extended records
- Instead of starting with an empty structure, you can extend an
- existing structure. E.g., a plain linked list without data, as defined
- above, is hardly useful; You can extend it to a linked list of integers,
- like this:@footnote{This feature is also known as @emph{extended
- records}. It is the main innovation in the Oberon language; in other
- words, adding this feature to Modula-2 led Wirth to create a new
- language, write a new compiler etc.  Adding this feature to Forth just
- required a few lines of code.}
- @example
- list%
-     cell% field intlist-int
- end-struct intlist%
- @end example
- @code{intlist%} is a structure with two fields:
- @code{list-next} and @code{intlist-int}.
- @cindex structures containing arrays
- You can specify an array type containing @emph{n} elements of
- type @code{foo%} like this:
- @example
- foo% @emph{n} *
- @end example
- You can use this array type in any place where you can use a normal
- type, e.g., when defining a @code{field}, or with
- @code{%allot}.
- @cindex first field optimization
- The first field is at the base address of a structure and the word
- for this field (e.g., @code{list-next}) actually does not change
- the address on the stack. You may be tempted to leave it away in the
- interest of run-time and space efficiency. This is not necessary,
- because the structure package optimizes this case and compiling such
- words does not generate any code. So, in the interest of readability
- and maintainability you should include the word for the field when
- accessing the field.
- @node Structure Naming Convention, Structure Implementation, Structure Usage, Structures
- @subsection Structure Naming Convention
- @cindex structure naming convention
- The field names that come to (my) mind are often quite generic, and,
- if used, would cause frequent name clashes. E.g., many structures
- probably contain a @code{counter} field. The structure names
- that come to (my) mind are often also the logical choice for the names
- of words that create such a structure.
- Therefore, I have adopted the following naming conventions:
- @itemize @bullet
- @cindex field naming convention
- @item
- The names of fields are of the form
- @code{@emph{struct}-@emph{field}}, where
- @code{@emph{struct}} is the basic name of the structure, and
- @code{@emph{field}} is the basic name of the field. You can
- think of field words as converting the (address of the)
- structure into the (address of the) field.
- @cindex structure naming convention
- @item
- The names of structures are of the form
- @code{@emph{struct}%}, where
- @code{@emph{struct}} is the basic name of the structure.
- @end itemize
- This naming convention does not work that well for fields of extended
- structures; e.g., the integer list structure has a field
- @code{intlist-int}, but has @code{list-next}, not
- @code{intlist-next}.
- @node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures
+ @cindex @code{current-interface} discussion
- @subsection Structure Implementation
+ @cindex class implementation and representation
- @cindex structure implementation
+ A class is basically a @code{struct} combined with a method
- @cindex implementation of structures
+ map. During the class definition the alignment and size of the class
+ are passed on the stack, just as with @code{struct}s, so
+ @code{field} can also be used for defining class
+ fields. However, passing more items on the stack would be
+ inconvenient, so @code{class} builds a data structure in memory,
+ which is accessed through the variable
+ @code{current-interface}. After its definition is complete, the
+ class is represented on the stack by a pointer (e.g., as parameter for
+ a child class definition).
- The central idea in the implementation is to pass the data about the
+ A new class starts off with the alignment and size of its parent,
- structure being built on the stack, not in some global
+ and a copy of the parent's method map. Defining new fields extends the
- variable. Everything else falls into place naturally once this design
+ size and alignment; likewise, defining new selectors extends the
- decision is made.
+ method map. @code{overrides} just stores a new @i{xt} in the method
+ map at the offset given by the selector.
- The type description on the stack is of the form @emph{align
+ @cindex class binding, implementation
- size}. Keeping the size on the top-of-stack makes dealing with arrays
+ Class binding just gets the @i{xt} at the offset given by the selector
- very simple.
+ from the class's method map and @code{compile,}s (in the case of
+ @code{[bind]}) it.
- @code{field} is a defining word that uses @code{Create}
+ @cindex @code{this} implementation
- and @code{DOES>}. The body of the field contains the offset
+ @cindex @code{catch} and @code{this}
- of the field, and the normal @code{DOES>} action is simply:
+ @cindex @code{this} and @code{catch}
+ I implemented @code{this} as a @code{value}. At the
+ start of an @code{m:...;m} method the old @code{this} is
+ stored to the return stack and restored at the end; and the object on
+ the TOS is stored @code{TO this}. This technique has one
+ disadvantage: If the user does not leave the method via
+ @code{;m}, but via @code{throw} or @code{exit},
+ @code{this} is not restored (and @code{exit} may
+ crash). To deal with the @code{throw} problem, I have redefined
+ @code{catch} to save and restore @code{this}; the same
+ should be done with any word that can catch an exception. As for
+ @code{exit}, I simply forbid it (as a replacement, there is
+ @code{exitm}).
+ @cindex @code{inst-var} implementation
+ @code{inst-var} is just the same as @code{field}, with
+ a different @code{DOES>} action:
  @example
- @@ +
+ @@ this +
  @end example
+ Similar for @code{inst-value}.
- @noindent
+ @cindex class scoping implementation
- i.e., add the offset to the address, giving the stack effect
+ Each class also has a word list that contains the words defined with
- @i{addr1 -- addr2} for a field.
+ @code{inst-var} and @code{inst-value}, and its protected
+ words. It also has a pointer to its parent. @code{class} pushes
- @cindex first field optimization, implementation
+ the word lists of the class and all its ancestors onto the search order stack,
- This simple structure is slightly complicated by the optimization
+ and @code{end-class} drops them.
- for fields with offset 0, which requires a different
- @code{DOES>}-part (because we cannot rely on there being
- something on the stack if such a field is invoked during
- compilation). Therefore, we put the different @code{DOES>}-parts
- in separate words, and decide which one to invoke based on the
- offset. For a zero offset, the field is basically a noop; it is
- immediate, and therefore no code is generated when it is compiled.
- @node Structure Glossary,  , Structure Implementation, Structures
- @subsection Structure Glossary
- @cindex structure glossary
- doc-%align
- doc-%alignment
- doc-%alloc
- doc-%allocate
- doc-%allot
- doc-cell%
- doc-char%
- doc-dfloat%
- doc-double%
- doc-end-struct
- doc-field
- doc-float%
- doc-naligned
- doc-sfloat%
- doc-%size
- doc-struct
- @c -------------------------------------------------------------
- @node Object-oriented Forth, Passing Commands to the OS, Structures, Words
- @section Object-oriented Forth
- Gforth comes with three packages for object-oriented programming:
- @file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them
- is preloaded, so you have to @code{include} them before use. The most
- important differences between these packages (and others) are discussed
- in @ref{Comparison with other object models}. All packages are written
- in ANS Forth and can be used with any other ANS Forth.
- @menu
- * Why object-oriented programming?::
- * Object-Oriented Terminology::
- * Objects::
- * OOF::
- * Mini-OOF::
- * Comparison with other object models::
- @end menu
- @c ----------------------------------------------------------------
- @node Why object-oriented programming?, Object-Oriented Terminology, Object-oriented Forth, Object-oriented Forth
- @subsection Why object-oriented programming?
- @cindex object-oriented programming motivation
- @cindex motivation for object-oriented programming
- Often we have to deal with several data structures (@emph{objects}),
- that have to be treated similarly in some respects, but differently in
- others. Graphical objects are the textbook example: circles, triangles,
- dinosaurs, icons, and others, and we may want to add more during program
- development. We want to apply some operations to any graphical object,
- e.g., @code{draw} for displaying it on the screen. However, @code{draw}
- has to do something different for every kind of object.
- @comment TODO add some other operations eg perimeter, area
- @comment and tie in to concrete examples later..
- We could implement @code{draw} as a big @code{CASE}
- control structure that executes the appropriate code depending on the
- kind of object to be drawn. This would be not be very elegant, and,
- moreover, we would have to change @code{draw} every time we add
- a new kind of graphical object (say, a spaceship).
- What we would rather do is: When defining spaceships, we would tell
- the system: ``Here's how you @code{draw} a spaceship; you figure
- out the rest''.
- This is the problem that all systems solve that (rightfully) call
+ @cindex interface implementation
- themselves object-oriented; the object-oriented packages presented here
+ An interface is like a class without fields, parent and protected
- solve this problem (and not much else).
+ words; i.e., it just has a method map. If a class implements an
- @comment TODO ?list properties of oo systems.. oo vs o-based?
+ interface, its method map contains a pointer to the method map of the
+ interface. The positive offsets in the map are reserved for class
+ methods, therefore interface map pointers have negative
+ offsets. Interfaces have offsets that are unique throughout the
+ system, unlike class selectors, whose offsets are only unique for the
+ classes where the selector is available (invokable).
- @c ------------------------------------------------------------------------
+ This structure means that interface selectors have to perform one
- @node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth
+ indirection more than class selectors to find their method. Their body
- @subsection Object-Oriented Terminology
+ contains the interface map pointer offset in the class method map, and
- @cindex object-oriented terminology
+ the method offset in the interface method map. The
- @cindex terminology for object-oriented programming
+ @code{does>} action for an interface selector is, basically:
- This section is mainly for reference, so you don't have to understand
+ @example
- all of it right away.  The terminology is mainly Smalltalk-inspired.  In
+ ( object selector-body )
- short:
+dup selector-interface @@ ( object selector-body object interface-offset )
+ swap object-map @@ + @@ ( object selector-body map )
+ swap selector-offset @@ + @@ execute
+ @end example
- @table @emph
+ where @code{object-map} and @code{selector-offset} are
- @cindex class
+ first fields and generate no code.
- @item class
- a data structure definition with some extras.
- @cindex object
+ As a concrete example, consider the following code:
- @item object
- an instance of the data structure described by the class definition.
- @cindex instance variables
+ @example
- @item instance variables
+ interface
- fields of the data structure.
+   selector if1sel1
+   selector if1sel2
+ end-interface if1
- @cindex selector
+ object class
- @cindex method selector
+   if1 implementation
- @cindex virtual function
+   selector cl1sel1
- @item selector
+   cell% inst-var cl1iv1
- (or @emph{method selector}) a word (e.g.,
- @code{draw}) that performs an operation on a variety of data
- structures (classes). A selector describes @emph{what} operation to
- perform. In C++ terminology: a (pure) virtual function.
- @cindex method
+ ' m1 overrides construct
- @item method
+ ' m2 overrides if1sel1
- the concrete definition that performs the operation
+ ' m3 overrides if1sel2
- described by the selector for a specific class. A method specifies
+ ' m4 overrides cl1sel2
- @emph{how} the operation is performed for a specific class.
+ end-class cl1
- @cindex selector invocation
+ create obj1 object dict-new drop
- @cindex message send
+ create obj2 cl1    dict-new drop
- @cindex invoking a selector
+ @end example
- @item selector invocation
- a call of a selector. One argument of the call (the TOS (top-of-stack))
- is used for determining which method is used. In Smalltalk terminology:
- a message (consisting of the selector and the other arguments) is sent
- to the object.
- @cindex receiving object
+ The data structure created by this code (including the data structure
- @item receiving object
+ for @code{object}) is shown in the
- the object used for determining the method executed by a selector
+ @uref{objects-implementation.eps,figure}, assuming a cell size of 4.
- invocation. In the @file{objects.fs} model, it is the object that is on
+ @comment TODO add this diagram..
- the TOS when the selector is invoked. (@emph{Receiving} comes from
- the Smalltalk @emph{message} terminology.)
- @cindex child class
+ @node Objects Glossary,  , Objects Implementation, Objects
- @cindex parent class
+ @subsubsection @file{objects.fs} Glossary
- @cindex inheritance
+ @cindex @file{objects.fs} Glossary
- @item child class
- a class that has (@emph{inherits}) all properties (instance variables,
- selectors, methods) from a @emph{parent class}. In Smalltalk
- terminology: The subclass inherits from the superclass. In C++
- terminology: The derived class inherits from the base class.
- @end table
- @c If you wonder about the message sending terminology, it comes from
+ doc---objects-bind
- @c a time when each object had it's own task and objects communicated via
+ doc---objects-<bind>
- @c message passing; eventually the Smalltalk developers realized that
+ doc---objects-bind'
- @c they can do most things through simple (indirect) calls. They kept the
+ doc---objects-[bind]
- @c terminology.
+ doc---objects-class
+ doc---objects-class->map
+ doc---objects-class-inst-size
+ doc---objects-class-override!
+ doc---objects-construct
+ doc---objects-current'
+ doc---objects-[current]
+ doc---objects-current-interface
+ doc---objects-dict-new
+ doc---objects-drop-order
+ doc---objects-end-class
+ doc---objects-end-class-noname
+ doc---objects-end-interface
+ doc---objects-end-interface-noname
+ doc---objects-end-methods
+ doc---objects-exitm
+ doc---objects-heap-new
+ doc---objects-implementation
+ doc---objects-init-object
+ doc---objects-inst-value
+ doc---objects-inst-var
+ doc---objects-interface
+ doc---objects-m:
+ doc---objects-:m
+ doc---objects-;m
+ doc---objects-method
+ doc---objects-methods
+ doc---objects-object
+ doc---objects-overrides
+ doc---objects-[parent]
+ doc---objects-print
+ doc---objects-protected
+ doc---objects-public
+ @c !! push-order conflicts
+ doc---objects-push-order
+ doc---objects-selector
+ doc---objects-this
+ doc---objects-<to-inst>
+ doc---objects-[to-inst]
+ doc---objects-to-this
+ doc---objects-xt-new
- @c --------------------------------------------------------------
- @node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth
+ @c -------------------------------------------------------------
- @subsection The @file{objects.fs} model
+ @node OOF, Mini-OOF, Objects, Object-oriented Forth
- @cindex objects
+ @subsection The @file{oof.fs} model
+ @cindex oof
  @cindex object-oriented programming
  @cindex @file{objects.fs}
  @cindex @file{oof.fs}
- This section describes the @file{objects.fs} package. This material also
+ This section describes the @file{oof.fs} package.
- has been published in M. Anton Ertl,
- @cite{@uref{http://www.complang.tuwien.ac.at/forth/objects/objects.html,
- Yet Another Forth Objects Package}}, Forth Dimensions 19(2), pages
---43.
- @c McKewan's and Zsoter's packages
- This section assumes that you have read @ref{Structures}.
- The techniques on which this model is based have been used to implement
+ The package described in this section has been used in bigFORTH since 1991, and
- the parser generator, Gray, and have also been used in Gforth for
+ used for two large applications: a chromatographic system used to
- implementing the various flavours of word lists (hashed or not,
+ create new medicaments, and a graphic user interface library (MINOS).
- case-sensitive or not, special-purpose word lists for locals etc.).
+ You can find a description (in German) of @file{oof.fs} in @cite{Object
+ oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension}
+(2), 1994.
  @menu
- * Properties of the Objects model::
+ * Properties of the OOF model::
- * Basic Objects Usage::
+ * Basic OOF Usage::
- * The Objects base class::
+ * The OOF base class::
- * Creating objects::
+ * Class Declaration::
- * Object-Oriented Programming Style::
+ * Class Implementation::
- * Class Binding::
- * Method conveniences::
- * Classes and Scoping::
- * Dividing classes::
- * Object Interfaces::
- * Objects Implementation::
- * Objects Glossary::
  @end menu
- Marcel Hendrix provided helpful comments on this section. Andras Zsoter
+ @node Properties of the OOF model, Basic OOF Usage, OOF, OOF
- and Bernd Paysan helped me with the related works section.
+ @subsubsection Properties of the @file{oof.fs} model
+ @cindex @file{oof.fs} properties
- @node Properties of the Objects model, Basic Objects Usage, Objects, Objects
- @subsubsection Properties of the @file{objects.fs} model
- @cindex @file{objects.fs} properties
  @itemize @bullet
  @item
- It is straightforward to pass objects on the stack. Passing
+ This model combines object oriented programming with information
- selectors on the stack is a little less convenient, but possible.
+ hiding. It helps you writing large application, where scoping is
+ necessary, because it provides class-oriented scoping.
- @item
- Objects are just data structures in memory, and are referenced by their
- address. You can create words for objects with normal defining words
- like @code{constant}. Likewise, there is no difference between instance
- variables that contain objects and those that contain other data.
  @item
- Late binding is efficient and easy to use.
+ Named objects, object pointers, and object arrays can be created,
+ selector invocation uses the ``object selector'' syntax. Selector invocation
+ to objects and/or selectors on the stack is a bit less convenient, but
+ possible.
  @item
- It avoids parsing, and thus avoids problems with state-smartness
+ Selector invocation and instance variable usage of the active object is
- and reduced extensibility; for convenience there are a few parsing
+ straightforward, since both make use of the active object.
- words, but they have non-parsing counterparts. There are also a few
- defining words that parse. This is hard to avoid, because all standard
- defining words parse (except @code{:noname}); however, such
- words are not as bad as many other parsing words, because they are not
- state-smart.
  @item
- It does not try to incorporate everything. It does a few things and does
+ Late binding is efficient and easy to use.
- them well (IMO). In particular, this model was not designed to support
- information hiding (although it has features that may help); you can use
- a separate package for achieving this.
  @item
- It is layered; you don't have to learn and use all features to use this
+ State-smart objects parse selectors. However, extensibility is provided
- model. Only a few features are necessary (@pxref{Basic Objects Usage},
+ using a (parsing) selector @code{postpone} and a selector @code{'}.
- @pxref{The Objects base class}, @pxref{Creating objects}.), the others
- are optional and independent of each other.
  @item
  An implementation in ANS Forth is available.
- Line 10467  An implementation in ANS Forth is availa
+ Line 10499  An implementation in ANS Forth is availa
  @end itemize
- @node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects
+ @node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF
- @subsubsection Basic @file{objects.fs} Usage
+ @subsubsection Basic @file{oof.fs} Usage
- @cindex basic objects usage
+ @cindex @file{oof.fs} usage
- @cindex objects, basic usage
+ This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}).
  You can define a class for graphical objects like this:
  @cindex @code{class} usage
- @cindex @code{end-class} usage
+ @cindex @code{class;} usage
- @cindex @code{selector} usage
+ @cindex @code{method} usage
  @example
- object class \ "object" is the parent class
+ object class graphical \ "object" is the parent class
-   selector draw ( x y graphical -- )
+   method draw ( x y graphical -- )
- end-class graphical
+ class;
  @end example
  This code defines a class @code{graphical} with an
- Line 10492  operation @code{draw}.  We can perform t
+ Line 10525  operation @code{draw}.  We can perform t
  @end example
  @noindent
- where @code{t-rex} is a word (say, a constant) that produces a
+ where @code{t-rex} is an object or object pointer, created with e.g.
- graphical object.
+ @code{graphical : t-rex}.
- @comment TODO add a 2nd operation eg perimeter.. and use for
- @comment a concrete example
  @cindex abstract class
  How do we create a graphical object? With the present definitions,
- Line 10509  any concrete graphical object type (C++
+ Line 10539  any concrete graphical object type (C++
  For concrete graphical objects, we define child classes of the
  class @code{graphical}, e.g.:
- @cindex @code{overrides} usage
- @cindex @code{field} usage in class definition
  @example
- graphical class \ "graphical" is the parent class
+ graphical class circle \ "graphical" is the parent class
-   cell% field circle-radius
+   cell var circle-radius
+ how:
- :noname ( x y circle -- )
+   : draw ( x y -- )
-   circle-radius @@ draw-circle ;
+     circle-radius @@ draw-circle ;
- overrides draw
- :noname ( n-radius circle -- )
-   circle-radius ! ;
- overrides construct
- end-class circle
+   : init ( n-radius -- (
+     circle-radius ! ;
+ class;
  @end example
  Here we define a class @code{circle} as a child of @code{graphical},
- with field @code{circle-radius} (which behaves just like a field
+ with a field @code{circle-radius}; it defines new methods for the
- (@pxref{Structures}); it defines (using @code{overrides}) new methods
+ selectors @code{draw} and @code{init} (@code{init} is defined in
- for the selectors @code{draw} and @code{construct} (@code{construct} is
+ @code{object}, the parent class of @code{graphical}).
- defined in @code{object}, the parent class of @code{graphical}).
- Now we can create a circle on the heap (i.e.,
+ Now we can create a circle in the dictionary with:
- @code{allocate}d memory) with:
- @cindex @code{heap-new} usage
  @example
-circle heap-new constant my-circle
+circle : my-circle
  @end example
  @noindent
- @code{heap-new} invokes @code{construct}, thus
+ @code{:} invokes @code{init}, thus initializing the field
- initializing the field @code{circle-radius} with 50. We can draw
+ @code{circle-radius} with 50. We can draw this new circle at (100,100)
- this new circle at (100,100) with:
+ with:
  @example
 100 my-circle draw
- Line 10551  this new circle at (100,100) with:
+ Line 10573  this new circle at (100,100) with:
  @cindex selector invocation, restrictions
  @cindex class definition, restrictions
- Note: You can only invoke a selector if the object on the TOS
+ Note: You can only invoke a selector if the receiving object belongs to
- (the receiving object) belongs to the class where the selector was
+ the class where the selector was defined or one of its descendents;
- defined or one of its descendents; e.g., you can invoke
+ e.g., you can invoke @code{draw} only for objects belonging to
- @code{draw} only for objects belonging to @code{graphical}
+ @code{graphical} or its descendents (e.g., @code{circle}). The scoping
- or its descendents (e.g., @code{circle}).  Immediately before
+ mechanism will check if you try to invoke a selector that is not
- @code{end-class}, the search order has to be the same as
+ defined in this class hierarchy, so you'll get an error at compilation
- immediately after @code{class}.
+ time.
- @node The Objects base class, Creating objects, Basic Objects Usage, Objects
- @subsubsection The @file{object.fs} base class
+ @node The OOF base class, Class Declaration, Basic OOF Usage, OOF
- @cindex @code{object} class
+ @subsubsection The @file{oof.fs} base class
+ @cindex @file{oof.fs} base class
  When you define a class, you have to specify a parent class.  So how do
  you start defining classes? There is one class available from the start:
- @code{object}. It is ancestor for all classes and so is the
+ @code{object}. You have to use it as ancestor for all classes. It is the
- only class that has no parent. It has two selectors: @code{construct}
+ only class that has no parent. Classes are also objects, except that
- and @code{print}.
+ they don't have instance variables; class manipulation such as
+ inheritance or changing definitions of a class is handled through
+ selectors of the class @code{object}.
- @node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects
+ @code{object} provides a number of selectors:
- @subsubsection Creating objects
- @cindex creating objects
- @cindex object creation
- @cindex object allocation options
- @cindex @code{heap-new} discussion
+ @itemize @bullet
- @cindex @code{dict-new} discussion
+ @item
- @cindex @code{construct} discussion
+ @code{class} for subclassing, @code{definitions} to add definitions
- You can create and initialize an object of a class on the heap with
+ later on, and @code{class?} to get type informations (is the class a
- @code{heap-new} ( ... class -- object ) and in the dictionary
+ subclass of the class passed on the stack?).
- (allocation with @code{allot}) with @code{dict-new} (
- ... class -- object ). Both words invoke @code{construct}, which
- consumes the stack items indicated by "..." above.
- @cindex @code{init-object} discussion
+ doc---object-class
- @cindex @code{class-inst-size} discussion
+ doc---object-definitions
- If you want to allocate memory for an object yourself, you can get its
+ doc---object-class?
- alignment and size with @code{class-inst-size 2@@} ( class --
- align size ). Once you have memory for an object, you can initialize
- it with @code{init-object} ( ... class object -- );
- @code{construct} does only a part of the necessary work.
- @node Object-Oriented Programming Style, Class Binding, Creating objects, Objects
- @subsubsection Object-Oriented Programming Style
- @cindex object-oriented programming style
- @cindex programming style, object-oriented
- This section is not exhaustive.
+ @item
+ @code{init} and @code{dispose} as constructor and destructor of the
+ object. @code{init} is invocated after the object's memory is allocated,
+ while @code{dispose} also handles deallocation. Thus if you redefine
+ @code{dispose}, you have to call the parent's dispose with @code{super
+ dispose}, too.
- @cindex stack effects of selectors
+ doc---object-init
- @cindex selectors and stack effects
+ doc---object-dispose
- In general, it is a good idea to ensure that all methods for the
- same selector have the same stack effect: when you invoke a selector,
- you often have no idea which method will be invoked, so, unless all
- methods have the same stack effect, you will not know the stack effect
- of the selector invocation.
- One exception to this rule is methods for the selector
- @code{construct}. We know which method is invoked, because we
- specify the class to be constructed at the same place. Actually, I
- defined @code{construct} as a selector only to give the users a
- convenient way to specify initialization. The way it is used, a
- mechanism different from selector invocation would be more natural
- (but probably would take more code and more space to explain).
- @node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects
+ @item
- @subsubsection Class Binding
+ @code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and
- @cindex class binding
+ @code{[]} to create named and unnamed objects and object arrays or
- @cindex early binding
+ object pointers.
- @cindex late binding
+ doc---object-new
- Normal selector invocations determine the method at run-time depending
+ doc---object-new[]
- on the class of the receiving object. This run-time selection is called
+ doc---object-:
- @i{late binding}.
+ doc---object-ptr
+ doc---object-asptr
+ doc---object-[]
- Sometimes it's preferable to invoke a different method. For example,
- you might want to use the simple method for @code{print}ing
- @code{object}s instead of the possibly long-winded @code{print} method
- of the receiver class. You can achieve this by replacing the invocation
- of @code{print} with:
- @cindex @code{[bind]} usage
+ @item
- @example
+ @code{::} and @code{super} for explicit scoping. You should use explicit
- [bind] object print
+ scoping only for super classes or classes with the same set of instance
- @end example
+ variables. Explicitly-scoped selectors use early binding.
- @noindent
+ doc---object-::
- in compiled code or:
+ doc---object-super
- @cindex @code{bind} usage
- @example
- bind object print
- @end example
- @cindex class binding, alternative to
+ @item
- @noindent
+ @code{self} to get the address of the object
- in interpreted code. Alternatively, you can define the method with a
- name (e.g., @code{print-object}), and then invoke it through the
- name. Class binding is just a (often more convenient) way to achieve
- the same effect; it avoids name clutter and allows you to invoke
- methods directly without naming them first.
- @cindex superclass binding
+ doc---object-self
- @cindex parent class binding
- A frequent use of class binding is this: When we define a method
- for a selector, we often want the method to do what the selector does
- in the parent class, and a little more. There is a special word for
- this purpose: @code{[parent]}; @code{[parent]
- @emph{selector}} is equivalent to @code{[bind] @emph{parent
- selector}}, where @code{@emph{parent}} is the parent
- class of the current class. E.g., a method definition might look like:
- @cindex @code{[parent]} usage
- @example
- :noname
-   dup [parent] foo \ do parent's foo on the receiving object
-   ... \ do some more
- ; overrides foo
- @end example
- @cindex class binding as optimization
+ @item
- In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions,
+ @code{bind}, @code{bound}, @code{link}, and @code{is} to assign object
- March 1997), Andrew McKewan presents class binding as an optimization
+ pointers and instance defers.
- technique. I recommend not using it for this purpose unless you are in
- an emergency. Late binding is pretty fast with this model anyway, so the
- benefit of using class binding is small; the cost of using class binding
- where it is not appropriate is reduced maintainability.
- While we are at programming style questions: You should bind
+ doc---object-bind
- selectors only to ancestor classes of the receiving object. E.g., say,
+ doc---object-bound
- you know that the receiving object is of class @code{foo} or its
+ doc---object-link
- descendents; then you should bind only to @code{foo} and its
+ doc---object-is
- ancestors.
- @node Method conveniences, Classes and Scoping, Class Binding, Objects
- @subsubsection Method conveniences
- @cindex method conveniences
- In a method you usually access the receiving object pretty often.  If
+ @item
- you define the method as a plain colon definition (e.g., with
+ @code{'} to obtain selector tokens, @code{send} to invocate selectors
- @code{:noname}), you may have to do a lot of stack
+ form the stack, and @code{postpone} to generate selector invocation code.
- gymnastics. To avoid this, you can define the method with @code{m:
- ... ;m}. E.g., you could define the method for
+ doc---object-'
- @code{draw}ing a @code{circle} with
+ doc---object-postpone
+ @item
+ @code{with} and @code{endwith} to select the active object from the
+ stack, and enable its scope. Using @code{with} and @code{endwith}
+ also allows you to create code using selector @code{postpone} without being
+ trapped by the state-smart objects.
+ doc---object-with
+ doc---object-endwith
+ @end itemize
+ @node Class Declaration, Class Implementation, The OOF base class, OOF
+ @subsubsection Class Declaration
+ @cindex class declaration
+ @itemize @bullet
+ @item
+ Instance variables
+ doc---oof-var
+ @item
+ Object pointers
+ doc---oof-ptr
+ doc---oof-asptr
+ @item
+ Instance defers
+ doc---oof-defer
+ @item
+ Method selectors
+ doc---oof-early
+ doc---oof-method
- @cindex @code{this} usage
+ @item
- @cindex @code{m:} usage
+ Class-wide variables
- @cindex @code{;m} usage
- @example
- m: ( x y circle -- )
-   ( x y ) this circle-radius @@ draw-circle ;m
- @end example
- @cindex @code{exit} in @code{m: ... ;m}
+ doc---oof-static
- @cindex @code{exitm} discussion
- @cindex @code{catch} in @code{m: ... ;m}
- When this method is executed, the receiver object is removed from the
- stack; you can access it with @code{this} (admittedly, in this
- example the use of @code{m: ... ;m} offers no advantage). Note
- that I specify the stack effect for the whole method (i.e. including
- the receiver object), not just for the code between @code{m:}
- and @code{;m}. You cannot use @code{exit} in
- @code{m:...;m}; instead, use
- @code{exitm}.@footnote{Moreover, for any word that calls
- @code{catch} and was defined before loading
- @code{objects.fs}, you have to redefine it like I redefined
- @code{catch}: @code{: catch this >r catch r> to-this ;}}
- @cindex @code{inst-var} usage
- You will frequently use sequences of the form @code{this
- @emph{field}} (in the example above: @code{this
- circle-radius}). If you use the field only in this way, you can
- define it with @code{inst-var} and eliminate the
- @code{this} before the field name. E.g., the @code{circle}
- class above could also be defined with:
- @example
+ @item
- graphical class
+ End declaration
-   cell% inst-var radius
- m: ( x y circle -- )
+ doc---oof-how:
-   radius @@ draw-circle ;m
+ doc---oof-class;
- overrides draw
- m: ( n-radius circle -- )
-   radius ! ;m
- overrides construct
- end-class circle
+ @end itemize
- @end example
- @code{radius} can only be used in @code{circle} and its
+ @c -------------------------------------------------------------
- descendent classes and inside @code{m:...;m}.
+ @node Class Implementation,  , Class Declaration, OOF
+ @subsubsection Class Implementation
+ @cindex class implementation
- @cindex @code{inst-value} usage
+ @c -------------------------------------------------------------
- You can also define fields with @code{inst-value}, which is
+ @node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth
- to @code{inst-var} what @code{value} is to
+ @subsection The @file{mini-oof.fs} model
- @code{variable}.  You can change the value of such a field with
+ @cindex mini-oof
- @code{[to-inst]}.  E.g., we could also define the class
- @code{circle} like this:
- @example
+ Gforth's third object oriented Forth package is a 12-liner. It uses a
- graphical class
+ mixture of the @file{object.fs} and the @file{oof.fs} syntax,
-   inst-value radius
+ and reduces to the bare minimum of features. This is based on a posting
+ of Bernd Paysan in comp.lang.forth.
- m: ( x y circle -- )
+ @menu
-   radius draw-circle ;m
+ * Basic Mini-OOF Usage::
- overrides draw
+ * Mini-OOF Example::
+ * Mini-OOF Implementation::
+ @end menu
- m: ( n-radius circle -- )
+ @c -------------------------------------------------------------
-   [to-inst] radius ;m
+ @node Basic Mini-OOF Usage, Mini-OOF Example, Mini-OOF, Mini-OOF
- overrides construct
+ @subsubsection Basic @file{mini-oof.fs} Usage
+ @cindex mini-oof usage
- end-class circle
+ There is a base class (@code{class}, which allocates one cell for the
- @end example
+ object pointer) plus seven other words: to define a method, a variable,
+ a class; to end a class, to resolve binding, to allocate an object and
+ to compile a class method.
+ @comment TODO better description of the last one
- Finally, you can define named methods with @code{:m}.  One use of this
- feature is the definition of words that occur only in one class and are
- not intended to be overridden, but which still need method context
- (e.g., for accessing @code{inst-var}s).  Another use is for methods that
- would be bound frequently, if defined anonymously.
+ doc-object
+ doc-method
+ doc-var
+ doc-class
+ doc-end-class
+ doc-defines
+ doc-new
+ doc-::
- @node Classes and Scoping, Dividing classes, Method conveniences, Objects
- @subsubsection Classes and Scoping
- @cindex classes and scoping
- @cindex scoping and classes
- Inheritance is frequent, unlike structure extension. This exacerbates
- the problem with the field name convention (@pxref{Structure Naming
- Convention}): One always has to remember in which class the field was
- originally defined; changing a part of the class structure would require
- changes for renaming in otherwise unaffected code.
- @cindex @code{inst-var} visibility
+ @c -------------------------------------------------------------
- @cindex @code{inst-value} visibility
+ @node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF
- To solve this problem, I added a scoping mechanism (which was not in my
+ @subsubsection Mini-OOF Example
- original charter): A field defined with @code{inst-var} (or
+ @cindex mini-oof example
- @code{inst-value}) is visible only in the class where it is defined and in
- the descendent classes of this class.  Using such fields only makes
- sense in @code{m:}-defined methods in these classes anyway.
- This scoping mechanism allows us to use the unadorned field name,
+ A short example shows how to use this package. This example, in slightly
- because name clashes with unrelated words become much less likely.
+ extended form, is supplied as @file{moof-exm.fs}
+ @comment TODO could flesh this out with some comments from the Forthwrite article
- @cindex @code{protected} discussion
+ @example
- @cindex @code{private} discussion
+ object class
- Once we have this mechanism, we can also use it for controlling the
+   method init
- visibility of other words: All words defined after
+   method draw
- @code{protected} are visible only in the current class and its
+ end-class graphical
- descendents. @code{public} restores the compilation
+ @end example
- (i.e. @code{current}) word list that was in effect before. If you
- have several @code{protected}s without an intervening
- @code{public} or @code{set-current}, @code{public}
- will restore the compilation word list in effect before the first of
- these @code{protected}s.
- @node Dividing classes, Object Interfaces, Classes and Scoping, Objects
+ This code defines a class @code{graphical} with an
- @subsubsection Dividing classes
+ operation @code{draw}.  We can perform the operation
- @cindex Dividing classes
+ @code{draw} on any @code{graphical} object, e.g.:
- @cindex @code{methods}...@code{end-methods}
- You may want to do the definition of methods separate from the
+ @example
- definition of the class, its selectors, fields, and instance variables,
+100 t-rex draw
- i.e., separate the implementation from the definition.  You can do this
+ @end example
- in the following way:
+ where @code{t-rex} is an object or object pointer, created with e.g.
+ @code{graphical new Constant t-rex}.
+ For concrete graphical objects, we define child classes of the
+ class @code{graphical}, e.g.:
  @example
  graphical class
-   inst-value radius
+   cell var circle-radius
- end-class circle
+ end-class circle \ "graphical" is the parent class
- ... \ do some other stuff
+ :noname ( x y -- )
+   circle-radius @@ draw-circle ; circle defines draw
+ :noname ( r -- )
+   circle-radius ! ; circle defines init
+ @end example
- circle methods \ now we are ready
+ There is no implicit init method, so we have to define one. The creation
+ code of the object now has to call init explicitely.
- m: ( x y circle -- )
+ @example
-   radius draw-circle ;m
+ circle new Constant my-circle
- overrides draw
+my-circle init
+ @end example
- m: ( n-radius circle -- )
+ It is also possible to add a function to create named objects with
-   [to-inst] radius ;m
+ automatic call of @code{init}, given that all objects have @code{init}
- overrides construct
+ on the same place:
- end-methods
+ @example
+ : new: ( .. o "name" -- )
+     new dup Constant init ;
+circle new: large-circle
  @end example
- You can use several @code{methods}...@code{end-methods} sections.  The
+ We can draw this new circle at (100,100) with:
- only things you can do to the class in these sections are: defining
- methods, and overriding the class's selectors.  You must not define new
- selectors or fields.
- Note that you often have to override a selector before using it.  In
+ @example
- particular, you usually have to override @code{construct} with a new
+100 my-circle draw
- method before you can invoke @code{heap-new} and friends.  E.g., you
+ @end example
- must not create a circle before the @code{overrides construct} sequence
- in the example above.
- @node Object Interfaces, Objects Implementation, Dividing classes, Objects
+ @node Mini-OOF Implementation,  , Mini-OOF Example, Mini-OOF
- @subsubsection Object Interfaces
+ @subsubsection @file{mini-oof.fs} Implementation
- @cindex object interfaces
- @cindex interfaces for objects
- In this model you can only call selectors defined in the class of the
+ Object-oriented systems with late binding typically use a
- receiving objects or in one of its ancestors. If you call a selector
+ ``vtable''-approach: the first variable in each object is a pointer to a
- with a receiving object that is not in one of these classes, the
+ table, which contains the methods as function pointers. The vtable
- result is undefined; if you are lucky, the program crashes
+ may also contain other information.
- immediately.
- @cindex selectors common to hardly-related classes
+ So first, let's declare methods:
- Now consider the case when you want to have a selector (or several)
- available in two classes: You would have to add the selector to a
- common ancestor class, in the worst case to @code{object}. You
- may not want to do this, e.g., because someone else is responsible for
- this ancestor class.
- The solution for this problem is interfaces. An interface is a
+ @example
- collection of selectors. If a class implements an interface, the
+ : method ( m v -- m' v ) Create  over , swap cell+ swap
- selectors become available to the class and its descendents. A class
+   DOES> ( ... o -- ... ) @@ over @@ + @@ execute ;
- can implement an unlimited number of interfaces. For the problem
+ @end example
- discussed above, we would define an interface for the selector(s), and
- both classes would implement the interface.
+ During method declaration, the number of methods and instance
+ variables is on the stack (in address units). @code{method} creates
+ one method and increments the method number. To execute a method, it
+ takes the object, fetches the vtable pointer, adds the offset, and
+ executes the @i{xt} stored there. Each method takes the object it is
+ invoked from as top of stack parameter. The method itself should
+ consume that object.
+ Now, we also have to declare instance variables
+ @example
+ : var ( m v size -- m v' ) Create  over , +
+   DOES> ( o -- addr ) @@ + ;
+ @end example
+ As before, a word is created with the current offset. Instance
+ variables can have different sizes (cells, floats, doubles, chars), so
+ all we do is take the size and add it to the offset. If your machine
+ has alignment restrictions, put the proper @code{aligned} or
+ @code{faligned} before the variable, to adjust the variable
+ offset. That's why it is on the top of stack.
- As an example, consider an interface @code{storage} for
+ We need a starting point (the base object) and some syntactic sugar:
- writing objects to disk and getting them back, and a class
- @code{foo} that implements it. The code would look like this:
- @cindex @code{interface} usage
- @cindex @code{end-interface} usage
- @cindex @code{implementation} usage
  @example
- interface
+ Create object  1 cells , 2 cells ,
-   selector write ( file object -- )
+ : class ( class -- class methods vars ) dup 2@@ ;
-   selector read1 ( file object -- )
+ @end example
- end-interface storage
- bar class
+ For inheritance, the vtable of the parent object has to be
-   storage implementation
+ copied when a new, derived class is declared. This gives all the
+ methods of the parent class, which can be overridden, though.
- ... overrides write
+ @example
- ... overrides read1
+ : end-class  ( class methods vars -- )
- ...
+   Create  here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP
- end-class foo
+   cell+ dup cell+ r> rot @@ 2 cells /string move ;
  @end example
- @noindent
+ The first line creates the vtable, initialized with
- (I would add a word @code{read} @i{( file -- object )} that uses
+ @code{noop}s. The second line is the inheritance mechanism, it
- @code{read1} internally, but that's beyond the point illustrated
+ copies the xts from the parent vtable.
- here.)
- Note that you cannot use @code{protected} in an interface; and
- of course you cannot define fields.
- In the Neon model, all selectors are available for all classes;
+ We still have no way to define new methods, let's do that now:
- therefore it does not need interfaces. The price you pay in this model
- is slower late binding, and therefore, added complexity to avoid late
- binding.
- @node Objects Implementation, Objects Glossary, Object Interfaces, Objects
+ @example
- @subsubsection @file{objects.fs} Implementation
+ : defines ( xt class -- ) ' >body @@ + ! ;
- @cindex @file{objects.fs} implementation
+ @end example
- @cindex @code{object-map} discussion
+ To allocate a new object, we need a word, too:
- An object is a piece of memory, like one of the data structures
- described with @code{struct...end-struct}. It has a field
- @code{object-map} that points to the method map for the object's
- class.
- @cindex method map
+ @example
- @cindex virtual function table
+ : new ( class -- o )  here over @@ allot swap over ! ;
- The @emph{method map}@footnote{This is Self terminology; in C++
+ @end example
- terminology: virtual function table.} is an array that contains the
- execution tokens (@i{xt}s) of the methods for the object's class. Each
- selector contains an offset into a method map.
- @cindex @code{selector} implementation, class
+ Sometimes derived classes want to access the method of the
- @code{selector} is a defining word that uses
+ parent object. There are two ways to achieve this with Mini-OOF:
- @code{CREATE} and @code{DOES>}. The body of the
+ first, you could use named words, and second, you could look up the
- selector contains the offset; the @code{DOES>} action for a
+ vtable of the parent object.
- class selector is, basically:
  @example
- ( object addr ) @@ over object-map @@ + @@ execute
+ : :: ( class "name" -- ) ' >body @@ + @@ compile, ;
  @end example
- Since @code{object-map} is the first field of the object, it
- does not generate any code. As you can see, calling a selector has a
- small, constant cost.
- @cindex @code{current-interface} discussion
+ Nothing can be more confusing than a good example, so here is
- @cindex class implementation and representation
+ one. First let's declare a text object (called
- A class is basically a @code{struct} combined with a method
+ @code{button}), that stores text and position:
- map. During the class definition the alignment and size of the class
- are passed on the stack, just as with @code{struct}s, so
- @code{field} can also be used for defining class
- fields. However, passing more items on the stack would be
- inconvenient, so @code{class} builds a data structure in memory,
- which is accessed through the variable
- @code{current-interface}. After its definition is complete, the
- class is represented on the stack by a pointer (e.g., as parameter for
- a child class definition).
- A new class starts off with the alignment and size of its parent,
+ @example
- and a copy of the parent's method map. Defining new fields extends the
+ object class
- size and alignment; likewise, defining new selectors extends the
+   cell var text
- method map. @code{overrides} just stores a new @i{xt} in the method
+   cell var len
- map at the offset given by the selector.
+   cell var x
+   cell var y
+   method init
+   method draw
+ end-class button
+ @end example
- @cindex class binding, implementation
+ @noindent
- Class binding just gets the @i{xt} at the offset given by the selector
+ Now, implement the two methods, @code{draw} and @code{init}:
- from the class's method map and @code{compile,}s (in the case of
- @code{[bind]}) it.
- @cindex @code{this} implementation
+ @example
- @cindex @code{catch} and @code{this}
+ :noname ( o -- )
- @cindex @code{this} and @code{catch}
+  >r r@@ x @@ r@@ y @@ at-xy  r@@ text @@ r> len @@ type ;
- I implemented @code{this} as a @code{value}. At the
+  button defines draw
- start of an @code{m:...;m} method the old @code{this} is
+ :noname ( addr u o -- )
- stored to the return stack and restored at the end; and the object on
+  >r 0 r@@ x ! 0 r@@ y ! r@@ len ! r> text ! ;
- the TOS is stored @code{TO this}. This technique has one
+  button defines init
- disadvantage: If the user does not leave the method via
+ @end example
- @code{;m}, but via @code{throw} or @code{exit},
- @code{this} is not restored (and @code{exit} may
+ @noindent
- crash). To deal with the @code{throw} problem, I have redefined
+ To demonstrate inheritance, we define a class @code{bold-button}, with no
- @code{catch} to save and restore @code{this}; the same
+ new data and no new methods:
- should be done with any word that can catch an exception. As for
- @code{exit}, I simply forbid it (as a replacement, there is
- @code{exitm}).
- @cindex @code{inst-var} implementation
- @code{inst-var} is just the same as @code{field}, with
- a different @code{DOES>} action:
  @example
- @@ this +
+ button class
+ end-class bold-button
+ : bold   27 emit ." [1m" ;
+ : normal 27 emit ." [0m" ;
  @end example
- Similar for @code{inst-value}.
- @cindex class scoping implementation
+ @noindent
- Each class also has a word list that contains the words defined with
+ The class @code{bold-button} has a different draw method to
- @code{inst-var} and @code{inst-value}, and its protected
+ @code{button}, but the new method is defined in terms of the draw method
- words. It also has a pointer to its parent. @code{class} pushes
+ for @code{button}:
- the word lists of the class and all its ancestors onto the search order stack,
- and @code{end-class} drops them.
- @cindex interface implementation
+ @example
- An interface is like a class without fields, parent and protected
+ :noname bold [ button :: draw ] normal ; bold-button defines draw
- words; i.e., it just has a method map. If a class implements an
+ @end example
- interface, its method map contains a pointer to the method map of the
- interface. The positive offsets in the map are reserved for class
- methods, therefore interface map pointers have negative
- offsets. Interfaces have offsets that are unique throughout the
- system, unlike class selectors, whose offsets are only unique for the
- classes where the selector is available (invokable).
- This structure means that interface selectors have to perform one
+ @noindent
- indirection more than class selectors to find their method. Their body
+ Finally, create two objects and apply methods:
- contains the interface map pointer offset in the class method map, and
- the method offset in the interface method map. The
- @code{does>} action for an interface selector is, basically:
  @example
- ( object selector-body )
+ button new Constant foo
-dup selector-interface @@ ( object selector-body object interface-offset )
+ s" thin foo" foo init
- swap object-map @@ + @@ ( object selector-body map )
+ page
- swap selector-offset @@ + @@ execute
+ foo draw
+ bold-button new Constant bar
+ s" fat bar" bar init
+bar y !
+ bar draw
  @end example
- where @code{object-map} and @code{selector-offset} are
- first fields and generate no code.
- As a concrete example, consider the following code:
+ @node Comparison with other object models,  , Mini-OOF, Object-oriented Forth
+ @subsection Comparison with other object models
+ @cindex comparison of object models
+ @cindex object models, comparison
- @example
+ Many object-oriented Forth extensions have been proposed (@cite{A survey
- interface
+ of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford
-   selector if1sel1
+ J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the
-   selector if1sel2
+ relation of the object models described here to two well-known and two
- end-interface if1
+ closely-related (by the use of method maps) models.  Andras Zsoter
+ helped us with this section.
- object class
+ @cindex Neon model
-   if1 implementation
+ The most popular model currently seems to be the Neon model (see
-   selector cl1sel1
+ @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March
-   cell% inst-var cl1iv1
+) by Andrew McKewan) but this model has a number of limitations
+ @footnote{A longer version of this critique can be
+ found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth
+ Dimensions, May 1997) by Anton Ertl.}:
- ' m1 overrides construct
+ @itemize @bullet
- ' m2 overrides if1sel1
+ @item
- ' m3 overrides if1sel2
+ It uses a @code{@emph{selector object}} syntax, which makes it unnatural
- ' m4 overrides cl1sel2
+ to pass objects on the stack.
- end-class cl1
- create obj1 object dict-new drop
+ @item
- create obj2 cl1    dict-new drop
+ It requires that the selector parses the input stream (at
- @end example
+ compile time); this leads to reduced extensibility and to bugs that are+
+ hard to find.
- The data structure created by this code (including the data structure
+ @item
- for @code{object}) is shown in the <a
+ It allows using every selector to every object;
- href="objects-implementation.eps">figure</a>, assuming a cell size of 4.
+ this eliminates the need for classes, but makes it harder to create
- @comment TODO add this diagram..
+ efficient implementations.
+ @end itemize
- @node Objects Glossary,  , Objects Implementation, Objects
+ @cindex Pountain's object-oriented model
- @subsubsection @file{objects.fs} Glossary
+ Another well-known publication is @cite{Object-Oriented Forth} (Academic
- @cindex @file{objects.fs} Glossary
+ Press, London, 1987) by Dick Pountain. However, it is not really about
+ object-oriented programming, because it hardly deals with late
+ binding. Instead, it focuses on features like information hiding and
+ overloading that are characteristic of modular languages like Ada (83).
+ @cindex Zsoter's object-oriented model
+ In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1)
+, pages 31-35) Andras Zsoter describes a model that makes heavy use
+ of an active object (like @code{this} in @file{objects.fs}): The active
+ object is not only used for accessing all fields, but also specifies the
+ receiving object of every selector invocation; you have to change the
+ active object explicitly with @code{@{ ... @}}, whereas in
+ @file{objects.fs} it changes more or less implicitly at @code{m:
+ ... ;m}. Such a change at the method entry point is unnecessary with the
+ Zsoter's model, because the receiving object is the active object
+ already. On the other hand, the explicit change is absolutely necessary
+ in that model, because otherwise no one could ever change the active
+ object. An ANS Forth implementation of this model is available at
+ @uref{http://www.forth.org/fig/oopf.html}.
+ @cindex @file{oof.fs}, differences to other models
+ The @file{oof.fs} model combines information hiding and overloading
+ resolution (by keeping names in various word lists) with object-oriented
+ programming. It sets the active object implicitly on method entry, but
+ also allows explicit changing (with @code{>o...o>} or with
+ @code{with...endwith}). It uses parsing and state-smart objects and
+ classes for resolving overloading and for early binding: the object or
+ class parses the selector and determines the method from this. If the
+ selector is not parsed by an object or class, it performs a call to the
+ selector for the active object (late binding), like Zsoter's model.
+ Fields are always accessed through the active object. The big
+ disadvantage of this model is the parsing and the state-smartness, which
+ reduces extensibility and increases the opportunities for subtle bugs;
+ essentially, you are only safe if you never tick or @code{postpone} an
+ object or class (Bernd disagrees, but I (Anton) am not convinced).
- doc---objects-bind
+ @cindex @file{mini-oof.fs}, differences to other models
- doc---objects-<bind>
+ The @file{mini-oof.fs} model is quite similar to a very stripped-down
- doc---objects-bind'
+ version of the @file{objects.fs} model, but syntactically it is a
- doc---objects-[bind]
+ mixture of the @file{objects.fs} and @file{oof.fs} models.
- doc---objects-class
- doc---objects-class->map
- doc---objects-class-inst-size
- doc---objects-class-override!
- doc---objects-construct
- doc---objects-current'
- doc---objects-[current]
- doc---objects-current-interface
- doc---objects-dict-new
- doc---objects-drop-order
- doc---objects-end-class
- doc---objects-end-class-noname
- doc---objects-end-interface
- doc---objects-end-interface-noname
- doc---objects-end-methods
- doc---objects-exitm
- doc---objects-heap-new
- doc---objects-implementation
- doc---objects-init-object
- doc---objects-inst-value
- doc---objects-inst-var
- doc---objects-interface
- doc---objects-m:
- doc---objects-:m
- doc---objects-;m
- doc---objects-method
- doc---objects-methods
- doc---objects-object
- doc---objects-overrides
- doc---objects-[parent]
- doc---objects-print
- doc---objects-protected
- doc---objects-public
- doc---objects-push-order
- doc---objects-selector
- doc---objects-this
- doc---objects-<to-inst>
- doc---objects-[to-inst]
- doc---objects-to-this
- doc---objects-xt-new
  @c -------------------------------------------------------------
- @node OOF, Mini-OOF, Objects, Object-oriented Forth
+ @node Programming Tools, Assembler and Code Words, Object-oriented Forth, Words
- @subsection The @file{oof.fs} model
+ @section Programming Tools
- @cindex oof
+ @cindex programming tools
- @cindex object-oriented programming
- @cindex @file{objects.fs}
+ @c !! move this and assembler down below OO stuff.
- @cindex @file{oof.fs}
- This section describes the @file{oof.fs} package.
+ @menu
+ * Examining::
+ * Forgetting words::
+ * Debugging::                   Simple and quick.
+ * Assertions::                  Making your programs self-checking.
+ * Singlestep Debugger::         Executing your program word by word.
+ @end menu
- The package described in this section has been used in bigFORTH since 1991, and
+ @node Examining, Forgetting words, Programming Tools, Programming Tools
- used for two large applications: a chromatographic system used to
+ @subsection Examining data and code
- create new medicaments, and a graphic user interface library (MINOS).
+ @cindex examining data and code
+ @cindex data examination
+ @cindex code examination
- You can find a description (in German) of @file{oof.fs} in @cite{Object
+ The following words inspect the stack non-destructively:
- oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension}
-(2), 1994.
- @menu
+ doc-.s
- * Properties of the OOF model::
+ doc-f.s
- * Basic OOF Usage::
- * The OOF base class::
- * Class Declaration::
- * Class Implementation::
- @end menu
- @node Properties of the OOF model, Basic OOF Usage, OOF, OOF
+ There is a word @code{.r} but it does @i{not} display the return stack!
- @subsubsection Properties of the @file{oof.fs} model
+ It is used for formatted numeric output (@pxref{Simple numeric output}).
- @cindex @file{oof.fs} properties
- @itemize @bullet
+ doc-depth
- @item
+ doc-fdepth
- This model combines object oriented programming with information
+ doc-clearstack
- hiding. It helps you writing large application, where scoping is
- necessary, because it provides class-oriented scoping.
- @item
+ The following words inspect memory.
- Named objects, object pointers, and object arrays can be created,
- selector invocation uses the ``object selector'' syntax. Selector invocation
- to objects and/or selectors on the stack is a bit less convenient, but
- possible.
- @item
+ doc-?
- Selector invocation and instance variable usage of the active object is
+ doc-dump
- straightforward, since both make use of the active object.
- @item
+ And finally, @code{see} allows to inspect code:
- Late binding is efficient and easy to use.
- @item
+ doc-see
- State-smart objects parse selectors. However, extensibility is provided
+ doc-xt-see
- using a (parsing) selector @code{postpone} and a selector @code{'}.
- @item
+ @node Forgetting words, Debugging, Examining, Programming Tools
- An implementation in ANS Forth is available.
+ @subsection Forgetting words
+ @cindex words, forgetting
+ @cindex forgeting words
- @end itemize
+ @c  anton: other, maybe better places for this subsection: Defining Words;
+ @c  Dictionary allocation.  At least a reference should be there.
+ Forth allows you to forget words (and everything that was alloted in the
+ dictonary after them) in a LIFO manner.
- @node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF
+ doc-marker
- @subsubsection Basic @file{oof.fs} Usage
- @cindex @file{oof.fs} usage
- This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}).
+ The most common use of this feature is during progam development: when
+ you change a source file, forget all the words it defined and load it
+ again (since you also forget everything defined after the source file
+ was loaded, you have to reload that, too).  Note that effects like
+ storing to variables and destroyed system words are not undone when you
+ forget words.  With a system like Gforth, that is fast enough at
+ starting up and compiling, I find it more convenient to exit and restart
+ Gforth, as this gives me a clean slate.
- You can define a class for graphical objects like this:
+ Here's an example of using @code{marker} at the start of a source file
+ that you are debugging; it ensures that you only ever have one copy of
+ the file's definitions compiled at any time:
- @cindex @code{class} usage
- @cindex @code{class;} usage
- @cindex @code{method} usage
  @example
- object class graphical \ "object" is the parent class
+ [IFDEF] my-code
-   method draw ( x y graphical -- )
+     my-code
- class;
+ [ENDIF]
- @end example
- This code defines a class @code{graphical} with an
+ marker my-code
- operation @code{draw}.  We can perform the operation
+ init-included-files
- @code{draw} on any @code{graphical} object, e.g.:
- @example
+ \ .. definitions start here
-100 t-rex draw
+ \ .
+ \ .
+ \ end
  @end example
- @noindent
- where @code{t-rex} is an object or object pointer, created with e.g.
- @code{graphical : t-rex}.
- @cindex abstract class
+ @node Debugging, Assertions, Forgetting words, Programming Tools
- How do we create a graphical object? With the present definitions,
+ @subsection Debugging
- we cannot create a useful graphical object. The class
+ @cindex debugging
- @code{graphical} describes graphical objects in general, but not
- any concrete graphical object type (C++ users would call it an
- @emph{abstract class}); e.g., there is no method for the selector
- @code{draw} in the class @code{graphical}.
- For concrete graphical objects, we define child classes of the
+ Languages with a slow edit/compile/link/test development loop tend to
- class @code{graphical}, e.g.:
+ require sophisticated tracing/stepping debuggers to facilate debugging.
- @example
+ A much better (faster) way in fast-compiling languages is to add
- graphical class circle \ "graphical" is the parent class
+ printing code at well-selected places, let the program run, look at
-   cell var circle-radius
+ the output, see where things went wrong, add more printing code, etc.,
- how:
+ until the bug is found.
-   : draw ( x y -- )
-     circle-radius @@ draw-circle ;
-   : init ( n-radius -- (
+ The simple debugging aids provided in @file{debugs.fs}
-     circle-radius ! ;
+ are meant to support this style of debugging.
- class;
- @end example
- Here we define a class @code{circle} as a child of @code{graphical},
+ The word @code{~~} prints debugging information (by default the source
- with a field @code{circle-radius}; it defines new methods for the
+ location and the stack contents). It is easy to insert. If you use Emacs
- selectors @code{draw} and @code{init} (@code{init} is defined in
+ it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to
- @code{object}, the parent class of @code{graphical}).
+ query-replace them with nothing). The deferred words
+ @code{printdebugdata} and @code{printdebugline} control the output of
+ @code{~~}. The default source location output format works well with
+ Emacs' compilation mode, so you can step through the program at the
+ source level using @kbd{C-x `} (the advantage over a stepping debugger
+ is that you can step in any direction and you know where the crash has
+ happened or where the strange data has occurred).
- Now we can create a circle in the dictionary with:
+ doc-~~
+ doc-printdebugdata
+ doc-printdebugline
+ @node Assertions, Singlestep Debugger, Debugging, Programming Tools
+ @subsection Assertions
+ @cindex assertions
+ It is a good idea to make your programs self-checking, especially if you
+ make an assumption that may become invalid during maintenance (for
+ example, that a certain field of a data structure is never zero). Gforth
+ supports @dfn{assertions} for this purpose. They are used like this:
  @example
-circle : my-circle
+ assert( @i{flag} )
  @end example
- @noindent
+ The code between @code{assert(} and @code{)} should compute a flag, that
- @code{:} invokes @code{init}, thus initializing the field
+ should be true if everything is alright and false otherwise. It should
- @code{circle-radius} with 50. We can draw this new circle at (100,100)
+ not change anything else on the stack. The overall stack effect of the
- with:
+ assertion is @code{( -- )}. E.g.
  @example
-100 my-circle draw
+ assert( 1 1 + 2 = ) \ what we learn in school
+ assert( dup 0<> ) \ assert that the top of stack is not zero
+ assert( false ) \ this code should not be reached
  @end example
- @cindex selector invocation, restrictions
+ The need for assertions is different at different times. During
- @cindex class definition, restrictions
+ debugging, we want more checking, in production we sometimes care more
- Note: You can only invoke a selector if the receiving object belongs to
+ for speed. Therefore, assertions can be turned off, i.e., the assertion
- the class where the selector was defined or one of its descendents;
+ becomes a comment. Depending on the importance of an assertion and the
- e.g., you can invoke @code{draw} only for objects belonging to
+ time it takes to check it, you may want to turn off some assertions and
- @code{graphical} or its descendents (e.g., @code{circle}). The scoping
+ keep others turned on. Gforth provides several levels of assertions for
- mechanism will check if you try to invoke a selector that is not
+ this purpose:
- defined in this class hierarchy, so you'll get an error at compilation
- time.
- @node The OOF base class, Class Declaration, Basic OOF Usage, OOF
+ doc-assert0(
- @subsubsection The @file{oof.fs} base class
+ doc-assert1(
- @cindex @file{oof.fs} base class
+ doc-assert2(
+ doc-assert3(
+ doc-assert(
+ doc-)
- When you define a class, you have to specify a parent class.  So how do
- you start defining classes? There is one class available from the start:
- @code{object}. You have to use it as ancestor for all classes. It is the
- only class that has no parent. Classes are also objects, except that
- they don't have instance variables; class manipulation such as
- inheritance or changing definitions of a class is handled through
- selectors of the class @code{object}.
- @code{object} provides a number of selectors:
+ The variable @code{assert-level} specifies the highest assertions that
+ are turned on. I.e., at the default @code{assert-level} of one,
+ @code{assert0(} and @code{assert1(} assertions perform checking, while
+ @code{assert2(} and @code{assert3(} assertions are treated as comments.
- @itemize @bullet
+ The value of @code{assert-level} is evaluated at compile-time, not at
- @item
+ run-time. Therefore you cannot turn assertions on or off at run-time;
- @code{class} for subclassing, @code{definitions} to add definitions
+ you have to set the @code{assert-level} appropriately before compiling a
- later on, and @code{class?} to get type informations (is the class a
+ piece of code. You can compile different pieces of code at different
- subclass of the class passed on the stack?).
+ @code{assert-level}s (e.g., a trusted library at level 1 and
+ newly-written code at level 3).
- doc---object-class
- doc---object-definitions
+ doc-assert-level
- doc---object-class?
- @item
+ If an assertion fails, a message compatible with Emacs' compilation mode
- @code{init} and @code{dispose} as constructor and destructor of the
+ is produced and the execution is aborted (currently with @code{ABORT"}.
- object. @code{init} is invocated after the object's memory is allocated,
+ If there is interest, we will introduce a special throw code. But if you
- while @code{dispose} also handles deallocation. Thus if you redefine
+ intend to @code{catch} a specific condition, using @code{throw} is
- @code{dispose}, you have to call the parent's dispose with @code{super
+ probably more appropriate than an assertion).
- dispose}, too.
- doc---object-init
+ Definitions in ANS Forth for these assertion words are provided
- doc---object-dispose
+ in @file{compat/assert.fs}.
- @item
+ @node Singlestep Debugger,  , Assertions, Programming Tools
- @code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and
+ @subsection Singlestep Debugger
- @code{[]} to create named and unnamed objects and object arrays or
+ @cindex singlestep Debugger
- object pointers.
+ @cindex debugging Singlestep
- doc---object-new
+ When you create a new word there's often the need to check whether it
- doc---object-new[]
+ behaves correctly or not. You can do this by typing @code{dbg
- doc---object-:
+ badword}. A debug session might look like this:
- doc---object-ptr
- doc---object-asptr
- doc---object-[]
+ @example
+ : badword 0 DO i . LOOP ;  ok
+dbg badword
+ : badword
+ Scanning code...
- @item
+ Nesting debugger ready!
- @code{::} and @code{super} for explicit scoping. You should use explicit
- scoping only for super classes or classes with the same set of instance
- variables. Explicitly-scoped selectors use early binding.
- doc---object-::
+D4738  8049BC4 0              -> [ 2 ] 00002 00000
- doc---object-super
+D4740  8049F68 DO             -> [ 0 ]
+D4744  804A0C8 i              -> [ 1 ] 00000
+D4748 400C5E60 .              -> 0 [ 0 ]
+D474C  8049D0C LOOP           -> [ 0 ]
+D4744  804A0C8 i              -> [ 1 ] 00001
+D4748 400C5E60 .              -> 1 [ 0 ]
+D474C  8049D0C LOOP           -> [ 0 ]
+D4758  804B384 ;              ->  ok
+ @end example
+ Each line displayed is one step. You always have to hit return to
+ execute the next word that is displayed. If you don't want to execute
+ the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is
+ an overview what keys are available:
- @item
+ @table @i
- @code{self} to get the address of the object
- doc---object-self
+ @item @key{RET}
+ Next; Execute the next word.
+ @item n
+ Nest; Single step through next word.
- @item
+ @item u
- @code{bind}, @code{bound}, @code{link}, and @code{is} to assign object
+ Unnest; Stop debugging and execute rest of word. If we got to this word
- pointers and instance defers.
+ with nest, continue debugging with the calling word.
- doc---object-bind
+ @item d
- doc---object-bound
+ Done; Stop debugging and execute rest.
- doc---object-link
- doc---object-is
+ @item s
+ Stop; Abort immediately.
- @item
+ @end table
- @code{'} to obtain selector tokens, @code{send} to invocate selectors
- form the stack, and @code{postpone} to generate selector invocation code.
- doc---object-'
+ Debugging large application with this mechanism is very difficult, because
- doc---object-postpone
+ you have to nest very deeply into the program before the interesting part
+ begins. This takes a lot of time.
+ To do it more directly put a @code{BREAK:} command into your source code.
+ When program execution reaches @code{BREAK:} the single step debugger is
+ invoked and you have all the features described above.
- @item
+ If you have more than one part to debug it is useful to know where the
- @code{with} and @code{endwith} to select the active object from the
+ program has stopped at the moment. You can do this by the
- stack, and enable its scope. Using @code{with} and @code{endwith}
+ @code{BREAK" string"} command. This behaves like @code{BREAK:} except that
- also allows you to create code using selector @code{postpone} without being
+ string is typed out when the ``breakpoint'' is reached.
- trapped by the state-smart objects.
- doc---object-with
- doc---object-endwith
+ doc-dbg
+ doc-break:
+ doc-break"
- @end itemize
- @node Class Declaration, Class Implementation, The OOF base class, OOF
- @subsubsection Class Declaration
- @cindex class declaration
- @itemize @bullet
+ @c -------------------------------------------------------------
- @item
+ @node Assembler and Code Words, Threading Words, Programming Tools, Words
- Instance variables
+ @section Assembler and Code Words
+ @cindex assembler
+ @cindex code words
- doc---oof-var
+ @menu
+ * Code and ;code::
+ * Common Assembler::            Assembler Syntax
+ * Common Disassembler::
+ * 386 Assembler::               Deviations and special cases
+ * Alpha Assembler::             Deviations and special cases
+ * MIPS assembler::              Deviations and special cases
+ * Other assemblers::            How to write them
+ @end menu
+ @node Code and ;code, Common Assembler, Assembler and Code Words, Assembler and Code Words
+ @subsection @code{Code} and @code{;code}
- @item
+ Gforth provides some words for defining primitives (words written in
- Object pointers
+ machine code), and for defining the machine-code equivalent of
+ @code{DOES>}-based defining words. However, the machine-independent
+ nature of Gforth poses a few problems: First of all, Gforth runs on
+ several architectures, so it can provide no standard assembler. What's
+ worse is that the register allocation not only depends on the processor,
+ but also on the @code{gcc} version and options used.
- doc---oof-ptr
+ The words that Gforth offers encapsulate some system dependences (e.g.,
- doc---oof-asptr
+ the header structure), so a system-independent assembler may be used in
+ Gforth. If you do not have an assembler, you can compile machine code
+ directly with @code{,} and @code{c,}@footnote{This isn't portable,
+ because these words emit stuff in @i{data} space; it works because
+ Gforth has unified code/data spaces. Assembler isn't likely to be
+ portable anyway.}.
- @item
+ doc-assembler
- Instance defers
+ doc-init-asm
+ doc-code
+ doc-end-code
+ doc-;code
+ doc-flush-icache
- doc---oof-defer
+ If @code{flush-icache} does not work correctly, @code{code} words
+ etc. will not work (reliably), either.
+ The typical usage of these @code{code} words can be shown most easily by
+ analogy to the equivalent high-level defining words:
- @item
+ @example
- Method selectors
+ : foo                              code foo
+    <high-level Forth words>              <assembler>
+ ;                                  end-code
+ : bar                              : bar
+    <high-level Forth words>           <high-level Forth words>
+    CREATE                             CREATE
+       <high-level Forth words>           <high-level Forth words>
+    DOES>                              ;code
+       <high-level Forth words>           <assembler>
+ ;                                  end-code
+ @end example
- doc---oof-early
+ @c anton: the following stuff is also in "Common Assembler", in less detail.
- doc---oof-method
+ @cindex registers of the inner interpreter
+ In the assembly code you will want to refer to the inner interpreter's
+ registers (e.g., the data stack pointer) and you may want to use other
+ registers for temporary storage. Unfortunately, the register allocation
+ is installation-dependent.
+ In particular, @code{ip} (Forth instruction pointer) and @code{rp}
+ (return stack pointer) are in different places in @code{gforth} and
+ @code{gforth-fast}.  This means that you cannot write a @code{NEXT}
+ routine that works on both versions; so for doing @code{NEXT}, I
+ recomment jumping to @code{' noop >code-address}, which contains nothing
+ but a @code{NEXT}.
- @item
+ For general accesses to the inner interpreter's registers, the easiest
- Class-wide variables
+ solution is to use explicit register declarations (@pxref{Explicit Reg
+ Vars, , Variables in Specified Registers, gcc.info, GNU C Manual}) for
+ all of the inner interpreter's registers: You have to compile Gforth
+ with @code{-DFORCE_REG} (configure option @code{--enable-force-reg}) and
+ the appropriate declarations must be present in the @code{machine.h}
+ file (see @code{mips.h} for an example; you can find a full list of all
+ declarable register symbols with @code{grep register engine.c}). If you
+ give explicit registers to all variables that are declared at the
+ beginning of @code{engine()}, you should be able to use the other
+ caller-saved registers for temporary storage. Alternatively, you can use
+ the @code{gcc} option @code{-ffixed-REG} (@pxref{Code Gen Options, ,
+ Options for Code Generation Conventions, gcc.info, GNU C Manual}) to
+ reserve a register (however, this restriction on register allocation may
+ slow Gforth significantly).
- doc---oof-static
+ If this solution is not viable (e.g., because @code{gcc} does not allow
+ you to explicitly declare all the registers you need), you have to find
+ out by looking at the code where the inner interpreter's registers
+ reside and which registers can be used for temporary storage. You can
+ get an assembly listing of the engine's code with @code{make engine.s}.
+ In any case, it is good practice to abstract your assembly code from the
+ actual register allocation. E.g., if the data stack pointer resides in
+ register @code{$17}, create an alias for this register called @code{sp},
+ and use that in your assembly code.
- @item
+ @cindex code words, portable
- End declaration
+ Another option for implementing normal and defining words efficiently
+ is to add the desired functionality to the source of Gforth. For normal
+ words you just have to edit @file{primitives} (@pxref{Automatic
+ Generation}). Defining words (equivalent to @code{;CODE} words, for fast
+ defined words) may require changes in @file{engine.c}, @file{kernel.fs},
+ @file{prims2x.fs}, and possibly @file{cross.fs}.
- doc---oof-how:
+ @node Common Assembler, Common Disassembler, Code and ;code, Assembler and Code Words
- doc---oof-class;
+ @subsection Common Assembler
+ The assemblers in Gforth generally use a postfix syntax, i.e., the
+ instruction name follows the operands.
- @end itemize
+ The operands are passed in the usual order (the same that is used in the
+ manual of the architecture).  Since they all are Forth words, they have
+ to be separated by spaces; you can also use Forth words to compute the
+ operands.
- @c -------------------------------------------------------------
+ The instruction names usually end with a @code{,}.  This makes it easier
- @node Class Implementation,  , Class Declaration, OOF
+ to visually separate instructions if you put several of them on one
- @subsubsection Class Implementation
+ line; it also avoids shadowing other Forth words (e.g., @code{and}).
- @cindex class implementation
- @c -------------------------------------------------------------
+ Registers are usually specified by number; e.g., (decimal) @code{11}
- @node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth
+ specifies registers R11 and F11 on the Alpha architecture (which one,
- @subsection The @file{mini-oof.fs} model
+ depends on the instruction).  The usual names are also available, e.g.,
- @cindex mini-oof
+ @code{s2} for R11 on Alpha.
- Gforth's third object oriented Forth package is a 12-liner. It uses a
+ Control flow is specified similar to normal Forth code (@pxref{Arbitrary
- mixture of the @file{object.fs} and the @file{oof.fs} syntax,
+ control structures}), with @code{if,}, @code{ahead,}, @code{then,},
- and reduces to the bare minimum of features. This is based on a posting
+ @code{begin,}, @code{until,}, @code{again,}, @code{cs-roll},
- of Bernd Paysan in comp.lang.forth.
+ @code{cs-pick}, @code{else,}, @code{while,}, and @code{repeat,}.  The
+ conditions are specified in a way specific to each assembler.
- @menu
+ Note that the register assignments of the Gforth engine can change
- * Basic Mini-OOF Usage::
+ between Gforth versions, or even between different compilations of the
- * Mini-OOF Example::
+ same Gforth version (e.g., if you use a different GCC version).  So if
- * Mini-OOF Implementation::
+ you want to refer to Gforth's registers (e.g., the stack pointer or
- @end menu
+ TOS), I recommend defining your own words for refering to these
+ registers, and using them later on; then you can easily adapt to a
+ changed register assignment.  The stability of the register assignment
+ is usually better if you build Gforth with @code{--enable-force-reg}.
- @c -------------------------------------------------------------
+ In particular, the return stack pointer and the instruction pointer are
- @node Basic Mini-OOF Usage, Mini-OOF Example, Mini-OOF, Mini-OOF
+ in memory in @code{gforth}, and usually in registers in
- @subsubsection Basic @file{mini-oof.fs} Usage
+ @code{gforth-fast}.  The most common use of these registers is to
- @cindex mini-oof usage
+ dispatch to the next word (the @code{next} routine).  A portable way to
+ do this is to jump to @code{' noop >code-address} (of course, this is
+ less efficient than integrating the @code{next} code and scheduling it
+ well).
- There is a base class (@code{class}, which allocates one cell for the
+ @node  Common Disassembler, 386 Assembler, Common Assembler, Assembler and Code Words
- object pointer) plus seven other words: to define a method, a variable,
+ @subsection Common Disassembler
- a class; to end a class, to resolve binding, to allocate an object and
- to compile a class method.
- @comment TODO better description of the last one
+ You can disassemble a @code{code} word with @code{see}
+ (@pxref{Debugging}).  You can disassemble a section of memory with
- doc-object
+ doc-disasm
- doc-method
- doc-var
- doc-class
- doc-end-class
- doc-defines
- doc-new
- doc-::
+ The disassembler generally produces output that can be fed into the
+ assembler (i.e., same syntax, etc.).  It also includes additional
+ information in comments.  In particular, the address of the instruction
+ is given in a comment before the instruction.
+ @code{See} may display more or less than the actual code of the word,
+ because the recognition of the end of the code is unreliable.  You can
+ use @code{disasm} if it did not display enough.  It may display more, if
+ the code word is not immediately followed by a named word.  If you have
+ something else there, you can follow the word with @code{align last @ ,}
+ to ensure that the end is recognized.
- @c -------------------------------------------------------------
+ @node 386 Assembler, Alpha Assembler, Common Disassembler, Assembler and Code Words
- @node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF
+ @subsection 386 Assembler
- @subsubsection Mini-OOF Example
- @cindex mini-oof example
- A short example shows how to use this package. This example, in slightly
+ The 386 assembler included in Gforth was written by Bernd Paysan, it's
- extended form, is supplied as @file{moof-exm.fs}
+ available under GPL, and originally part of bigFORTH.
- @comment TODO could flesh this out with some comments from the Forthwrite article
- @example
+ The 386 disassembler included in Gforth was written by Andrew McKewan
- object class
+ and is in the public domain.
-   method init
-   method draw
- end-class graphical
- @end example
- This code defines a class @code{graphical} with an
+ The disassembler displays code in prefix Intel syntax.
- operation @code{draw}.  We can perform the operation
- @code{draw} on any @code{graphical} object, e.g.:
- @example
+ The assembler uses a postfix syntax with reversed parameters.
-100 t-rex draw
- @end example
- where @code{t-rex} is an object or object pointer, created with e.g.
+ The assembler includes all instruction of the Athlon, i.e. 486 core
- @code{graphical new Constant t-rex}.
+ instructions, Pentium and PPro extensions, floating point, MMX, 3Dnow!,
+ but not ISSE. It's an integrated 16- and 32-bit assembler. Default is 32
+ bit, you can switch to 16 bit with .86 and back to 32 bit with .386.
- For concrete graphical objects, we define child classes of the
+ There are several prefixes to switch between different operation sizes,
- class @code{graphical}, e.g.:
+ @code{.b} for byte accesses, @code{.w} for word accesses, @code{.d} for
+ double-word accesses. Addressing modes can be switched with @code{.wa}
+ for 16 bit addresses, and @code{.da} for 32 bit addresses. You don't
+ need a prefix for byte register names (@code{AL} et al).
- @example
+ For floating point operations, the prefixes are @code{.fs} (IEEE
- graphical class
+ single), @code{.fl} (IEEE double), @code{.fx} (extended), @code{.fw}
-   cell var circle-radius
+ (word), @code{.fd} (double-word), and @code{.fq} (quad-word).
- end-class circle \ "graphical" is the parent class
- :noname ( x y -- )
+ The MMX opcodes don't have size prefixes, they are spelled out like in
-   circle-radius @@ draw-circle ; circle defines draw
+ the Intel assembler. Instead of move from and to memory, there are
- :noname ( r -- )
+ PLDQ/PLDD and PSTQ/PSTD.
-   circle-radius ! ; circle defines init
- @end example
- There is no implicit init method, so we have to define one. The creation
+ The registers lack the 'e' prefix; even in 32 bit mode, eax is called
- code of the object now has to call init explicitely.
+ ax.  Immediate values are indicated by postfixing them with @code{#},
+ e.g., @code{3 #}.  Here are some examples of addressing modes:
  @example
- circle new Constant my-circle
+#          \ immediate
-my-circle init
+ ax           \ register
+di d)    \ 100[edi]
+bx cx di)  \ 4[ebx][ecx]
+ di ax *4 i)  \ [edi][eax*4]
+ax *4 i#) \ 20[eax*4]
  @end example
- It is also possible to add a function to create named objects with
+ Some example of instructions are:
- automatic call of @code{init}, given that all objects have @code{init}
- on the same place:
  @example
- : new: ( .. o "name" -- )
+ ax bx mov             \ move ebx,eax
-     new dup Constant init ;
+# ax mov            \ mov eax,3
-circle new: large-circle
+di ) ax mov       \ mov eax,100[edi]
+bx cx di) ax mov    \ mov eax,4[ebx][ecx]
+ .w ax bx mov          \ mov bx,ax
  @end example
- We can draw this new circle at (100,100) with:
+ The following forms are supported for binary instructions:
  @example
-100 my-circle draw
+ <reg> <reg> <inst>
+ <n> # <reg> <inst>
+ <mem> <reg> <inst>
+ <reg> <mem> <inst>
  @end example
- @node Mini-OOF Implementation,  , Mini-OOF Example, Mini-OOF
+ Immediate to memory is not supported.  The shift/rotate syntax is:
- @subsubsection @file{mini-oof.fs} Implementation
- Object-oriented systems with late binding typically use a
- ``vtable''-approach: the first variable in each object is a pointer to a
- table, which contains the methods as function pointers. The vtable
- may also contain other information.
- So first, let's declare methods:
  @example
- : method ( m v -- m' v ) Create  over , swap cell+ swap
+ <reg/mem> 1 # shl \ shortens to shift without immediate
-   DOES> ( ... o -- ... ) @@ over @@ + @@ execute ;
+ <reg/mem> 4 # shl
+ <reg/mem> cl shl
  @end example
- During method declaration, the number of methods and instance
+ Precede string instructions (@code{movs} etc.) with @code{.b} to get
- variables is on the stack (in address units). @code{method} creates
+ the byte version.
- one method and increments the method number. To execute a method, it
- takes the object, fetches the vtable pointer, adds the offset, and
- executes the @i{xt} stored there. Each method takes the object it is
- invoked from as top of stack parameter. The method itself should
- consume that object.
- Now, we also have to declare instance variables
- @example
- : var ( m v size -- m v' ) Create  over , +
-   DOES> ( o -- addr ) @@ + ;
- @end example
- As before, a word is created with the current offset. Instance
+ The control structure words @code{IF} @code{UNTIL} etc. must be preceded
- variables can have different sizes (cells, floats, doubles, chars), so
+ by one of these conditions: @code{vs vc u< u>= 0= 0<> u<= u> 0< 0>= ps
- all we do is take the size and add it to the offset. If your machine
+ pc < >= <= >}. (Note that most of these words shadow some Forth words
- has alignment restrictions, put the proper @code{aligned} or
+ when @code{assembler} is in front of @code{forth} in the search path,
- @code{faligned} before the variable, to adjust the variable
+ e.g., in @code{code} words).  Currently the control structure words use
- offset. That's why it is on the top of stack.
+ one stack item, so you have to use @code{roll} instead of @code{cs-roll}
+ to shuffle them (you can also use @code{swap} etc.).
- We need a starting point (the base object) and some syntactic sugar:
+ Here is an example of a @code{code} word (assumes that the stack pointer
+ is in esi and the TOS is in ebx):
  @example
- Create object  1 cells , 2 cells ,
+ code my+ ( n1 n2 -- n )
- : class ( class -- class methods vars ) dup 2@@ ;
+si D) bx add
+# si add
+     Next
+ end-code
  @end example
- For inheritance, the vtable of the parent object has to be
+ @node Alpha Assembler, MIPS assembler, 386 Assembler, Assembler and Code Words
- copied when a new, derived class is declared. This gives all the
+ @subsection Alpha Assembler
- methods of the parent class, which can be overridden, though.
- @example
+ The Alpha assembler and disassembler were originally written by Bernd
- : end-class  ( class methods vars -- )
+ Thallner.
-   Create  here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP
-   cell+ dup cell+ r> rot @@ 2 cells /string move ;
- @end example
- The first line creates the vtable, initialized with
+ The register names @code{a0}--@code{a5} are not available to avoid
- @code{noop}s. The second line is the inheritance mechanism, it
+ shadowing hex numbers.
- copies the xts from the parent vtable.
- We still have no way to define new methods, let's do that now:
+ Immediate forms of arithmetic instructions are distinguished by a
+ @code{#} just before the @code{,}, e.g., @code{and#,} (note: @code{lda,}
+ does not count as arithmetic instruction).
- @example
+ You have to specify all operands to an instruction, even those that
- : defines ( xt class -- ) ' >body @@ + ! ;
+ other assemblers consider optional, e.g., the destination register for
- @end example
+ @code{br,}, or the destination register and hint for @code{jmp,}.
- To allocate a new object, we need a word, too:
+ You can specify conditions for @code{if,} by removing the first @code{b}
+ and the trailing @code{,} from a branch with a corresponding name; e.g.,
  @example
- : new ( class -- o )  here over @@ allot swap over ! ;
+fgt if, \ if F11>0e
+   ...
+ endif,
  @end example
- Sometimes derived classes want to access the method of the
+ @code{fbgt,} gives @code{fgt}.
- parent object. There are two ways to achieve this with Mini-OOF:
- first, you could use named words, and second, you could look up the
- vtable of the parent object.
- @example
+ @node MIPS assembler, Other assemblers, Alpha Assembler, Assembler and Code Words
- : :: ( class "name" -- ) ' >body @@ + @@ compile, ;
+ @subsection MIPS assembler
- @end example
+ The MIPS assembler was originally written by Christian Pirker.
- Nothing can be more confusing than a good example, so here is
+ Currently the assembler and disassembler only cover the MIPS-I
- one. First let's declare a text object (called
+ architecture (R3000), and don't support FP instructions.
- @code{button}), that stores text and position:
- @example
+ The register names @code{$a0}--@code{$a3} are not available to avoid
- object class
+ shadowing hex numbers.
-   cell var text
-   cell var len
-   cell var x
-   cell var y
-   method init
-   method draw
- end-class button
- @end example
- @noindent
+ Because there is no way to distinguish registers from immediate values,
- Now, implement the two methods, @code{draw} and @code{init}:
+ you have to explicitly use the immediate forms of instructions, i.e.,
+ @code{addiu,}, not just @code{addu,} (@command{as} does this
+ implicitly).
- @example
+ If the architecture manual specifies several formats for the instruction
- :noname ( o -- )
+ (e.g., for @code{jalr,}), you usually have to use the one with more
-  >r r@@ x @@ r@@ y @@ at-xy  r@@ text @@ r> len @@ type ;
+ arguments (i.e., two for @code{jalr,}).  When in doubt, see
-  button defines draw
+ @code{arch/mips/testasm.fs} for an example of correct use.
- :noname ( addr u o -- )
-  >r 0 r@@ x ! 0 r@@ y ! r@@ len ! r> text ! ;
-  button defines init
- @end example
- @noindent
+ Branches and jumps in the MIPS architecture have a delay slot.  You have
- To demonstrate inheritance, we define a class @code{bold-button}, with no
+ to fill it yourself (the simplest way is to use @code{nop,}), the
- new data and no new methods:
+ assembler does not do it for you (unlike @command{as}).  Even
+ @code{if,}, @code{ahead,}, @code{until,}, @code{again,}, @code{while,},
+ @code{else,} and @code{repeat,} need a delay slot.  Since @code{begin,}
+ and @code{then,} just specify branch targets, they are not affected.
- @example
+ Note that you must not put branches, jumps, or @code{li,} into the delay
- button class
+ slot: @code{li,} may expand to several instructions, and control flow
- end-class bold-button
+ instructions may not be put into the branch delay slot in any case.
- : bold   27 emit ." [1m" ;
+ For branches the argument specifying the target is a relative address;
- : normal 27 emit ." [0m" ;
+ You have to add the address of the delay slot to get the absolute
- @end example
+ address.
- @noindent
+ The MIPS architecture also has load delay slots and restrictions on
- The class @code{bold-button} has a different draw method to
+ using @code{mfhi,} and @code{mflo,}; you have to order the instructions
- @code{button}, but the new method is defined in terms of the draw method
+ yourself to satisfy these restrictions, the assembler does not do it for
- for @code{button}:
+ you.
+ You can specify the conditions for @code{if,} etc. by taking a
+ conditional branch and leaving away the @code{b} at the start and the
+ @code{,} at the end.  E.g.,
  @example
- :noname bold [ button :: draw ] normal ; bold-button defines draw
+5 eq if,
+   ... \ do something if $4 equals $5
+ then,
  @end example
- @noindent
+ @node Other assemblers,  , MIPS assembler, Assembler and Code Words
- Finally, create two objects and apply methods:
+ @subsection Other assemblers
- @example
+ If you want to contribute another assembler/disassembler, please contact
- button new Constant foo
+ us (@email{bug-gforth@@gnu.org}) to check if we have such an assembler
- s" thin foo" foo init
+ already.  If you are writing them from scratch, please use a similar
- page
+ syntax style as the one we use (i.e., postfix, commas at the end of the
- foo draw
+ instruction names, @pxref{Common Assembler}); make the output of the
- bold-button new Constant bar
+ disassembler be valid input for the assembler, and keep the style
- s" fat bar" bar init
+ similar to the style we used.
-bar y !
- bar draw
- @end example
+ Hints on implementation: The most important part is to have a good test
+ suite that contains all instructions.  Once you have that, the rest is
+ easy.  For actual coding you can take a look at
+ @file{arch/mips/disasm.fs} to get some ideas on how to use data for both
+ the assembler and disassembler, avoiding redundancy and some potential
+ bugs.  You can also look at that file (and @pxref{Advanced does> usage
+ example}) to get ideas how to factor a disassembler.
- @node Comparison with other object models,  , Mini-OOF, Object-oriented Forth
+ Start with the disassembler, because it's easier to reuse data from the
- @subsection Comparison with other object models
+ disassembler for the assembler than the other way round.
- @cindex comparison of object models
- @cindex object models, comparison
- Many object-oriented Forth extensions have been proposed (@cite{A survey
+ For the assembler, take a look at @file{arch/alpha/asm.fs}, which shows
- of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford
+ how simple it can be.
- J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the
- relation of the object models described here to two well-known and two
- closely-related (by the use of method maps) models.
- @cindex Neon model
+ @c -------------------------------------------------------------
- The most popular model currently seems to be the Neon model (see
+ @node Threading Words, Passing Commands to the OS, Assembler and Code Words, Words
- @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March
+ @section Threading Words
-) by Andrew McKewan) but this model has a number of limitations
+ @cindex threading words
- @footnote{A longer version of this critique can be
- found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth
- Dimensions, May 1997) by Anton Ertl.}:
- @itemize @bullet
+ @cindex code address
- @item
+ These words provide access to code addresses and other threading stuff
- It uses a @code{@emph{selector object}} syntax, which makes it unnatural
+ in Gforth (and, possibly, other interpretive Forths). It more or less
- to pass objects on the stack.
+ abstracts away the differences between direct and indirect threading
+ (and, for direct threading, the machine dependences). However, at
+ present this wordset is still incomplete. It is also pretty low-level;
+ some day it will hopefully be made unnecessary by an internals wordset
+ that abstracts implementation details away completely.
- @item
+ The terminology used here stems from indirect threaded Forth systems; in
- It requires that the selector parses the input stream (at
+ such a system, the XT of a word is represented by the CFA (code field
- compile time); this leads to reduced extensibility and to bugs that are+
+ address) of a word; the CFA points to a cell that contains the code
- hard to find.
+ address.  The code address is the address of some machine code that
+ performs the run-time action of invoking the word (e.g., the
+ @code{dovar:} routine pushes the address of the body of the word (a
+ variable) on the stack
+ ).
- @item
+ @cindex code address
- It allows using every selector to every object;
+ @cindex code field address
- this eliminates the need for classes, but makes it harder to create
+ In an indirect threaded Forth, you can get the code address of @i{name}
- efficient implementations.
+ with @code{' @i{name} @@}; in Gforth you can get it with @code{' @i{name}
- @end itemize
+ >code-address}, independent of the threading method.
- @cindex Pountain's object-oriented model
+ doc-threading-method
- Another well-known publication is @cite{Object-Oriented Forth} (Academic
+ doc->code-address
- Press, London, 1987) by Dick Pountain. However, it is not really about
+ doc-code-address!
- object-oriented programming, because it hardly deals with late
- binding. Instead, it focuses on features like information hiding and
- overloading that are characteristic of modular languages like Ada (83).
- @cindex Zsoter's object-oriented model
+ @cindex @code{does>}-handler
- In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1)
+ @cindex @code{does>}-code
-, pages 31-35) Andras Zsoter describes a model that makes heavy use
+ For a word defined with @code{DOES>}, the code address usually points to
- of an active object (like @code{this} in @file{objects.fs}): The active
+ a jump instruction (the @dfn{does-handler}) that jumps to the dodoes
- object is not only used for accessing all fields, but also specifies the
+ routine (in Gforth on some platforms, it can also point to the dodoes
- receiving object of every selector invocation; you have to change the
+ routine itself).  What you are typically interested in, though, is
- active object explicitly with @code{@{ ... @}}, whereas in
+ whether a word is a @code{DOES>}-defined word, and what Forth code it
- @file{objects.fs} it changes more or less implicitly at @code{m:
+ executes; @code{>does-code} tells you that.
- ... ;m}. Such a change at the method entry point is unnecessary with the
- Zsoter's model, because the receiving object is the active object
- already. On the other hand, the explicit change is absolutely necessary
- in that model, because otherwise no one could ever change the active
- object. An ANS Forth implementation of this model is available at
- @uref{http://www.forth.org/fig/oopf.html}.
- @cindex @file{oof.fs}, differences to other models
+ doc->does-code
- The @file{oof.fs} model combines information hiding and overloading
- resolution (by keeping names in various word lists) with object-oriented
- programming. It sets the active object implicitly on method entry, but
- also allows explicit changing (with @code{>o...o>} or with
- @code{with...endwith}). It uses parsing and state-smart objects and
- classes for resolving overloading and for early binding: the object or
- class parses the selector and determines the method from this. If the
- selector is not parsed by an object or class, it performs a call to the
- selector for the active object (late binding), like Zsoter's model.
- Fields are always accessed through the active object. The big
- disadvantage of this model is the parsing and the state-smartness, which
- reduces extensibility and increases the opportunities for subtle bugs;
- essentially, you are only safe if you never tick or @code{postpone} an
- object or class (Bernd disagrees, but I (Anton) am not convinced).
- @cindex @file{mini-oof.fs}, differences to other models
+ To create a @code{DOES>}-defined word with the following basic words,
- The @file{mini-oof.fs} model is quite similar to a very stripped-down
+ you have to set up a @code{DOES>}-handler with @code{does-handler!};
- version of the @file{objects.fs} model, but syntactically it is a
+ @code{/does-handler} aus behind you have to place your executable Forth
- mixture of the @file{objects.fs} and @file{oof.fs} models.
+ code.  Finally you have to create a word and modify its behaviour with
+ @code{does-handler!}.
+ doc-does-code!
+ doc-does-handler!
+ doc-/does-handler
+ The code addresses produced by various defining words are produced by
+ the following words:
+ doc-docol:
+ doc-docon:
+ doc-dovar:
+ doc-douser:
+ doc-dodefer:
+ doc-dofield:
  @c -------------------------------------------------------------
- @node Passing Commands to the OS, Keeping track of Time, Object-oriented Forth, Words
+ @node Passing Commands to the OS, Keeping track of Time, Threading Words, Words
  @section Passing Commands to the Operating System
  @cindex operating system - passing commands
  @cindex shell commands
- Line 11827  from primitives (e.g., invalid memory ad
+ Line 11862  from primitives (e.g., invalid memory ad
  @code{gforth-fast} is only able to do a return stack dump from a
  directly called @code{throw} (including @code{abort} etc.).  This is the
  only difference (apart from a speed factor of between 1.15 (K6-2) and
-.6 (21164A)) between @code{gforth} and @code{gforth-fast}.  Given an
+(21264)) between @code{gforth} and @code{gforth-fast}.  Given an
  exception caused by a primitive in @code{gforth-fast}, you will
  typically see no return stack dump at all; however, if the exception is
  caught by @code{catch} (e.g., for restoring some state), and then

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>

Removed from v.1.77
changed lines
	Added in v.1.78