version 1.25, 1999/03/11 22:52:11
|
version 1.26, 1999/03/23 20:24:21
|
Line 32 Programming style note:
|
Line 32 Programming style note:
|
@ifinfo |
@ifinfo |
This file documents Gforth @value{VERSION} |
This file documents Gforth @value{VERSION} |
|
|
Copyright @copyright{} 1995-1998 Free Software Foundation, Inc. |
Copyright @copyright{} 1995-1999 Free Software Foundation, Inc. |
|
|
Permission is granted to make and distribute verbatim copies of |
Permission is granted to make and distribute verbatim copies of |
this manual provided the copyright notice and this permission notice |
this manual provided the copyright notice and this permission notice |
Line 71 Copyright @copyright{} 1995-1998 Free So
|
Line 71 Copyright @copyright{} 1995-1998 Free So
|
@center Jens Wilke |
@center Jens Wilke |
@center Neal Crook |
@center Neal Crook |
@sp 3 |
@sp 3 |
@center This manual is permanently under construction and was last updated on 16-Feb-1999 |
@center This manual is permanently under construction and was last updated on 23-Mar-1999 |
|
|
@comment The following two commands start the copyright page. |
@comment The following two commands start the copyright page. |
@page |
@page |
Line 107 personal machines. This manual correspon
|
Line 107 personal machines. This manual correspon
|
|
|
@menu |
@menu |
* License:: The GPL |
* License:: The GPL |
* Introduction:: An introduction to ANS Forth |
|
* Goals:: About the Gforth Project |
* Goals:: About the Gforth Project |
|
* Introduction:: An introduction to ANS Forth |
* Invoking Gforth:: Starting (and exiting) Gforth |
* Invoking Gforth:: Starting (and exiting) Gforth |
* Words:: Forth words available in Gforth |
* Words:: Forth words available in Gforth |
* Error messages:: How to interpret them |
* Error messages:: How to interpret them |
Line 129 personal machines. This manual correspon
|
Line 129 personal machines. This manual correspon
|
|
|
@detailmenu --- The Detailed Node Listing --- |
@detailmenu --- The Detailed Node Listing --- |
|
|
|
Goals of Gforth |
|
|
|
* Gforth Extensions Sinful?:: |
|
|
An Introduction to ANS Forth |
An Introduction to ANS Forth |
|
|
* Introducing the Text Interpreter:: |
* Introducing the Text Interpreter:: |
Line 139 An Introduction to ANS Forth
|
Line 143 An Introduction to ANS Forth
|
* Review - elements of a Forth system:: |
* Review - elements of a Forth system:: |
* Exercises:: |
* Exercises:: |
|
|
Goals of Gforth |
|
|
|
* Gforth Extensions Sinful?:: |
|
|
|
Forth Words |
Forth Words |
|
|
* Notation:: |
* Notation:: |
Line 152 Forth Words
|
Line 152 Forth Words
|
* Stack Manipulation:: |
* Stack Manipulation:: |
* Memory:: |
* Memory:: |
* Control Structures:: |
* Control Structures:: |
* Locals:: |
|
* Defining Words:: |
* Defining Words:: |
* The Text Interpreter:: |
* The Text Interpreter:: |
* Structures:: |
|
* Object-oriented Forth:: |
|
* Tokens for Words:: |
* Tokens for Words:: |
* Word Lists:: |
* Word Lists:: |
* Environmental Queries:: |
* Environmental Queries:: |
* Files:: |
* Files:: |
* Including Files:: |
|
* Blocks:: |
* Blocks:: |
* Other I/O:: |
* Other I/O:: |
* Programming Tools:: |
* Programming Tools:: |
* Assembler and Code Words:: |
* Assembler and Code Words:: |
* Threading Words:: |
* Threading Words:: |
|
* Locals:: |
|
* Structures:: |
|
* Object-oriented Forth:: |
* Passing Commands to the OS:: |
* Passing Commands to the OS:: |
* Miscellaneous Words:: |
* Miscellaneous Words:: |
|
|
Line 202 Control Structures
|
Line 201 Control Structures
|
* Calls and returns:: |
* Calls and returns:: |
* Exception Handling:: |
* Exception Handling:: |
|
|
Locals |
|
|
|
* Gforth locals:: |
|
* ANS Forth locals:: |
|
|
|
Gforth locals |
|
|
|
* Where are locals visible by name?:: |
|
* How long do locals live?:: |
|
* Programming Style:: |
|
* Implementation:: |
|
|
|
Defining Words |
Defining Words |
|
|
* Simple Defining Words:: |
* Simple Defining Words:: |
Line 229 The Text Interpreter
|
Line 216 The Text Interpreter
|
* Literals:: |
* Literals:: |
* Interpreter Directives:: |
* Interpreter Directives:: |
|
|
|
Word Lists |
|
|
|
* Why use word lists?:: |
|
* Word list examples:: |
|
|
|
Files |
|
|
|
* Forth source files:: |
|
* General files:: |
|
* Search Paths:: |
|
* Forth Search Paths:: |
|
* General Search Paths:: |
|
|
|
Other I/O |
|
|
|
* Simple numeric output:: |
|
* Formatted numeric output:: |
|
* String Formats:: |
|
* Displaying characters and strings:: |
|
* Input:: |
|
|
|
Programming Tools |
|
|
|
* Debugging:: Simple and quick. |
|
* Assertions:: Making your programs self-checking. |
|
* Singlestep Debugger:: Executing your program word by word. |
|
|
|
Locals |
|
|
|
* Gforth locals:: |
|
* ANS Forth locals:: |
|
|
|
Gforth locals |
|
|
|
* Where are locals visible by name?:: |
|
* How long do locals live?:: |
|
* Programming Style:: |
|
* Implementation:: |
|
|
Structures |
Structures |
|
|
* Why explicit structure support?:: |
* Why explicit structure support?:: |
Line 274 The @file{mini-oof.fs} model
|
Line 300 The @file{mini-oof.fs} model
|
* Mini-OOF Example:: |
* Mini-OOF Example:: |
* Mini-OOF Implementation:: |
* Mini-OOF Implementation:: |
|
|
Word Lists |
|
|
|
* Why use word lists?:: |
|
* Word list examples:: |
|
|
|
Including Files |
|
|
|
* Words for Including:: |
|
* Search Path:: |
|
* Forth Search Paths:: |
|
* General Search Paths:: |
|
|
|
Other I/O |
|
|
|
* Simple numeric output:: Predefined formats |
|
* Formatted numeric output:: Formatted (pictured) output |
|
* String Formats:: How Forth stores strings in memory |
|
* Displaying characters and strings:: Other stuff |
|
* Input:: Input |
|
|
|
Programming Tools |
|
|
|
* Debugging:: Simple and quick. |
|
* Assertions:: Making your programs self-checking. |
|
* Singlestep Debugger:: Executing your program word by word. |
|
|
|
Tools |
Tools |
|
|
* ANS Report:: Report the words used, sorted by wordset. |
* ANS Report:: Report the words used, sorted by wordset. |
Line 422 Other Forth-related information
|
Line 422 Other Forth-related information
|
@end detailmenu |
@end detailmenu |
@end menu |
@end menu |
|
|
@node License, Introduction, Top, Top |
@node License, Goals, Top, Top |
@unnumbered GNU GENERAL PUBLIC LICENSE |
@unnumbered GNU GENERAL PUBLIC LICENSE |
@center Version 2, June 1991 |
@center Version 2, June 1991 |
|
|
Line 823 from other Forth compilers. However, thi
|
Line 823 from other Forth compilers. However, thi
|
reference manual. |
reference manual. |
@end iftex |
@end iftex |
|
|
@c ---------------------------------------------------------- |
|
@node Introduction, Goals, License, Top |
@c ****************************************************************** |
|
@node Goals, Introduction, License, Top |
|
@comment node-name, next, previous, up |
|
@chapter Goals of Gforth |
|
@cindex goals of the Gforth project |
|
The goal of the Gforth Project is to develop a standard model for |
|
ANS Forth. This can be split into several subgoals: |
|
|
|
@itemize @bullet |
|
@item |
|
Gforth should conform to the ANS Forth Standard. |
|
@item |
|
It should be a model, i.e. it should define all the |
|
implementation-dependent things. |
|
@item |
|
It should become standard, i.e. widely accepted and used. This goal |
|
is the most difficult one. |
|
@end itemize |
|
|
|
To achieve these goals Gforth should be |
|
@itemize @bullet |
|
@item |
|
Similar to previous models (fig-Forth, F83) |
|
@item |
|
Powerful. It should provide for all the things that are considered |
|
necessary today and even some that are not yet considered necessary. |
|
@item |
|
Efficient. It should not get the reputation of being exceptionally |
|
slow. |
|
@item |
|
Free. |
|
@item |
|
Available on many machines/easy to port. |
|
@end itemize |
|
|
|
Have we achieved these goals? Gforth conforms to the ANS Forth |
|
standard. It may be considered a model, but we have not yet documented |
|
which parts of the model are stable and which parts we are likely to |
|
change. It certainly has not yet become a de facto standard, but it |
|
appears to be quite popular. It has some similarities to and some |
|
differences from previous models. It has some powerful features, but not |
|
yet everything that we envisioned. We certainly have achieved our |
|
execution speed goals (@pxref{Performance}). It is free and available |
|
on many machines. |
|
|
|
@menu |
|
* Gforth Extensions Sinful?:: |
|
@end menu |
|
|
|
@node Gforth Extensions Sinful?, , Goals, Goals |
|
@comment node-name, next, previous, up |
|
@section Is it a Sin to use Gforth Extensions? |
|
@cindex Gforth extensions |
|
|
|
If you've been paying attention, you will have realised that there is an |
|
ANS (American National Standard) for Forth. As you read through the rest |
|
of this manual, you will see documentation for @var{Standard} words, and |
|
documentation for some appealing Gforth @var{extensions}. You might ask |
|
yourself the question: @var{``Given that there is a standard, would I be |
|
committing a sin to use (non-Standard) Gforth extensions?''} |
|
|
|
The answer to that question is somewhat pragmatic and somewhat |
|
philosophical. Consider these points: |
|
|
|
@itemize @bullet |
|
@item |
|
A number of the Gforth extensions can be implemented in ANS Forth using |
|
files provided in the @file{compat/} directory. These are mentioned in |
|
the text in passing. |
|
@item |
|
Forth has a rich historical precedent for programmers taking advantage |
|
of implementation-dependent features of their tools (for example, |
|
relying on a knowledge of the dictionary structure). Sometimes these |
|
techniques are necessary to extract every last bit of performance from |
|
the hardware, sometimes they are just a programming shorthand. |
|
@item |
|
The best way to break the rules is to know what the rules are. To learn |
|
the rules, there is no substitute for studying the text of the Standard |
|
itself. In particular, Appendix A of the Standard (@var{Rationale}) |
|
provides a valuable insight into the thought processes of the technical |
|
committee. |
|
@item |
|
The best reason to break a rule is because you have to; because it's |
|
more productive to do that, because it makes your code run fast enough |
|
or because you can see no Standard way to achieve what you want to |
|
achieve. |
|
@end itemize |
|
|
|
The tool @file{ans-report.fs} (@pxref{ANS Report}) makes it easy to |
|
analyse your program and determine what non-Standard definitions it |
|
relies upon. |
|
|
|
@c ****************************************************************** |
|
@node Introduction, Invoking Gforth, Goals, Top |
@comment node-name, next, previous, up |
@comment node-name, next, previous, up |
@chapter An Introduction to ANS Forth |
@chapter An Introduction to ANS Forth |
@cindex Forth - an introduction |
@cindex Forth - an introduction |
Line 835 teaching material, it seems worthwhile t
|
Line 928 teaching material, it seems worthwhile t
|
material. @xref{Forth-related information} for other sources of Forth-related |
material. @xref{Forth-related information} for other sources of Forth-related |
information. |
information. |
|
|
The examples in this section should work on any ANS Standard Forth, the |
The examples in this section should work on any ANS Forth; the |
output shown was produced using Gforth. In each example, I have tried to |
output shown was produced using Gforth. Each example attempts to |
reproduce the exact output that Gforth produces. If you try out the |
reproduce the exact output that Gforth produces. If you try out the |
examples (and you should), what you should type is shown @kbd{like this} |
examples (and you should), what you should type is shown @kbd{like this} |
and Gforth's response is shown @code{like this}. The single exception is |
and Gforth's response is shown @code{like this}. The single exception is |
that, where the example shows @kbd{<return>} it means that you should |
that, where the example shows @kbd{<return>} it means that you should |
press the "carriage return" key. Unfortunatley, some output formats for |
press the ``carriage return'' key. Unfortunately, some output formats for |
this manual cannot show the difference between @kbd{this} and |
this manual cannot show the difference between @kbd{this} and |
@code{this} which will make trying out the examples harder (but not |
@code{this} which will make trying out the examples harder (but not |
impossible). |
impossible). |
Line 864 lead to great productivity improvements.
|
Line 957 lead to great productivity improvements.
|
* Review - elements of a Forth system:: |
* Review - elements of a Forth system:: |
* Exercises:: |
* Exercises:: |
@end menu |
@end menu |
@comment TODO add these sections to the top xref lists |
|
|
|
@comment ---------------------------------------------- |
@comment ---------------------------------------------- |
@node Introducing the Text Interpreter, Stacks and Postfix notation, Introduction, Introduction |
@node Introducing the Text Interpreter, Stacks and Postfix notation, Introduction, Introduction |
Line 876 When you invoke the Forth image, you wil
|
Line 968 When you invoke the Forth image, you wil
|
and nothing else (if you have Gforth installed on your system, try |
and nothing else (if you have Gforth installed on your system, try |
invoking it now, by typing @kbd{gforth<return>}). Forth is now running |
invoking it now, by typing @kbd{gforth<return>}). Forth is now running |
its command line interpreter, which is called the @var{Text Interpreter} |
its command line interpreter, which is called the @var{Text Interpreter} |
(also known as the @var{Outer Interpreter}). (@pxref{The Text |
(also known as the @var{Outer Interpreter}). (You will learn a lot |
Interpreter} describes it in more detail, but we will learn more about |
about the text interpreter as you read through this chapter, |
its behaviour as we go through this chapter). |
but @pxref{The Text Interpreter} for more detail). |
|
|
Although it may not be obvious, Forth is actually waiting for your |
Although it's not obvious, Forth is actually waiting for your |
input. Type a number and press the <return> key: |
input. Type a number and press the <return> key: |
|
|
@example |
@example |
Line 889 input. Type a number and press the <retu
|
Line 981 input. Type a number and press the <retu
|
|
|
Rather than give you a prompt to invite you to input something, the text |
Rather than give you a prompt to invite you to input something, the text |
interpreter prints a status message @var{after} it has processed a line |
interpreter prints a status message @var{after} it has processed a line |
of input. The status message in this case (" ok" followed by |
of input. The status message in this case (``@code{ ok}'' followed by |
carriage-return) indicates that the text interpreter was able to process |
carriage-return) indicates that the text interpreter was able to process |
all of your input successfully. Now type something illegal: |
all of your input successfully. Now type something illegal: |
|
|
@example |
@example |
@kbd{qwer341<return>} |
@kbd{qwer341<return>} |
|
:1: Undefined word |
|
qwer341 |
^^^^^^^ |
^^^^^^^ |
Error: Undefined word |
$400D2BA8 Bounce |
|
$400DBDA8 no.extensions |
@end example |
@end example |
|
|
When the text interpreter detects an error, it discards any remaining |
The exact text, other than the ``Undefined word'' may differ slightly on |
text on a line, resets certain internal state and prints an error |
your system, but the effect is the same; when the text interpreter |
message. |
detects an error, it discards any remaining text on a line, resets |
|
certain internal state and prints an error message. |
The text interpreter works on input one line at a time. Starting at |
|
the beginning of the line, it breaks the line into groups of characters |
The text interpreter waits for you to press carrage-return, and then |
separated by spaces. For each group of characters in turn, it makes two |
processes your input line. Starting at the beginning of the line, it |
attempts to do something: |
breaks the line into groups of characters separated by spaces. For each |
|
group of characters in turn, it makes two attempts to do something: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
Line 933 in the next section).
|
Line 1029 in the next section).
|
@end itemize |
@end itemize |
|
|
If the text interpreter is unable to do either of these things with any |
If the text interpreter is unable to do either of these things with any |
group of characters, it discards the rest of the line and print an error |
group of characters, it discards the group of characters and the rest of |
message. If the text interpreter reaches the end of the line without |
the line, then prints an error message. If the text interpreter reaches |
error, it prints the status message " ok" followed by carriage-return. |
the end of the line without error, it prints the status message ``@code{ ok}'' |
|
followed by carriage-return. |
|
|
This is the simplest command we can give to the text interpreter: |
This is the simplest command we can give to the text interpreter: |
|
|
Line 944 This is the simplest command we can give
|
Line 1041 This is the simplest command we can give
|
@end example |
@end example |
|
|
The text interpreter did everything we asked it to do (nothing) without |
The text interpreter did everything we asked it to do (nothing) without |
an error, so it said that everything is "ok". Try a slightly longer |
an error, so it said that everything is ``@code{ ok}''. Try a slightly longer |
command: |
command: |
|
|
@example |
@example |
@kbd{12 dup fred dup<return>} |
@kbd{12 dup fred dup<return>} |
|
:1: Undefined word |
|
12 dup fred dup |
^^^^ |
^^^^ |
Error: Undefined word |
$400D2BA8 Bounce |
|
$400DBDA8 no.extensions |
@end example |
@end example |
|
|
When you pres the <return> key, the text interpreter starts to work its |
When you press the carriage-return key, the text interpreter starts to |
way along the line. |
work its way along the line: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
Line 963 characters @code{12} and looks them up i
|
Line 1063 characters @code{12} and looks them up i
|
dictionary@footnote{We can't tell if it found them or not, but assume |
dictionary@footnote{We can't tell if it found them or not, but assume |
for now that it did not}. There is no match for this group of characters |
for now that it did not}. There is no match for this group of characters |
in the name dictionary, so it tries to treat them as a number. It is |
in the name dictionary, so it tries to treat them as a number. It is |
able to do this successfully, so it puts the number, 12, "on the stack" |
able to do this successfully, so it puts the number, 12, ``on the stack'' |
(whatever that means). |
(whatever that means). |
@item |
@item |
The text interpreter resumes scanning the line and gets the next group |
The text interpreter resumes scanning the line and gets the next group |
of characters, @code{dup}. It looks them up in the name dictionary and |
of characters, @code{dup}. It looks it up in the name dictionary and |
(you'll have to take my word for this) finds them, and executes the word |
(you'll have to take my word for this) finds it, and executes the word |
@code{dup} (whatever that means). |
@code{dup} (whatever that means). |
@item |
@item |
Once again, the text interpreter resumes scanning the line and gets the |
Once again, the text interpreter resumes scanning the line and gets the |
Line 993 and executing it a second time.
|
Line 1093 and executing it a second time.
|
@cindex outer interpreter |
@cindex outer interpreter |
|
|
In procedural programming languages (like C and Pascal), the |
In procedural programming languages (like C and Pascal), the |
building-block of programs is the function or procedure. These |
building-block of programs is the @var{function} or @var{procedure}. These |
functions or procedures are called with explicit parameters. For |
functions or procedures are called with @var{explicit parameters}. For |
example, in C we might write: |
example, in C we might write: |
|
|
@example |
@example |
total = total + new_volume(length,height,depth); |
total = total + new_volume(length,height,depth); |
@end example |
@end example |
|
|
where total, length, height, depth are all variables and new_volume is |
@noindent |
a function-call to another piece of code. |
where new_volume is a function-call to another piece of code, and total, |
|
length, height and depth are all variables. length, height and depth are |
|
parameters to the function-call. |
|
|
In Forth, the equivalent to the function or procedure is the |
In Forth, the equivalent of the function or procedure is the |
@var{definition} and parameters are implicitly passed between |
@var{definition} and parameters are implicitly passed between |
definitions using a shared stack that is visible to the |
definitions using a shared stack that is visible to the |
programmer. Although Forth does support variables, the existence of the |
programmer. Although Forth does support variables, the existence of the |
Line 1015 actual number is implementation-dependen
|
Line 1117 actual number is implementation-dependen
|
used for any operation is implied unambiguously by the operation being |
used for any operation is implied unambiguously by the operation being |
performed. The stack used for all integer operations is called the @var{data |
performed. The stack used for all integer operations is called the @var{data |
stack} and, since this is the stack used most commonly, references to |
stack} and, since this is the stack used most commonly, references to |
"the data stack" are often abbreviated to "the stack". |
``the data stack'' are often abbreviated to ``the stack''. |
|
|
The stacks have a last-in, first-out (LIFO) organisation. If you type: |
The stacks have a last-in, first-out (LIFO) organisation. If you type: |
|
|
Line 1023 The stacks have a last-in, first-out (LI
|
Line 1125 The stacks have a last-in, first-out (LI
|
@kbd{1 2 3<return>} ok |
@kbd{1 2 3<return>} ok |
@end example |
@end example |
|
|
Then you (well, the text interpreter, really) have placed three numbers |
Then this instructs the text interpreter to placed three numbers on the |
on the (data) stack. An analogy for the behaviour of the stack is to |
(data) stack. An analogy for the behaviour of the stack is to take a |
take a pack of playing cards and deal out the ace (1), 2 and 3 into a |
pack of playing cards and deal out the ace (1), 2 and 3 into a pile on |
pile on the table. The 3 was the last card onto the pile ("last-in") and |
the table. The 3 was the last card onto the pile (``last-in'') and if |
if you take a card off the pile then, unless you're prepared to fiddle a |
you take a card off the pile then, unless you're prepared to fiddle a |
bit, the card that you take off will be the 3 ("first-out"). The number |
bit, the card that you take off will be the 3 (``first-out''). The |
that will be first-out of the stack is called the "top of stack", which |
number that will be first-out of the stack is called the @var{top of |
|
stack}, which |
|
@cindex TOS definition |
is often abbreviated to @var{TOS}. |
is often abbreviated to @var{TOS}. |
|
|
To see how parameters are passed in Forth, we will consider the |
To understand how parameters are passed in Forth, consider the |
behaviour of the definition @code{+} (pronounced "plus"). You will not be |
behaviour of the definition @code{+} (pronounced ``plus''). You will not |
surprised to learn that this definition performs addition. More |
be surprised to learn that this definition performs addition. More |
precisely, it adds two number together and produces a result. Where does |
precisely, it adds two number together and produces a result. Where does |
it get the two numbers from? It takes the first two numbers off the |
it get the two numbers from? It takes the top two numbers off the |
stack. Where does it place the result? On the stack. You can act-out the |
stack. Where does it place the result? On the stack. You can act-out the |
behaviour of @code{+} with your playing cards like this: |
behaviour of @code{+} with your playing cards like this: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
Pick up two cards from the stack |
Pick up two cards from the stack on the table |
@item |
@item |
Stare at them intently and ask yourself "what *is* the sum of these two |
Stare at them intently and ask yourself ``what @var{is} the sum of these two |
numbers" |
numbers'' |
@item |
@item |
Decide that the answer is 5 |
Decide that the answer is 5 |
@item |
@item |
Line 1055 Put a 5 on the remaining ace that's on t
|
Line 1159 Put a 5 on the remaining ace that's on t
|
@end itemize |
@end itemize |
|
|
If you don't have a pack of cards handy but you do have Forth running, |
If you don't have a pack of cards handy but you do have Forth running, |
you can use the definition .s to show the current state of the stack, |
you can use the definition @code{.s} to show the current state of the stack, |
without affecting the stack. Type: |
without affecting the stack. Type: |
|
|
@example |
@example |
@kbd{clearstack 1 2 3<return>} ok |
@kbd{clearstack 1 2 3<return>} ok |
@kbd{.s<return> <3> 1 2 3 } ok |
@kbd{.s<return>} <3> 1 2 3 ok |
@end example |
@end example |
|
|
The text interpreter looks up the word @code{clearstack} and executes |
The text interpreter looks up the word @code{clearstack} and executes |
Line 1068 it; it tidies up the stack and removes a
|
Line 1172 it; it tidies up the stack and removes a
|
left on it by earlier examples. The text interpreter pushes each of the |
left on it by earlier examples. The text interpreter pushes each of the |
three numbers in turn onto the stack. Finally, the text interpreter |
three numbers in turn onto the stack. Finally, the text interpreter |
looks up the word @code{.s} and executes it. The effect of executing |
looks up the word @code{.s} and executes it. The effect of executing |
@code{.s} is to print the "<3>" (the total number of items on the stack) |
@code{.s} is to print the ``<3>'' (the total number of items on the stack) |
followed by a list of all the items and the item on the far right-hand |
followed by a list of all the items on the stack; the item on the far |
side is the TOS. |
right-hand side is the TOS. |
|
|
You can now type: |
You can now type: |
|
|
+ .s<return> <2> 1 5 ok |
@example |
|
@kbd{+ .s<return>} <2> 1 5 ok |
|
@end example |
|
|
|
@noindent |
which is correct; there are now 2 items on the stack and the result of |
which is correct; there are now 2 items on the stack and the result of |
the addition is 5. |
the addition is 5. |
|
|
If you're playing with cards, try doing a second addition; pick up the |
If you're playing with cards, try doing a second addition: pick up the |
two cards, work out that their sum is 6, shuffle them into the pack, |
two cards, work out that their sum is 6, shuffle them into the pack, |
look for a 6 and place that on the table. You now have just one item |
look for a 6 and place that on the table. You now have just one item on |
on the stack. What happens if you try to do a third addition? Pick up |
the stack. What happens if you try to do a third addition? Pick up the |
the first card, pick up the second card - ah. There is no second |
first card, pick up the second card -- ah! There is no second card. This |
card. This is called a "stack underflow" and consitutes an error. If |
is called a @var{stack underflow} and consitutes an error. If you try to |
you try to do the same thing with Forth it will report an error |
do the same thing with Forth it will report an error (probably a Stack |
(probably a Stack Underflow or an Invalid Memory Address error). |
Underflow or an Invalid Memory Address error). |
|
|
The opposite situation to a stack underflow is a stack overflow, which |
The opposite situation to a stack underflow is a @var{stack overflow}, |
simply accepts that there is a finite amount of storage space reserved |
which simply accepts that there is a finite amount of storage space |
for the stack. To stretch the playing card analogy, if you had enough |
reserved for the stack. To stretch the playing card analogy, if you had |
packs of cards and you piled the cards up on the table, you would |
enough packs of cards and you piled the cards up on the table, you would |
eventually be unable to add another card; you'd hit the |
eventually be unable to add another card; you'd hit the ceiling. Gforth |
ceiling. Gforth allows you to set the maximum size of the stacks. In |
allows you to set the maximum size of the stacks. In general, the only |
general, the only time that you will get a stack overflow is because a |
time that you will get a stack overflow is because a definition has a |
definition has a bug in it and is generating data on the stack |
bug in it and is generating data on the stack uncontrollably. |
uncontrollably. |
|
|
|
There's one final use for the playing card analogy. If you model your |
There's one final use for the playing card analogy. If you model your |
stack using a pack of playing cards, the maximum number of items on |
stack using a pack of playing cards, the maximum number of items on |
your stack will be 52 (I assume you didn't use the Joker). The maximum |
your stack will be 52 (I assume you didn't use the Joker). The maximum |
*value* of any item on the stack is 13 (the King). In fact, the only |
@var{value} of any item on the stack is 13 (the King). In fact, the only |
possible numbers are positive integer numbers 1 through 13; you can't |
possible numbers are positive integer numbers 1 through 13; you can't |
have (for example) 0 or 27 or 3.52 or -2. If you change the way you |
have (for example) 0 or 27 or 3.52 or -2. If you change the way you |
think about some of the cards, you can accommodate different |
think about some of the cards, you can accommodate different |
Line 1112 numbers) but the numbers that you can re
|
Line 1218 numbers) but the numbers that you can re
|
|
|
In that analogy, the limit was the amount of information that a single |
In that analogy, the limit was the amount of information that a single |
stack entry could hold, and Forth has a similar limit. In Forth, the |
stack entry could hold, and Forth has a similar limit. In Forth, the |
size of a stack entry is called a "cell". The actual size of a cell is |
size of a stack entry is called a @var{cell}. The actual size of a cell is |
implementation dependent and affects the maximum value that a stack |
implementation dependent and affects the maximum value that a stack |
entry can hold. A Standard Forth provides a cell size of at least |
entry can hold. A Standard Forth provides a cell size of at least |
16-bits, and most desktop systems use a cell size of 32-bits. |
16-bits, and most desktop systems use a cell size of 32-bits. |
Line 1120 entry can hold. A Standard Forth provide
|
Line 1226 entry can hold. A Standard Forth provide
|
Forth does not do any type checking for you, so you are free to |
Forth does not do any type checking for you, so you are free to |
manipulate and combine stack items in any way you wish. A convenient |
manipulate and combine stack items in any way you wish. A convenient |
ways of treating stack items is as 2's complement signed integers, and |
ways of treating stack items is as 2's complement signed integers, and |
that is what Standard words like "+" do. Therefore you can type: |
that is what Standard words like ``+'' do. Therefore you can type: |
|
|
-5 12 + .s<return> <1> 7 ok |
@example |
|
@kbd{-5 12 + .s<return>} <1> 7 ok |
|
@end example |
|
|
If you use numbers and definitions like "+" in order to turn Forth |
If you use numbers and definitions like ``+'' in order to turn Forth |
into a great big pocket calculator, you will realise that it's rather |
into a great big pocket calculator, you will realise that it's rather |
different from a normal calculator. Rather than typing 2 + 3 = you had |
different from a normal calculator. Rather than typing 2 + 3 = you had |
to type 2 3 + (ignore the fact that you had to use .s to see the |
to type 2 3 + (ignore the fact that you had to use @code{.s} to see the |
result). The terminology used to describe this difference is to say |
result). The terminology used to describe this difference is to say |
that your calculator uses "Infix Notation" (parameters and operators |
that your calculator uses @var{Infix Notation} (parameters and operators |
are mixed) whilst Forth uses "Postfix Notation" (parameters and |
are mixed) whilst Forth uses @var{Postfix Notation} (parameters and |
operators are separate), also called "Reverse Polish Notation". |
operators are separate), also called @var{Reverse Polish Notation}. |
|
|
Whilst postfix notation might look confusing to begin with, it has |
Whilst postfix notation might look confusing to begin with, it has |
several important advantages: |
several important advantages: |
|
|
- it is unambiguous |
@itemize @bullet |
- it is more concise |
@item |
- it fits naturally with a stack-based system |
it is unambiguous |
|
@item |
|
it is more concise |
|
@item |
|
it fits naturally with a stack-based system |
|
@end itemize |
|
|
To examine these claims in more detail, consider these sums: |
To examine these claims in more detail, consider these sums: |
|
|
|
@example |
6 + 5 * 4 = |
6 + 5 * 4 = |
4 * 5 + 6 = |
4 * 5 + 6 = |
|
@end example |
|
|
If you're just learning maths or your maths is very rusty, you will |
If you're just learning maths or your maths is very rusty, you will |
probably come up with the answer 44 for the first and 26 for the |
probably come up with the answer 44 for the first and 26 for the |
second. If you are a bit of a whizz at maths you will remember the |
second. If you are a bit of a whizz at maths you will remember the |
*convention* that multiplication takes precendence over addition, and |
@var{convention} that multiplication takes precendence over addition, and |
you'd come up with the answer 26 both times. To explain the answer 26 |
you'd come up with the answer 26 both times. To explain the answer 26 |
to someone who got the answer 44, you'd probably rewrite the first sum |
to someone who got the answer 44, you'd probably rewrite the first sum |
like this: |
like this: |
|
|
|
@example |
6 + (5 * 4) = |
6 + (5 * 4) = |
|
@end example |
|
|
If what you really wanted was to perform the addition before the |
If what you really wanted was to perform the addition before the |
multiplication, you would have to use parentheses to force it. |
multiplication, you would have to use parentheses to force it. |
Line 1167 these keystroke sequences:
|
Line 1284 these keystroke sequences:
|
|
|
Postfix notation is unambiguous because the order that the operators |
Postfix notation is unambiguous because the order that the operators |
are applied is always explicit; that also means that parentheses are |
are applied is always explicit; that also means that parentheses are |
never required. The operators are *active* (the act of quoting the |
never required. The operators are @var{active} (the act of quoting the |
operator makes the operation occur) which removes the need for "=". |
operator makes the operation occur) which removes the need for ``=''. |
|
|
The sum 6 + 5 * 4 can be written (in postfix notation) in two |
The sum 6 + 5 * 4 can be written (in postfix notation) in two |
equivalent ways: |
equivalent ways: |
|
|
|
@example |
6 5 4 * + or: |
6 5 4 * + or: |
5 4 * 6 + |
5 4 * 6 + |
|
@end example |
|
|
An important thing that you should notice about this notation is that |
An important thing that you should notice about this notation is that |
the @var{order} of the numbers does not change; if you want to subtract |
the @var{order} of the numbers does not change; if you want to subtract |
2 from 10 you type @code{10 2 -}. |
2 from 10 you type @code{10 2 -}. |
|
|
The reason why Forth uses postfix notation is very simple to explain: it |
The reason that Forth uses postfix notation is very simple to explain: it |
makes the implementation extremely simple, and it follows naturally from |
makes the implementation extremely simple, and it follows naturally from |
using the stack as a mechanism for passing parameters. Another way of |
using the stack as a mechanism for passing parameters. Another way of |
thinking about this is to realise that all Forth definitions are |
thinking about this is to realise that all Forth definitions are |
@var{active}; they execute as they are encountered by the text |
@var{active}; they execute as they are encountered by the text |
interpreter. The result of this is that the syntax of Forth is almost |
interpreter. The result of this is that the syntax of Forth is trivially |
trivially simple. |
simple. |
|
|
|
|
|
|
Line 1197 trivially simple.
|
Line 1316 trivially simple.
|
|
|
Until now, the examples we've seen have been trivial; we've just been |
Until now, the examples we've seen have been trivial; we've just been |
using Forth an a bigger-than-pocket calculator. Also, each calculation |
using Forth an a bigger-than-pocket calculator. Also, each calculation |
we've shown has been a "one-off" -- to repeat it we'd need to type it in |
we've shown has been a ``one-off'' -- to repeat it we'd need to type it in |
again@footnote{That's not quite true. If you press the up-arrow key on |
again@footnote{That's not quite true. If you press the up-arrow key on |
your keyboard you should be able to scroll back to any earlier command, |
your keyboard you should be able to scroll back to any earlier command, |
edit it and re-enter it.} In this section we'll see how to add new |
edit it and re-enter it.} In this section we'll see how to add new |
word to Forth's vocabulary. |
word to Forth's vocabulary. |
|
|
The easiest way to create a new word is to use a "colon |
The easiest way to create a new word is to use a @var{colon |
definition". We'll define a few and try them out before we worry too |
definition}. We'll define a few and try them out before we worry too |
much about how they work. Try typing in these examples; be careful to |
much about how they work. Try typing in these examples; be careful to |
copy the spaces accurately: |
copy the spaces accurately: |
|
|
Line 1341 magic to make that xt or number get exec
|
Line 1460 magic to make that xt or number get exec
|
at the time that @code{add-two} is @var{executed}. Therefore, when you |
at the time that @code{add-two} is @var{executed}. Therefore, when you |
execute @code{add-two} its @var{run-time effect} is exactly the same as |
execute @code{add-two} its @var{run-time effect} is exactly the same as |
if you had typed @code{2 + .} outside of a definition, and pressed |
if you had typed @code{2 + .} outside of a definition, and pressed |
<return>. |
carriage-return. |
|
|
In Forth, every word or number can be described in terms of three |
In Forth, every word or number can be described in terms of three |
properties: |
properties: |
Line 1468 example). The effect of executing it are
|
Line 1587 example). The effect of executing it are
|
compilation state at this time. If you execute @code{word2} it does |
compilation state at this time. If you execute @code{word2} it does |
nothing at all. |
nothing at all. |
|
|
@cindex ." -- how it works |
@cindex @code{."}, how it works |
Before leaving the subject of immediate words, consider the behaviour of |
Before leaving the subject of immediate words, consider the behaviour of |
@code{."} in the definition of @code{greet}, in the previous |
@code{."} in the definition of @code{greet}, in the previous |
section. This word is both a parsing word and an immediate word. Notice |
section. This word is both a parsing word and an immediate word. Notice |
Line 1480 the text interpreter can identify it. Th
|
Line 1599 the text interpreter can identify it. Th
|
it is a @var{delimiter}. The examples earlier show that, when the string |
it is a @var{delimiter}. The examples earlier show that, when the string |
is displayed, there is neither a space before the @code{H} nor after the |
is displayed, there is neither a space before the @code{H} nor after the |
@code{e}. Since @code{."} is an immediate word, it executes at the time |
@code{e}. Since @code{."} is an immediate word, it executes at the time |
that @code{greet is defined}. When it executes, it searches forward in |
that @code{greet} is defined. When it executes, it searches forward in |
the input line looking for the delimiter. When it finds the delimiter, |
the input line looking for the delimiter. When it finds the delimiter, |
it updates @code{>in} to point past the delimiter. It also compiles some |
it updates @code{>in} to point past the delimiter. It also compiles some |
magic code into the definition of @code{greet}; the xt of a run-time |
magic code into the definition of @code{greet}; the xt of a run-time |
Line 1506 If you have tried out the examples in th
|
Line 1625 If you have tried out the examples in th
|
have typed them in by hand; when you leave Gforth, your definitions will |
have typed them in by hand; when you leave Gforth, your definitions will |
be deleted. You can avoid this by using a text editor to enter Forth |
be deleted. You can avoid this by using a text editor to enter Forth |
source code into a file, and then load all of the code from the file |
source code into a file, and then load all of the code from the file |
using @code{include} (@xref{Including Files}). A Forth source |
using @code{include} (@xref{Forth source files}). A Forth source |
file is processed by the text interpreter, just as though you had typed |
file is processed by the text interpreter, just as though you had typed |
it in by hand@footnote{Actually, there are some subtle differences, like |
it in by hand@footnote{Actually, there are some subtle differences, like |
the fact that it doesn't print @code{ ok} at the end of each line}. |
the fact that it doesn't print @code{ ok} at the end of each line}. |
Line 1526 long definitions by hand, you can use a
|
Line 1645 long definitions by hand, you can use a
|
the history file into a Forth source file for reuse at a later time. |
the history file into a Forth source file for reuse at a later time. |
|
|
@cindex history file |
@cindex history file |
@cindex .gforth-history |
@cindex @file{.gforth-history} |
@cindex GFORTHHIST |
@cindex @code{GFORTHHIST} environment variable |
|
@cindex environment variables |
You can find out the name of your history file using @code{history-file |
You can find out the name of your history file using @code{history-file |
type }. On non-Unix systems you can find the location of the file using |
type }. On non-Unix systems you can find the location of the file using |
@code{history-dir type }@footnote{The environment variable GFORTHHIST |
@code{history-dir type }@footnote{The environment variable @code{GFORTHHIST} |
determines the location of the file.} |
determines the location of the file.} |
|
|
|
|
Line 1552 Forth program development is an interact
|
Line 1672 Forth program development is an interact
|
@item |
@item |
The main command loop that accepts input, and controls both |
The main command loop that accepts input, and controls both |
interpretation and compilation, is called the @var{text interpreter} |
interpretation and compilation, is called the @var{text interpreter} |
(also known as the @var{outer interpreter}. |
(also known as the @var{outer interpreter}). |
@item |
@item |
Forth has a very simple syntax, consisting of words and numbers |
Forth has a very simple syntax, consisting of words and numbers |
separated by spaces or carriage-return characters. Any additional syntax |
separated by spaces or carriage-return characters. Any additional syntax |
Line 1573 semantics} of a word that it encounters.
|
Line 1693 semantics} of a word that it encounters.
|
@item |
@item |
The relationship between the @var{interpretation semantics}, @var{compilation semantics} |
The relationship between the @var{interpretation semantics}, @var{compilation semantics} |
and @var{execution semantics} for a word depend upon the way in which |
and @var{execution semantics} for a word depend upon the way in which |
the word was defined (for example, whether it is an @var{immediate} word. |
the word was defined (for example, whether it is an @var{immediate} word). |
@item |
@item |
Forth definitions can be implemented in Forth (called @var{high-level |
Forth definitions can be implemented in Forth (called @var{high-level |
definitions}) or in some other way (usually a lower-level language and |
definitions}) or in some other way (usually a lower-level language and |
Line 1583 definitions} or @var{primitives}).
|
Line 1703 definitions} or @var{primitives}).
|
Many Forth systems are implemented mainly in Forth. |
Many Forth systems are implemented mainly in Forth. |
@item |
@item |
You now know enough to read and understand the rest of this manual and |
You now know enough to read and understand the rest of this manual and |
the ANS Forth Standard. |
the ANS Forth document. |
@end itemize |
@end itemize |
|
|
|
|
Line 1609 provides. Even scarier, you know almost
|
Line 1729 provides. Even scarier, you know almost
|
system. However, that's not a good idea just yet.. better to try writing |
system. However, that's not a good idea just yet.. better to try writing |
some programs in Gforth. |
some programs in Gforth. |
|
|
The large number of Forth words available in ANS Standard Forth and |
The large number of Forth words available in ANS Forth and |
Gforth make learning Forth somewhat daunting. To make the problem |
Gforth make learning Forth somewhat daunting. To make the problem |
easier, use the index of this manual to learn more about these words: |
easier, use the index of this manual to learn more about these words: |
|
|
Line 1622 all the exercises in a .fs file in the d
|
Line 1742 all the exercises in a .fs file in the d
|
inspiration from Starting Forth and Kelly&Spies. |
inspiration from Starting Forth and Kelly&Spies. |
|
|
|
|
|
@c ****************************************************************** |
|
@node Invoking Gforth, Words, Introduction, Top |
|
@chapter Invoking Gforth |
|
@cindex Gforth - invoking |
|
@cindex invoking Gforth |
|
@cindex running Gforth |
|
@cindex command-line options |
|
@cindex options on the command line |
|
@cindex flags on the command line |
|
|
@c ---------------------------------------------------------- |
You will usually just say @code{gforth}. In many other cases the default |
@node Goals, Invoking Gforth, Introduction, Top |
Gforth image will be invoked like this: |
@comment node-name, next, previous, up |
@example |
@chapter Goals of Gforth |
gforth [files] [-e forth-code] |
@cindex Goals |
@end example |
The goal of the Gforth Project is to develop a standard model for |
This interprets the contents of the files and the Forth code in the order they |
ANS Forth. This can be split into several subgoals: |
are given. |
|
|
@itemize @bullet |
|
@item |
|
Gforth should conform to the ANS Forth Standard. |
|
@item |
|
It should be a model, i.e. it should define all the |
|
implementation-dependent things. |
|
@item |
|
It should become standard, i.e. widely accepted and used. This goal |
|
is the most difficult one. |
|
@end itemize |
|
|
|
To achieve these goals Gforth should be |
|
@itemize @bullet |
|
@item |
|
Similar to previous models (fig-Forth, F83) |
|
@item |
|
Powerful. It should provide for all the things that are considered |
|
necessary today and even some that are not yet considered necessary. |
|
@item |
|
Efficient. It should not get the reputation of being exceptionally |
|
slow. |
|
@item |
|
Free. |
|
@item |
|
Available on many machines/easy to port. |
|
@end itemize |
|
|
|
Have we achieved these goals? Gforth conforms to the ANS Forth |
In general, the command line looks like this: |
standard. It may be considered a model, but we have not yet documented |
|
which parts of the model are stable and which parts we are likely to |
|
change. It certainly has not yet become a de facto standard, but it |
|
appears to be quite popular. It has some similarities to and some |
|
differences from previous models. It has some powerful features, but not |
|
yet everything that we envisioned. We certainly have achieved our |
|
execution speed goals (@pxref{Performance}). It is free and available |
|
on many machines. |
|
|
|
@menu |
@example |
* Gforth Extensions Sinful?:: |
gforth [initialization options] [image-specific options] |
@end menu |
@end example |
|
|
@node Gforth Extensions Sinful?, , Goals, Goals |
The initialization options must come before the rest of the command |
@comment node-name, next, previous, up |
line. They are: |
@section Is it a Sin to use Gforth Extensions? |
|
@cindex Gforth extensions |
|
|
|
If you've been paying attention, you will have realised that there is an |
|
ANS Standard for Forth. As you read through the rest of this manual, you |
|
will see documentation for @var{Standard} words, and documentation for |
|
some appealing Gforth @var{extensions}. You might ask yourself the |
|
question: @var{"Given that there is a standard, would I be committing a |
|
sin to use (non-Standard) Gforth extensions?"} |
|
|
|
The answer to that question is somewhat pragmatic and somewhat |
|
philosophical. Consider these points: |
|
|
|
@itemize @bullet |
|
@item |
|
A number of the Gforth extensions can be implemented in ANS Standard |
|
Forth using files provided in the @file{compat/} directory. These are |
|
mentioned in the text in passing. |
|
@item |
|
Forth has a rich historical precedent for programmers taking advantage |
|
of implementation-dependent features of their tools (for example, |
|
relying on a knowledge of the dictionary structure). Sometimes these |
|
techniques are necessary to extract every last bit of performance from |
|
the hardware, sometimes they are just a programming shorthand. |
|
@item |
|
The best way to break the rules is to know what the rules are. To learn |
|
the rules, there is no substitute for studying the text of the Standard |
|
itself. In particular, Appendix A of the Standard (@var{Rationale}) |
|
provides a valuable insight into the thought processes of the technical |
|
committee. |
|
@item |
|
The best reason to break a rule is because you have to; because it's |
|
more productive to do that, because it makes your code run fast enough |
|
or because you can see no Standard way to achieve what you want to |
|
achieve. |
|
@end itemize |
|
|
|
The tool @file{ans-report.fs} (@pxref{ANS Report}) makes it easy to |
|
analyse your program and determine what non-Standard definitions it |
|
relies upon. |
|
|
|
|
|
|
|
@c ---------------------------------------------------------- |
|
@node Invoking Gforth, Words, Goals, Top |
|
@chapter Invoking Gforth |
|
@cindex Gforth - invoking |
|
@cindex invoking Gforth |
|
@cindex running Gforth |
|
@cindex command-line options |
|
@cindex options on the command line |
|
@cindex flags on the command line |
|
|
|
You will usually just say @code{gforth}. In many other cases the default |
|
Gforth image will be invoked like this: |
|
@example |
|
gforth [files] [-e forth-code] |
|
@end example |
|
This interprets the contents of the files and the Forth code in the order they |
|
are given. |
|
|
|
In general, the command line looks like this: |
|
|
|
@example |
|
gforth [initialization options] [image-specific options] |
|
@end example |
|
|
|
The initialization options must come before the rest of the command |
|
line. They are: |
|
|
|
@table @code |
@table @code |
@cindex -i, command-line option |
@cindex -i, command-line option |
Line 1856 default image @file{gforth.fi} consist o
|
Line 1881 default image @file{gforth.fi} consist o
|
in which they are given. The @code{-e @var{forth-code}} or |
in which they are given. The @code{-e @var{forth-code}} or |
@code{--evaluate @var{forth-code}} option evaluates the Forth |
@code{--evaluate @var{forth-code}} option evaluates the Forth |
code. This option takes only one argument; if you want to evaluate more |
code. This option takes only one argument; if you want to evaluate more |
Forth words, you have to quote them or use several @code{-e}s. To exit |
Forth words, you have to quote them or use @code{-e} several times. To exit |
after processing the command line (instead of entering interactive mode) |
after processing the command line (instead of entering interactive mode) |
append @code{-e bye} to the command line. |
append @code{-e bye} to the command line. |
|
|
Line 1900 doc-bye
|
Line 1925 doc-bye
|
@comment some are in .c files. |
@comment some are in .c files. |
|
|
|
|
|
@c ****************************************************************** |
@node Words, Error messages, Invoking Gforth, Top |
@node Words, Error messages, Invoking Gforth, Top |
@chapter Forth Words |
@chapter Forth Words |
@cindex Words |
@cindex words |
|
|
@menu |
@menu |
* Notation:: |
* Notation:: |
Line 1912 doc-bye
|
Line 1938 doc-bye
|
* Stack Manipulation:: |
* Stack Manipulation:: |
* Memory:: |
* Memory:: |
* Control Structures:: |
* Control Structures:: |
* Locals:: |
|
* Defining Words:: |
* Defining Words:: |
* The Text Interpreter:: |
* The Text Interpreter:: |
* Structures:: |
|
* Object-oriented Forth:: |
|
* Tokens for Words:: |
* Tokens for Words:: |
* Word Lists:: |
* Word Lists:: |
* Environmental Queries:: |
* Environmental Queries:: |
* Files:: |
* Files:: |
* Including Files:: |
|
* Blocks:: |
* Blocks:: |
* Other I/O:: |
* Other I/O:: |
* Programming Tools:: |
* Programming Tools:: |
* Assembler and Code Words:: |
* Assembler and Code Words:: |
* Threading Words:: |
* Threading Words:: |
|
* Locals:: |
|
* Structures:: |
|
* Object-oriented Forth:: |
* Passing Commands to the OS:: |
* Passing Commands to the OS:: |
* Miscellaneous Words:: |
* Miscellaneous Words:: |
@end menu |
@end menu |
Line 1948 that has become a de-facto standard for
|
Line 1973 that has become a de-facto standard for
|
|
|
@table @var |
@table @var |
@item word |
@item word |
@cindex case insensitivity |
@cindex case-sensitivity |
The name of the word. BTW, Gforth is case insensitive, so you can |
The name of the word. Gforth is case-insensitive, so you can type the |
type the words in in lower case (However, @pxref{core-idef}). |
words in in lower case (However, @pxref{core-idef, |
|
Implementation-defined options, Implementation-defined options}). |
|
|
@item Stack effect |
@item Stack effect |
@cindex stack effect |
@cindex stack effect |
Line 1982 The ANS Forth standard is divided into s
|
Line 2008 The ANS Forth standard is divided into s
|
system need not support all of them. Therefore, in theory, the fewer |
system need not support all of them. Therefore, in theory, the fewer |
word sets your program uses the more portable it will be. However, we |
word sets your program uses the more portable it will be. However, we |
suspect that most ANS Forth systems on personal machines will feature |
suspect that most ANS Forth systems on personal machines will feature |
all word sets. Words that are not defined in the ANS standard have |
all word sets. Words that are not defined in ANS Forth have |
@code{gforth} or @code{gforth-internal} as word set. @code{gforth} |
@code{gforth} or @code{gforth-internal} as word set. @code{gforth} |
describes words that will work in future releases of Gforth; |
describes words that will work in future releases of Gforth; |
@code{gforth-internal} words are more volatile. Environmental query |
@code{gforth-internal} words are more volatile. Environmental query |
Line 2056 quotes.
|
Line 2082 quotes.
|
|
|
@node Comments, Boolean Flags, Notation, Words |
@node Comments, Boolean Flags, Notation, Words |
@section Comments |
@section Comments |
@cindex Comments |
@cindex comments |
|
|
Forth supports two styles of comment; the traditional "in-line" comment, |
Forth supports two styles of comment; the traditional @var{in-line} comment, |
@code{(} and its modern cousin, the "comment to end of line"; @code{\}. |
@code{(} and its modern cousin, the @var{comment to end of line}; @code{\}. |
|
|
doc-( |
doc-( |
doc-\ |
doc-\ |
Line 2067 doc-\G
|
Line 2093 doc-\G
|
|
|
@node Boolean Flags, Arithmetic, Comments, Words |
@node Boolean Flags, Arithmetic, Comments, Words |
@section Boolean Flags |
@section Boolean Flags |
@cindex Boolean Flags |
@cindex Boolean flags |
|
|
A Boolean flag is cell-sized. A cell with all bits clear represents the |
A Boolean flag is cell-sized. A cell with all bits clear represents the |
flag @code{false} and a flag with all bits set represents the flag |
flag @code{false} and a flag with all bits set represents the flag |
@code{true}. Words that check a flag (for example, @var{IF}) will treat |
@code{true}. Words that check a flag (for example, @code{IF}) will treat |
a cell that has @var{any} bit set as @code{true}. |
a cell that has @var{any} bit set as @code{true}. |
|
|
doc-true |
doc-true |
Line 2092 operators. If you perform division with
|
Line 2118 operators. If you perform division with
|
you do not want to use @code{/} or @code{/mod} with its undefined |
you do not want to use @code{/} or @code{/mod} with its undefined |
behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the |
behaviour, but rather @code{fm/mod} or @code{sm/mod} (probably the |
former, @pxref{Mixed precision}). |
former, @pxref{Mixed precision}). |
|
@comment TODO discuss the different division forms and the std approach |
|
|
@menu |
@menu |
* Single precision:: |
* Single precision:: |
Line 2107 former, @pxref{Mixed precision}).
|
Line 2134 former, @pxref{Mixed precision}).
|
@cindex single precision arithmetic words |
@cindex single precision arithmetic words |
|
|
By default, numbers in Forth are single-precision integers that are 1 |
By default, numbers in Forth are single-precision integers that are 1 |
CELL in size. They can be signed or unsigned, depending upon how you |
cell in size. They can be signed or unsigned, depending upon how you |
treat them. @xref{Number Conversion} for the rules used by the text |
treat them. @xref{Number Conversion} for the rules used by the text |
interpreter for recognising single-precision integers. |
interpreter for recognising single-precision integers. |
|
|
Line 2148 doc-d2/
|
Line 2175 doc-d2/
|
recognising double-precision integers. |
recognising double-precision integers. |
|
|
A double precision number is represented by a cell pair, with the most |
A double precision number is represented by a cell pair, with the most |
significant digit at the TOS. It is trivial to convert an unsigned single |
significant digit at the TOS. It is trivial to convert an unsigned |
to an (unsigned) double; simply push a @code{0} onto the TOS. Since numbers |
single to an (unsigned) double; simply push a @code{0} onto the |
are represented by Gforth using 2's complement arithmetic, converting |
TOS. Since numbers are represented by Gforth using 2's complement |
a signed single to a (signed) double requires sign-extension across the |
arithmetic, converting a signed single to a (signed) double requires |
most significant digit. This can be achieved using @code{s>d}. The moral |
sign-extension across the most significant digit. This can be achieved |
of the story is that you cannot convert a number without knowing what that |
using @code{s>d}. The moral of the story is that you cannot convert a |
number represents. |
number without knowing whether it represents an unsigned or a |
|
signed number. |
|
|
doc-s>d |
doc-s>d |
doc-d+ |
doc-d+ |
Line 2228 recognising floating-point numbers.
|
Line 2256 recognising floating-point numbers.
|
@cindex angles in trigonometric operations |
@cindex angles in trigonometric operations |
@cindex trigonometric operations |
@cindex trigonometric operations |
Angles in floating point operations are given in radians (a full circle |
Angles in floating point operations are given in radians (a full circle |
has 2 pi radians). Note, that Gforth has a separate floating point |
has 2 pi radians). Gforth has a separate floating point |
stack, but we use the unified notation. |
stack, but the documentation uses the unified notation. |
|
|
@cindex floating-point arithmetic, pitfalls |
@cindex floating-point arithmetic, pitfalls |
Floating point numbers have a number of unpleasant surprises for the |
Floating point numbers have a number of unpleasant surprises for the |
Line 2398 doc-2rdrop
|
Line 2426 doc-2rdrop
|
@node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation |
@node Locals stack, Stack pointer manipulation, Return stack, Stack Manipulation |
@subsection Locals stack |
@subsection Locals stack |
|
|
|
@comment TODO |
|
|
@node Stack pointer manipulation, , Locals stack, Stack Manipulation |
@node Stack pointer manipulation, , Locals stack, Stack Manipulation |
@subsection Stack pointer manipulation |
@subsection Stack pointer manipulation |
Line 2421 doc-lp!
|
Line 2450 doc-lp!
|
|
|
@node Memory, Control Structures, Stack Manipulation, Words |
@node Memory, Control Structures, Stack Manipulation, Words |
@section Memory |
@section Memory |
@cindex Memory words |
@cindex memory words |
|
|
@menu |
@menu |
* Memory Access:: |
* Memory Access:: |
Line 2475 char-aligned have no use in the standard
|
Line 2504 char-aligned have no use in the standard
|
created. |
created. |
|
|
@cindex @code{CREATE} and alignment |
@cindex @code{CREATE} and alignment |
The standard guarantees that addresses returned by @code{CREATE}d words |
AND Forth guarantees that addresses returned by @code{CREATE}d words |
are cell-aligned; in addition, Gforth guarantees that these addresses |
are cell-aligned; in addition, Gforth guarantees that these addresses |
are aligned for all purposes. |
are aligned for all purposes. |
|
|
Note that the standard defines a word @code{char}, which has nothing to |
Note that the ANS Forth word @code{char} has nothing to do with address |
do with address arithmetic. |
arithmetic. |
|
|
doc-chars |
doc-chars |
doc-char+ |
doc-char+ |
Line 2542 doc-blank
|
Line 2571 doc-blank
|
doc-compare |
doc-compare |
doc-search |
doc-search |
|
|
@node Control Structures, Locals, Memory, Words |
@node Control Structures, Defining Words, Memory, Words |
@section Control Structures |
@section Control Structures |
@cindex control structures |
@cindex control structures |
|
|
Line 2605 and many other programming languages has
|
Line 2634 and many other programming languages has
|
|
|
Gforth also provides the words @code{?DUP-IF} and @code{?DUP-0=-IF}, so |
Gforth also provides the words @code{?DUP-IF} and @code{?DUP-0=-IF}, so |
you can avoid using @code{?dup}. Using these alternatives is also more |
you can avoid using @code{?dup}. Using these alternatives is also more |
efficient than using @code{?dup}. Definitions in ANS Standard Forth |
efficient than using @code{?dup}. Definitions in ANS Forth |
for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in |
for @code{ENDIF}, @code{?DUP-IF} and @code{?DUP-0=-IF} are provided in |
@file{compat/control.fs}. |
@file{compat/control.fs}. |
|
|
Line 2804 prints nothing.
|
Line 2833 prints nothing.
|
@end itemize |
@end itemize |
|
|
Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and |
Unfortunately, @code{+DO}, @code{U+DO}, @code{-DO}, @code{U-DO} and |
@code{-LOOP} are not in the ANS Forth standard. However, an |
@code{-LOOP} are not defined in ANS Forth. However, an implementation |
implementation for these words that uses only standard words is provided |
for these words that uses only standard words is provided in |
in @file{compat/loops.fs}. |
@file{compat/loops.fs}. |
|
|
|
|
|
|
@cindex @code{FOR} loops |
@cindex @code{FOR} loops |
Another counted loop is |
Another counted loop is: |
@example |
@example |
@var{n} |
@var{n} |
FOR |
FOR |
Line 2819 FOR
|
Line 2847 FOR
|
NEXT |
NEXT |
@end example |
@end example |
This is the preferred loop of native code compiler writers who are too |
This is the preferred loop of native code compiler writers who are too |
lazy to optimize @code{?DO} loops properly. In Gforth, this loop |
lazy to optimize @code{?DO} loops properly. This loop structure is not |
iterates @var{n+1} times; @code{i} produces values starting with @var{n} |
defined in ANS Forth. In Gforth, this loop iterates @var{n+1} times; |
and ending with 0. Other Forth systems may behave differently, even if |
@code{i} produces values starting with @var{n} and ending with 0. Other |
they support @code{FOR} loops. To avoid problems, don't use @code{FOR} |
Forth systems may behave differently, even if they support @code{FOR} |
loops. |
loops. To avoid problems, don't use @code{FOR} loops. |
|
|
@node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures |
@node Arbitrary control structures, Calls and returns, Counted Loops, Control Structures |
@subsection Arbitrary control structures |
@subsection Arbitrary control structures |
Line 2857 would need to know how many stack items
|
Line 2885 would need to know how many stack items
|
entry (many systems use one cell. In Gforth they currently take three, |
entry (many systems use one cell. In Gforth they currently take three, |
but this may change in the future). |
but this may change in the future). |
|
|
|
|
Some standard control structure words are built from these words: |
Some standard control structure words are built from these words: |
|
|
doc-else |
doc-else |
Line 2895 through the definition (@code{LOOP} etc.
|
Line 2922 through the definition (@code{LOOP} etc.
|
fall-through path). Also, you have to ensure that all @code{LEAVE}s are |
fall-through path). Also, you have to ensure that all @code{LEAVE}s are |
resolved (by using one of the loop-ending words or @code{DONE}). |
resolved (by using one of the loop-ending words or @code{DONE}). |
|
|
Another group of control structure words are |
Another group of control structure words are: |
|
|
doc-case |
doc-case |
doc-endcase |
doc-endcase |
Line 2910 doc-endof
|
Line 2937 doc-endof
|
In order to ensure readability we recommend that you do not create |
In order to ensure readability we recommend that you do not create |
arbitrary control structures directly, but define new control structure |
arbitrary control structures directly, but define new control structure |
words for the control structure you want and use these words in your |
words for the control structure you want and use these words in your |
program. |
program. For example, instead of writing: |
|
|
E.g., instead of writing: |
|
|
|
@example |
@example |
begin |
BEGIN |
... |
... |
if [ 1 cs-roll ] |
IF [ 1 CS-ROLL ] |
... |
... |
again then |
AGAIN THEN |
@end example |
@end example |
|
|
@noindent |
@noindent |
we recommend defining control structure words, e.g., |
we recommend defining control structure words, e.g., |
|
|
@example |
@example |
: while ( dest -- orig dest ) |
: WHILE ( DEST -- ORIG DEST ) |
POSTPONE if |
POSTPONE IF |
1 cs-roll ; immediate |
1 CS-ROLL ; immediate |
|
|
: repeat ( orig dest -- ) |
: REPEAT ( orig dest -- ) |
POSTPONE again |
POSTPONE AGAIN |
POSTPONE then ; immediate |
POSTPONE THEN ; immediate |
@end example |
@end example |
|
|
@noindent |
@noindent |
and then using these to create the control structure: |
and then using these to create the control structure: |
|
|
@example |
@example |
begin |
BEGIN |
... |
... |
while |
WHILE |
... |
... |
repeat |
REPEAT |
@end example |
@end example |
|
|
That's much easier to read, isn't it? Of course, @code{REPEAT} and |
That's much easier to read, isn't it? Of course, @code{REPEAT} and |
Line 2957 necessary to define them.
|
Line 2982 necessary to define them.
|
|
|
@cindex recursive definitions |
@cindex recursive definitions |
A definition can be called simply be writing the name of the definition |
A definition can be called simply be writing the name of the definition |
to be called. Note that normally a definition is invisible during its |
to be called. Normally a definition is invisible during its own |
definition. If you want to write a directly recursive definition, you |
definition. If you want to write a directly recursive definition, you |
can use @code{recursive} to make the current definition visible. |
can use @code{recursive} to make the current definition visible, or |
|
@code{recurse} to call the current definition directly. |
|
|
doc-recursive |
doc-recursive |
|
|
Another way to perform a recursive call is |
|
|
|
doc-recurse |
doc-recurse |
|
|
@comment TODO add example of the two recursion methods |
@comment TODO add example of the two recursion methods |
Line 2993 defer foo
|
Line 3016 defer foo
|
IS foo |
IS foo |
@end example |
@end example |
|
|
When the end of the definition is reached, it returns. An earlier return |
The current definition returns control to the calling definition when |
can be forced using |
the end of the definition is reached or @code{EXIT} is encountered. |
|
|
doc-exit |
doc-exit |
|
|
Don't forget to clean up the return stack and @code{UNLOOP} any |
|
outstanding @code{?DO}...@code{LOOP}s before @code{EXIT}ing. |
|
|
|
doc-;s |
doc-;s |
|
|
@node Exception Handling, , Calls and returns, Control Structures |
@node Exception Handling, , Calls and returns, Control Structures |
@subsection Exception Handling |
@subsection Exception Handling |
@cindex Exceptions |
@cindex exceptions |
|
|
@comment TODO examples and blurb |
|
doc-catch |
|
doc-throw |
|
@comment TODO -- think this will alllcate you a new THROW code? |
|
@comment for reserving new exception numbers. Note the existence of compat/exception.fs |
|
doc---exception-exception |
|
doc-quit |
|
doc-abort |
|
doc-abort" |
|
|
|
|
If your program detects a fatal error condition, the simplest action |
|
that it can take is to @code{quit}. This resets the return stack and |
|
restarts the text interpreter, but does not print any error message. |
|
|
@node Locals, Defining Words, Control Structures, Words |
The next stage in severity is to execute @code{abort}, which has the |
@section Locals |
same effect as @code{quit}, with the addition that it resets the data |
@cindex locals |
stack. |
|
|
Local variables can make Forth programming more enjoyable and Forth |
|
programs easier to read. Unfortunately, the locals of ANS Forth are |
|
laden with restrictions. Therefore, we provide not only the ANS Forth |
|
locals wordset, but also our own, more powerful locals wordset (we |
|
implemented the ANS Forth locals wordset through our locals wordset). |
|
|
|
The ideas in this section have also been published in the paper |
A slightly more sophisticated approach is use use @code{abort"}, which |
@cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented |
compiles a string to be used as an error message and does a conditional |
at EuroForth '94; it is available at |
@code{abort} at run-time. For example: |
@*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}. |
|
|
|
@menu |
@example |
* Gforth locals:: |
@kbd{: checker abort" That flag was true" ." A false flag" ;<return>} ok |
* ANS Forth locals:: |
@kbd{0 checker<return>} A false flag ok |
@end menu |
@kbd{1 checker<return>} |
|
:1: That flag was true |
|
1 checker |
|
^^^^^^^ |
|
$400D1648 throw |
|
$400E4660 |
|
@end example |
|
|
@node Gforth locals, ANS Forth locals, Locals, Locals |
These simple techniques allow a program to react to a fatal error |
@subsection Gforth locals |
condition, but they are not exactly user-friendly. The ANS Forth |
@cindex Gforth locals |
Exception word set provides the pair of words @code{throw} and |
@cindex locals, Gforth style |
@code{catch}, which can be used to provide sophisticated error-handling. |
|
|
Locals can be defined with |
@code{catch} has a similar behaviour to @code{execute}, in that it takes |
|
an @var{xt} as a parameter and starts execution of the xt. However, |
|
before passing control to the xt, @code{catch} pushes an |
|
@var{exception frame} onto the @var{exception stack}. This exception |
|
frame is used to restore the system to a known state if a detected error |
|
occurs during the execution of the xt. A typical way to use @code{catch} |
|
would be: |
|
|
@example |
@example |
@{ local1 local2 ... -- comment @} |
... ['] foo catch IF ... |
@end example |
@end example |
or |
|
|
Whilst @code{foo} executes, it can call other words to any level of |
|
nesting, as usual. If @code{foo} (and all the words that it calls) |
|
execute successfully, control will ultimately passes to the word following |
|
the @code{catch}, and there will be a @code{true} flag (0) at |
|
TOS. However, if any word detects an error, it can terminate the |
|
execution of @code{foo} by pushing an error code onto the stack and then |
|
performing a @code{throw}. The execution of @code{throw} will pass |
|
control to the word following the @code{catch}, but this time the TOS |
|
will hold the error code. Therefore, the @code{IF} in the example |
|
can be used to determine whether @code{foo} executed successfully. |
|
|
|
This simple example shows how you can use @code{throw} and @code{catch} |
|
to ``take over'' exception handling from the system: |
@example |
@example |
@{ local1 local2 ... @} |
: my-div ['] / catch if ." DIVIDE ERROR" else ." OK.. " . then ; |
@end example |
@end example |
|
|
E.g., |
The next example is more sophisticated and shows a multi-level |
|
@code{throw} and @code{catch}. To understand this example, start at the |
|
definition of @code{top-level} and work backwards: |
|
|
@example |
@example |
: max @{ n1 n2 -- n3 @} |
: lowest-level ( -- c ) |
n1 n2 > if |
key dup 27 = if |
n1 |
1 throw \ ESCAPE key pressed |
else |
else |
n2 |
." lowest-level successfull" CR |
endif ; |
then |
|
; |
|
|
|
: lower-level ( -- c ) |
|
lowest-level |
|
\ at this level consider a CTRL-U to be a fatal error |
|
dup 21 = if \ CTRL-U |
|
2 throw |
|
else |
|
." lower-level successfull" CR |
|
then |
|
; |
|
|
|
: low-level ( -- c ) |
|
['] lower-level catch |
|
?dup if |
|
\ error occurred - do we recognise it? |
|
dup 1 = if |
|
\ ESCAPE key pressed.. pretend it was an E |
|
[char] E |
|
else throw \ propogate the error upwards |
|
then |
|
then |
|
." low-level successfull" CR |
|
; |
|
|
|
: top-level ( -- ) |
|
CR ['] low-level catch \ CATCH is used like EXECUTE |
|
?dup if \ error occurred.. |
|
." Error " . ." occurred - contact your supplier" |
|
else |
|
." The '" emit ." ' key was pressed" CR |
|
then |
|
; |
@end example |
@end example |
|
|
The similarity of locals definitions with stack comments is intended. A |
The ANS Forth document assigns @code{throw} codes thus: |
locals definition often replaces the stack comment of a word. The order |
|
of the locals corresponds to the order in a stack comment and everything |
|
after the @code{--} is really a comment. |
|
|
|
This similarity has one disadvantage: It is too easy to confuse locals |
@itemize @bullet |
declarations with stack comments, causing bugs and making them hard to |
@item |
find. However, this problem can be avoided by appropriate coding |
codes in the range -1 -- -255 are reserved to be assigned by the |
conventions: Do not use both notations in the same program. If you do, |
Standard. Assignments for codes in the range -1 -- -58 are currently |
they should be distinguished using additional means, e.g. by position. |
documented in the Standard. In particular, @code{-1 throw} is equivalent |
|
to @code{abort} and @code{-2 throw} is equivalent to @code{abort"}. |
|
@item |
|
codes in the range -256 -- -4095 are reserved to be assigned by the system. |
|
@item |
|
all other codes may be assigned by programs. |
|
@end itemize |
|
|
@cindex types of locals |
Gforth provides the word @code{exception} as a mechanism for assigning |
@cindex locals types |
system throw codes to applications. This allows multiple applications to |
The name of the local may be preceded by a type specifier, e.g., |
co-exist in memory without any clash of @code{throw} codes. A definition |
@code{F:} for a floating point value: |
of @code{exception} in ANS Forth is provided in |
|
@file{compat/exception.fs}. |
|
|
@example |
|
: CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @} |
|
\ complex multiplication |
|
Ar Br f* Ai Bi f* f- |
|
Ar Bi f* Ai Br f* f+ ; |
|
@end example |
|
|
|
@cindex flavours of locals |
doc-quit |
@cindex locals flavours |
doc-abort |
@cindex value-flavoured locals |
doc-abort" |
@cindex variable-flavoured locals |
|
Gforth currently supports cells (@code{W:}, @code{W^}), doubles |
|
(@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters |
|
(@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined |
|
with @code{W:}, @code{D:} etc.) produces its value and can be changed |
|
with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.) |
|
produces its address (which becomes invalid when the variable's scope is |
|
left). E.g., the standard word @code{emit} can be defined in terms of |
|
@code{type} like this: |
|
|
|
@example |
doc-catch |
: emit @{ C^ char* -- @} |
doc-throw |
char* 1 type ; |
doc---exception-exception |
@end example |
|
|
|
@cindex default type of locals |
|
@cindex locals, default type |
|
A local without type specifier is a @code{W:} local. Both flavours of |
|
locals are initialized with values from the data or FP stack. |
|
|
|
Currently there is no way to define locals with user-defined data |
@c ------------------------------------------------------------- |
structures, but we are working on it. |
@node Defining Words, The Text Interpreter, Control Structures, Words |
|
@section Defining Words |
|
@cindex defining words |
|
|
Gforth allows defining locals everywhere in a colon definition. This |
@comment TODO much more intro material here. 3 classes: colon defn, variables/constants |
poses the following questions: |
@comment values, user-defined defining words. |
|
|
@menu |
@menu |
* Where are locals visible by name?:: |
* Simple Defining Words:: |
* How long do locals live?:: |
* Colon Definitions:: |
* Programming Style:: |
* User-defined Defining Words:: |
* Implementation:: |
* Supplying names:: |
|
* Interpretation and Compilation Semantics:: |
@end menu |
@end menu |
|
|
@node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals |
@node Simple Defining Words, Colon Definitions, Defining Words, Defining Words |
@subsubsection Where are locals visible by name? |
@subsection Simple Defining Words |
@cindex locals visibility |
@cindex simple defining words |
@cindex visibility of locals |
@cindex defining words, simple |
@cindex scope of locals |
|
|
|
Basically, the answer is that locals are visible where you would expect |
|
it in block-structured languages, and sometimes a little longer. If you |
|
want to restrict the scope of a local, enclose its definition in |
|
@code{SCOPE}...@code{ENDSCOPE}. |
|
|
|
doc-scope |
doc-constant |
doc-endscope |
doc-2constant |
|
doc-fconstant |
|
doc-variable |
|
doc-2variable |
|
doc-fvariable |
|
doc-create |
|
doc-user |
|
doc-value |
|
doc-to |
|
doc-defer |
|
doc-is |
|
|
These words behave like control structure words, so you can use them |
Definitions in ANS Forth for @code{defer}, @code{<is>} and |
with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in |
@code{[is]} are provided in @file{compat/defer.fs}. |
arbitrary ways. |
@comment TODO - what do the two "is" words do? |
|
|
If you want a more exact answer to the visibility question, here's the |
@node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words |
basic principle: A local is visible in all places that can only be |
@subsection Colon Definitions |
reached through the definition of the local@footnote{In compiler |
@cindex colon definitions |
construction terminology, all places dominated by the definition of the |
|
local.}. In other words, it is not visible in places that can be reached |
|
without going through the definition of the local. E.g., locals defined |
|
in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals |
|
defined in @code{BEGIN}...@code{UNTIL} are visible after the |
|
@code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}). |
|
|
|
The reasoning behind this solution is: We want to have the locals |
@example |
visible as long as it is meaningful. The user can always make the |
: name ( ... -- ... ) |
visibility shorter by using explicit scoping. In a place that can |
word1 word2 word3 ; |
only be reached through the definition of a local, the meaning of a |
@end example |
local name is clear. In other places it is not: How is the local |
|
initialized at the control flow path that does not contain the |
|
definition? Which local is meant, if the same name is defined twice in |
|
two independent control flow paths? |
|
|
|
This should be enough detail for nearly all users, so you can skip the |
creates a word called @code{name}, that, upon execution, executes |
rest of this section. If you really must know all the gory details and |
@code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}. |
options, read on. |
|
|
|
In order to implement this rule, the compiler has to know which places |
The explanation above is somewhat superficial. @xref{Interpretation and |
are unreachable. It knows this automatically after @code{AHEAD}, |
Compilation Semantics} for an in-depth discussion of some of the issues |
@code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after |
involved. |
most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the |
|
compiler that the control flow never reaches that place. If |
|
@code{UNREACHABLE} is not used where it could, the only consequence is |
|
that the visibility of some locals is more limited than the rule above |
|
says. If @code{UNREACHABLE} is used where it should not (i.e., if you |
|
lie to the compiler), buggy code will be produced. |
|
|
|
doc-unreachable |
doc-: |
|
doc-; |
|
|
Another problem with this rule is that at @code{BEGIN}, the compiler |
@node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words |
does not know which locals will be visible on the incoming |
@subsection User-defined Defining Words |
back-edge. All problems discussed in the following are due to this |
@cindex user-defined defining words |
ignorance of the compiler (we discuss the problems using @code{BEGIN} |
@cindex defining words, user-defined |
loops as examples; the discussion also applies to @code{?DO} and other |
|
loops). Perhaps the most insidious example is: |
|
@example |
|
AHEAD |
|
BEGIN |
|
x |
|
[ 1 CS-ROLL ] THEN |
|
@{ x @} |
|
... |
|
UNTIL |
|
@end example |
|
|
|
This should be legal according to the visibility rule. The use of |
You can create new defining words simply by wrapping defining-time code |
@code{x} can only be reached through the definition; but that appears |
around existing defining words and putting the sequence in a colon |
textually below the use. |
definition. |
|
|
From this example it is clear that the visibility rules cannot be fully |
@comment TODO example |
implemented without major headaches. Our implementation treats common |
|
cases as advertised and the exceptions are treated in a safe way: The |
|
compiler makes a reasonable guess about the locals visible after a |
|
@code{BEGIN}; if it is too pessimistic, the |
|
user will get a spurious error about the local not being defined; if the |
|
compiler is too optimistic, it will notice this later and issue a |
|
warning. In the case above the compiler would complain about @code{x} |
|
being undefined at its use. You can see from the obscure examples in |
|
this section that it takes quite unusual control structures to get the |
|
compiler into trouble, and even then it will often do fine. |
|
|
|
If the @code{BEGIN} is reachable from above, the most optimistic guess |
@cindex @code{CREATE} ... @code{DOES>} |
is that all locals visible before the @code{BEGIN} will also be |
If you want the words defined with your defining words to behave |
visible after the @code{BEGIN}. This guess is valid for all loops that |
differently from words defined with standard defining words, you can |
are entered only through the @code{BEGIN}, in particular, for normal |
write your defining word like this: |
@code{BEGIN}...@code{WHILE}...@code{REPEAT} and |
|
@code{BEGIN}...@code{UNTIL} loops and it is implemented in our |
|
compiler. When the branch to the @code{BEGIN} is finally generated by |
|
@code{AGAIN} or @code{UNTIL}, the compiler checks the guess and |
|
warns the user if it was too optimistic: |
|
@example |
|
IF |
|
@{ x @} |
|
BEGIN |
|
\ x ? |
|
[ 1 cs-roll ] THEN |
|
... |
|
UNTIL |
|
@end example |
|
|
|
Here, @code{x} lives only until the @code{BEGIN}, but the compiler |
|
optimistically assumes that it lives until the @code{THEN}. It notices |
|
this difference when it compiles the @code{UNTIL} and issues a |
|
warning. The user can avoid the warning, and make sure that @code{x} |
|
is not used in the wrong area by using explicit scoping: |
|
@example |
@example |
IF |
: def-word ( "name" -- ) |
SCOPE |
Create @var{code1} |
@{ x @} |
DOES> ( ... -- ... ) |
ENDSCOPE |
@var{code2} ; |
BEGIN |
|
[ 1 cs-roll ] THEN |
|
... |
|
UNTIL |
|
@end example |
|
|
|
Since the guess is optimistic, there will be no spurious error messages |
def-word name |
about undefined locals. |
@end example |
|
|
If the @code{BEGIN} is not reachable from above (e.g., after |
Technically, this fragment defines a defining word @code{def-word}, and |
@code{AHEAD} or @code{EXIT}), the compiler cannot even make an |
a word @code{name}; when you execute @code{name}, the address of the |
optimistic guess, as the locals visible after the @code{BEGIN} may be |
body of @code{name} is put on the data stack and @var{code2} is executed |
defined later. Therefore, the compiler assumes that no locals are |
(the address of the body of @code{name} is the address @code{HERE} |
visible after the @code{BEGIN}. However, the user can use |
returns immediately after the @code{CREATE}). The word @code{name} is |
@code{ASSUME-LIVE} to make the compiler assume that the same locals are |
sometimes called a @var{child} of @code{def-word}. |
visible at the BEGIN as at the point where the top control-flow stack |
|
item was created. |
|
|
|
doc-assume-live |
In other words, if you make the following definitions: |
|
|
E.g., |
|
@example |
@example |
@{ x @} |
: def-word1 ( "name" -- ) |
AHEAD |
Create @var{code1} ; |
ASSUME-LIVE |
|
BEGIN |
: action1 ( ... -- ... ) |
x |
@var{code2} ; |
[ 1 CS-ROLL ] THEN |
|
... |
def-word name1 |
UNTIL |
|
@end example |
@end example |
|
|
Other cases where the locals are defined before the @code{BEGIN} can be |
Using @code{name1 action1} is equivalent to using @code{name}. |
handled by inserting an appropriate @code{CS-ROLL} before the |
|
@code{ASSUME-LIVE} (and changing the control-flow stack manipulation |
The classic example is that you can define @code{Constant} in this way: |
behind the @code{ASSUME-LIVE}). |
|
|
|
Cases where locals are defined after the @code{BEGIN} (but should be |
|
visible immediately after the @code{BEGIN}) can only be handled by |
|
rearranging the loop. E.g., the ``most insidious'' example above can be |
|
arranged into: |
|
@example |
@example |
BEGIN |
: constant ( w "name" -- ) |
@{ x @} |
create , |
... 0= |
DOES> ( -- w ) |
WHILE |
@@ ; |
x |
|
REPEAT |
|
@end example |
@end example |
|
|
@node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals |
@comment that is the classic example.. maybe it should be earlier. There |
@subsubsection How long do locals live? |
@comment is a beautiful description of how this works and what it does in |
@cindex locals lifetime |
@comment the Forthwrite 100th edition. |
@cindex lifetime of locals |
|
|
|
The right answer for the lifetime question would be: A local lives at |
|
least as long as it can be accessed. For a value-flavoured local this |
|
means: until the end of its visibility. However, a variable-flavoured |
|
local could be accessed through its address far beyond its visibility |
|
scope. Ultimately, this would mean that such locals would have to be |
|
garbage collected. Since this entails un-Forth-like implementation |
|
complexities, I adopted the same cowardly solution as some other |
|
languages (e.g., C): The local lives only as long as it is visible; |
|
afterwards its address is invalid (and programs that access it |
|
afterwards are erroneous). |
|
|
|
@node Programming Style, Implementation, How long do locals live?, Gforth locals |
When you create a constant with @code{5 constant five}, first a new word |
@subsubsection Programming Style |
@code{five} is created, then the value 5 is laid down in the body of |
@cindex locals programming style |
@code{five} with @code{,}. When @code{five} is invoked, the address of |
@cindex programming style, locals |
the body is put on the stack, and @code{@@} retrieves the value 5. |
|
|
The freedom to define locals anywhere has the potential to change |
@cindex stack effect of @code{DOES>}-parts |
programming styles dramatically. In particular, the need to use the |
@cindex @code{DOES>}-parts, stack effect |
return stack for intermediate storage vanishes. Moreover, all stack |
In the example above the stack comment after the @code{DOES>} specifies |
manipulations (except @code{PICK}s and @code{ROLL}s with run-time |
the stack effect of the defined words, not the stack effect of the |
determined arguments) can be eliminated: If the stack items are in the |
following code (the following code expects the address of the body on |
wrong order, just write a locals definition for all of them; then |
the top of stack, which is not reflected in the stack comment). This is |
write the items in the order you want. |
the convention that I use and recommend (it clashes a bit with using |
|
locals declarations for stack effect specification, though). |
|
|
This seems a little far-fetched and eliminating stack manipulations is |
@subsubsection Applications of @code{CREATE..DOES>} |
unlikely to become a conscious programming objective. Still, the number |
@cindex @code{CREATE} ... @code{DOES>}, applications |
of stack manipulations will be reduced dramatically if local variables |
|
are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with |
|
a traditional implementation of @code{max}). |
|
|
|
This shows one potential benefit of locals: making Forth programs more |
You may wonder how to use this feature. Here are some usage patterns: |
readable. Of course, this benefit will only be realized if the |
|
programmers continue to honour the principle of factoring instead of |
|
using the added latitude to make the words longer. |
|
|
|
@cindex single-assignment style for locals |
@cindex factoring similar colon definitions |
Using @code{TO} can and should be avoided. Without @code{TO}, |
When you see a sequence of code occurring several times, and you can |
every value-flavoured local has only a single assignment and many |
identify a meaning, you will factor it out as a colon definition. When |
advantages of functional languages apply to Forth. I.e., programs are |
you see similar colon definitions, you can factor them using |
easier to analyse, to optimize and to read: It is clear from the |
@code{CREATE..DOES>}. E.g., an assembler usually defines several words |
definition what the local stands for, it does not turn into something |
that look very similar: |
different later. |
@example |
|
: ori, ( reg-target reg-source n -- ) |
|
0 asm-reg-reg-imm ; |
|
: andi, ( reg-target reg-source n -- ) |
|
1 asm-reg-reg-imm ; |
|
@end example |
|
|
E.g., a definition using @code{TO} might look like this: |
@noindent |
|
This could be factored with: |
@example |
@example |
: strcmp @{ addr1 u1 addr2 u2 -- n @} |
: reg-reg-imm ( op-code -- ) |
u1 u2 min 0 |
CREATE , |
?do |
DOES> ( reg-target reg-source n -- ) |
addr1 c@@ addr2 c@@ - |
@@ asm-reg-reg-imm ; |
?dup-if |
|
unloop exit |
0 reg-reg-imm ori, |
then |
1 reg-reg-imm andi, |
addr1 char+ TO addr1 |
|
addr2 char+ TO addr2 |
|
loop |
|
u1 u2 - ; |
|
@end example |
@end example |
Here, @code{TO} is used to update @code{addr1} and @code{addr2} at |
|
every loop iteration. @code{strcmp} is a typical example of the |
|
readability problems of using @code{TO}. When you start reading |
|
@code{strcmp}, you think that @code{addr1} refers to the start of the |
|
string. Only near the end of the loop you realize that it is something |
|
else. |
|
|
|
This can be avoided by defining two locals at the start of the loop that |
@cindex currying |
are initialized with the right value for the current iteration. |
Another view of @code{CREATE..DOES>} is to consider it as a crude way to |
|
supply a part of the parameters for a word (known as @dfn{currying} in |
|
the functional language community). E.g., @code{+} needs two |
|
parameters. Creating versions of @code{+} with one parameter fixed can |
|
be done like this: |
@example |
@example |
: strcmp @{ addr1 u1 addr2 u2 -- n @} |
: curry+ ( n1 -- ) |
addr1 addr2 |
CREATE , |
u1 u2 min 0 |
DOES> ( n2 -- n1+n2 ) |
?do @{ s1 s2 @} |
@@ + ; |
s1 c@@ s2 c@@ - |
|
?dup-if |
|
unloop exit |
|
then |
|
s1 char+ s2 char+ |
|
loop |
|
2drop |
|
u1 u2 - ; |
|
@end example |
|
Here it is clear from the start that @code{s1} has a different value |
|
in every loop iteration. |
|
|
|
@node Implementation, , Programming Style, Gforth locals |
3 curry+ 3+ |
@subsubsection Implementation |
-2 curry+ 2- |
@cindex locals implementation |
@end example |
@cindex implementation of locals |
|
|
|
@cindex locals stack |
@subsubsection The gory details of @code{CREATE..DOES>} |
Gforth uses an extra locals stack. The most compelling reason for |
@cindex @code{CREATE} ... @code{DOES>}, details |
this is that the return stack is not float-aligned; using an extra stack |
|
also eliminates the problems and restrictions of using the return stack |
|
as locals stack. Like the other stacks, the locals stack grows toward |
|
lower addresses. A few primitives allow an efficient implementation: |
|
|
|
doc-@local# |
doc-does> |
doc-f@local# |
|
doc-laddr# |
|
doc-lp+!# |
|
doc-lp! |
|
doc->l |
|
doc-f>l |
|
|
|
In addition to these primitives, some specializations of these |
@cindex @code{DOES>} in a separate definition |
primitives for commonly occurring inline arguments are provided for |
This means that you need not use @code{CREATE} and @code{DOES>} in the |
efficiency reasons, e.g., @code{@@local0} as specialization of |
same definition; you can put the @code{DOES>}-part in a separate |
@code{@@local#} for the inline argument 0. The following compiling words |
definition. This allows us to, e.g., select among different DOES>-parts: |
compile the right specialized version, or the general version, as |
@example |
appropriate: |
: does1 |
|
DOES> ( ... -- ... ) |
|
... ; |
|
|
doc-compile-@local |
: does2 |
doc-compile-f@local |
DOES> ( ... -- ... ) |
doc-compile-lp+! |
... ; |
|
|
Combinations of conditional branches and @code{lp+!#} like |
: def-word ( ... -- ... ) |
@code{?branch-lp+!#} (the locals pointer is only changed if the branch |
create ... |
is taken) are provided for efficiency and correctness in loops. |
IF |
|
does1 |
|
ELSE |
|
does2 |
|
ENDIF ; |
|
@end example |
|
|
A special area in the dictionary space is reserved for keeping the |
In this example, the selection of whether to use @code{does1} or |
local variable names. @code{@{} switches the dictionary pointer to this |
@code{does2} is made at compile-time; at the time that the child word is |
area and @code{@}} switches it back and generates the locals |
@code{Create}d. |
initializing code. @code{W:} etc.@ are normal defining words. This |
|
special area is cleared at the start of every colon definition. |
|
|
|
@cindex word list for defining locals |
@cindex @code{DOES>} in interpretation state |
A special feature of Gforth's dictionary is used to implement the |
In a standard program you can apply a @code{DOES>}-part only if the last |
definition of locals without type specifiers: every word list (aka |
word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part |
vocabulary) has its own methods for searching |
will override the behaviour of the last word defined in any case. In a |
etc. (@pxref{Word Lists}). For the present purpose we defined a word list |
standard program, you can use @code{DOES>} only in a colon |
with a special search method: When it is searched for a word, it |
definition. In Gforth, you can also use it in interpretation state, in a |
actually creates that word using @code{W:}. @code{@{} changes the search |
kind of one-shot mode; for example: |
order to first search the word list containing @code{@}}, @code{W:} etc., |
@example |
and then the word list for defining locals without type specifiers. |
CREATE name ( ... -- ... ) |
|
@var{initialization} |
|
DOES> |
|
@var{code} ; |
|
@end example |
|
|
The lifetime rules support a stack discipline within a colon |
@noindent |
definition: The lifetime of a local is either nested with other locals |
is equivalent to the standard: |
lifetimes or it does not overlap them. |
@example |
|
:noname |
|
DOES> |
|
@var{code} ; |
|
CREATE name EXECUTE ( ... -- ... ) |
|
@var{initialization} |
|
@end example |
|
|
At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack |
You can get the address of the body of a word with: |
pointer manipulation is generated. Between control structure words |
|
locals definitions can push locals onto the locals stack. @code{AGAIN} |
|
is the simplest of the other three control flow words. It has to |
|
restore the locals stack depth of the corresponding @code{BEGIN} |
|
before branching. The code looks like this: |
|
@format |
|
@code{lp+!#} current-locals-size @minus{} dest-locals-size |
|
@code{branch} <begin> |
|
@end format |
|
|
|
@code{UNTIL} is a little more complicated: If it branches back, it |
doc->body |
must adjust the stack just like @code{AGAIN}. But if it falls through, |
|
the locals stack must not be changed. The compiler generates the |
|
following code: |
|
@format |
|
@code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size |
|
@end format |
|
The locals stack pointer is only adjusted if the branch is taken. |
|
|
|
@code{THEN} can produce somewhat inefficient code: |
@node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words |
@format |
@subsection Supplying names for the defined words |
@code{lp+!#} current-locals-size @minus{} orig-locals-size |
@cindex names for defined words |
<orig target>: |
@cindex defining words, name parameter |
@code{lp+!#} orig-locals-size @minus{} new-locals-size |
|
@end format |
|
The second @code{lp+!#} adjusts the locals stack pointer from the |
|
level at the @var{orig} point to the level after the @code{THEN}. The |
|
first @code{lp+!#} adjusts the locals stack pointer from the current |
|
level to the level at the orig point, so the complete effect is an |
|
adjustment from the current level to the right level after the |
|
@code{THEN}. |
|
|
|
@cindex locals information on the control-flow stack |
@cindex defining words, name given in a string |
@cindex control-flow stack items, locals information |
By default, defining words take the names for the defined words from the |
In a conventional Forth implementation a dest control-flow stack entry |
input stream. Sometimes you want to supply the name from a string. You |
is just the target address and an orig entry is just the address to be |
can do this with: |
patched. Our locals implementation adds a word list to every orig or dest |
|
item. It is the list of locals visible (or assumed visible) at the point |
|
described by the entry. Our implementation also adds a tag to identify |
|
the kind of entry, in particular to differentiate between live and dead |
|
(reachable and unreachable) orig entries. |
|
|
|
A few unusual operations have to be performed on locals word lists: |
doc-nextname |
|
|
doc-common-list |
For example: |
doc-sub-list? |
|
doc-list-size |
|
|
|
Several features of our locals word list implementation make these |
@example |
operations easy to implement: The locals word lists are organised as |
s" foo" nextname create |
linked lists; the tails of these lists are shared, if the lists |
@end example |
contain some of the same locals; and the address of a name is greater |
@noindent |
than the address of the names behind it in the list. |
is equivalent to: |
|
@example |
|
create foo |
|
@end example |
|
|
Another important implementation detail is the variable |
@cindex defining words without name |
@code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to |
Sometimes you want to define an @var{anonymous word}; a word without a |
determine if they can be reached directly or only through the branch |
name. You can do this with: |
that they resolve. @code{dead-code} is set by @code{UNREACHABLE}, |
|
@code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon |
|
definition, by @code{BEGIN} and usually by @code{THEN}. |
|
|
|
Counted loops are similar to other loops in most respects, but |
doc-:noname |
@code{LEAVE} requires special attention: It performs basically the same |
|
service as @code{AHEAD}, but it does not create a control-flow stack |
|
entry. Therefore the information has to be stored elsewhere; |
|
traditionally, the information was stored in the target fields of the |
|
branches created by the @code{LEAVE}s, by organizing these fields into a |
|
linked list. Unfortunately, this clever trick does not provide enough |
|
space for storing our extended control flow information. Therefore, we |
|
introduce another stack, the leave stack. It contains the control-flow |
|
stack entries for all unresolved @code{LEAVE}s. |
|
|
|
Local names are kept until the end of the colon definition, even if |
This leaves the execution token for the word on the stack after the |
they are no longer visible in any control-flow path. In a few cases |
closing @code{;}. Here's an example in which a deferred word is |
this may lead to increased space needs for the locals name area, but |
initialised with an @code{xt} from an anonymous colon definition: |
usually less than reclaiming this space would cost in code size. |
@example |
|
Defer deferred |
|
:noname ( ... -- ... ) |
|
... ; |
|
IS deferred |
|
@end example |
|
|
|
Gforth provides an alternative way of doing this, using two separate |
|
words: |
|
|
@node ANS Forth locals, , Gforth locals, Locals |
doc-noname |
@subsection ANS Forth locals |
@cindex execution token of last defined word |
@cindex locals, ANS Forth style |
doc-lastxt |
|
|
The ANS Forth locals wordset does not define a syntax for locals, but |
The previous example can be rewritten using @code{noname} and |
words that make it possible to define various syntaxes. One of the |
@code{lastxt}: |
possible syntaxes is a subset of the syntax we used in the Gforth locals |
|
wordset, i.e.: |
|
|
|
@example |
@example |
@{ local1 local2 ... -- comment @} |
Defer deferred |
@end example |
noname : ( ... -- ... ) |
@noindent |
... ; |
or |
lastxt IS deferred |
@example |
|
@{ local1 local2 ... @} |
|
@end example |
@end example |
|
|
The order of the locals corresponds to the order in a stack comment. The |
@code{lastxt} also works when the last word was not defined as |
restrictions are: |
@code{noname}. |
|
|
@itemize @bullet |
|
@item |
|
Locals can only be cell-sized values (no type specifiers are allowed). |
|
@item |
|
Locals can be defined only outside control structures. |
|
@item |
|
Locals can interfere with explicit usage of the return stack. For the |
|
exact (and long) rules, see the standard. If you don't use return stack |
|
accessing words in a definition using locals, you will be all right. The |
|
purpose of this rule is to make locals implementation on the return |
|
stack easier. |
|
@item |
|
The whole definition must be in one line. |
|
@end itemize |
|
|
|
Locals defined in this way behave like @code{VALUE}s (@xref{Simple |
@node Interpretation and Compilation Semantics, , Supplying names, Defining Words |
Defining Words}). I.e., they are initialized from the stack. Using their |
@subsection Interpretation and Compilation Semantics |
name produces their value. Their value can be changed using @code{TO}. |
@cindex semantics, interpretation and compilation |
|
|
Since this syntax is supported by Gforth directly, you need not do |
|
anything to use it. If you want to port a program using this syntax to |
|
another ANS Forth system, use @file{compat/anslocal.fs} to implement the |
|
syntax on the other system. |
|
|
|
Note that a syntax shown in the standard, section A.13 looks |
@cindex interpretation semantics |
similar, but is quite different in having the order of locals |
The @dfn{interpretation semantics} of a word are what the text |
reversed. Beware! |
interpreter does when it encounters the word in interpret state. It also |
|
appears in some other contexts, e.g., the execution token returned by |
|
@code{' @var{word}} identifies the interpretation semantics of |
|
@var{word} (in other words, @code{' @var{word} execute} is equivalent to |
|
interpret-state text interpretation of @code{@var{word}}). |
|
|
The ANS Forth locals wordset itself consists of a word: |
@cindex compilation semantics |
|
The @dfn{compilation semantics} of a word are what the text interpreter |
|
does when it encounters the word in compile state. It also appears in |
|
other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In |
|
standard terminology, ``appends to the current definition''.} the |
|
compilation semantics of @var{word}. |
|
|
doc-(local) |
@cindex execution semantics |
|
The standard also talks about @dfn{execution semantics}. They are used |
|
only for defining the interpretation and compilation semantics of many |
|
words. By default, the interpretation semantics of a word are to |
|
@code{execute} its execution semantics, and the compilation semantics of |
|
a word are to @code{compile,} its execution semantics.@footnote{In |
|
standard terminology: The default interpretation semantics are its |
|
execution semantics; the default compilation semantics are to append its |
|
execution semantics to the execution semantics of the current |
|
definition.} |
|
|
The ANS Forth locals extension wordset defines a syntax using @code{locals|}, but it is so |
@comment TODO expand, make it co-operate with new sections on text interpreter. |
awful that we strongly recommend not to use it. We have implemented this |
|
syntax to make porting to Gforth easy, but do not document it here. The |
|
problem with this syntax is that the locals are defined in an order |
|
reversed with respect to the standard stack comment notation, making |
|
programs harder to read, and easier to misread and miswrite. The only |
|
merit of this syntax is that it is easy to implement using the ANS Forth |
|
locals wordset. |
|
|
|
@node Defining Words, The Text Interpreter, Locals, Words |
@cindex immediate words |
@section Defining Words |
@cindex compile-only words |
@cindex defining words |
You can change the semantics of the most-recently defined word: |
|
|
@menu |
doc-immediate |
* Simple Defining Words:: |
doc-compile-only |
* Colon Definitions:: |
doc-restrict |
* User-defined Defining Words:: |
|
* Supplying names:: |
|
* Interpretation and Compilation Semantics:: |
|
@end menu |
|
|
|
@node Simple Defining Words, Colon Definitions, Defining Words, Defining Words |
Note that ticking (@code{'}) a compile-only word gives an error |
@subsection Simple Defining Words |
(``Interpreting a compile-only word''). |
@cindex simple defining words |
|
@cindex defining words, simple |
|
|
|
doc-constant |
Gforth also allows you to define words with arbitrary combinations of |
doc-2constant |
interpretation and compilation semantics. |
doc-fconstant |
|
doc-variable |
|
doc-2variable |
|
doc-fvariable |
|
doc-create |
|
doc-user |
|
doc-value |
|
doc-to |
|
doc-defer |
|
doc-is |
|
|
|
Definitions in ANS Standard Forth for @code{defer}, @code{<is>} and |
doc-interpret/compile: |
@code{[is]} are provided in @file{compat/defer.fs}. TODO - what do |
|
the two is words do? |
|
|
|
@node Colon Definitions, User-defined Defining Words, Simple Defining Words, Defining Words |
This feature was introduced for implementing @code{TO} and @code{S"}. I |
@subsection Colon Definitions |
recommend that you do not define such words, as cute as they may be: |
@cindex colon definitions |
they make it hard to get at both parts of the word in some contexts. |
|
E.g., assume you want to get an execution token for the compilation |
|
part. Instead, define two words, one that embodies the interpretation |
|
part, and one that embodies the compilation part. Once you have done |
|
that, you can define a combined word with @code{interpret/compile:} for |
|
the convenience of your users. |
|
|
|
You might try to use this feature to provide an optimizing |
|
implementation of the default compilation semantics of a word. For |
|
example, by defining: |
@example |
@example |
: name ( ... -- ... ) |
:noname |
word1 word2 word3 ; |
foo bar ; |
|
:noname |
|
POSTPONE foo POSTPONE bar ; |
|
interpret/compile: foobar |
@end example |
@end example |
|
|
creates a word called @code{name}, that, upon execution, executes |
@noindent |
@code{word1 word2 word3}. @code{name} is a @dfn{(colon) definition}. |
as an optimizing version of: |
|
|
The explanation above is somewhat superficial. @xref{Interpretation and |
|
Compilation Semantics} for an in-depth discussion of some of the issues |
|
involved. |
|
|
|
doc-: |
|
doc-; |
|
|
|
@node User-defined Defining Words, Supplying names, Colon Definitions, Defining Words |
|
@subsection User-defined Defining Words |
|
@cindex user-defined defining words |
|
@cindex defining words, user-defined |
|
|
|
You can create new defining words simply by wrapping defining-time code |
@example |
around existing defining words and putting the sequence in a colon |
: foobar |
definition. |
foo bar ; |
|
@end example |
|
|
@comment TODO example |
Unfortunately, this does not work correctly with @code{[compile]}, |
|
because @code{[compile]} assumes that the compilation semantics of all |
|
@code{interpret/compile:} words are non-default. I.e., @code{[compile] |
|
foobar} would compile the compilation semantics for the optimizing |
|
@code{foobar}, whereas it would compile the interpretation semantics for |
|
the non-optimizing @code{foobar}. |
|
|
@cindex @code{CREATE} ... @code{DOES>} |
@cindex state-smart words (are a bad idea) |
If you want the words defined with your defining words to behave |
Some people try to use @var{state-smart} words to emulate the feature provided |
differently from words defined with standard defining words, you can |
by @code{interpret/compile:} (words are state-smart if they check |
write your defining word like this: |
@code{STATE} during execution). E.g., they would try to code |
|
@code{foobar} like this: |
|
|
@example |
@example |
: def-word ( "name" -- ) |
: foobar |
Create @var{code1} |
STATE @@ |
DOES> ( ... -- ... ) |
IF ( compilation state ) |
@var{code2} ; |
POSTPONE foo POSTPONE bar |
|
ELSE |
def-word name |
foo bar |
|
ENDIF ; immediate |
@end example |
@end example |
|
|
Technically, this fragment defines a defining word @code{def-word}, and |
Although this works if @code{foobar} is only processed by the text |
a word @code{name}; when you execute @code{name}, the address of the |
interpreter, it does not work in other contexts (like @code{'} or |
body of @code{name} is put on the data stack and @var{code2} is executed |
@code{POSTPONE}). E.g., @code{' foobar} will produce an execution token |
(the address of the body of @code{name} is the address @code{HERE} |
for a state-smart word, not for the interpretation semantics of the |
returns immediately after the @code{CREATE}). The word @code{name} is |
original @code{foobar}; when you execute this execution token (directly |
sometimes called a @var{child} of @code{def-word}. |
with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile |
|
state, the result will not be what you expected (i.e., it will not |
|
perform @code{foo bar}). State-smart words are a bad idea. Simply don't |
|
write them@footnote{For a more detailed discussion of this topic, see |
|
@cite{@code{State}-smartness -- Why it is Evil and How to Exorcise it} by Anton |
|
Ertl; presented at EuroForth '98 and available from |
|
@url{http://www.complang.tuwien.ac.at/papers/}}! |
|
|
In other words, if you make the following definitions: |
@cindex defining words with arbitrary semantics combinations |
|
It is also possible to write defining words that define words with |
|
arbitrary combinations of interpretation and compilation semantics. In |
|
general, they look like this: |
|
|
@example |
@example |
: def-word1 ( "name" -- ) |
: def-word |
Create @var{code1} ; |
create-interpret/compile |
|
@var{code1} |
: action1 ( ... -- ... ) |
interpretation> |
@var{code2} ; |
@var{code2} |
|
<interpretation |
def-word name1 |
compilation> |
|
@var{code3} |
|
<compilation ; |
@end example |
@end example |
|
|
Using @code{name1 action1} is equivalent to using @code{name}. |
For a @var{word} defined with @code{def-word}, the interpretation |
|
semantics are to push the address of the body of @var{word} and perform |
E.g., you can implement @code{Constant} in this way: |
@var{code2}, and the compilation semantics are to push the address of |
|
the body of @var{word} and perform @var{code3}. E.g., @code{constant} |
|
can also be defined like this (except that the defined constants don't |
|
behave correctly when @code{[compile]}d): |
|
|
@example |
@example |
: constant ( w "name" -- ) |
: constant ( n "name" -- ) |
create , |
create-interpret/compile |
DOES> ( -- w ) |
, |
@@ ; |
interpretation> ( -- n ) |
|
@@ |
|
<interpretation |
|
compilation> ( compilation. -- ; run-time. -- n ) |
|
@@ postpone literal |
|
<compilation ; |
@end example |
@end example |
|
|
@comment that is the classic example.. maybe it should be earlier. There |
doc-create-interpret/compile |
@comment is a beautiful description of how this works and what it does in |
doc-interpretation> |
@comment the Forthwrite 100th edition. |
doc-<interpretation |
|
doc-compilation> |
|
doc-<compilation |
|
|
When you create a constant with @code{5 constant five}, first a new word |
Note that words defined with @code{interpret/compile:} and |
@code{five} is created, then the value 5 is laid down in the body of |
@code{create-interpret/compile} have an extended header structure that |
@code{five} with @code{,}. When @code{five} is invoked, the address of |
differs from other words; however, unless you try to access them with |
the body is put on the stack, and @code{@@} retrieves the value 5. |
plain address arithmetic, you should not notice this. Words for |
|
accessing the header structure usually know how to deal with this; e.g., |
|
@code{' word >body} also gives you the body of a word created with |
|
@code{create-interpret/compile}. |
|
|
@cindex stack effect of @code{DOES>}-parts |
@c ---------------------------------------------------------- |
@cindex @code{DOES>}-parts, stack effect |
@node The Text Interpreter, Tokens for Words, Defining Words, Words |
In the example above the stack comment after the @code{DOES>} specifies |
@section The Text Interpreter |
the stack effect of the defined words, not the stack effect of the |
@cindex interpreter - outer |
following code (the following code expects the address of the body on |
@cindex text interpreter |
the top of stack, which is not reflected in the stack comment). This is |
@cindex outer interpreter |
the convention that I use and recommend (it clashes a bit with using |
|
locals declarations for stack effect specification, though). |
|
|
|
@subsubsection Applications of @code{CREATE..DOES>} |
Intro blah. |
@cindex @code{CREATE} ... @code{DOES>}, applications |
|
|
|
You may wonder how to use this feature. Here are some usage patterns: |
@comment TODO |
|
|
@cindex factoring similar colon definitions |
doc->in |
When you see a sequence of code occurring several times, and you can |
doc-tib |
identify a meaning, you will factor it out as a colon definition. When |
doc-#tib |
you see similar colon definitions, you can factor them using |
doc-span |
@code{CREATE..DOES>}. E.g., an assembler usually defines several words |
doc-restore-input |
that look very similar: |
doc-save-input |
@example |
doc-source |
: ori, ( reg-target reg-source n -- ) |
doc-source-id |
0 asm-reg-reg-imm ; |
|
: andi, ( reg-target reg-source n -- ) |
|
1 asm-reg-reg-imm ; |
|
@end example |
|
|
|
@noindent |
|
This could be factored with: |
|
@example |
|
: reg-reg-imm ( op-code -- ) |
|
CREATE , |
|
DOES> ( reg-target reg-source n -- ) |
|
@@ asm-reg-reg-imm ; |
|
|
|
0 reg-reg-imm ori, |
|
1 reg-reg-imm andi, |
|
@end example |
|
|
|
@cindex currying |
|
Another view of @code{CREATE..DOES>} is to consider it as a crude way to |
|
supply a part of the parameters for a word (known as @dfn{currying} in |
|
the functional language community). E.g., @code{+} needs two |
|
parameters. Creating versions of @code{+} with one parameter fixed can |
|
be done like this: |
|
@example |
|
: curry+ ( n1 -- ) |
|
CREATE , |
|
DOES> ( n2 -- n1+n2 ) |
|
@@ + ; |
|
|
|
3 curry+ 3+ |
|
-2 curry+ 2- |
|
@end example |
|
|
|
@subsubsection The gory details of @code{CREATE..DOES>} |
@menu |
@cindex @code{CREATE} ... @code{DOES>}, details |
* Number Conversion:: |
|
* Interpret/Compile states:: |
|
* Literals:: |
|
* Interpreter Directives:: |
|
@end menu |
|
|
doc-does> |
@comment TODO |
|
|
@cindex @code{DOES>} in a separate definition |
The text interpreter works on input one line at a time. Starting at |
This means that you need not use @code{CREATE} and @code{DOES>} in the |
the beginning of the line, it skips leading spaces (called |
same definition; you can put the @code{DOES>}-part in a separate |
@var{delimiters}) then parses a string (a sequence of non-space |
definition. This allows us to, e.g., select among different DOES>-parts: |
characters) until it either reaches a space character or it |
@example |
reaches the end of the line. Having parsed a string, it then makes two |
: does1 |
attempts to do something with it: |
DOES> ( ... -- ... ) |
|
... ; |
|
|
|
: does2 |
* It looks the string up in a dictionary of definitions. If the string |
DOES> ( ... -- ... ) |
is found in the dictionary, the string names a @var{definition} (also |
... ; |
known as a @var{word}) and the dictionary search will return an |
|
@var{Execution token} (xt) for the definition and some flags that show |
|
when the definition can be used legally. If the definition can be |
|
legally executed in @var{Interpret} mode then the text interpreter will |
|
use the xt to execute it, otherwise it will issue an error |
|
message. The dictionary is described in more detail in <blah>. |
|
|
: def-word ( ... -- ... ) |
* If the string is not found in the dictionary, the text interpreter |
create ... |
attempts to treat it as a number in the current radix (base 10 after |
IF |
initial startup). If the string represents a legal number in the |
does1 |
current radix, the number is pushed onto the appropriate parameter |
ELSE |
stack. Stacks are discussed in more detail in <blah>. Number |
does2 |
conversion is described in more detail in <section about +, - |
ENDIF ; |
numbers and different number formats>. |
@end example |
|
|
|
In this example, the selection of whether to use @code{does1} or |
If both of these attempts fail, the remainer of the input line is |
@code{does2} is made at compile-time; at the time that the child word is |
discarded and the text interpreter isses an error message. If one of |
@code{Create}d. |
these attempts succeeds, the text interpreter repeats the parsing |
|
process until the end of the line has been reached. At this point, |
|
it prints the status message `` ok'' and waits for more input. |
|
|
@cindex @code{DOES>} in interpretation state |
There are two important things to note about the behaviour of the text |
In a standard program you can apply a @code{DOES>}-part only if the last |
interpreter: |
word was defined with @code{CREATE}. In Gforth, the @code{DOES>}-part |
|
will override the behaviour of the last word defined in any case. In a |
|
standard program, you can use @code{DOES>} only in a colon |
|
definition. In Gforth, you can also use it in interpretation state, in a |
|
kind of one-shot mode; for example: |
|
@example |
|
CREATE name ( ... -- ... ) |
|
@var{initialization} |
|
DOES> |
|
@var{code} ; |
|
@end example |
|
|
|
@noindent |
* it processes each input string to completion before parsing |
is equivalent to the standard: |
additional characters from the input line. |
@example |
|
:noname |
|
DOES> |
|
@var{code} ; |
|
CREATE name EXECUTE ( ... -- ... ) |
|
@var{initialization} |
|
@end example |
|
|
|
You can get the address of the body of a word with: |
* it keeps track of its position in the input line using a variable |
|
(called >IN, pronounced ``to-in''). The value of >IN can be modified |
|
by the execution of definitions in the input line. This means that |
|
definitions can ``trick'' the text interpreter either into skipping |
|
sections of the input line or into parsing a section of the |
|
input line more than once. |
|
|
doc->body |
|
|
|
@node Supplying names, Interpretation and Compilation Semantics, User-defined Defining Words, Defining Words |
@node Number Conversion, Interpret/Compile states, The Text Interpreter, The Text Interpreter |
@subsection Supplying names for the defined words |
@subsection Number Conversion |
@cindex names for defined words |
@cindex number conversion |
@cindex defining words, name parameter |
@cindex double-cell numbers, input format |
|
@cindex input format for double-cell numbers |
|
@cindex single-cell numbers, input format |
|
@cindex input format for single-cell numbers |
|
@cindex floating-point numbers, input format |
|
@cindex input format for floating-point numbers |
|
|
@cindex defining words, name given in a string |
If the text interpreter fails to find a particular string in the name |
By default, defining words take the names for the defined words from the |
dictionary, it attempts to convert it to a number using a set of rules. |
input stream. Sometimes you want to supply the name from a string. You |
|
can do this with: |
|
|
|
doc-nextname |
Let <digit> represent any character that is a legal digit in the current |
|
number base (for example, 0-9 when the number base is decimal or 0-9, A-F |
|
when the number base is hexadecimal). |
|
|
For example: |
Let <decimal digit> represent any character in the range 0-9. |
|
|
@example |
@comment TODO need to extend the next defn to support fp format |
s" foo" nextname create |
Let @{+ | -@} represent the optional presence of either a @code{+} or |
@end example |
@code{-} character. |
@noindent |
|
is equivalent to: |
|
@example |
|
create foo |
|
@end example |
|
|
|
@cindex defining words without name |
Let * represent any number of instances of the previous character |
Sometimes you want to define an @var{anonymous word}; a word without a |
(including none). |
name. You can do this with: |
|
|
|
doc-:noname |
Let any other character represent itself. |
|
|
This leaves the execution token for the word on the stack after the |
Now, the conversion rules are: |
closing @code{;}. Here's an example in which a deferred word is |
|
initialised with an @code{xt} from an anonymous colon definition: |
|
@example |
|
Defer deferred |
|
:noname ( ... -- ... ) |
|
... ; |
|
IS deferred |
|
@end example |
|
|
|
Gforth provides an alternative way of doing this, using two separate |
@itemize @bullet |
words: |
@item |
|
A string of the form <digit><digit>* is treated as a single-precision |
|
(CELL-sized) positive integer. Examples are 0 123 6784532 32343212343456 42 |
|
@item |
|
A string of the form -<digit><digit>* is treated as a single-precision |
|
(CELL-sized) negative integer, and is represented using 2's-complement |
|
arithmetic. Examples are -45 -5681 -0 |
|
@item |
|
A string of the form <digit><digit>*.<digit>* is treated as a double-precision |
|
(double-CELL-sized) positive integer. Examples are 3465. 3.465 34.65 |
|
(and note that these all represent the same number). |
|
@item |
|
A string of the form -<digit><digit>*.<digit>* is treated as a |
|
double-precision (double-CELL-sized) negative integer, and is |
|
represented using 2's-complement arithmetic. Examples are -3465. -3.465 |
|
-34.65 (and note that these all represent the same number). |
|
@item |
|
A string of the form @{+ | -@}<decimal digit>@{.@}<decimal digit>*@{e | E@}@{+ |
|
| -@}<decimal digit><decimal digit>* is treated as floating-point |
|
number. Examples are 1e0 1.e 1.e0 +1e+0 (which all represent the same |
|
number) +12.E-4 |
|
@end itemize |
|
|
doc-noname |
By default, the number base used for integer number conversion is given |
@cindex execution token of last defined word |
by the contents of a variable named @code{BASE}. Base 10 (decimal) is |
doc-lastxt |
always used for floating-point number conversion. |
|
|
The previous example can be rewritten using @code{noname} and |
doc-base |
@code{lastxt}: |
doc-hex |
|
doc-decimal |
|
|
@example |
@cindex '-prefix for character strings |
Defer deferred |
@cindex &-prefix for decimal numbers |
noname : ( ... -- ... ) |
@cindex %-prefix for binary numbers |
... ; |
@cindex $-prefix for hexadecimal numbers |
lastxt IS deferred |
Gforth allows you to override the value of @code{BASE} by using a prefix |
@end example |
before the first digit of an (integer) number. Four prefixes are |
|
supported: |
|
|
@code{lastxt} also works when the last word was not defined as |
@itemize @bullet |
@code{noname}. |
@item |
|
@code{&} -- decimal number |
|
@item |
|
@code{%} -- binary number |
|
@item |
|
@code{$} -- hexadecimal number |
|
@item |
|
@code{'} -- base 256 number |
|
@end itemize |
|
|
|
Here are some examples, with the equivalent decimal number shown after |
|
in braces: |
|
|
@node Interpretation and Compilation Semantics, , Supplying names, Defining Words |
-$41 (-65), %1001101 (205), %1001.0001 (145 - a double-precision number), |
@subsection Interpretation and Compilation Semantics |
'AB (16706; ascii A is 65, ascii B is 66, number is 65*256 + 66), |
@cindex semantics, interpretation and compilation |
'ab (24930; ascii a is 97, ascii B is 98, number is 97*256 + 98), |
|
&905 (905), $abc (2478), $ABC (2478). |
|
|
@cindex interpretation semantics |
@cindex number conversion - traps for the unwary |
The @dfn{interpretation semantics} of a word are what the text |
Number conversion has a number of traps for the unwary: |
interpreter does when it encounters the word in interpret state. It also |
|
appears in some other contexts, e.g., the execution token returned by |
|
@code{' @var{word}} identifies the interpretation semantics of |
|
@var{word} (in other words, @code{' @var{word} execute} is equivalent to |
|
interpret-state text interpretation of @code{@var{word}}). |
|
|
|
@cindex compilation semantics |
@itemize @bullet |
The @dfn{compilation semantics} of a word are what the text interpreter |
@item |
does when it encounters the word in compile state. It also appears in |
You cannot determine the current number base using the code sequence |
other contexts, e.g, @code{POSTPONE @var{word}} compiles@footnote{In |
@code{BASE @@ .} -- the number base is always 10 in the current number |
standard terminology, ``appends to the current definition''.} the |
base. Instead, use something like @code{BASE @@ DECIMAL DUP . BASE !} |
compilation semantics of @var{word}. |
@item |
|
If the number base is set to a value greater than 14 (for example, |
@cindex execution semantics |
hexadecimal), the number 123E4 is ambiguous; the conversion rules allow |
The standard also talks about @dfn{execution semantics}. They are used |
it to be intepreted as either a single-precision integer or a |
only for defining the interpretation and compilation semantics of many |
floating-point number (Gforth treats it as an integer). The ambiguity |
words. By default, the interpretation semantics of a word are to |
can be resolved by explicitly stating the sign of the mantissa and/or |
@code{execute} its execution semantics, and the compilation semantics of |
exponent: 123E+4 or +123E4 -- if the number base is decimal, no |
a word are to @code{compile,} its execution semantics.@footnote{In |
ambiguity arises; either representation will be treated as a |
standard terminology: The default interpretation semantics are its |
floating-point number. |
execution semantics; the default compilation semantics are to append its |
@item |
execution semantics to the execution semantics of the current |
There is a word @code{bin} but it does @var{not} set the number base! |
definition.} |
It is used to specify file types. |
|
@item |
|
ANS Forth requires the @code{.} of a double-precision number to |
|
be the final character in the string. Allowing the @code{.} to be |
|
anywhere after the first digit is a Gforth extension. |
|
@item |
|
The number conversion process does not check for overflow. |
|
@item |
|
In Gforth, number conversion to floating-point numbers always use base |
|
10, irrespective of the value of @code{BASE}. In ANS Forth, |
|
conversion to floating-point numbers whilst the value of |
|
@code{BASE} is not 10 is an ambiguous condition. |
|
@end itemize |
|
|
@comment TODO expand, make it co-operate with new sections on text interpreter. |
|
|
|
@cindex immediate words |
@node Interpret/Compile states, Literals, Number Conversion, The Text Interpreter |
You can change the compilation semantics into @code{execute}ing the |
@subsection Interpret/Compile states |
execution semantics with |
@cindex Interpret/Compile states |
|
|
doc-immediate |
@comment TODO Intro blah. |
|
|
@cindex compile-only words |
doc-state |
You can remove the interpretation semantics of a word with |
doc-[ |
|
doc-] |
|
|
doc-compile-only |
|
doc-restrict |
|
|
|
Note that ticking (@code{'}) compile-only words gives an error |
@node Literals, Interpreter Directives, Interpret/Compile states, The Text Interpreter |
(``Interpreting a compile-only word''). |
@subsection Literals |
|
@cindex Literals |
|
|
Gforth also allows you to define words with arbitrary combinations of |
@comment TODO Intro blah. |
interpretation and compilation semantics. |
|
|
|
doc-interpret/compile: |
doc-literal |
|
doc-]L |
|
doc-2literal |
|
doc-fliteral |
|
|
This feature was introduced for implementing @code{TO} and @code{S"}. I |
@node Interpreter Directives, ,Literals, The Text Interpreter |
recommend that you do not define such words, as cute as they may be: |
@subsection Interpreter Directives |
they make it hard to get at both parts of the word in some contexts. |
@cindex interpreter directives |
E.g., assume you want to get an execution token for the compilation |
|
part. Instead, define two words, one that embodies the interpretation |
|
part, and one that embodies the compilation part. Once you have done |
|
that, you can define a combined word with @code{interpret/compile:} for |
|
the convenience of your users. |
|
|
|
You also might try to with this feature, like this: |
These words are usually used outside of definitions; for example, to |
|
control which parts of a source file are processed by the text |
|
interpreter. There are only a few ANS Forth Standard words, but Gforth |
|
supplements these with a rich set of immediate control structure words |
|
to compensate for the fact that the non-immediate versions can only be |
|
used in compile state (@pxref{Control Structures}). |
|
|
You might try to use this feature to provide an optimizing |
doc-[IF] |
implementation of the default compilation semantics of a word. For |
doc-[ELSE] |
example, by defining: |
doc-[THEN] |
@example |
doc-[ENDIF] |
:noname |
|
foo bar ; |
|
:noname |
|
POSTPONE foo POSTPONE bar ; |
|
interpret/compile: foobar |
|
@end example |
|
|
|
@noindent |
doc-[IFDEF] |
as an optimizing version of: |
doc-[IFUNDEF] |
|
|
@example |
doc-[?DO] |
: foobar |
doc-[DO] |
foo bar ; |
doc-[FOR] |
@end example |
doc-[LOOP] |
|
doc-[+LOOP] |
|
doc-[NEXT] |
|
|
Unfortunately, this does not work correctly with @code{[compile]}, |
doc-[BEGIN] |
because @code{[compile]} assumes that the compilation semantics of all |
doc-[UNTIL] |
@code{interpret/compile:} words are non-default. I.e., @code{[compile] |
doc-[AGAIN] |
foobar} would compile the compilation semantics for the optimizing |
doc-[WHILE] |
@code{foobar}, whereas it would compile the interpretation semantics for |
doc-[REPEAT] |
the non-optimizing @code{foobar}. |
|
|
|
@cindex state-smart words (are a bad idea) |
@c ------------------------------------------------------------- |
Some people try to use @var{state-smart} words to emulate the feature provided |
@node Tokens for Words, Word Lists, The Text Interpreter, Words |
by @code{interpret/compile:} (words are state-smart if they check |
@section Tokens for Words |
@code{STATE} during execution). E.g., they would try to code |
@cindex tokens for words |
@code{foobar} like this: |
|
|
|
@example |
This chapter describes the creation and use of tokens that represent |
: foobar |
words on the stack (and in data space). |
STATE @@ |
|
IF ( compilation state ) |
|
POSTPONE foo POSTPONE bar |
|
ELSE |
|
foo bar |
|
ENDIF ; immediate |
|
@end example |
|
|
|
Although this works if @code{foobar} is only processed by the text |
Named words have interpretation and compilation semantics. Unnamed words |
interpreter, it does not work in other contexts (like @code{'} or |
just have execution semantics. |
@code{POSTPONE}). E.g., @code{' foobar} will produce an execution token |
|
for a state-smart word, not for the interpretation semantics of the |
|
original @code{foobar}; when you execute this execution token (directly |
|
with @code{EXECUTE} or indirectly through @code{COMPILE,}) in compile |
|
state, the result will not be what you expected (i.e., it will not |
|
perform @code{foo bar}). State-smart words are a bad idea. Simply don't |
|
write them@footnote{For a more detailed discussion of this topic, see |
|
@cite{@code{State}-smartness -- Why it is Evil and How to Exorcise it} by Anton |
|
Ertl; presented at EuroForth '98 and available from |
|
@url{http://www.complang.tuwien.ac.at/papers/}}! |
|
|
|
@cindex defining words with arbitrary semantics combinations |
@comment TODO ?normally interpretation semantics are the execution semantics. |
It is also possible to write defining words that define words with |
@comment this should all be covered in earlier ss |
arbitrary combinations of interpretation and compilation semantics. In |
|
general, they look like this: |
|
|
|
@example |
@cindex execution token |
: def-word |
An @dfn{execution token} represents the execution semantics of an |
create-interpret/compile |
unnamed word. An execution token occupies one cell. As explained in |
@var{code1} |
@ref{Supplying names}, the execution token of the last word |
interpretation> |
defined can be produced with @code{lastxt}. |
@var{code2} |
|
<interpretation |
|
compilation> |
|
@var{code3} |
|
<compilation ; |
|
@end example |
|
|
|
For a @var{word} defined with @code{def-word}, the interpretation |
doc-execute |
semantics are to push the address of the body of @var{word} and perform |
doc-compile, |
@var{code2}, and the compilation semantics are to push the address of |
|
the body of @var{word} and perform @var{code3}. E.g., @code{constant} |
|
can also be defined like this (except that the defined constants don't |
|
behave correctly when @code{[compile]}d): |
|
|
|
@example |
@cindex code field address |
: constant ( n "name" -- ) |
@cindex CFA |
create-interpret/compile |
In Gforth, the abstract data type @emph{execution token} is implemented |
, |
as a code field address (CFA). |
interpretation> ( -- n ) |
@comment TODO note that the standard does not say what it represents.. |
@@ |
@comment and you cannot necessarily compile it in all Forths (eg native |
<interpretation |
@comment compilers?). |
compilation> ( compilation. -- ; run-time. -- n ) |
|
@@ postpone literal |
|
<compilation ; |
|
@end example |
|
|
|
doc-create-interpret/compile |
The interpretation semantics of a named word are also represented by an |
doc-interpretation> |
execution token. You can get it with: |
doc-<interpretation |
|
doc-compilation> |
|
doc-<compilation |
|
|
|
Note that words defined with @code{interpret/compile:} and |
doc-['] |
@code{create-interpret/compile} have an extended header structure that |
doc-' |
differs from other words; however, unless you try to access them with |
|
plain address arithmetic, you should not notice this. Words for |
|
accessing the header structure usually know how to deal with this; e.g., |
|
@code{' word >body} also gives you the body of a word created with |
|
@code{create-interpret/compile}. |
|
|
|
@c ---------------------------------------------------------- |
For literals, you use @code{'} in interpreted code and @code{[']} in |
@node The Text Interpreter, Structures, Defining Words, Words |
compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusually |
@section The Text Interpreter |
by complaining about compile-only words. To get an execution token for a |
@cindex interpreter - outer |
compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP'] |
@cindex text interpreter |
@var{X} drop}. |
@cindex outer interpreter |
|
|
|
Intro blah. |
@cindex compilation token |
|
The compilation semantics are represented by a @dfn{compilation token} |
|
consisting of two cells: @var{w xt}. The top cell @var{xt} is an |
|
execution token. The compilation semantics represented by the |
|
compilation token can be performed with @code{execute}, which consumes |
|
the whole compilation token, with an additional stack effect determined |
|
by the represented compilation semantics. |
|
|
@comment TODO |
doc-[comp'] |
|
doc-comp' |
|
|
doc->in |
You can compile the compilation semantics with @code{postpone,}. I.e., |
doc-tib |
@code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE |
doc-#tib |
@var{word}}. |
doc-span |
|
doc-restore-input |
|
doc-save-input |
|
doc-source |
|
doc-source-id |
|
|
|
|
doc-postpone, |
|
|
@menu |
At present, the @var{w} part of a compilation token is an execution |
* Number Conversion:: |
token, and the @var{xt} part represents either @code{execute} or |
* Interpret/Compile states:: |
@code{compile,}. However, don't rely on that knowledge, unless necessary; |
* Literals:: |
we may introduce unusual compilation tokens in the future (e.g., |
* Interpreter Directives:: |
compilation tokens representing the compilation semantics of literals). |
@end menu |
|
|
|
@comment TODO |
@cindex name token |
|
@cindex name field address |
|
@cindex NFA |
|
Named words are also represented by the @dfn{name token}, (@var{nt}). The abstract |
|
data type @emph{name token} is implemented as a name field address (NFA). |
|
|
The text interpreter works on input one line at a time. Starting at |
doc-find-name |
the beginning of the line, it skips leading spaces (called |
doc-name>int |
"delimiters") then parses a string (a sequence of non-space |
doc-name?int |
characters) until it either reaches a space character or it |
doc-name>comp |
reaches the end of the line. Having parsed a string, it then makes two |
doc-name>string |
attempts to do something with it: |
|
|
|
* It looks the string up in a dictionary of definitions. If the string |
@c ------------------------------------------------------------- |
is found in the dictionary, the string names a "definition" (also |
@node Word Lists, Environmental Queries, Tokens for Words, Words |
known as a "word") and the dictionary search will return an |
@section Word Lists |
"Execution token" (xt) for the definition and some flags that show |
@cindex word lists |
when the definition can be used legally. If the definition can be |
@cindex name dictionary |
legally executed in "Interpret" mode then the text interpreter will |
|
use the xt to execute it, otherwise it will issue an error |
|
message. The dictionary is described in more detail in <blah>. |
|
|
|
* If the string is not found in the dictionary, the text interpreter |
@cindex wid |
attempts to treat it as a number in the current radix (base 10 after |
All definitions other than those created by @code{:noname} have an entry |
initial startup). If the string represents a legal number in the |
in the name dictionary. The name dictionary is fragmented into a number |
current radix, the number is pushed onto the appropriate parameter |
of parts, called @var{word lists}. A word list is identified by a |
stack. Stacks are discussed in more detail in <blah>. Number |
cell-sized word list identifier (@var{wid}) in much the same way as a |
conversion is described in more detail in <section about +, - |
file is identified by a file handle. The numerical value of the wid has |
numbers and different number formats>. |
no (portable) meaning, and might change from session to session. |
|
|
If both of these attempts fail, the remainer of the input line is |
@cindex compilation word list |
discarded and the text interpreter isses an error message. If one of |
At any one time, a single word list is defined as the word list to which |
these attempts succeeds, the text interpreter repeats the parsing |
all new definitions will be added -- this is called the @var{compilation |
process until the end of the line has been reached. At this point, |
word list}. When Gforth is started, the compilation word list is the |
it prints the status message " ok" and waits for more input. |
word list called @code{FORTH-WORDLIST}. |
|
|
There are two important things to note about the behaviour of the text |
@cindex search order stack |
interpreter: |
Forth maintains a stack of word lists, representing the @var{search |
|
order}. When the name dictionary is searched (for example, when |
|
attempting to find a word's execution token during compilation), only |
|
those word lists that are currently in the search order are |
|
searched. The most recently-defined word in the word list at the top of |
|
the word list stack is searched first, and the search proceeds until |
|
either the word is located or the oldest definition in the word list at |
|
the bottom of the stack is reached. Definitions of the word may exist in |
|
more than one word lists; the search order determines which version will |
|
be found. |
|
|
* it processes each input string to completion before parsing |
The ANS Forth Standard ``Search order'' word set is intended to provide a |
additional characters from the input line. |
set of low-level tools that allow various different schemes to be |
|
implemented. Gforth provides @code{vocabulary}, a traditional Forth |
|
word. @file{compat/vocabulary.fs} provides an implementation in ANS |
|
Standard Forth. |
|
|
* it keeps track of its position in the input line using a variable |
TODO: locals section refers to here, saying that every word list (aka |
(called >IN, pronounced "to-in"). The value of >IN can be modified |
vocabulary) has its own methods for searching etc. Need to document that. |
by the execution of definitions in the input line. This means that |
|
definitions can "trick" the text interpreter either into skipping |
|
sections of the input line or into parsing a section of the |
|
input line more than once. |
|
|
|
|
doc-forth-wordlist |
|
doc-definitions |
|
doc-get-current |
|
doc-set-current |
|
|
@node Number Conversion, Interpret/Compile states, The Text Interpreter, The Text Interpreter |
@comment TODO when a defn (like set-order) is instanced twice, the second instance gets documented. |
@subsection Number Conversion |
@comment In general that might be fine, but in this example (search.fs) the second instance is an |
@cindex Number conversion |
@comment alias, so it would not naturally have documentation |
@cindex double-cell numbers, input format |
@comment .. the fix to that is to add a specific prefix, like the object-orientation stuff does. |
@cindex input format for double-cell numbers |
|
@cindex single-cell numbers, input format |
|
@cindex input format for single-cell numbers |
|
@cindex floating-point numbers, input format |
|
@cindex input format for floating-point numbers |
|
|
|
If the text interpreter fails to find a particular string in the name |
doc-get-order |
dictionary, it attempts to convert it to a number using a set of rules. |
doc-set-order |
|
doc-wordlist |
|
doc-also |
|
doc-forth |
|
doc-only |
|
doc-order |
|
doc-previous |
|
|
Let <digit> represent any character that is a legal digit in the current |
doc-find |
number base (for example, 0-9 when the number base is decimal or 0-9, A-F |
doc-search-wordlist |
when the number base is hexadecimal). |
|
|
|
Let <decimal digit> represent any character in the range 0-9. |
doc-words |
|
doc-vlist |
|
|
@comment TODO need to extend the next defn to support fp format |
doc-mappedwordlist |
Let @{+ | -@} represent the optional presence of either a @code{+} or |
doc-root |
@code{-} character. |
doc-vocabulary |
|
doc-seal |
|
doc-vocs |
|
doc-current |
|
doc-context |
|
|
Let * represent any number of instances of the previous character |
@menu |
(including none). |
* Why use word lists?:: |
|
* Word list examples:: |
|
@end menu |
|
|
Let any other character represent itself. |
@node Why use word lists?, Word list examples, Word Lists, Word Lists |
|
@subsection Why use word lists? |
|
@cindex word lists - why use them? |
|
|
Now, the conversion rules are: |
There are several reasons for using multiple word lists: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
A string of the form <digit><digit>* is treated as a single-precision |
To improve compilation speed by reducing the number of name dictionary |
(CELL-sized) positive integer. Examples are 0 123 6784532 32343212343456 42 |
entries that must be searched. This is achieved by creating a new |
@item |
word list that contains all of the definitions that are used in the |
A string of the form -<digit><digit>* is treated as a single-precision |
definition of a Forth system but which would not usually be used by |
(CELL-sized) negative integer, and is represented using 2's-complement |
programs running on that system. That word list would be on the search |
arithmetic. Examples are -45 -5681 -0 |
list when the Forth system was compiled but would be removed from the |
@item |
search list for normal operation. This can be a useful technique for |
A string of the form <digit><digit>*.<digit>* is treated as a double-precision |
low-performance systems (for example, 8-bit processors in embedded |
(double-CELL-sized) positive integer. Examples are 3465. 3.465 34.65 |
systems) but is unlikely to be necessary in high-performance desktop |
(and note that these all represent the same number). |
systems. |
@item |
@item |
A string of the form -<digit><digit>*.<digit>* is treated as a |
To prevent a set of words from being used outside the context in which |
double-precision (double-CELL-sized) negative integer, and is |
they are valid. Two classic examples of this are an integrated editor |
represented using 2's-complement arithmetic. Examples are -3465. -3.465 |
(all of the edit commands are defined in a separate word list; the |
-34.65 (and note that these all represent the same number). |
search order is set to the editor word list when the editor is invoked; |
|
the old search order is restored when the editor is terminated) and an |
|
integrated assembler (the op-codes for the machine are defined in a |
|
separate word list which is used when a @code{CODE} word is defined). |
@item |
@item |
A string of the form @{+ | -@}<decimal digit>@{.@}<decimal digit>*@{e | E@}@{+ |
To prevent a name-space clash between multiple definitions with the same |
| -@}<decimal digit><decimal digit>* is treated as floating-point |
name. For example, when building a cross-compiler you might have a word |
number. Examples are 1e0 1.e 1.e0 +1e+0 (which all represent the same |
@code{IF} that generates conditional code for your target system. By |
number) +12.E-4 |
placing this definition in a different word list you can control whether |
|
the host system's @code{IF} or the target system's @code{IF} get used in |
|
any particular context by controlling the order of the word lists on the |
|
search order stack. |
@end itemize |
@end itemize |
|
|
By default, the number base used for integer number conversion is given |
@node Word list examples, ,Why use word lists?, Word Lists |
by the contents of a variable named @code{BASE}. Base 10 (decimal) is |
@subsection Word list examples |
always used for floating-point number conversion. |
@cindex word lists - examples |
|
|
doc-base |
|
doc-hex |
|
doc-decimal |
|
|
|
@cindex '-prefix for character strings |
Here is an example of creating and using a new wordlist using ANS |
@cindex &-prefix for decimal numbers |
Forth Standard words: |
@cindex %-prefix for binary numbers |
|
@cindex $-prefix for hexadecimal numbers |
|
Gforth allows you to override the value of @code{BASE} by using a prefix |
|
before the first digit of an (integer) number. Four prefixes are |
|
supported: |
|
|
|
@itemize @bullet |
@example |
@item |
wordlist constant my-new-words-wordlist |
@code{&} -- decimal number |
: my-new-words get-order nip my-new-words-wordlist swap set-order ; |
@item |
|
@code{%} -- binary number |
|
@item |
|
@code{$} -- hexadecimal number |
|
@item |
|
@code{'} -- base 256 number |
|
@end itemize |
|
|
|
Here are some examples, with the equivalent decimal number shown after |
\ add it to the search order |
in braces: |
also my-new-words |
|
|
-$41 (-65) %1001101 (205) %1001.0001 (145 - a double-precision number) |
\ alternatively, add it to the search order and make it |
'AB (16706; ascii A is 65, ascii B is 66, number is 65*256 + 66) |
\ the compilation word list |
'ab (24930; ascii a is 97, ascii B is 98, number is 97*256 + 98) |
also my-new-words definitions |
&905 (905) $abc (2478) $ABC (2478) |
\ type "order" to see the problem |
|
@end example |
|
|
@cindex Number conversion - traps for the unwary |
The problem with this example is that @code{order} has no way to |
Number conversion has a number of traps for the unwary: |
associate the name @code{my-new-words} with the wid of the word list (in |
|
Gforth, @code{order} and @code{vocs} will display @code{???} for a wid |
|
that has no associated name). There is no Standard way of associating a |
|
name with a wid. |
|
|
@itemize @bullet |
In Gforth, this example can be re-coded using @code{vocabulary}, which |
@item |
associates a name with a wid: |
You cannot determine the current number base using the code sequence |
|
@code{BASE @@ .} -- the number base is always 10 in the current number |
|
base. Instead, use something like @code{BASE @@ DECIMAL DUP . BASE !} |
|
@item |
|
If the number base is set to a value greater than 14 (for example, |
|
hexadecimal), the number 123E4 is ambiguous; the conversion rules allow |
|
it to be intepreted as either a single-precision integer or a |
|
floating-point number (Gforth treats it as an integer). The ambiguity |
|
can be resolved by explicitly stating the sign of the mantissa and/or |
|
exponent: 123E+4 or +123E4 -- if the number base is decimal, no |
|
ambiguity arises; either representation will be treated as a |
|
floating-point number. |
|
@item |
|
There is a word @code{bin} but it does @var{not} set the number base! |
|
It is used to specify file types. |
|
@item |
|
ANS Forth Standard requires the @code{.} of a double-precision number to |
|
be the final character in the string. Allowing the @code{.} to be |
|
anywhere after the first digit is a Gforth extension. |
|
@item |
|
The number conversion process does not check for overflow. |
|
@item |
|
In Gforth, number conversion to floating-point numbers always use base |
|
10, irrespective of the value of @code{BASE}. For the ANS Forth |
|
Standard, conversion to floating-point numbers whilst the value of |
|
@code{BASE} is not 10 is an ambiguous condition. |
|
@end itemize |
|
|
|
|
@example |
|
vocabulary my-new-words |
|
|
@node Interpret/Compile states, Literals, Number Conversion, The Text Interpreter |
\ add it to the search order |
@subsection Interpret/Compile states |
my-new-words |
@cindex Interpret/Compile states |
|
|
|
@comment TODO |
\ alternatively, add it to the search order and make it |
Intro blah. |
\ the compilation word list |
|
my-new-words definitions |
|
\ type "order" to see that the problem is solved |
|
@end example |
|
|
doc-state |
@c ------------------------------------------------------------- |
doc-[ |
@node Environmental Queries, Files, Word Lists, Words |
doc-] |
@section Environmental Queries |
|
@cindex environmental queries |
|
@comment TODO more index entries |
@node Literals, Interpreter Directives, Interpret/Compile states, The Text Interpreter |
|
@subsection Literals |
|
@cindex Literals |
|
|
|
@comment TODO |
|
Intro blah. |
|
|
|
doc-literal |
|
doc-]L |
|
doc-2literal |
|
doc-fliteral |
|
|
|
@node Interpreter Directives, ,Literals, The Text Interpreter |
|
@subsection Interpreter Directives |
|
@cindex Interpreter Directives |
|
|
|
These words are usually used outside of definitions; for example, to |
|
control which parts of a source file are processed by the text |
|
interpreter. There are only a few ANS Forth Standard words, but Gforth |
|
supplements these with a rich set of immediate control structure words |
|
to compensate for the fact that the non-immediate versions can only be |
|
used in compile state (@pxref{Control Structures}). |
|
|
|
doc-[IF] |
|
doc-[ELSE] |
|
doc-[THEN] |
|
doc-[ENDIF] |
|
|
|
doc-[IFDEF] |
|
doc-[IFUNDEF] |
|
|
|
doc-[?DO] |
|
doc-[DO] |
|
doc-[FOR] |
|
doc-[LOOP] |
|
doc-[+LOOP] |
|
doc-[NEXT] |
|
|
|
doc-[BEGIN] |
ANS Forth introduced the idea of ``environmental queries'' as a way |
doc-[UNTIL] |
for a program running on a system to determine certain characteristics of the system. |
doc-[AGAIN] |
The Standard specifies a number of strings that might be recognised by a system. |
doc-[WHILE] |
|
doc-[REPEAT] |
|
|
|
|
The Standard requires that the name space used for environmental queries |
|
be distinct from the name space used for definitions. |
|
|
@c ---------------------------------------------------------- |
Typically, environmental queries are supported by creating a set of |
@node Structures, Object-oriented Forth, The Text Interpreter, Words |
definitions in a word list that is @var{only} used during environmental |
@section Structures |
queries; that is what Gforth does. There is no Standard way of adding |
@cindex structures |
definitions to the set of recognised environmental queries, but any |
@cindex records |
implementation that supports the loading of optional word sets must have |
|
some mechanism for doing this (after loading the word set, the |
|
associated environmental query string must return @code{true}). In |
|
Gforth, the word list used to honour environmental queries can be |
|
manipulated just like any other word list. |
|
|
This section presents the structure package that comes with Gforth. A |
doc-environment? |
version of the package implemented in ANS Standard Forth is available in |
doc-environment-wordlist |
@file{compat/struct.fs}. This package was inspired by a posting on |
|
comp.lang.forth in 1989 (unfortunately I don't remember, by whom; |
|
possibly John Hayes). A version of this section has been published in |
|
???. Marcel Hendrix provided helpful comments. |
|
|
|
@menu |
doc-gforth |
* Why explicit structure support?:: |
doc-os-class |
* Structure Usage:: |
|
* Structure Naming Convention:: |
|
* Structure Implementation:: |
|
* Structure Glossary:: |
|
@end menu |
|
|
|
@node Why explicit structure support?, Structure Usage, Structures, Structures |
Note that, whilst the documentation for (e.g.) @code{gforth} shows it |
@subsection Why explicit structure support? |
returning two items on the stack, querying it using @code{environment?} |
|
will return an additional item; the @code{true} flag that shows that the |
|
string was recognised. |
|
|
@cindex address arithmetic for structures |
@comment TODO Document the standard strings or note where they are documented herein |
@cindex structures using address arithmetic |
|
If we want to use a structure containing several fields, we could simply |
|
reserve memory for it, and access the fields using address arithmetic |
|
(@pxref{Address arithmetic}). As an example, consider a structure with |
|
the following fields |
|
|
|
@table @code |
Here are some examples of using environmental queries: |
@item a |
|
is a float |
|
@item b |
|
is a cell |
|
@item c |
|
is a float |
|
@end table |
|
|
|
Given the (float-aligned) base address of the structure we get the |
@example |
address of the field |
s" address-unit-bits" environment? 0= |
|
[IF] |
|
cr .( environmental attribute address-units-bits unknown... ) cr |
|
[THEN] |
|
|
@table @code |
s" block" environment? [IF] DROP include block.fs [THEN] |
@item a |
|
without doing anything further. |
|
@item b |
|
with @code{float+} |
|
@item c |
|
with @code{float+ cell+ faligned} |
|
@end table |
|
|
|
It is easy to see that this can become quite tiring. |
s" gforth" environment? [IF] 2DROP include compat/vocabulary.fs [THEN] |
|
|
Moreover, it is not very readable, because seeing a |
s" gforth" environment? [IF] .( Gforth version ) TYPE |
@code{cell+} tells us neither which kind of structure is |
[ELSE] .( Not Gforth..) [THEN] |
accessed nor what field is accessed; we have to somehow infer the kind |
@end example |
of structure, and then look up in the documentation, which field of |
|
that structure corresponds to that offset. |
|
|
|
Finally, this kind of address arithmetic also causes maintenance |
|
troubles: If you add or delete a field somewhere in the middle of the |
|
structure, you have to find and change all computations for the fields |
|
afterwards. |
|
|
|
So, instead of using @code{cell+} and friends directly, how |
Here is an example of adding a definition to the environment word list: |
about storing the offsets in constants: |
|
|
|
@example |
@example |
0 constant a-offset |
get-current environment-wordlist set-current |
0 float+ constant b-offset |
true constant block |
0 float+ cell+ faligned c-offset |
true constant block-ext |
|
set-current |
@end example |
@end example |
|
|
Now we can get the address of field @code{x} with @code{x-offset |
You can see what definitions are in the environment word list like this: |
+}. This is much better in all respects. Of course, you still |
|
have to change all later offset definitions if you add a field. You can |
|
fix this by declaring the offsets in the following way: |
|
|
|
@example |
@example |
0 constant a-offset |
get-order 1+ environment-wordlist swap set-order words previous |
a-offset float+ constant b-offset |
|
b-offset cell+ faligned constant c-offset |
|
@end example |
@end example |
|
|
Since we always use the offsets with @code{+}, we could use a defining |
|
word @code{cfield} that includes the @code{+} in the action of the |
|
defined word: |
|
|
|
@example |
@c ------------------------------------------------------------- |
: cfield ( n "name" -- ) |
@node Files, Blocks, Environmental Queries, Words |
create , |
@section Files |
does> ( name execution: addr1 -- addr2 ) |
|
@@ + ; |
|
|
|
0 cfield a |
Gforth provides facilities for accessing files that are stored in the |
0 a float+ cfield b |
host operating system's file-system. Files that are processed by Gforth |
0 b cell+ faligned cfield c |
can be divided into two categories: |
@end example |
|
|
|
Instead of @code{x-offset +}, we now simply write @code{x}. |
@itemize @bullet |
|
@item |
|
Files that are processed by the Text Interpreter (@var{Forth source files}). |
|
@item |
|
Files that are processed by some other program (@var{general files}). |
|
@end itemize |
|
|
The structure field words now can be used quite nicely. However, |
@menu |
their definition is still a bit cumbersome: We have to repeat the |
* Forth source files:: |
name, the information about size and alignment is distributed before |
* General files:: |
and after the field definitions etc. The structure package presented |
* Search Paths:: |
here addresses these problems. |
* Forth Search Paths:: |
|
* General Search Paths:: |
|
@end menu |
|
|
@node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures |
|
@subsection Structure Usage |
|
@cindex structure usage |
|
|
|
@cindex @code{field} usage |
@c ------------------------------------------------------------- |
@cindex @code{struct} usage |
@node Forth source files, General files, Files, Files |
@cindex @code{end-struct} usage |
@subsection Forth source files |
You can define a structure for a (data-less) linked list with: |
@cindex including files |
@example |
@cindex Forth source files |
struct |
|
cell% field list-next |
|
end-struct list% |
|
@end example |
|
|
|
With the address of the list node on the stack, you can compute the |
The simplest way to interpret the contents of a file is to use one of |
address of the field that contains the address of the next node with |
these two formats: |
@code{list-next}. E.g., you can determine the length of a list |
|
with: |
|
|
|
@example |
@example |
: list-length ( list -- n ) |
include mysource.fs |
\ "list" is a pointer to the first element of a linked list |
s" mysource.fs" included |
\ "n" is the length of the list |
|
0 begin ( list1 n1 ) |
|
over |
|
while ( list1 n1 ) |
|
1+ swap list-next @@ swap |
|
repeat |
|
nip ; |
|
@end example |
@end example |
|
|
You can reserve memory for a list node in the dictionary with |
Sometimes you want to include a file only if it is not included already |
@code{list% %allot}, which leaves the address of the list node on the |
(by, say, another source file). In that case, you can use one of these |
stack. For the equivalent allocation on the heap you can use @code{list% |
fomats: |
%alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior), |
|
use @code{list% %allocate}). You can get the the size of a list |
|
node with @code{list% %size} and its alignment with @code{list% |
|
%alignment}. |
|
|
|
Note that in ANS Forth the body of a @code{create}d word is |
|
@code{aligned} but not necessarily @code{faligned}; |
|
therefore, if you do a: |
|
@example |
@example |
create @emph{name} foo% %allot |
require mysource.fs |
|
needs mysource.fs |
|
s" mysource.fs" required |
@end example |
@end example |
|
|
@noindent |
@cindex stack effect of included files |
then the memory alloted for @code{foo%} is |
@cindex including files, stack effect |
guaranteed to start at the body of @code{@emph{name}} only if |
I recommend that you write your source files such that interpreting them |
@code{foo%} contains only character, cell and double fields. |
does not change the stack. This allows using these files with |
|
@code{required} and friends without complications. For example: |
|
|
@cindex strcutures containing structures |
|
You can include a structure @code{foo%} as a field of |
|
another structure, like this: |
|
@example |
@example |
struct |
1 require foo.fs drop |
... |
|
foo% field ... |
|
... |
|
end-struct ... |
|
@end example |
@end example |
|
|
@cindex structure extension |
|
@cindex extended records |
|
Instead of starting with an empty structure, you can extend an |
|
existing structure. E.g., a plain linked list without data, as defined |
|
above, is hardly useful; You can extend it to a linked list of integers, |
|
like this:@footnote{This feature is also known as @emph{extended |
|
records}. It is the main innovation in the Oberon language; in other |
|
words, adding this feature to Modula-2 led Wirth to create a new |
|
language, write a new compiler etc. Adding this feature to Forth just |
|
required a few lines of code.} |
|
|
|
@example |
doc-include-file |
list% |
doc-included |
cell% field intlist-int |
doc-include |
end-struct intlist% |
@comment TODO describe what happens on error. Describes how the require |
@end example |
@comment stuff works and describe how to clear/reset the history (eg |
|
@comment for debug). Might want to include that in the MARKER example. |
|
doc-required |
|
doc-require |
|
doc-needs |
|
|
@code{intlist%} is a structure with two fields: |
A definition in ANS Forth for @code{required} is provided in |
@code{list-next} and @code{intlist-int}. |
@file{compat/required.fs}. |
|
|
@cindex structures containing arrays |
@c ------------------------------------------------------------- |
You can specify an array type containing @emph{n} elements of |
@node General files, Search Paths, Forth source files, Files |
type @code{foo%} like this: |
@subsection General files |
|
@cindex general files |
|
@cindex file-handling |
|
|
@example |
Files are opened/created by name and type. The following types are |
foo% @emph{n} * |
recognised: |
@end example |
|
|
|
You can use this array type in any place where you can use a normal |
doc-r/o |
type, e.g., when defining a @code{field}, or with |
doc-r/w |
@code{%allot}. |
doc-w/o |
|
doc-bin |
|
|
@cindex first field optimization |
When a file is opened/created, it returns a file identifier, |
The first field is at the base address of a structure and the word |
@var{wfileid} that is used for all other file commands. All file |
for this field (e.g., @code{list-next}) actually does not change |
commands also return a status value, @var{wior}, that is 0 for a |
the address on the stack. You may be tempted to leave it away in the |
successful operation and an implementation-defined non-zero value in the |
interest of run-time and space efficiency. This is not necessary, |
case of an error. |
because the structure package optimizes this case and compiling such |
|
words does not generate any code. So, in the interest of readability |
|
and maintainability you should include the word for the field when |
|
accessing the field. |
|
|
|
@node Structure Naming Convention, Structure Implementation, Structure Usage, Structures |
doc-open-file |
@subsection Structure Naming Convention |
doc-create-file |
@cindex structure naming conventions |
|
|
|
The field names that come to (my) mind are often quite generic, and, |
doc-close-file |
if used, would cause frequent name clashes. E.g., many structures |
doc-delete-file |
probably contain a @code{counter} field. The structure names |
doc-rename-file |
that come to (my) mind are often also the logical choice for the names |
doc-read-file |
of words that create such a structure. |
doc-read-line |
|
doc-write-file |
|
doc-write-line |
|
doc-emit-file |
|
doc-flush-file |
|
|
Therefore, I have adopted the following naming conventions: |
doc-file-status |
|
doc-file-position |
|
doc-reposition-file |
|
doc-file-size |
|
doc-resize-file |
|
|
@itemize @bullet |
@c --------------------------------------------------------- |
@cindex field naming convention |
@node Search Paths, Forth Search Paths, General files, Files |
@item |
@subsection Search Paths |
The names of fields are of the form |
@cindex path for @code{included} |
@code{@emph{struct}-@emph{field}}, where |
@cindex file search path |
@code{@emph{struct}} is the basic name of the structure, and |
@cindex @code{include} search path |
@code{@emph{field}} is the basic name of the field. You can |
@cindex search path for files |
think of field words as converting the (address of the) |
|
structure into the (address of the) field. |
|
|
|
@cindex structure naming convention |
@comment what uses these search paths.. just include and friends? |
@item |
If you specify an absolute filename (i.e., a filename starting with |
The names of structures are of the form |
@file{/} or @file{~}, or with @file{:} in the second position (as in |
@code{@emph{struct}%}, where |
@samp{C:...})) for @code{included} and friends, that file is included |
@code{@emph{struct}} is the basic name of the structure. |
just as you would expect. |
@end itemize |
|
|
|
This naming convention does not work that well for fields of extended |
For relative filenames, Gforth uses a search path similar to Forth's |
structures; e.g., the integer list structure has a field |
search order (@pxref{Word Lists}). It tries to find the given filename |
@code{intlist-int}, but has @code{list-next}, not |
in the directories present in the path, and includes the first one it |
@code{intlist-next}. |
finds. There are separate search paths for Forth source files and |
|
general files. |
|
|
@node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures |
If the search path contains the directory @file{.} (as it should), this |
@subsection Structure Implementation |
refers to the directory that the present file was @code{included} |
@cindex structure implementation |
from. This allows files to include other files relative to their own |
@cindex implementation of structures |
position (irrespective of the current working directory or the absolute |
|
position). This feature is essential for libraries consisting of |
|
several files, where a file may include other files from the library. |
|
It corresponds to @code{#include "..."} in C. If the current input |
|
source is not a file, @file{.} refers to the directory of the innermost |
|
file being included, or, if there is no file being included, to the |
|
current working directory. |
|
|
The central idea in the implementation is to pass the data about the |
Use @file{~+} to refer to the current working directory (as in the |
structure being built on the stack, not in some global |
@code{bash}). |
variable. Everything else falls into place naturally once this design |
|
decision is made. |
|
|
|
The type description on the stack is of the form @emph{align |
If the filename starts with @file{./}, the search path is not searched |
size}. Keeping the size on the top-of-stack makes dealing with arrays |
(just as with absolute filenames), and the @file{.} has the same meaning |
very simple. |
as described above. |
|
|
@code{field} is a defining word that uses @code{Create} |
@c --------------------------------------------------------- |
and @code{DOES>}. The body of the field contains the offset |
@node Forth Search Paths, General Search Paths, Search Paths, Files |
of the field, and the normal @code{DOES>} action is simply: |
@subsubsection Forth Search Paths |
|
@cindex search path control - forth |
|
|
|
The search path is initialized when you start Gforth (@pxref{Invoking |
|
Gforth}). You can display it and change it using these words: |
|
|
|
doc-.fpath |
|
doc-fpath+ |
|
doc-fpath= |
|
doc-open-fpath-file |
|
|
|
Here is an example of using @code{fpath} and @code{require}: |
|
|
@example |
@example |
@ + |
fpath= /usr/lib/forth/|./ |
|
require timer.fs |
@end example |
@end example |
|
|
@noindent |
@c --------------------------------------------------------- |
i.e., add the offset to the address, giving the stack effect |
@node General Search Paths, , Forth Search Paths, Files |
@var{addr1 -- addr2} for a field. |
@subsubsection General Search Paths |
|
@cindex search path control - for user applications |
|
|
@cindex first field optimization, implementation |
Your application may need to search files in several directories, like |
This simple structure is slightly complicated by the optimization |
@code{included} does. To facilitate this, Gforth allows you to define |
for fields with offset 0, which requires a different |
and use your own search paths, by providing generic equivalents of the |
@code{DOES>}-part (because we cannot rely on there being |
Forth search path words: |
something on the stack if such a field is invoked during |
|
compilation). Therefore, we put the different @code{DOES>}-parts |
|
in separate words, and decide which one to invoke based on the |
|
offset. For a zero offset, the field is basically a noop; it is |
|
immediate, and therefore no code is generated when it is compiled. |
|
|
|
@node Structure Glossary, , Structure Implementation, Structures |
doc-.path |
@subsection Structure Glossary |
doc-path+ |
@cindex structure glossary |
doc-path= |
|
doc-open-path-file |
|
|
doc-%align |
Here's an example of creating a search path: |
doc-%alignment |
|
doc-%alloc |
@example |
doc-%allocate |
\ Make a buffer for the path: |
doc-%allot |
create mypath 100 chars , \ maximum length (is checked) |
doc-cell% |
0 , \ real len |
doc-char% |
100 chars allot \ space for path |
doc-dfloat% |
@end example |
doc-double% |
|
doc-end-struct |
|
doc-field |
|
doc-float% |
|
doc-nalign |
|
doc-sfloat% |
|
doc-%size |
|
doc-struct |
|
|
|
@c ------------------------------------------------------------- |
@c ------------------------------------------------------------- |
@node Object-oriented Forth, Tokens for Words, Structures, Words |
@node Blocks, Other I/O, Files, Words |
@section Object-oriented Forth |
@section Blocks |
|
|
Gforth comes with three packets for object-oriented programming: |
This chapter describes how to use block files within Gforth. |
@file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them |
|
is preloaded, so you have to @code{include} them before use. The most |
Block files are traditionally means of data and source storage in |
important differences between these packets (and others) are discussed |
Forth. They have been very important in resource-starved computers |
in @ref{Comparison with other object models}. All packets are written |
without OS in the past. Gforth doesn't encourage to use blocks as |
in ANS Forth and can be used with any other ANS Forth. |
source, and provides blocks only for backward compatibility. The ANS |
|
standard requires blocks to be available when files are. |
|
|
|
@comment TODO what about errors on open-blocks? |
|
doc-open-blocks |
|
doc-use |
|
doc-scr |
|
doc-blk |
|
doc-get-block-fid |
|
doc-block-position |
|
doc-update |
|
doc-save-buffers |
|
doc-save-buffer |
|
doc-empty-buffers |
|
doc-empty-buffer |
|
doc-flush |
|
doc-get-buffer |
|
doc---block-block |
|
doc-buffer |
|
doc-updated? |
|
doc-list |
|
doc-load |
|
doc-thru |
|
doc-+load |
|
doc-+thru |
|
doc---block---> |
|
doc-block-included |
|
|
|
@c ------------------------------------------------------------- |
|
@node Other I/O, Programming Tools, Blocks, Words |
|
@section Other I/O |
|
@comment TODO more index entries |
|
|
@menu |
@menu |
* Why object-oriented programming?:: |
* Simple numeric output:: Predefined formats |
* Object-Oriented Terminology:: |
* Formatted numeric output:: Formatted (pictured) output |
* Objects:: |
* String Formats:: How Forth stores strings in memory |
* OOF:: |
* Displaying characters and strings:: Other stuff |
* Mini-OOF:: |
* Input:: Input |
* Comparison with other object models:: |
|
@end menu |
@end menu |
|
|
|
@node Simple numeric output, Formatted numeric output, Other I/O, Other I/O |
|
@subsection Simple numeric output |
|
@cindex simple numeric output |
|
@comment TODO more index entries |
|
|
@node Why object-oriented programming?, Object-Oriented Terminology, , Object-oriented Forth |
The simplest output functions are those that display numbers from the |
@subsubsection Why object-oriented programming? |
data or floating-point stacks. Floating-point output is always displayed |
@cindex object-oriented programming motivation |
using base 10. Numbers displayed from the data stack use the value stored |
@cindex motivation for object-oriented programming |
in @code{base}. |
|
|
Often we have to deal with several data structures (@emph{objects}), |
doc-. |
that have to be treated similarly in some respects, but differently in |
doc-dec. |
others. Graphical objects are the textbook example: circles, triangles, |
doc-hex. |
dinosaurs, icons, and others, and we may want to add more during program |
doc-u. |
development. We want to apply some operations to any graphical object, |
doc-.r |
e.g., @code{draw} for displaying it on the screen. However, @code{draw} |
doc-u.r |
has to do something different for every kind of object. |
doc-d. |
@comment TODO add some other operations eg perimeter, area |
doc-ud. |
@comment and tie in to concrete examples later.. |
doc-d.r |
|
doc-ud.r |
|
doc-f. |
|
doc-fe. |
|
doc-fs. |
|
|
We could implement @code{draw} as a big @code{CASE} |
Examples of printing the number 1234.5678E23 in the different floating-point output |
control structure that executes the appropriate code depending on the |
formats are shown below: |
kind of object to be drawn. This would be not be very elegant, and, |
|
moreover, we would have to change @code{draw} every time we add |
|
a new kind of graphical object (say, a spaceship). |
|
|
|
What we would rather do is: When defining spaceships, we would tell |
@example |
the system: "Here's how you @code{draw} a spaceship; you figure |
f. 123456779999999000000000000. |
out the rest." |
fe. 123.456779999999E24 |
|
fs. 1.23456779999999E26 |
This is the problem that all systems solve that (rightfully) call |
@end example |
themselves object-oriented; the object-oriented packages presented here |
|
solve this problem (and not much else). |
|
@comment TODO ?list properties of oo systems.. oo vs o-based? |
|
|
|
@node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth |
|
@subsubsection Object-Oriented Terminology |
|
@cindex object-oriented terminology |
|
@cindex terminology for object-oriented programming |
|
|
|
This section is mainly for reference, so you don't have to understand |
|
all of it right away. The terminology is mainly Smalltalk-inspired. In |
|
short: |
|
|
|
@table @emph |
|
@cindex class |
|
@item class |
|
a data structure definition with some extras. |
|
|
|
@cindex object |
@node Formatted numeric output, String Formats, Simple numeric output, Other I/O |
@item object |
@subsection Formatted numeric output |
an instance of the data structure described by the class definition. |
@cindex Formatted numeric output |
|
@cindex pictured numeric output |
|
@comment TODO more index entries |
|
|
@cindex instance variables |
Forth traditionally uses a technique called @var{pictured numeric |
@item instance variables |
output} for formatted printing of integers. In this technique, digits |
fields of the data structure. |
are extracted from the number (using the current output radix defined by |
|
@code{base}), converted to ASCII codes and appended to a string that is |
|
built in a scratch-pad area of memory (@pxref{core-idef, |
|
Implementation-defined options, Implementation-defined |
|
options}). Arbitrary characters can be appended to the string during the |
|
extraction process. The completed string is specified by an address |
|
and length and can be manipulated (@code{TYPE}ed, copied, modified) |
|
under program control. |
|
|
@cindex selector |
All of the words described in the previous section for simple numeric |
@cindex method selector |
output are implemented in Gforth using pictured numeric output. |
@cindex virtual function |
|
@item selector |
|
(or @emph{method selector}) a word (e.g., |
|
@code{draw}) that performs an operation on a variety of data |
|
structures (classes). A selector describes @emph{what} operation to |
|
perform. In C++ terminology: a (pure) virtual function. |
|
|
|
@cindex method |
Three important things to remember about Pictured Numeric Output: |
@item method |
|
the concrete definition that performs the operation |
|
described by the selector for a specific class. A method specifies |
|
@emph{how} the operation is performed for a specific class. |
|
|
|
@cindex selector invocation |
@itemize @bullet |
@cindex message send |
@item |
@cindex invoking a selector |
It always operates on double-precision numbers; to display a single-precision number, |
@item selector invocation |
convert it first (@pxref{Double precision} for ways of doing this). |
a call of a selector. One argument of the call (the TOS (top-of-stack)) |
@item |
is used for determining which method is used. In Smalltalk terminology: |
It always treats the double-precision number as though it were unsigned. Refer to |
a message (consisting of the selector and the other arguments) is sent |
the examples below for ways of printing signed numbers. |
to the object. |
@item |
|
The string is built up from right to left; least significant digit first. |
|
@end itemize |
|
|
@cindex receiving object |
doc-<# |
@item receiving object |
doc-# |
the object used for determining the method executed by a selector |
doc-#s |
invocation. In the @file{objects.fs} model, it is the object that is on |
doc-hold |
the TOS when the selector is invoked. (@emph{Receiving} comes from |
doc-sign |
the Smalltalk @emph{message} terminology.) |
doc-#> |
|
|
@cindex child class |
doc-represent |
@cindex parent class |
|
@cindex inheritance |
|
@item child class |
|
a class that has (@emph{inherits}) all properties (instance variables, |
|
selectors, methods) from a @emph{parent class}. In Smalltalk |
|
terminology: The subclass inherits from the superclass. In C++ |
|
terminology: The derived class inherits from the base class. |
|
|
|
@end table |
Here are some examples of using pictured numeric output: |
|
|
@c If you wonder about the message sending terminology, it comes from |
@example |
@c a time when each object had it's own task and objects communicated via |
: my-u. ( u -- ) |
@c message passing; eventually the Smalltalk developers realized that |
\ Simplest use of pns.. behaves like Standard u. |
@c they can do most things through simple (indirect) calls. They kept the |
0 \ convert to unsigned double |
@c terminology. |
<# \ start conversion |
|
#s \ convert all digits |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
|
: cents-only ( u -- ) |
|
0 \ convert to unsigned double |
|
<# \ start conversion |
|
# # \ convert two least-significant digits |
|
#> \ complete conversion, discard other digits |
|
TYPE SPACE ; \ display, with trailing space |
|
|
@node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth |
: dollars-and-cents ( u -- ) |
@subsection The @file{objects.fs} model |
0 \ convert to unsigned double |
@cindex objects |
<# \ start conversion |
@cindex object-oriented programming |
# # \ convert two least-significant digits |
|
[char] . hold \ insert decimal point |
|
#s \ convert remaining digits |
|
[char] $ hold \ append currency symbol |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
@cindex @file{objects.fs} |
: my-. ( n -- ) |
@cindex @file{oof.fs} |
\ handling negatives.. behaves like Standard . |
|
s>d \ convert to signed double |
|
swap over dabs \ leave sign byte followed by unsigned double |
|
<# \ start conversion |
|
#s \ convert all digits |
|
rot sign \ get at sign byte, append "-" if needed |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
This section describes the @file{objects.fs} packet. This material also has been published in @cite{Yet Another Forth Objects Package} by Anton Ertl and appeared in Forth Dimensions 19(2), pages 37--43 (@url{http://www.complang.tuwien.ac.at/forth/objects/objects.html}). |
: account. ( n -- ) |
@c McKewan's and Zsoter's packages |
\ accountants don't like minus signs, they use braces |
|
\ for negative numbers |
|
s>d \ convert to signed double |
|
swap over dabs \ leave sign byte followed by unsigned double |
|
<# \ start conversion |
|
2 pick \ get copy of sign byte |
|
0< IF [char] ) hold THEN \ right-most character of output |
|
#s \ convert all digits |
|
rot \ get at sign byte |
|
0< IF [char] ( hold THEN |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
@end example |
|
|
This section assumes that you have read @ref{Structures}. |
Here are some examples of using these words: |
|
|
The techniques on which this model is based have been used to implement |
@example |
the parser generator, Gray, and have also been used in Gforth for |
1 my-u. 1 |
implementing the various flavours of word lists (hashed or not, |
hex -1 my-u. decimal FFFFFFFF |
case-sensitive or not, special-purpose word lists for locals etc.). |
1 cents-only 01 |
|
1234 cents-only 34 |
|
2 dollars-and-cents $0.02 |
|
1234 dollars-and-cents $12.34 |
|
123 my-. 123 |
|
-123 my. -123 |
|
123 account. 123 |
|
-456 account. (456) |
|
@end example |
|
|
|
|
@menu |
@node String Formats, Displaying characters and strings, Formatted numeric output, Other I/O |
* Properties of the Objects model:: |
@subsection String Formats |
* Basic Objects Usage:: |
@cindex string formats |
* The Objects base class:: |
|
* Creating objects:: |
|
* Object-Oriented Programming Style:: |
|
* Class Binding:: |
|
* Method conveniences:: |
|
* Classes and Scoping:: |
|
* Object Interfaces:: |
|
* Objects Implementation:: |
|
* Objects Glossary:: |
|
@end menu |
|
|
|
Marcel Hendrix provided helpful comments on this section. Andras Zsoter |
@comment TODO more index entries |
and Bernd Paysan helped me with the related works section. |
|
|
|
@node Properties of the Objects model, Basic Objects Usage, Objects, Objects |
Forth commonly uses two different methods for representing a string: |
@subsubsection Properties of the @file{objects.fs} model |
|
@cindex @file{objects.fs} properties |
|
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
It is straightforward to pass objects on the stack. Passing |
@cindex address of counted string |
selectors on the stack is a little less convenient, but possible. |
As a @var{counted string}, represented by a @var{c-addr}. The char |
|
addressed by @var{c-addr} contains a character-count, @var{n}, of the |
|
string and the string occupies the subsequent @var{n} char addresses in |
|
memory. |
|
@item |
|
As cell pair on the stack; @var{c-addr u}, where @var{u} is the length |
|
of the string in characters, and @var{c-addr} is the address of the |
|
first byte of the string. |
|
@end itemize |
|
|
@item |
ANS Forth encourages the use of the second format when representing |
Objects are just data structures in memory, and are referenced by their |
strings on the stack, whilst conceeding that the counted string format |
address. You can create words for objects with normal defining words |
remains useful as a way of storing strings in memory. |
like @code{constant}. Likewise, there is no difference between instance |
|
variables that contain objects and those that contain other data. |
|
|
|
@item |
doc-count |
Late binding is efficient and easy to use. |
|
|
|
@item |
@xref{Memory Blocks} for words that move, copy and search |
It avoids parsing, and thus avoids problems with state-smartness |
for strings. @xref{Displaying characters and strings,} for words that |
and reduced extensibility; for convenience there are a few parsing |
display characters and strings. |
words, but they have non-parsing counterparts. There are also a few |
|
defining words that parse. This is hard to avoid, because all standard |
|
defining words parse (except @code{:noname}); however, such |
|
words are not as bad as many other parsing words, because they are not |
|
state-smart. |
|
|
|
@item |
|
It does not try to incorporate everything. It does a few things and does |
|
them well (IMO). In particular, this model was not designed to support |
|
information hiding (although it has features that may help); you can use |
|
a separate package for achieving this. |
|
|
|
@item |
@node Displaying characters and strings, Input, String Formats, Other I/O |
It is layered; you don't have to learn and use all features to use this |
@subsection Displaying characters and strings |
model. Only a few features are necessary (@xref{Basic Objects Usage}, |
@cindex displaying characters and strings |
@xref{The Objects base class}, @xref{Creating objects}.), the others |
@cindex compiling characters and strings |
are optional and independent of each other. |
@cindex cursor control |
|
|
@item |
@comment TODO more index entries |
An implementation in ANS Forth is available. |
|
|
|
@end itemize |
This section starts with a glossary of Forth words and ends with a set |
|
of examples. |
|
|
|
doc-bl |
|
doc-space |
|
doc-spaces |
|
doc-emit |
|
doc-toupper |
|
doc-." |
|
doc-.( |
|
doc-type |
|
doc-cr |
|
doc-at-xy |
|
doc-page |
|
doc-s" |
|
doc-c" |
|
doc-char |
|
doc-[char] |
|
doc-sliteral |
|
|
@node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects |
As an example, consider the following text, stored in a file @file{test.fs}: |
@subsubsection Basic @file{objects.fs} Usage |
|
@cindex basic objects usage |
|
@cindex objects, basic usage |
|
|
|
You can define a class for graphical objects like this: |
|
|
|
@cindex @code{class} usage |
|
@cindex @code{end-class} usage |
|
@cindex @code{selector} usage |
|
@example |
@example |
object class \ "object" is the parent class |
.( text-1) |
selector draw ( x y graphical -- ) |
: my-word |
end-class graphical |
." text-2" cr |
|
.( text-3) |
|
; |
|
|
|
." text-4" |
|
|
|
: my-char |
|
[char] ALPHABET emit |
|
char emit |
|
; |
@end example |
@end example |
|
|
This code defines a class @code{graphical} with an |
When you load this code into Gforth, the following output is generated: |
operation @code{draw}. We can perform the operation |
|
@code{draw} on any @code{graphical} object, e.g.: |
|
|
|
@example |
@example |
100 100 t-rex draw |
@kbd{include test.fs <return>} text-1text-3text-4 ok |
@end example |
@end example |
|
|
@noindent |
@itemize @bullet |
where @code{t-rex} is a word (say, a constant) that produces a |
@item |
graphical object. |
Messages @code{text-1} and @code{text-3} are displayed because @code{.(} |
|
is an immediate word; it behaves in the same way whether it is used inside |
|
or outside a colon definition. |
|
@item |
|
Message @code{text-4} is displayed because of Gforth's added interpretation |
|
semantics for @code{."}. |
|
@item |
|
Message @code{text-2} is @var{not} displayed, because the text interpreter |
|
performs the compilation semantics for @code{."} within the definition of |
|
@code{my-word}. |
|
@end itemize |
|
|
@comment nac TODO add a 2nd operation eg perimeter.. and use for |
Here are some examples of executing @code{my-word} and @code{my-char}: |
@comment a concrete example |
|
|
|
@cindex abstract class |
@example |
How do we create a graphical object? With the present definitions, |
@kbd{my-word <return>} text-2 |
we cannot create a useful graphical object. The class |
ok |
@code{graphical} describes graphical objects in general, but not |
@kbd{my-char fred <return>} Af ok |
any concrete graphical object type (C++ users would call it an |
@kbd{my-char jim <return>} Aj ok |
@emph{abstract class}); e.g., there is no method for the selector |
@end example |
@code{draw} in the class @code{graphical}. |
|
|
|
For concrete graphical objects, we define child classes of the |
@itemize @bullet |
class @code{graphical}, e.g.: |
@item |
|
Message @code{text-2} is displayed because of the run-time behaviour of |
|
@code{."}. |
|
@item |
|
@code{[char]} compiles the ``A'' from ``ALPHABET'' and puts its display code |
|
on the stack at run-time. @code{emit} always displays the character |
|
when @code{my-char} is executed. |
|
@item |
|
@code{char} parses a string at run-time and the second @code{emit} displays |
|
the first character of the string. |
|
@item |
|
If you type @code{see my-char} you can see that @code{[char]} discarded |
|
the text ``LPHABET'' and only compiled the display code for ``A'' into the |
|
definition of @code{my-char}. |
|
@end itemize |
|
|
@cindex @code{overrides} usage |
|
@cindex @code{field} usage in class definition |
|
@example |
|
graphical class \ "graphical" is the parent class |
|
cell% field circle-radius |
|
|
|
:noname ( x y circle -- ) |
|
circle-radius @@ draw-circle ; |
|
overrides draw |
|
|
|
:noname ( n-radius circle -- ) |
@node Input, , Displaying characters and strings, Other I/O |
circle-radius ! ; |
@subsection Input |
overrides construct |
@cindex input |
|
@comment TODO more index entries |
|
|
end-class circle |
Blah on traditional and recommended string formats. |
@end example |
|
|
|
Here we define a class @code{circle} as a child of @code{graphical}, |
doc--trailing |
with field @code{circle-radius} (which behaves just like a field |
doc-/string |
(@pxref{Structures}); it defines (using @code{overrides}) new methods |
doc-convert |
for the selectors @code{draw} and @code{construct} (@code{construct} is |
doc->number |
defined in @code{object}, the parent class of @code{graphical}). |
doc->float |
|
doc-accept |
|
doc-query |
|
doc-expect |
|
doc-evaluate |
|
doc-key |
|
doc-key? |
|
|
Now we can create a circle on the heap (i.e., |
TODO reference the block move stuff elsewhere |
@code{allocate}d memory) with: |
|
|
|
@cindex @code{heap-new} usage |
TODO convert and >number might be better in the numeric input section. |
@example |
|
50 circle heap-new constant my-circle |
|
@end example |
|
|
|
@noindent |
TODO maybe some of these shouldn't be here but should be in a ``parsing'' section |
@code{heap-new} invokes @code{construct}, thus |
|
initializing the field @code{circle-radius} with 50. We can draw |
|
this new circle at (100,100) with: |
|
|
|
@example |
|
100 100 my-circle draw |
|
@end example |
|
|
|
@cindex selector invocation, restrictions |
@c ------------------------------------------------------------- |
@cindex class definition, restrictions |
@node Programming Tools, Assembler and Code Words, Other I/O, Words |
Note: You can only invoke a selector if the object on the TOS |
@section Programming Tools |
(the receiving object) belongs to the class where the selector was |
@cindex programming tools |
defined or one of its descendents; e.g., you can invoke |
|
@code{draw} only for objects belonging to @code{graphical} |
|
or its descendents (e.g., @code{circle}). Immediately before |
|
@code{end-class}, the search order has to be the same as |
|
immediately after @code{class}. |
|
|
|
@node The Objects base class, Creating objects, Basic Objects Usage, Objects |
@menu |
@subsubsection The @file{object.fs} base class |
* Debugging:: Simple and quick. |
@cindex @code{object} class |
* Assertions:: Making your programs self-checking. |
|
* Singlestep Debugger:: Executing your program word by word. |
|
@end menu |
|
|
When you define a class, you have to specify a parent class. So how do |
@node Debugging, Assertions, Programming Tools, Programming Tools |
you start defining classes? There is one class available from the start: |
@subsection Debugging |
@code{object}. It is ancestor for all classes and so is the |
@cindex debugging |
only class that has no parent. It has two selectors: @code{construct} |
|
and @code{print}. |
|
|
|
@node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects |
Languages with a slow edit/compile/link/test development loop tend to |
@subsubsection Creating objects |
require sophisticated tracing/stepping debuggers to facilate |
@cindex creating objects |
productive debugging. |
@cindex object creation |
|
@cindex object allocation options |
|
|
|
@cindex @code{heap-new} discussion |
A much better (faster) way in fast-compiling languages is to add |
@cindex @code{dict-new} discussion |
printing code at well-selected places, let the program run, look at |
@cindex @code{construct} discussion |
the output, see where things went wrong, add more printing code, etc., |
You can create and initialize an object of a class on the heap with |
until the bug is found. |
@code{heap-new} ( ... class -- object ) and in the dictionary |
|
(allocation with @code{allot}) with @code{dict-new} ( |
|
... class -- object ). Both words invoke @code{construct}, which |
|
consumes the stack items indicated by "..." above. |
|
|
|
@cindex @code{init-object} discussion |
The simple debugging aids provided in @file{debugs.fs} |
@cindex @code{class-inst-size} discussion |
are meant to support this style of debugging. In addition, there are |
If you want to allocate memory for an object yourself, you can get its |
words for non-destructively inspecting the stack and memory: |
alignment and size with @code{class-inst-size 2@@} ( class -- |
|
align size ). Once you have memory for an object, you can initialize |
|
it with @code{init-object} ( ... class object -- ); |
|
@code{construct} does only a part of the necessary work. |
|
|
|
@node Object-Oriented Programming Style, Class Binding, Creating objects, Objects |
doc-.s |
@subsubsection Object-Oriented Programming Style |
doc-f.s |
@cindex object-oriented programming style |
|
|
|
This section is not exhaustive. |
There is a word @code{.r} but it does @var{not} display the return |
|
stack! It is used for formatted numeric output. |
|
|
@cindex stack effects of selectors |
doc-depth |
@cindex selectors and stack effects |
doc-fdepth |
In general, it is a good idea to ensure that all methods for the |
doc-clearstack |
same selector have the same stack effect: when you invoke a selector, |
doc-? |
you often have no idea which method will be invoked, so, unless all |
doc-dump |
methods have the same stack effect, you will not know the stack effect |
|
of the selector invocation. |
|
|
|
One exception to this rule is methods for the selector |
The word @code{~~} prints debugging information (by default the source |
@code{construct}. We know which method is invoked, because we |
location and the stack contents). It is easy to insert. If you use Emacs |
specify the class to be constructed at the same place. Actually, I |
it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to |
defined @code{construct} as a selector only to give the users a |
query-replace them with nothing). The deferred words |
convenient way to specify initialization. The way it is used, a |
@code{printdebugdata} and @code{printdebugline} control the output of |
mechanism different from selector invocation would be more natural |
@code{~~}. The default source location output format works well with |
(but probably would take more code and more space to explain). |
Emacs' compilation mode, so you can step through the program at the |
|
source level using @kbd{C-x `} (the advantage over a stepping debugger |
|
is that you can step in any direction and you know where the crash has |
|
happened or where the strange data has occurred). |
|
|
@node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects |
The default actions of @code{~~} clobber the contents of the pictured |
@subsubsection Class Binding |
numeric output string, so you should not use @code{~~}, e.g., between |
@cindex class binding |
@code{<#} and @code{#>}. |
@cindex early binding |
|
|
|
@cindex late binding |
doc-~~ |
Normal selector invocations determine the method at run-time depending |
doc-printdebugdata |
on the class of the receiving object. This run-time selection is called |
doc-printdebugline |
@var{late binding}. |
|
|
|
Sometimes it's preferable to invoke a different method. For example, |
doc-see |
you might want to use the simple method for @code{print}ing |
doc-marker |
@code{object}s instead of the possibly long-winded @code{print} method |
|
of the receiver class. You can achieve this by replacing the invocation |
Here's an example of using @code{marker} at the start of a source file |
of @code{print} with: |
that you are debugging; it ensures that you only ever have one copy of |
|
the file's definitions compiled at any time: |
|
|
@cindex @code{[bind]} usage |
|
@example |
@example |
[bind] object print |
[IFDEF] my-code |
@end example |
my-code |
|
[ENDIF] |
|
|
@noindent |
marker my-code |
in compiled code or: |
|
|
|
@cindex @code{bind} usage |
\ .. definitions start here |
@example |
\ . |
bind object print |
\ . |
|
\ end |
@end example |
@end example |
|
|
@cindex class binding, alternative to |
|
@noindent |
|
in interpreted code. Alternatively, you can define the method with a |
|
name (e.g., @code{print-object}), and then invoke it through the |
|
name. Class binding is just a (often more convenient) way to achieve |
|
the same effect; it avoids name clutter and allows you to invoke |
|
methods directly without naming them first. |
|
|
|
@cindex superclass binding |
|
@cindex parent class binding |
|
A frequent use of class binding is this: When we define a method |
|
for a selector, we often want the method to do what the selector does |
|
in the parent class, and a little more. There is a special word for |
|
this purpose: @code{[parent]}; @code{[parent] |
|
@emph{selector}} is equivalent to @code{[bind] @emph{parent |
|
selector}}, where @code{@emph{parent}} is the parent |
|
class of the current class. E.g., a method definition might look like: |
|
|
|
@cindex @code{[parent]} usage |
|
@example |
|
:noname |
|
dup [parent] foo \ do parent's foo on the receiving object |
|
... \ do some more |
|
; overrides foo |
|
@end example |
|
|
|
@cindex class binding as optimization |
@node Assertions, Singlestep Debugger, Debugging, Programming Tools |
In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, |
@subsection Assertions |
March 1997), Andrew McKewan presents class binding as an optimization |
@cindex assertions |
technique. I recommend not using it for this purpose unless you are in |
|
an emergency. Late binding is pretty fast with this model anyway, so the |
|
benefit of using class binding is small; the cost of using class binding |
|
where it is not appropriate is reduced maintainability. |
|
|
|
While we are at programming style questions: You should bind |
It is a good idea to make your programs self-checking, especially if you |
selectors only to ancestor classes of the receiving object. E.g., say, |
make an assumption that may become invalid during maintenance (for |
you know that the receiving object is of class @code{foo} or its |
example, that a certain field of a data structure is never zero). Gforth |
descendents; then you should bind only to @code{foo} and its |
supports @var{assertions} for this purpose. They are used like this: |
ancestors. |
|
|
|
@node Method conveniences, Classes and Scoping, Class Binding, Objects |
@example |
@subsubsection Method conveniences |
assert( @var{flag} ) |
@cindex method conveniences |
@end example |
|
|
In a method you usually access the receiving object pretty often. If |
The code between @code{assert(} and @code{)} should compute a flag, that |
you define the method as a plain colon definition (e.g., with |
should be true if everything is alright and false otherwise. It should |
@code{:noname}), you may have to do a lot of stack |
not change anything else on the stack. The overall stack effect of the |
gymnastics. To avoid this, you can define the method with @code{m: |
assertion is @code{( -- )}. E.g. |
... ;m}. E.g., you could define the method for |
|
@code{draw}ing a @code{circle} with |
|
|
|
@cindex @code{this} usage |
|
@cindex @code{m:} usage |
|
@cindex @code{;m} usage |
|
@example |
@example |
m: ( x y circle -- ) |
assert( 1 1 + 2 = ) \ what we learn in school |
( x y ) this circle-radius @@ draw-circle ;m |
assert( dup 0<> ) \ assert that the top of stack is not zero |
|
assert( false ) \ this code should not be reached |
@end example |
@end example |
|
|
@cindex @code{exit} in @code{m: ... ;m} |
The need for assertions is different at different times. During |
@cindex @code{exitm} discussion |
debugging, we want more checking, in production we sometimes care more |
@cindex @code{catch} in @code{m: ... ;m} |
for speed. Therefore, assertions can be turned off, i.e., the assertion |
When this method is executed, the receiver object is removed from the |
becomes a comment. Depending on the importance of an assertion and the |
stack; you can access it with @code{this} (admittedly, in this |
time it takes to check it, you may want to turn off some assertions and |
example the use of @code{m: ... ;m} offers no advantage). Note |
keep others turned on. Gforth provides several levels of assertions for |
that I specify the stack effect for the whole method (i.e. including |
this purpose: |
the receiver object), not just for the code between @code{m:} |
|
and @code{;m}. You cannot use @code{exit} in |
|
@code{m:...;m}; instead, use |
|
@code{exitm}.@footnote{Moreover, for any word that calls |
|
@code{catch} and was defined before loading |
|
@code{objects.fs}, you have to redefine it like I redefined |
|
@code{catch}: @code{: catch this >r catch r> to-this ;}} |
|
|
|
@cindex @code{inst-var} usage |
doc-assert0( |
You will frequently use sequences of the form @code{this |
doc-assert1( |
@emph{field}} (in the example above: @code{this |
doc-assert2( |
circle-radius}). If you use the field only in this way, you can |
doc-assert3( |
define it with @code{inst-var} and eliminate the |
doc-assert( |
@code{this} before the field name. E.g., the @code{circle} |
doc-) |
class above could also be defined with: |
|
|
|
@example |
The variable @code{assert-level} specifies the highest assertions that |
graphical class |
are turned on. I.e., at the default @code{assert-level} of one, |
cell% inst-var radius |
@code{assert0(} and @code{assert1(} assertions perform checking, while |
|
@code{assert2(} and @code{assert3(} assertions are treated as comments. |
|
|
|
The value of @code{assert-level} is evaluated at compile-time, not at |
|
run-time. Therefore you cannot turn assertions on or off at run-time; |
|
you have to set the @code{assert-level} appropriately before compiling a |
|
piece of code. You can compile different pieces of code at different |
|
@code{assert-level}s (e.g., a trusted library at level 1 and |
|
newly-written code at level 3). |
|
|
m: ( x y circle -- ) |
doc-assert-level |
radius @@ draw-circle ;m |
|
overrides draw |
|
|
|
m: ( n-radius circle -- ) |
If an assertion fails, a message compatible with Emacs' compilation mode |
radius ! ;m |
is produced and the execution is aborted (currently with @code{ABORT"}. |
overrides construct |
If there is interest, we will introduce a special throw code. But if you |
|
intend to @code{catch} a specific condition, using @code{throw} is |
|
probably more appropriate than an assertion). |
|
|
end-class circle |
Definitions in ANS Forth for these assertion words are provided |
@end example |
in @file{compat/assert.fs}. |
|
|
@code{radius} can only be used in @code{circle} and its |
|
descendent classes and inside @code{m:...;m}. |
|
|
|
@cindex @code{inst-value} usage |
@node Singlestep Debugger, , Assertions, Programming Tools |
You can also define fields with @code{inst-value}, which is |
@subsection Singlestep Debugger |
to @code{inst-var} what @code{value} is to |
@cindex singlestep Debugger |
@code{variable}. You can change the value of such a field with |
@cindex debugging Singlestep |
@code{[to-inst]}. E.g., we could also define the class |
@cindex @code{dbg} |
@code{circle} like this: |
@cindex @code{BREAK:} |
|
@cindex @code{BREAK"} |
|
|
@example |
When you create a new word there's often the need to check whether it |
graphical class |
behaves correctly or not. You can do this by typing @code{dbg |
inst-value radius |
badword}. A debug session might look like this: |
|
|
m: ( x y circle -- ) |
@example |
radius draw-circle ;m |
: badword 0 DO i . LOOP ; ok |
overrides draw |
2 dbg badword |
|
: badword |
|
Scanning code... |
|
|
m: ( n-radius circle -- ) |
Nesting debugger ready! |
[to-inst] radius ;m |
|
overrides construct |
|
|
|
end-class circle |
400D4738 8049BC4 0 -> [ 2 ] 00002 00000 |
|
400D4740 8049F68 DO -> [ 0 ] |
|
400D4744 804A0C8 i -> [ 1 ] 00000 |
|
400D4748 400C5E60 . -> 0 [ 0 ] |
|
400D474C 8049D0C LOOP -> [ 0 ] |
|
400D4744 804A0C8 i -> [ 1 ] 00001 |
|
400D4748 400C5E60 . -> 1 [ 0 ] |
|
400D474C 8049D0C LOOP -> [ 0 ] |
|
400D4758 804B384 ; -> ok |
@end example |
@end example |
|
|
|
Each line displayed is one step. You always have to hit return to |
|
execute the next word that is displayed. If you don't want to execute |
|
the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is |
|
an overview what keys are available: |
|
|
@node Classes and Scoping, Object Interfaces, Method conveniences, Objects |
@table @i |
@subsubsection Classes and Scoping |
|
@cindex classes and scoping |
|
@cindex scoping and classes |
|
|
|
Inheritance is frequent, unlike structure extension. This exacerbates |
@item <return> |
the problem with the field name convention (@pxref{Structure Naming |
Next; Execute the next word. |
Convention}): One always has to remember in which class the field was |
|
originally defined; changing a part of the class structure would require |
|
changes for renaming in otherwise unaffected code. |
|
|
|
@cindex @code{inst-var} visibility |
@item n |
@cindex @code{inst-value} visibility |
Nest; Single step through next word. |
To solve this problem, I added a scoping mechanism (which was not in my |
|
original charter): A field defined with @code{inst-var} (or |
|
@code{inst-value}) is visible only in the class where it is defined and in |
|
the descendent classes of this class. Using such fields only makes |
|
sense in @code{m:}-defined methods in these classes anyway. |
|
|
|
This scoping mechanism allows us to use the unadorned field name, |
@item u |
because name clashes with unrelated words become much less likely. |
Unnest; Stop debugging and execute rest of word. If we got to this word |
|
with nest, continue debugging with the calling word. |
|
|
@cindex @code{protected} discussion |
@item d |
@cindex @code{private} discussion |
Done; Stop debugging and execute rest. |
Once we have this mechanism, we can also use it for controlling the |
|
visibility of other words: All words defined after |
|
@code{protected} are visible only in the current class and its |
|
descendents. @code{public} restores the compilation |
|
(i.e. @code{current}) word list that was in effect before. If you |
|
have several @code{protected}s without an intervening |
|
@code{public} or @code{set-current}, @code{public} |
|
will restore the compilation word list in effect before the first of |
|
these @code{protected}s. |
|
|
|
@node Object Interfaces, Objects Implementation, Classes and Scoping, Objects |
@item s |
@subsubsection Object Interfaces |
Stop; Abort immediately. |
@cindex object interfaces |
|
@cindex interfaces for objects |
|
|
|
In this model you can only call selectors defined in the class of the |
@end table |
receiving objects or in one of its ancestors. If you call a selector |
|
with a receiving object that is not in one of these classes, the |
|
result is undefined; if you are lucky, the program crashes |
|
immediately. |
|
|
|
@cindex selectors common to hardly-related classes |
Debugging large application with this mechanism is very difficult, because |
Now consider the case when you want to have a selector (or several) |
you have to nest very deeply into the program before the interesting part |
available in two classes: You would have to add the selector to a |
begins. This takes a lot of time. |
common ancestor class, in the worst case to @code{object}. You |
|
may not want to do this, e.g., because someone else is responsible for |
|
this ancestor class. |
|
|
|
The solution for this problem is interfaces. An interface is a |
To do it more directly put a @code{BREAK:} command into your source code. |
collection of selectors. If a class implements an interface, the |
When program execution reaches @code{BREAK:} the single step debugger is |
selectors become available to the class and its descendents. A class |
invoked and you have all the features described above. |
can implement an unlimited number of interfaces. For the problem |
|
discussed above, we would define an interface for the selector(s), and |
|
both classes would implement the interface. |
|
|
|
As an example, consider an interface @code{storage} for |
If you have more than one part to debug it is useful to know where the |
writing objects to disk and getting them back, and a class |
program has stopped at the moment. You can do this by the |
@code{foo} that implements it. The code would look like this: |
@code{BREAK" string"} command. This behaves like @code{BREAK:} except that |
|
string is typed out when the ``breakpoint'' is reached. |
|
|
@cindex @code{interface} usage |
doc-dbg |
@cindex @code{end-interface} usage |
doc-BREAK: |
@cindex @code{implementation} usage |
doc-BREAK" |
@example |
|
interface |
|
selector write ( file object -- ) |
|
selector read1 ( file object -- ) |
|
end-interface storage |
|
|
|
bar class |
|
storage implementation |
|
|
|
... overrides write |
@c ------------------------------------------------------------- |
... overrides read |
@node Assembler and Code Words, Threading Words, Programming Tools, Words |
... |
@section Assembler and Code Words |
end-class foo |
@cindex assembler |
@end example |
@cindex code words |
|
|
@noindent |
Gforth provides some words for defining primitives (words written in |
(I would add a word @code{read} @var{( file -- object )} that uses |
machine code), and for defining the the machine-code equivalent of |
@code{read1} internally, but that's beyond the point illustrated |
@code{DOES>}-based defining words. However, the machine-independent |
here.) |
nature of Gforth poses a few problems: First of all, Gforth runs on |
|
several architectures, so it can provide no standard assembler. What's |
|
worse is that the register allocation not only depends on the processor, |
|
but also on the @code{gcc} version and options used. |
|
|
Note that you cannot use @code{protected} in an interface; and |
The words that Gforth offers encapsulate some system dependences (e.g., the |
of course you cannot define fields. |
header structure), so a system-independent assembler may be used in |
|
Gforth. If you do not have an assembler, you can compile machine code |
|
directly with @code{,} and @code{c,}. |
|
|
In the Neon model, all selectors are available for all classes; |
doc-assembler |
therefore it does not need interfaces. The price you pay in this model |
doc-code |
is slower late binding, and therefore, added complexity to avoid late |
doc-end-code |
binding. |
doc-;code |
|
doc-flush-icache |
|
|
@node Objects Implementation, Objects Glossary, Object Interfaces, Objects |
If @code{flush-icache} does not work correctly, @code{code} words |
@subsubsection @file{objects.fs} Implementation |
etc. will not work (reliably), either. |
@cindex @file{objects.fs} implementation |
|
|
|
@cindex @code{object-map} discussion |
@code{flush-icache} is always present. The other words are rarely used |
An object is a piece of memory, like one of the data structures |
and reside in @code{code.fs}, which is usually not loaded. You can load |
described with @code{struct...end-struct}. It has a field |
it with @code{require code.fs}. |
@code{object-map} that points to the method map for the object's |
|
class. |
|
|
|
@cindex method map |
@cindex registers of the inner interpreter |
@cindex virtual function table |
In the assembly code you will want to refer to the inner interpreter's |
The @emph{method map}@footnote{This is Self terminology; in C++ |
registers (e.g., the data stack pointer) and you may want to use other |
terminology: virtual function table.} is an array that contains the |
registers for temporary storage. Unfortunately, the register allocation |
execution tokens (@var{xt}s) of the methods for the object's class. Each |
is installation-dependent. |
selector contains an offset into a method map. |
|
|
|
@cindex @code{selector} implementation, class |
The easiest solution is to use explicit register declarations |
@code{selector} is a defining word that uses |
(@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info, |
@code{CREATE} and @code{DOES>}. The body of the |
GNU C Manual}) for all of the inner interpreter's registers: You have to |
selector contains the offset; the @code{does>} action for a |
compile Gforth with @code{-DFORCE_REG} (configure option |
class selector is, basically: |
@code{--enable-force-reg}) and the appropriate declarations must be |
|
present in the @code{machine.h} file (see @code{mips.h} for an example; |
|
you can find a full list of all declarable register symbols with |
|
@code{grep register engine.c}). If you give explicit registers to all |
|
variables that are declared at the beginning of @code{engine()}, you |
|
should be able to use the other caller-saved registers for temporary |
|
storage. Alternatively, you can use the @code{gcc} option |
|
@code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code |
|
Generation Conventions, gcc.info, GNU C Manual}) to reserve a register |
|
(however, this restriction on register allocation may slow Gforth |
|
significantly). |
|
|
@example |
If this solution is not viable (e.g., because @code{gcc} does not allow |
( object addr ) @@ over object-map @@ + @@ execute |
you to explicitly declare all the registers you need), you have to find |
@end example |
out by looking at the code where the inner interpreter's registers |
|
reside and which registers can be used for temporary storage. You can |
|
get an assembly listing of the engine's code with @code{make engine.s}. |
|
|
Since @code{object-map} is the first field of the object, it |
In any case, it is good practice to abstract your assembly code from the |
does not generate any code. As you can see, calling a selector has a |
actual register allocation. E.g., if the data stack pointer resides in |
small, constant cost. |
register @code{$17}, create an alias for this register called @code{sp}, |
|
and use that in your assembly code. |
|
|
@cindex @code{current-interface} discussion |
@cindex code words, portable |
@cindex class implementation and representation |
Another option for implementing normal and defining words efficiently |
A class is basically a @code{struct} combined with a method |
is to add the desired functionality to the source of Gforth. For normal |
map. During the class definition the alignment and size of the class |
words you just have to edit @file{primitives} (@pxref{Automatic |
are passed on the stack, just as with @code{struct}s, so |
Generation}). Defining words (equivalent to @code{;CODE} words, for fast |
@code{field} can also be used for defining class |
defined words) may require changes in @file{engine.c}, @file{kernel.fs}, |
fields. However, passing more items on the stack would be |
@file{prims2x.fs}, and possibly @file{cross.fs}. |
inconvenient, so @code{class} builds a data structure in memory, |
|
which is accessed through the variable |
|
@code{current-interface}. After its definition is complete, the |
|
class is represented on the stack by a pointer (e.g., as parameter for |
|
a child class definition). |
|
|
|
A new class starts off with the alignment and size of its parent, |
|
and a copy of the parent's method map. Defining new fields extends the |
|
size and alignment; likewise, defining new selectors extends the |
|
method map. @code{overrides} just stores a new @var{xt} in the method |
|
map at the offset given by the selector. |
|
|
|
@cindex class binding, implementation |
@c ------------------------------------------------------------- |
Class binding just gets the @var{xt} at the offset given by the selector |
@node Threading Words, Locals, Assembler and Code Words, Words |
from the class's method map and @code{compile,}s (in the case of |
@section Threading Words |
@code{[bind]}) it. |
@cindex threading words |
|
|
@cindex @code{this} implementation |
@cindex code address |
@cindex @code{catch} and @code{this} |
These words provide access to code addresses and other threading stuff |
@cindex @code{this} and @code{catch} |
in Gforth (and, possibly, other interpretive Forths). It more or less |
I implemented @code{this} as a @code{value}. At the |
abstracts away the differences between direct and indirect threading |
start of an @code{m:...;m} method the old @code{this} is |
(and, for direct threading, the machine dependences). However, at |
stored to the return stack and restored at the end; and the object on |
present this wordset is still incomplete. It is also pretty low-level; |
the TOS is stored @code{TO this}. This technique has one |
some day it will hopefully be made unnecessary by an internals wordset |
disadvantage: If the user does not leave the method via |
that abstracts implementation details away completely. |
@code{;m}, but via @code{throw} or @code{exit}, |
|
@code{this} is not restored (and @code{exit} may |
|
crash). To deal with the @code{throw} problem, I have redefined |
|
@code{catch} to save and restore @code{this}; the same |
|
should be done with any word that can catch an exception. As for |
|
@code{exit}, I simply forbid it (as a replacement, there is |
|
@code{exitm}). |
|
|
|
@cindex @code{inst-var} implementation |
doc-threading-method |
@code{inst-var} is just the same as @code{field}, with |
doc->code-address |
a different @code{does>} action: |
doc->does-code |
@example |
doc-code-address! |
@@ this + |
doc-does-code! |
@end example |
doc-does-handler! |
Similar for @code{inst-value}. |
doc-/does-handler |
|
|
@cindex class scoping implementation |
The code addresses produced by various defining words are produced by |
Each class also has a word list that contains the words defined with |
the following words: |
@code{inst-var} and @code{inst-value}, and its protected |
|
words. It also has a pointer to its parent. @code{class} pushes |
|
the word lists of the class and all its ancestors onto the search order stack, |
|
and @code{end-class} drops them. |
|
|
|
@cindex interface implementation |
doc-docol: |
An interface is like a class without fields, parent and protected |
doc-docon: |
words; i.e., it just has a method map. If a class implements an |
doc-dovar: |
interface, its method map contains a pointer to the method map of the |
doc-douser: |
interface. The positive offsets in the map are reserved for class |
doc-dodefer: |
methods, therefore interface map pointers have negative |
doc-dofield: |
offsets. Interfaces have offsets that are unique throughout the |
|
system, unlike class selectors, whose offsets are only unique for the |
|
classes where the selector is available (invokable). |
|
|
|
This structure means that interface selectors have to perform one |
You can recognize words defined by a @code{CREATE}...@code{DOES>} word |
indirection more than class selectors to find their method. Their body |
with @code{>does-code}. If the word was defined in that way, the value |
contains the interface map pointer offset in the class method map, and |
returned is non-zero and identifies the @code{DOES>} used by the |
the method offset in the interface method map. The |
defining word. |
@code{does>} action for an interface selector is, basically: |
@comment TODO should that be ``identifies the xt of the DOES> ??'' |
|
|
@example |
@c ------------------------------------------------------------- |
( object selector-body ) |
@node Locals, Structures, Threading Words, Words |
2dup selector-interface @@ ( object selector-body object interface-offset ) |
@section Locals |
swap object-map @@ + @@ ( object selector-body map ) |
@cindex locals |
swap selector-offset @@ + @@ execute |
|
@end example |
|
|
|
where @code{object-map} and @code{selector-offset} are |
Local variables can make Forth programming more enjoyable and Forth |
first fields and generate no code. |
programs easier to read. Unfortunately, the locals of ANS Forth are |
|
laden with restrictions. Therefore, we provide not only the ANS Forth |
|
locals wordset, but also our own, more powerful locals wordset (we |
|
implemented the ANS Forth locals wordset through our locals wordset). |
|
|
As a concrete example, consider the following code: |
The ideas in this section have also been published in the paper |
|
@cite{Automatic Scoping of Local Variables} by M. Anton Ertl, presented |
|
at EuroForth '94; it is available at |
|
@*@url{http://www.complang.tuwien.ac.at/papers/ertl94l.ps.gz}. |
|
|
@example |
@menu |
interface |
* Gforth locals:: |
selector if1sel1 |
* ANS Forth locals:: |
selector if1sel2 |
@end menu |
end-interface if1 |
|
|
|
object class |
@node Gforth locals, ANS Forth locals, Locals, Locals |
if1 implementation |
@subsection Gforth locals |
selector cl1sel1 |
@cindex Gforth locals |
cell% inst-var cl1iv1 |
@cindex locals, Gforth style |
|
|
' m1 overrides construct |
Locals can be defined with |
' m2 overrides if1sel1 |
|
' m3 overrides if1sel2 |
|
' m4 overrides cl1sel2 |
|
end-class cl1 |
|
|
|
create obj1 object dict-new drop |
@example |
create obj2 cl1 dict-new drop |
@{ local1 local2 ... -- comment @} |
|
@end example |
|
or |
|
@example |
|
@{ local1 local2 ... @} |
@end example |
@end example |
|
|
The data structure created by this code (including the data structure |
E.g., |
for @code{object}) is shown in the <a |
@example |
href="objects-implementation.eps">figure</a>, assuming a cell size of 4. |
: max @{ n1 n2 -- n3 @} |
@comment nac TODO add this diagram.. |
n1 n2 > if |
|
n1 |
|
else |
|
n2 |
|
endif ; |
|
@end example |
|
|
@node Objects Glossary, , Objects Implementation, Objects |
The similarity of locals definitions with stack comments is intended. A |
@subsubsection @file{objects.fs} Glossary |
locals definition often replaces the stack comment of a word. The order |
@cindex @file{objects.fs} Glossary |
of the locals corresponds to the order in a stack comment and everything |
|
after the @code{--} is really a comment. |
|
|
doc---objects-bind |
This similarity has one disadvantage: It is too easy to confuse locals |
doc---objects-<bind> |
declarations with stack comments, causing bugs and making them hard to |
doc---objects-bind' |
find. However, this problem can be avoided by appropriate coding |
doc---objects-[bind] |
conventions: Do not use both notations in the same program. If you do, |
doc---objects-class |
they should be distinguished using additional means, e.g. by position. |
doc---objects-class->map |
|
doc---objects-class-inst-size |
|
doc---objects-class-override! |
|
doc---objects-construct |
|
doc---objects-current' |
|
doc---objects-[current] |
|
doc---objects-current-interface |
|
doc---objects-dict-new |
|
doc---objects-drop-order |
|
doc---objects-end-class |
|
doc---objects-end-class-noname |
|
doc---objects-end-interface |
|
doc---objects-end-interface-noname |
|
doc---objects-exitm |
|
doc---objects-heap-new |
|
doc---objects-implementation |
|
doc---objects-init-object |
|
doc---objects-inst-value |
|
doc---objects-inst-var |
|
doc---objects-interface |
|
doc---objects-;m |
|
doc---objects-m: |
|
doc---objects-method |
|
doc---objects-object |
|
doc---objects-overrides |
|
doc---objects-[parent] |
|
doc---objects-print |
|
doc---objects-protected |
|
doc---objects-public |
|
doc---objects-push-order |
|
doc---objects-selector |
|
doc---objects-this |
|
doc---objects-<to-inst> |
|
doc---objects-[to-inst] |
|
doc---objects-to-this |
|
doc---objects-xt-new |
|
|
|
@c ------------------------------------------------------------- |
@cindex types of locals |
@node OOF, Mini-OOF, Objects, Object-oriented Forth |
@cindex locals types |
@subsection The @file{oof.fs} model |
The name of the local may be preceded by a type specifier, e.g., |
@cindex oof |
@code{F:} for a floating point value: |
@cindex object-oriented programming |
|
|
|
@cindex @file{objects.fs} |
@example |
@cindex @file{oof.fs} |
: CX* @{ F: Ar F: Ai F: Br F: Bi -- Cr Ci @} |
|
\ complex multiplication |
|
Ar Br f* Ai Bi f* f- |
|
Ar Bi f* Ai Br f* f+ ; |
|
@end example |
|
|
|
@cindex flavours of locals |
|
@cindex locals flavours |
|
@cindex value-flavoured locals |
|
@cindex variable-flavoured locals |
|
Gforth currently supports cells (@code{W:}, @code{W^}), doubles |
|
(@code{D:}, @code{D^}), floats (@code{F:}, @code{F^}) and characters |
|
(@code{C:}, @code{C^}) in two flavours: a value-flavoured local (defined |
|
with @code{W:}, @code{D:} etc.) produces its value and can be changed |
|
with @code{TO}. A variable-flavoured local (defined with @code{W^} etc.) |
|
produces its address (which becomes invalid when the variable's scope is |
|
left). E.g., the standard word @code{emit} can be defined in terms of |
|
@code{type} like this: |
|
|
This section describes the @file{oof.fs} packet. |
@example |
|
: emit @{ C^ char* -- @} |
|
char* 1 type ; |
|
@end example |
|
|
The packet described in this section has been used in bigFORTH since 1991, and |
@cindex default type of locals |
used for two large applications: a chromatographic system used to |
@cindex locals, default type |
create new medicaments, and a graphic user interface library (MINOS). |
A local without type specifier is a @code{W:} local. Both flavours of |
|
locals are initialized with values from the data or FP stack. |
|
|
You can find a description (in German) of @file{oof.fs} in @cite{Object |
Currently there is no way to define locals with user-defined data |
oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension} |
structures, but we are working on it. |
10(2), 1994. |
|
|
Gforth allows defining locals everywhere in a colon definition. This |
|
poses the following questions: |
|
|
@menu |
@menu |
* Properties of the OOF model:: |
* Where are locals visible by name?:: |
* Basic OOF Usage:: |
* How long do locals live?:: |
* The OOF base class:: |
* Programming Style:: |
* Class Declaration:: |
* Implementation:: |
* Class Implementation:: |
|
@end menu |
@end menu |
|
|
@node Properties of the OOF model, Basic OOF Usage, OOF, OOF |
@node Where are locals visible by name?, How long do locals live?, Gforth locals, Gforth locals |
@subsubsection Properties of the @file{oof.fs} model |
@subsubsection Where are locals visible by name? |
@cindex @file{oof.fs} properties |
@cindex locals visibility |
|
@cindex visibility of locals |
|
@cindex scope of locals |
|
|
@itemize @bullet |
Basically, the answer is that locals are visible where you would expect |
@item |
it in block-structured languages, and sometimes a little longer. If you |
This model combines object oriented programming with information |
want to restrict the scope of a local, enclose its definition in |
hiding. It helps you writing large application, where scoping is |
@code{SCOPE}...@code{ENDSCOPE}. |
necessary, because it provides class-oriented scoping. |
|
|
|
@item |
doc-scope |
Named objects, object pointers, and object arrays can be created, |
doc-endscope |
selector invocation uses the "object selector" syntax. Selector invocation |
|
to objects and/or selectors on the stack is a bit less convenient, but |
|
possible. |
|
|
|
@item |
These words behave like control structure words, so you can use them |
Selector invocation and instance variable usage of the active object is |
with @code{CS-PICK} and @code{CS-ROLL} to restrict the scope in |
straightforward, since both make use of the active object. |
arbitrary ways. |
|
|
@item |
If you want a more exact answer to the visibility question, here's the |
Late binding is efficient and easy to use. |
basic principle: A local is visible in all places that can only be |
|
reached through the definition of the local@footnote{In compiler |
|
construction terminology, all places dominated by the definition of the |
|
local.}. In other words, it is not visible in places that can be reached |
|
without going through the definition of the local. E.g., locals defined |
|
in @code{IF}...@code{ENDIF} are visible until the @code{ENDIF}, locals |
|
defined in @code{BEGIN}...@code{UNTIL} are visible after the |
|
@code{UNTIL} (until, e.g., a subsequent @code{ENDSCOPE}). |
|
|
@item |
The reasoning behind this solution is: We want to have the locals |
State-smart objects parse selectors. However, extensibility is provided |
visible as long as it is meaningful. The user can always make the |
using a (parsing) selector @code{postpone} and a selector @code{'}. |
visibility shorter by using explicit scoping. In a place that can |
|
only be reached through the definition of a local, the meaning of a |
|
local name is clear. In other places it is not: How is the local |
|
initialized at the control flow path that does not contain the |
|
definition? Which local is meant, if the same name is defined twice in |
|
two independent control flow paths? |
|
|
@item |
This should be enough detail for nearly all users, so you can skip the |
An implementation in ANS Forth is available. |
rest of this section. If you really must know all the gory details and |
|
options, read on. |
|
|
@end itemize |
In order to implement this rule, the compiler has to know which places |
|
are unreachable. It knows this automatically after @code{AHEAD}, |
|
@code{AGAIN}, @code{EXIT} and @code{LEAVE}; in other cases (e.g., after |
|
most @code{THROW}s), you can use the word @code{UNREACHABLE} to tell the |
|
compiler that the control flow never reaches that place. If |
|
@code{UNREACHABLE} is not used where it could, the only consequence is |
|
that the visibility of some locals is more limited than the rule above |
|
says. If @code{UNREACHABLE} is used where it should not (i.e., if you |
|
lie to the compiler), buggy code will be produced. |
|
|
|
doc-unreachable |
|
|
@node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF |
Another problem with this rule is that at @code{BEGIN}, the compiler |
@subsubsection Basic @file{oof.fs} Usage |
does not know which locals will be visible on the incoming |
@cindex @file{oof.fs} usage |
back-edge. All problems discussed in the following are due to this |
|
ignorance of the compiler (we discuss the problems using @code{BEGIN} |
|
loops as examples; the discussion also applies to @code{?DO} and other |
|
loops). Perhaps the most insidious example is: |
|
@example |
|
AHEAD |
|
BEGIN |
|
x |
|
[ 1 CS-ROLL ] THEN |
|
@{ x @} |
|
... |
|
UNTIL |
|
@end example |
|
|
This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}). |
This should be legal according to the visibility rule. The use of |
|
@code{x} can only be reached through the definition; but that appears |
|
textually below the use. |
|
|
You can define a class for graphical objects like this: |
From this example it is clear that the visibility rules cannot be fully |
|
implemented without major headaches. Our implementation treats common |
|
cases as advertised and the exceptions are treated in a safe way: The |
|
compiler makes a reasonable guess about the locals visible after a |
|
@code{BEGIN}; if it is too pessimistic, the |
|
user will get a spurious error about the local not being defined; if the |
|
compiler is too optimistic, it will notice this later and issue a |
|
warning. In the case above the compiler would complain about @code{x} |
|
being undefined at its use. You can see from the obscure examples in |
|
this section that it takes quite unusual control structures to get the |
|
compiler into trouble, and even then it will often do fine. |
|
|
@cindex @code{class} usage |
If the @code{BEGIN} is reachable from above, the most optimistic guess |
@cindex @code{class;} usage |
is that all locals visible before the @code{BEGIN} will also be |
@cindex @code{method} usage |
visible after the @code{BEGIN}. This guess is valid for all loops that |
|
are entered only through the @code{BEGIN}, in particular, for normal |
|
@code{BEGIN}...@code{WHILE}...@code{REPEAT} and |
|
@code{BEGIN}...@code{UNTIL} loops and it is implemented in our |
|
compiler. When the branch to the @code{BEGIN} is finally generated by |
|
@code{AGAIN} or @code{UNTIL}, the compiler checks the guess and |
|
warns the user if it was too optimistic: |
@example |
@example |
object class graphical \ "object" is the parent class |
IF |
method draw ( x y graphical -- ) |
@{ x @} |
class; |
BEGIN |
|
\ x ? |
|
[ 1 cs-roll ] THEN |
|
... |
|
UNTIL |
@end example |
@end example |
|
|
This code defines a class @code{graphical} with an |
Here, @code{x} lives only until the @code{BEGIN}, but the compiler |
operation @code{draw}. We can perform the operation |
optimistically assumes that it lives until the @code{THEN}. It notices |
@code{draw} on any @code{graphical} object, e.g.: |
this difference when it compiles the @code{UNTIL} and issues a |
|
warning. The user can avoid the warning, and make sure that @code{x} |
|
is not used in the wrong area by using explicit scoping: |
@example |
@example |
100 100 t-rex draw |
IF |
|
SCOPE |
|
@{ x @} |
|
ENDSCOPE |
|
BEGIN |
|
[ 1 cs-roll ] THEN |
|
... |
|
UNTIL |
@end example |
@end example |
|
|
@noindent |
Since the guess is optimistic, there will be no spurious error messages |
where @code{t-rex} is an object or object pointer, created with e.g. |
about undefined locals. |
@code{graphical : t-rex}. |
|
|
|
@cindex abstract class |
If the @code{BEGIN} is not reachable from above (e.g., after |
How do we create a graphical object? With the present definitions, |
@code{AHEAD} or @code{EXIT}), the compiler cannot even make an |
we cannot create a useful graphical object. The class |
optimistic guess, as the locals visible after the @code{BEGIN} may be |
@code{graphical} describes graphical objects in general, but not |
defined later. Therefore, the compiler assumes that no locals are |
any concrete graphical object type (C++ users would call it an |
visible after the @code{BEGIN}. However, the user can use |
@emph{abstract class}); e.g., there is no method for the selector |
@code{ASSUME-LIVE} to make the compiler assume that the same locals are |
@code{draw} in the class @code{graphical}. |
visible at the BEGIN as at the point where the top control-flow stack |
|
item was created. |
|
|
For concrete graphical objects, we define child classes of the |
doc-assume-live |
class @code{graphical}, e.g.: |
|
|
|
|
E.g., |
@example |
@example |
graphical class circle \ "graphical" is the parent class |
@{ x @} |
cell var circle-radius |
AHEAD |
how: |
ASSUME-LIVE |
: draw ( x y -- ) |
BEGIN |
circle-radius @@ draw-circle ; |
x |
|
[ 1 CS-ROLL ] THEN |
: init ( n-radius -- ( |
... |
circle-radius ! ; |
UNTIL |
class; |
|
@end example |
@end example |
|
|
Here we define a class @code{circle} as a child of @code{graphical}, |
Other cases where the locals are defined before the @code{BEGIN} can be |
with a field @code{circle-radius}; it defines new methods for the |
handled by inserting an appropriate @code{CS-ROLL} before the |
selectors @code{draw} and @code{init} (@code{init} is defined in |
@code{ASSUME-LIVE} (and changing the control-flow stack manipulation |
@code{object}, the parent class of @code{graphical}). |
behind the @code{ASSUME-LIVE}). |
|
|
Now we can create a circle in the dictionary with |
|
|
|
|
Cases where locals are defined after the @code{BEGIN} (but should be |
|
visible immediately after the @code{BEGIN}) can only be handled by |
|
rearranging the loop. E.g., the ``most insidious'' example above can be |
|
arranged into: |
@example |
@example |
50 circle : my-circle |
BEGIN |
|
@{ x @} |
|
... 0= |
|
WHILE |
|
x |
|
REPEAT |
@end example |
@end example |
|
|
@noindent |
@node How long do locals live?, Programming Style, Where are locals visible by name?, Gforth locals |
@code{:} invokes @code{init}, thus initializing the field |
@subsubsection How long do locals live? |
@code{circle-radius} with 50. We can draw this new circle at (100,100) |
@cindex locals lifetime |
with: |
@cindex lifetime of locals |
|
|
@example |
The right answer for the lifetime question would be: A local lives at |
100 100 my-circle draw |
least as long as it can be accessed. For a value-flavoured local this |
@end example |
means: until the end of its visibility. However, a variable-flavoured |
|
local could be accessed through its address far beyond its visibility |
|
scope. Ultimately, this would mean that such locals would have to be |
|
garbage collected. Since this entails un-Forth-like implementation |
|
complexities, I adopted the same cowardly solution as some other |
|
languages (e.g., C): The local lives only as long as it is visible; |
|
afterwards its address is invalid (and programs that access it |
|
afterwards are erroneous). |
|
|
@cindex selector invocation, restrictions |
@node Programming Style, Implementation, How long do locals live?, Gforth locals |
@cindex class definition, restrictions |
@subsubsection Programming Style |
Note: You can only invoke a selector if the receiving object belongs to |
@cindex locals programming style |
the class where the selector was defined or one of its descendents; |
@cindex programming style, locals |
e.g., you can invoke @code{draw} only for objects belonging to |
|
@code{graphical} or its descendents (e.g., @code{circle}). The scoping |
|
mechanism will check if you try to invoke a selector that is not |
|
defined in this class hierarchy, so you'll get an error at compilation |
|
time. |
|
|
|
|
The freedom to define locals anywhere has the potential to change |
|
programming styles dramatically. In particular, the need to use the |
|
return stack for intermediate storage vanishes. Moreover, all stack |
|
manipulations (except @code{PICK}s and @code{ROLL}s with run-time |
|
determined arguments) can be eliminated: If the stack items are in the |
|
wrong order, just write a locals definition for all of them; then |
|
write the items in the order you want. |
|
|
@node The OOF base class, Class Declaration, Basic OOF Usage, OOF |
This seems a little far-fetched and eliminating stack manipulations is |
@subsubsection The @file{oof.fs} base class |
unlikely to become a conscious programming objective. Still, the number |
@cindex @file{oof.fs} base class |
of stack manipulations will be reduced dramatically if local variables |
|
are used liberally (e.g., compare @code{max} in @ref{Gforth locals} with |
|
a traditional implementation of @code{max}). |
|
|
When you define a class, you have to specify a parent class. So how do |
This shows one potential benefit of locals: making Forth programs more |
you start defining classes? There is one class available from the start: |
readable. Of course, this benefit will only be realized if the |
@code{object}. You have to use it as ancestor for all classes. It is the |
programmers continue to honour the principle of factoring instead of |
only class that has no parent. Classes are also objects, except that |
using the added latitude to make the words longer. |
they don't have instance variables; class manipulation such as |
|
inheritance or changing definitions of a class is handled through |
|
selectors of the class @code{object}. |
|
|
|
@code{object} provides a number of selectors: |
@cindex single-assignment style for locals |
|
Using @code{TO} can and should be avoided. Without @code{TO}, |
|
every value-flavoured local has only a single assignment and many |
|
advantages of functional languages apply to Forth. I.e., programs are |
|
easier to analyse, to optimize and to read: It is clear from the |
|
definition what the local stands for, it does not turn into something |
|
different later. |
|
|
@itemize @bullet |
E.g., a definition using @code{TO} might look like this: |
@item |
@example |
@code{class} for subclassing, @code{definitions} to add definitions |
: strcmp @{ addr1 u1 addr2 u2 -- n @} |
later on, and @code{class?} to get type informations (is the class a |
u1 u2 min 0 |
subclass of the class passed on the stack?). |
?do |
doc---object-class |
addr1 c@@ addr2 c@@ - |
doc---object-definitions |
?dup-if |
doc---object-class? |
unloop exit |
|
then |
|
addr1 char+ TO addr1 |
|
addr2 char+ TO addr2 |
|
loop |
|
u1 u2 - ; |
|
@end example |
|
Here, @code{TO} is used to update @code{addr1} and @code{addr2} at |
|
every loop iteration. @code{strcmp} is a typical example of the |
|
readability problems of using @code{TO}. When you start reading |
|
@code{strcmp}, you think that @code{addr1} refers to the start of the |
|
string. Only near the end of the loop you realize that it is something |
|
else. |
|
|
@item |
This can be avoided by defining two locals at the start of the loop that |
@code{init} and @code{dispose} as constructor and destructor of the |
are initialized with the right value for the current iteration. |
object. @code{init} is invocated after the object's memory is allocated, |
@example |
while @code{dispose} also handles deallocation. Thus if you redefine |
: strcmp @{ addr1 u1 addr2 u2 -- n @} |
@code{dispose}, you have to call the parent's dispose with @code{super |
addr1 addr2 |
dispose}, too. |
u1 u2 min 0 |
doc---object-init |
?do @{ s1 s2 @} |
doc---object-dispose |
s1 c@@ s2 c@@ - |
|
?dup-if |
|
unloop exit |
|
then |
|
s1 char+ s2 char+ |
|
loop |
|
2drop |
|
u1 u2 - ; |
|
@end example |
|
Here it is clear from the start that @code{s1} has a different value |
|
in every loop iteration. |
|
|
@item |
@node Implementation, , Programming Style, Gforth locals |
@code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and |
@subsubsection Implementation |
@code{[]} to create named and unnamed objects and object arrays or |
@cindex locals implementation |
object pointers. |
@cindex implementation of locals |
doc---object-new |
|
doc---object-new[] |
|
doc---object-: |
|
doc---object-ptr |
|
doc---object-asptr |
|
doc---object-[] |
|
|
|
@item |
@cindex locals stack |
@code{::} and @code{super} for explicit scoping. You should use explicit |
Gforth uses an extra locals stack. The most compelling reason for |
scoping only for super classes or classes with the same set of instance |
this is that the return stack is not float-aligned; using an extra stack |
variables. Explicitly-scoped selectors use early binding. |
also eliminates the problems and restrictions of using the return stack |
doc---object-:: |
as locals stack. Like the other stacks, the locals stack grows toward |
doc---object-super |
lower addresses. A few primitives allow an efficient implementation: |
|
|
@item |
doc-@local# |
@code{self} to get the address of the object |
doc-f@local# |
doc---object-self |
doc-laddr# |
|
doc-lp+!# |
|
doc-lp! |
|
doc->l |
|
doc-f>l |
|
|
@item |
In addition to these primitives, some specializations of these |
@code{bind}, @code{bound}, @code{link}, and @code{is} to assign object |
primitives for commonly occurring inline arguments are provided for |
pointers and instance defers. |
efficiency reasons, e.g., @code{@@local0} as specialization of |
doc---object-bind |
@code{@@local#} for the inline argument 0. The following compiling words |
doc---object-bound |
compile the right specialized version, or the general version, as |
doc---object-link |
appropriate: |
doc---object-is |
|
|
|
@item |
doc-compile-@local |
@code{'} to obtain selector tokens, @code{send} to invocate selectors |
doc-compile-f@local |
form the stack, and @code{postpone} to generate selector invocation code. |
doc-compile-lp+! |
doc---object-' |
|
doc---object-postpone |
|
|
|
@item |
Combinations of conditional branches and @code{lp+!#} like |
@code{with} and @code{endwith} to select the active object from the |
@code{?branch-lp+!#} (the locals pointer is only changed if the branch |
stack, and enable its scope. Using @code{with} and @code{endwith} |
is taken) are provided for efficiency and correctness in loops. |
also allows you to create code using selector @code{postpone} without being |
|
trapped by the state-smart objects. |
|
doc---object-with |
|
doc---object-endwith |
|
|
|
@end itemize |
A special area in the dictionary space is reserved for keeping the |
|
local variable names. @code{@{} switches the dictionary pointer to this |
|
area and @code{@}} switches it back and generates the locals |
|
initializing code. @code{W:} etc.@ are normal defining words. This |
|
special area is cleared at the start of every colon definition. |
|
|
@node Class Declaration, Class Implementation, The OOF base class, OOF |
@cindex word list for defining locals |
@subsubsection Class Declaration |
A special feature of Gforth's dictionary is used to implement the |
@cindex class declaration |
definition of locals without type specifiers: every word list (aka |
|
vocabulary) has its own methods for searching |
|
etc. (@pxref{Word Lists}). For the present purpose we defined a word list |
|
with a special search method: When it is searched for a word, it |
|
actually creates that word using @code{W:}. @code{@{} changes the search |
|
order to first search the word list containing @code{@}}, @code{W:} etc., |
|
and then the word list for defining locals without type specifiers. |
|
|
@itemize @bullet |
The lifetime rules support a stack discipline within a colon |
@item |
definition: The lifetime of a local is either nested with other locals |
Instance variables |
lifetimes or it does not overlap them. |
doc---oof-var |
|
|
|
@item |
At @code{BEGIN}, @code{IF}, and @code{AHEAD} no code for locals stack |
Object pointers |
pointer manipulation is generated. Between control structure words |
doc---oof-ptr |
locals definitions can push locals onto the locals stack. @code{AGAIN} |
doc---oof-asptr |
is the simplest of the other three control flow words. It has to |
|
restore the locals stack depth of the corresponding @code{BEGIN} |
@item |
before branching. The code looks like this: |
Instance defers |
@format |
doc---oof-defer |
@code{lp+!#} current-locals-size @minus{} dest-locals-size |
|
@code{branch} <begin> |
@item |
@end format |
Method selectors |
|
doc---oof-early |
|
doc---oof-method |
|
|
|
@item |
|
Class-wide variables |
|
doc---oof-static |
|
|
|
@item |
@code{UNTIL} is a little more complicated: If it branches back, it |
End declaration |
must adjust the stack just like @code{AGAIN}. But if it falls through, |
doc---oof-how: |
the locals stack must not be changed. The compiler generates the |
doc---oof-class; |
following code: |
|
@format |
|
@code{?branch-lp+!#} <begin> current-locals-size @minus{} dest-locals-size |
|
@end format |
|
The locals stack pointer is only adjusted if the branch is taken. |
|
|
@end itemize |
@code{THEN} can produce somewhat inefficient code: |
|
@format |
|
@code{lp+!#} current-locals-size @minus{} orig-locals-size |
|
<orig target>: |
|
@code{lp+!#} orig-locals-size @minus{} new-locals-size |
|
@end format |
|
The second @code{lp+!#} adjusts the locals stack pointer from the |
|
level at the @var{orig} point to the level after the @code{THEN}. The |
|
first @code{lp+!#} adjusts the locals stack pointer from the current |
|
level to the level at the orig point, so the complete effect is an |
|
adjustment from the current level to the right level after the |
|
@code{THEN}. |
|
|
@c ------------------------------------------------------------- |
@cindex locals information on the control-flow stack |
@node Class Implementation, , Class Declaration, OOF |
@cindex control-flow stack items, locals information |
@subsubsection Class Implementation |
In a conventional Forth implementation a dest control-flow stack entry |
@cindex class implementation |
is just the target address and an orig entry is just the address to be |
|
patched. Our locals implementation adds a word list to every orig or dest |
|
item. It is the list of locals visible (or assumed visible) at the point |
|
described by the entry. Our implementation also adds a tag to identify |
|
the kind of entry, in particular to differentiate between live and dead |
|
(reachable and unreachable) orig entries. |
|
|
@c ------------------------------------------------------------- |
A few unusual operations have to be performed on locals word lists: |
@node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth |
|
@subsection The @file{mini-oof.fs} model |
|
@cindex mini-oof |
|
|
|
Gforth's third object oriented Forth package is a 12-liner. It uses a |
doc-common-list |
mixture of the @file{object.fs} and the @file{oof.fs} syntax, |
doc-sub-list? |
and reduces to the bare minimum of features. This is based on a posting |
doc-list-size |
of Bernd Paysan in comp.arch. |
|
|
|
@menu |
Several features of our locals word list implementation make these |
* Basic Mini-OOF Usage:: |
operations easy to implement: The locals word lists are organised as |
* Mini-OOF Example:: |
linked lists; the tails of these lists are shared, if the lists |
* Mini-OOF Implementation:: |
contain some of the same locals; and the address of a name is greater |
@end menu |
than the address of the names behind it in the list. |
|
|
@c ------------------------------------------------------------- |
Another important implementation detail is the variable |
@node Basic Mini-OOF Usage, Mini-OOF Example, , Mini-OOF |
@code{dead-code}. It is used by @code{BEGIN} and @code{THEN} to |
@subsubsection Basic @file{mini-oof.fs} Usage |
determine if they can be reached directly or only through the branch |
@cindex mini-oof usage |
that they resolve. @code{dead-code} is set by @code{UNREACHABLE}, |
|
@code{AHEAD}, @code{EXIT} etc., and cleared at the start of a colon |
|
definition, by @code{BEGIN} and usually by @code{THEN}. |
|
|
There is a base class (@code{class}, which allocates one cell |
Counted loops are similar to other loops in most respects, but |
for the object pointer) plus seven other words: to define a method, a |
@code{LEAVE} requires special attention: It performs basically the same |
variable, a class; to end a class, to resolve binding, to allocate an |
service as @code{AHEAD}, but it does not create a control-flow stack |
object and to compile a class method. |
entry. Therefore the information has to be stored elsewhere; |
@comment TODO better description of the last one |
traditionally, the information was stored in the target fields of the |
|
branches created by the @code{LEAVE}s, by organizing these fields into a |
|
linked list. Unfortunately, this clever trick does not provide enough |
|
space for storing our extended control flow information. Therefore, we |
|
introduce another stack, the leave stack. It contains the control-flow |
|
stack entries for all unresolved @code{LEAVE}s. |
|
|
doc-object |
Local names are kept until the end of the colon definition, even if |
doc-method |
they are no longer visible in any control-flow path. In a few cases |
doc-var |
this may lead to increased space needs for the locals name area, but |
doc-class |
usually less than reclaiming this space would cost in code size. |
doc-end-class |
|
doc-defines |
|
doc-new |
|
doc-:: |
|
|
|
|
|
@c ------------------------------------------------------------- |
@node ANS Forth locals, , Gforth locals, Locals |
@node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF |
@subsection ANS Forth locals |
@subsubsection Mini-OOF Example |
@cindex locals, ANS Forth style |
@cindex mini-oof example |
|
|
|
A short example shows how to use this package. |
The ANS Forth locals wordset does not define a syntax for locals, but |
@comment nac TODO could flesh this out with some comments from the Forthwrite article |
words that make it possible to define various syntaxes. One of the |
|
possible syntaxes is a subset of the syntax we used in the Gforth locals |
|
wordset, i.e.: |
|
|
@example |
@example |
object class |
@{ local1 local2 ... -- comment @} |
method init |
|
method draw |
|
end-class graphical |
|
@end example |
@end example |
|
@noindent |
This code defines a class @code{graphical} with an |
or |
operation @code{draw}. We can perform the operation |
|
@code{draw} on any @code{graphical} object, e.g.: |
|
|
|
@example |
@example |
100 100 t-rex draw |
@{ local1 local2 ... @} |
@end example |
@end example |
|
|
where @code{t-rex} is an object or object pointer, created with e.g. |
The order of the locals corresponds to the order in a stack comment. The |
@code{graphical new Constant t-rex}. |
restrictions are: |
|
|
For concrete graphical objects, we define child classes of the |
|
class @code{graphical}, e.g.: |
|
|
|
@example |
@itemize @bullet |
graphical class |
@item |
cell var circle-radius |
Locals can only be cell-sized values (no type specifiers are allowed). |
end-class circle \ "graphical" is the parent class |
@item |
|
Locals can be defined only outside control structures. |
|
@item |
|
Locals can interfere with explicit usage of the return stack. For the |
|
exact (and long) rules, see the standard. If you don't use return stack |
|
accessing words in a definition using locals, you will be all right. The |
|
purpose of this rule is to make locals implementation on the return |
|
stack easier. |
|
@item |
|
The whole definition must be in one line. |
|
@end itemize |
|
|
:noname ( x y -- ) |
Locals defined in this way behave like @code{VALUE}s (@xref{Simple |
circle-radius @@ draw-circle ; circle defines draw |
Defining Words}). I.e., they are initialized from the stack. Using their |
:noname ( r -- ) |
name produces their value. Their value can be changed using @code{TO}. |
circle-radius ! ; circle defines init |
|
@end example |
|
|
|
There is no implicit init method, so we have to define one. The creation |
Since this syntax is supported by Gforth directly, you need not do |
code of the object now has to call init explicitely. |
anything to use it. If you want to port a program using this syntax to |
|
another ANS Forth system, use @file{compat/anslocal.fs} to implement the |
|
syntax on the other system. |
|
|
@example |
Note that a syntax shown in the standard, section A.13 looks |
circle new Constant my-circle |
similar, but is quite different in having the order of locals |
50 my-circle init |
reversed. Beware! |
@end example |
|
|
|
It is also possible to add a function to create named objects with |
The ANS Forth locals wordset itself consists of a word: |
automatic call of @code{init}, given that all objects have @code{init} |
|
on the same place: |
|
|
|
@example |
doc-(local) |
: new: ( .. o "name" -- ) |
|
new dup Constant init ; |
|
80 circle new: large-circle |
|
@end example |
|
|
|
We can draw this new circle at (100,100) with: |
The ANS Forth locals extension wordset defines a syntax using @code{locals|}, but it is so |
|
awful that we strongly recommend not to use it. We have implemented this |
|
syntax to make porting to Gforth easy, but do not document it here. The |
|
problem with this syntax is that the locals are defined in an order |
|
reversed with respect to the standard stack comment notation, making |
|
programs harder to read, and easier to misread and miswrite. The only |
|
merit of this syntax is that it is easy to implement using the ANS Forth |
|
locals wordset. |
|
|
@example |
|
100 100 my-circle draw |
|
@end example |
|
|
|
@node Mini-OOF Implementation, , Mini-OOF Example, Mini-OOF |
@c ---------------------------------------------------------- |
@subsubsection @file{mini-oof.fs} Implementation |
@node Structures, Object-oriented Forth, Locals, Words |
|
@section Structures |
|
@cindex structures |
|
@cindex records |
|
|
Object-oriented systems with late binding typically use a |
This section presents the structure package that comes with Gforth. A |
"vtable"-approach: the first variable in each object is a pointer to a |
version of the package implemented in ANS Forth is available in |
table, which contains the methods as function pointers. The vtable |
@file{compat/struct.fs}. This package was inspired by a posting on |
may also contain other information. |
comp.lang.forth in 1989 (unfortunately I don't remember, by whom; |
|
possibly John Hayes). A version of this section has been published in |
|
???. Marcel Hendrix provided helpful comments. |
|
|
So first, let's declare methods: |
@menu |
|
* Why explicit structure support?:: |
|
* Structure Usage:: |
|
* Structure Naming Convention:: |
|
* Structure Implementation:: |
|
* Structure Glossary:: |
|
@end menu |
|
|
@example |
@node Why explicit structure support?, Structure Usage, Structures, Structures |
: method ( m v -- m' v ) Create over , swap cell+ swap |
@subsection Why explicit structure support? |
DOES> ( ... o -- ... ) @ over @ + @ execute ; |
|
@end example |
|
|
|
During method declaration, the number of methods and instance |
@cindex address arithmetic for structures |
variables is on the stack (in address units). @code{method} creates |
@cindex structures using address arithmetic |
one method and increments the method number. To execute a method, it |
If we want to use a structure containing several fields, we could simply |
takes the object, fetches the vtable pointer, adds the offset, and |
reserve memory for it, and access the fields using address arithmetic |
executes the @var{xt} stored there. Each method takes the object it is |
(@pxref{Address arithmetic}). As an example, consider a structure with |
invoked from as top of stack parameter. The method itself should |
the following fields |
consume that object. |
|
|
|
Now, we also have to declare instance variables |
@table @code |
|
@item a |
|
is a float |
|
@item b |
|
is a cell |
|
@item c |
|
is a float |
|
@end table |
|
|
@example |
Given the (float-aligned) base address of the structure we get the |
: var ( m v size -- m v' ) Create over , + |
address of the field |
DOES> ( o -- addr ) @ + ; |
|
@end example |
|
|
|
As before, a word is created with the current offset. Instance |
@table @code |
variables can have different sizes (cells, floats, doubles, chars), so |
@item a |
all we do is take the size and add it to the offset. If your machine |
without doing anything further. |
has alignment restrictions, put the proper @code{aligned} or |
@item b |
@code{faligned} before the variable, to adjust the variable |
with @code{float+} |
offset. That's why it is on the top of stack. |
@item c |
|
with @code{float+ cell+ faligned} |
|
@end table |
|
|
We need a starting point (the base object) and some syntactic sugar: |
It is easy to see that this can become quite tiring. |
|
|
@example |
Moreover, it is not very readable, because seeing a |
Create object 1 cells , 2 cells , |
@code{cell+} tells us neither which kind of structure is |
: class ( class -- class methods vars ) dup 2@ ; |
accessed nor what field is accessed; we have to somehow infer the kind |
@end example |
of structure, and then look up in the documentation, which field of |
|
that structure corresponds to that offset. |
|
|
For inheritance, the vtable of the parent object has to be |
Finally, this kind of address arithmetic also causes maintenance |
copied when a new, derived class is declared. This gives all the |
troubles: If you add or delete a field somewhere in the middle of the |
methods of the parent class, which can be overridden, though. |
structure, you have to find and change all computations for the fields |
|
afterwards. |
|
|
|
So, instead of using @code{cell+} and friends directly, how |
|
about storing the offsets in constants: |
|
|
@example |
@example |
: end-class ( class methods vars -- ) |
0 constant a-offset |
Create here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP |
0 float+ constant b-offset |
cell+ dup cell+ r> rot @ 2 cells /string move ; |
0 float+ cell+ faligned c-offset |
@end example |
@end example |
|
|
The first line creates the vtable, initialized with |
Now we can get the address of field @code{x} with @code{x-offset |
@code{noop}s. The second line is the inheritance mechanism, it |
+}. This is much better in all respects. Of course, you still |
copies the xts from the parent vtable. |
have to change all later offset definitions if you add a field. You can |
|
fix this by declaring the offsets in the following way: |
We still have no way to define new methods, let's do that now: |
|
|
|
@example |
@example |
: defines ( xt class -- ) ' >body @ + ! ; |
0 constant a-offset |
|
a-offset float+ constant b-offset |
|
b-offset cell+ faligned constant c-offset |
@end example |
@end example |
|
|
To allocate a new object, we need a word, too: |
Since we always use the offsets with @code{+}, we could use a defining |
|
word @code{cfield} that includes the @code{+} in the action of the |
|
defined word: |
|
|
@example |
@example |
: new ( class -- o ) here over @ allot swap over ! ; |
: cfield ( n "name" -- ) |
|
create , |
|
does> ( name execution: addr1 -- addr2 ) |
|
@@ + ; |
|
|
|
0 cfield a |
|
0 a float+ cfield b |
|
0 b cell+ faligned cfield c |
@end example |
@end example |
|
|
Sometimes derived classes want to access the method of the |
Instead of @code{x-offset +}, we now simply write @code{x}. |
parent object. There are two ways to achieve this with Mini-OOF: |
|
first, you could use named words, and second, you could look up the |
The structure field words now can be used quite nicely. However, |
vtable of the parent object. |
their definition is still a bit cumbersome: We have to repeat the |
|
name, the information about size and alignment is distributed before |
|
and after the field definitions etc. The structure package presented |
|
here addresses these problems. |
|
|
|
@node Structure Usage, Structure Naming Convention, Why explicit structure support?, Structures |
|
@subsection Structure Usage |
|
@cindex structure usage |
|
|
|
@cindex @code{field} usage |
|
@cindex @code{struct} usage |
|
@cindex @code{end-struct} usage |
|
You can define a structure for a (data-less) linked list with: |
@example |
@example |
: :: ( class "name" -- ) ' >body @ + @ compile, ; |
struct |
|
cell% field list-next |
|
end-struct list% |
@end example |
@end example |
|
|
|
With the address of the list node on the stack, you can compute the |
Nothing can be more confusing than a good example, so here is |
address of the field that contains the address of the next node with |
one. First let's declare a text object (called |
@code{list-next}. E.g., you can determine the length of a list |
@code{button}), that stores text and position: |
with: |
|
|
@example |
@example |
object class |
: list-length ( list -- n ) |
cell var text |
\ "list" is a pointer to the first element of a linked list |
cell var len |
\ "n" is the length of the list |
cell var x |
0 BEGIN ( list1 n1 ) |
cell var y |
over |
method init |
WHILE ( list1 n1 ) |
method draw |
1+ swap list-next @@ swap |
end-class button |
REPEAT |
|
nip ; |
@end example |
@end example |
|
|
@noindent |
You can reserve memory for a list node in the dictionary with |
Now, implement the two methods, @code{draw} and @code{init}: |
@code{list% %allot}, which leaves the address of the list node on the |
|
stack. For the equivalent allocation on the heap you can use @code{list% |
|
%alloc} (or, for an @code{allocate}-like stack effect (i.e., with ior), |
|
use @code{list% %allocate}). You can get the the size of a list |
|
node with @code{list% %size} and its alignment with @code{list% |
|
%alignment}. |
|
|
|
Note that in ANS Forth the body of a @code{create}d word is |
|
@code{aligned} but not necessarily @code{faligned}; |
|
therefore, if you do a: |
@example |
@example |
:noname ( o -- ) |
create @emph{name} foo% %allot |
>r r@ x @ r@ y @ at-xy r@ text @ r> len @ type ; |
|
button defines draw |
|
:noname ( addr u o -- ) |
|
>r 0 r@ x ! 0 r@ y ! r@ len ! r> text ! ; |
|
button defines init |
|
@end example |
@end example |
|
|
@noindent |
@noindent |
To demonstrate inheritance, we define a class @code{bold-button}, with no |
then the memory alloted for @code{foo%} is |
new data and no new methods. |
guaranteed to start at the body of @code{@emph{name}} only if |
|
@code{foo%} contains only character, cell and double fields. |
|
|
|
@cindex strcutures containing structures |
|
You can include a structure @code{foo%} as a field of |
|
another structure, like this: |
@example |
@example |
button class |
struct |
end-class bold-button |
... |
|
foo% field ... |
: bold 27 emit ." [1m" ; |
... |
: normal 27 emit ." [0m" ; |
end-struct ... |
|
@end example |
|
|
@noindent |
@cindex structure extension |
The class @code{bold-button} has a different draw method to |
@cindex extended records |
@code{button}, but the new method is defined in terms of the draw method |
Instead of starting with an empty structure, you can extend an |
for @code{button}: |
existing structure. E.g., a plain linked list without data, as defined |
|
above, is hardly useful; You can extend it to a linked list of integers, |
|
like this:@footnote{This feature is also known as @emph{extended |
|
records}. It is the main innovation in the Oberon language; in other |
|
words, adding this feature to Modula-2 led Wirth to create a new |
|
language, write a new compiler etc. Adding this feature to Forth just |
|
required a few lines of code.} |
|
|
:noname bold [ button :: draw ] normal ; bold-button defines draw |
@example |
|
list% |
|
cell% field intlist-int |
|
end-struct intlist% |
@end example |
@end example |
|
|
@noindent |
@code{intlist%} is a structure with two fields: |
Finally, create two objects and apply methods: |
@code{list-next} and @code{intlist-int}. |
|
|
|
@cindex structures containing arrays |
|
You can specify an array type containing @emph{n} elements of |
|
type @code{foo%} like this: |
|
|
@example |
@example |
button new Constant foo |
foo% @emph{n} * |
s" thin foo" foo init |
|
page |
|
foo draw |
|
bold-button new Constant bar |
|
s" fat bar" bar init |
|
1 bar y ! |
|
bar draw |
|
@end example |
@end example |
|
|
|
You can use this array type in any place where you can use a normal |
|
type, e.g., when defining a @code{field}, or with |
|
@code{%allot}. |
|
|
@node Comparison with other object models, , Mini-OOF, Object-oriented Forth |
@cindex first field optimization |
@subsubsection Comparison with other object models |
The first field is at the base address of a structure and the word |
@cindex comparison of object models |
for this field (e.g., @code{list-next}) actually does not change |
@cindex object models, comparison |
the address on the stack. You may be tempted to leave it away in the |
|
interest of run-time and space efficiency. This is not necessary, |
|
because the structure package optimizes this case and compiling such |
|
words does not generate any code. So, in the interest of readability |
|
and maintainability you should include the word for the field when |
|
accessing the field. |
|
|
Many object-oriented Forth extensions have been proposed (@cite{A survey |
@node Structure Naming Convention, Structure Implementation, Structure Usage, Structures |
of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford |
@subsection Structure Naming Convention |
J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the |
@cindex structure naming convention |
relation of the object models described here to two well-known and two |
|
closely-related (by the use of method maps) models. |
|
|
|
@cindex Neon model |
The field names that come to (my) mind are often quite generic, and, |
The most popular model currently seems to be the Neon model (see |
if used, would cause frequent name clashes. E.g., many structures |
@cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March |
probably contain a @code{counter} field. The structure names |
1997) by Andrew McKewan) but this model has a number of limitations |
that come to (my) mind are often also the logical choice for the names |
@footnote{A longer version of this critique can be |
of words that create such a structure. |
found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth |
|
Dimensions, May 1997) by Anton Ertl.}: |
Therefore, I have adopted the following naming conventions: |
|
|
@itemize @bullet |
@itemize @bullet |
|
@cindex field naming convention |
@item |
@item |
It uses a @code{@emph{selector |
The names of fields are of the form |
object}} syntax, which makes it unnatural to pass objects on the |
@code{@emph{struct}-@emph{field}}, where |
stack. |
@code{@emph{struct}} is the basic name of the structure, and |
|
@code{@emph{field}} is the basic name of the field. You can |
|
think of field words as converting the (address of the) |
|
structure into the (address of the) field. |
|
|
|
@cindex structure naming convention |
@item |
@item |
It requires that the selector parses the input stream (at |
The names of structures are of the form |
compile time); this leads to reduced extensibility and to bugs that are+ |
@code{@emph{struct}%}, where |
hard to find. |
@code{@emph{struct}} is the basic name of the structure. |
|
@end itemize |
|
|
@item |
This naming convention does not work that well for fields of extended |
It allows using every selector to every object; |
structures; e.g., the integer list structure has a field |
this eliminates the need for classes, but makes it harder to create |
@code{intlist-int}, but has @code{list-next}, not |
efficient implementations. |
@code{intlist-next}. |
@end itemize |
|
|
|
@cindex Pountain's object-oriented model |
@node Structure Implementation, Structure Glossary, Structure Naming Convention, Structures |
Another well-known publication is @cite{Object-Oriented Forth} (Academic |
@subsection Structure Implementation |
Press, London, 1987) by Dick Pountain. However, it is not really about |
@cindex structure implementation |
object-oriented programming, because it hardly deals with late |
@cindex implementation of structures |
binding. Instead, it focuses on features like information hiding and |
|
overloading that are characteristic of modular languages like Ada (83). |
|
|
|
@cindex Zsoter's object-oriented model |
The central idea in the implementation is to pass the data about the |
In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1) 1996, pages 31-35) |
structure being built on the stack, not in some global |
Andras Zsoter describes a model that makes heavy use of an active object |
variable. Everything else falls into place naturally once this design |
(like @code{this} in @file{objects.fs}): The active object is not only |
decision is made. |
used for accessing all fields, but also specifies the receiving object |
|
of every selector invocation; you have to change the active object |
|
explicitly with @code{@{ ... @}}, whereas in @file{objects.fs} it |
|
changes more or less implicitly at @code{m: ... ;m}. Such a change at |
|
the method entry point is unnecessary with the Zsoter's model, because |
|
the receiving object is the active object already. On the other hand, the explicit |
|
change is absolutely necessary in that model, because otherwise no one |
|
could ever change the active object. An ANS Forth implementation of this |
|
model is available at @url{http://www.forth.org/fig/oopf.html}. |
|
|
|
@cindex @file{oof.fs}, differences to other models |
The type description on the stack is of the form @emph{align |
The @file{oof.fs} model combines information hiding and overloading |
size}. Keeping the size on the top-of-stack makes dealing with arrays |
resolution (by keeping names in various word lists) with object-oriented |
very simple. |
programming. It sets the active object implicitly on method entry, but |
|
also allows explicit changing (with @code{>o...o>} or with |
|
@code{with...endwith}). It uses parsing and state-smart objects and |
|
classes for resolving overloading and for early binding: the object or |
|
class parses the selector and determines the method from this. If the |
|
selector is not parsed by an object or class, it performs a call to the |
|
selector for the active object (late binding), like Zsoter's model. |
|
Fields are always accessed through the active object. The big |
|
disadvantage of this model is the parsing and the state-smartness, which |
|
reduces extensibility and increases the opportunities for subtle bugs; |
|
essentially, you are only safe if you never tick or @code{postpone} an |
|
object or class (Bernd disagrees, but I (Anton) am not convinced). |
|
|
|
@cindex @file{mini-oof.fs}, differences to other models |
@code{field} is a defining word that uses @code{Create} |
The @file{mini-oof.fs} model is quite similar to a very stripped-down version of |
and @code{DOES>}. The body of the field contains the offset |
the @file{objects.fs} model, but syntactically it is a mixture of the @file{objects.fs} and |
of the field, and the normal @code{DOES>} action is simply: |
@file{oof.fs} models. |
|
|
|
|
@example |
|
@ + |
|
@end example |
|
|
|
@noindent |
|
i.e., add the offset to the address, giving the stack effect |
|
@var{addr1 -- addr2} for a field. |
|
|
@c ------------------------------------------------------------- |
@cindex first field optimization, implementation |
@node Tokens for Words, Word Lists, Object-oriented Forth, Words |
This simple structure is slightly complicated by the optimization |
@section Tokens for Words |
for fields with offset 0, which requires a different |
@cindex tokens for words |
@code{DOES>}-part (because we cannot rely on there being |
|
something on the stack if such a field is invoked during |
|
compilation). Therefore, we put the different @code{DOES>}-parts |
|
in separate words, and decide which one to invoke based on the |
|
offset. For a zero offset, the field is basically a noop; it is |
|
immediate, and therefore no code is generated when it is compiled. |
|
|
This chapter describes the creation and use of tokens that represent |
@node Structure Glossary, , Structure Implementation, Structures |
words on the stack (and in data space). |
@subsection Structure Glossary |
|
@cindex structure glossary |
|
|
Named words have interpretation and compilation semantics. Unnamed words |
doc-%align |
just have execution semantics. |
doc-%alignment |
|
doc-%alloc |
|
doc-%allocate |
|
doc-%allot |
|
doc-cell% |
|
doc-char% |
|
doc-dfloat% |
|
doc-double% |
|
doc-end-struct |
|
doc-field |
|
doc-float% |
|
doc-naligned |
|
doc-sfloat% |
|
doc-%size |
|
doc-struct |
|
|
@comment TODO ?normally interpretation semantics are the execution semantics. |
@c ------------------------------------------------------------- |
@comment this should all be covered in earlier ss |
@node Object-oriented Forth, Passing Commands to the OS, Structures, Words |
|
@section Object-oriented Forth |
|
|
@cindex execution token |
Gforth comes with three packages for object-oriented programming: |
An @dfn{execution token} represents the execution semantics of an |
@file{objects.fs}, @file{oof.fs}, and @file{mini-oof.fs}; none of them |
unnamed word. An execution token occupies one cell. As explained in |
is preloaded, so you have to @code{include} them before use. The most |
@ref{Supplying names}, the execution token of the last word |
important differences between these packages (and others) are discussed |
defined can be produced with @code{lastxt}. |
in @ref{Comparison with other object models}. All packages are written |
|
in ANS Forth and can be used with any other ANS Forth. |
|
|
You can perform the semantics represented by an execution token with: |
@menu |
doc-execute |
* Why object-oriented programming?:: |
You can compile the word with: |
* Object-Oriented Terminology:: |
doc-compile, |
* Objects:: |
|
* OOF:: |
|
* Mini-OOF:: |
|
* Comparison with other object models:: |
|
@end menu |
|
|
@cindex code field address |
|
@cindex CFA |
|
In Gforth, the abstract data type @emph{execution token} is implemented |
|
as CFA (code field address). |
|
@comment TODO note that the standard does not say what it represents.. |
|
@comment and you cannot necessarily compile it in all Forths (eg native |
|
@comment compilers?). |
|
|
|
The interpretation semantics of a named word are also represented by an |
@node Why object-oriented programming?, Object-Oriented Terminology, , Object-oriented Forth |
execution token. You can get it with |
@subsubsection Why object-oriented programming? |
|
@cindex object-oriented programming motivation |
|
@cindex motivation for object-oriented programming |
|
|
doc-['] |
Often we have to deal with several data structures (@emph{objects}), |
doc-' |
that have to be treated similarly in some respects, but differently in |
|
others. Graphical objects are the textbook example: circles, triangles, |
|
dinosaurs, icons, and others, and we may want to add more during program |
|
development. We want to apply some operations to any graphical object, |
|
e.g., @code{draw} for displaying it on the screen. However, @code{draw} |
|
has to do something different for every kind of object. |
|
@comment TODO add some other operations eg perimeter, area |
|
@comment and tie in to concrete examples later.. |
|
|
For literals, you use @code{'} in interpreted code and @code{[']} in |
We could implement @code{draw} as a big @code{CASE} |
compiled code. Gforth's @code{'} and @code{[']} behave somewhat unusual |
control structure that executes the appropriate code depending on the |
by complaining about compile-only words. To get an execution token for a |
kind of object to be drawn. This would be not be very elegant, and, |
compiling word @var{X}, use @code{COMP' @var{X} drop} or @code{[COMP'] |
moreover, we would have to change @code{draw} every time we add |
@var{X} drop}. |
a new kind of graphical object (say, a spaceship). |
|
|
@cindex compilation token |
What we would rather do is: When defining spaceships, we would tell |
The compilation semantics are represented by a @dfn{compilation token} |
the system: ``Here's how you @code{draw} a spaceship; you figure |
consisting of two cells: @var{w xt}. The top cell @var{xt} is an |
out the rest''. |
execution token. The compilation semantics represented by the |
|
compilation token can be performed with @code{execute}, which consumes |
|
the whole compilation token, with an additional stack effect determined |
|
by the represented compilation semantics. |
|
|
|
doc-[comp'] |
This is the problem that all systems solve that (rightfully) call |
doc-comp' |
themselves object-oriented; the object-oriented packages presented here |
|
solve this problem (and not much else). |
|
@comment TODO ?list properties of oo systems.. oo vs o-based? |
|
|
You can compile the compilation semantics with @code{postpone,}. I.e., |
@node Object-Oriented Terminology, Objects, Why object-oriented programming?, Object-oriented Forth |
@code{COMP' @var{word} POSTPONE,} is equivalent to @code{POSTPONE |
@subsubsection Object-Oriented Terminology |
@var{word}}. |
@cindex object-oriented terminology |
|
@cindex terminology for object-oriented programming |
|
|
doc-postpone, |
This section is mainly for reference, so you don't have to understand |
|
all of it right away. The terminology is mainly Smalltalk-inspired. In |
|
short: |
|
|
At present, the @var{w} part of a compilation token is an execution |
@table @emph |
token, and the @var{xt} part represents either @code{execute} or |
@cindex class |
@code{compile,}. However, don't rely on that knowledge, unless necessary; |
@item class |
we may introduce unusual compilation tokens in the future (e.g., |
a data structure definition with some extras. |
compilation tokens representing the compilation semantics of literals). |
|
|
|
@cindex name token |
@cindex object |
@cindex name field address |
@item object |
@cindex NFA |
an instance of the data structure described by the class definition. |
Named words are also represented by the @dfn{name token}. The abstract |
|
data type @emph{name token} is implemented as NFA (name field address). |
|
|
|
doc-find-name |
@cindex instance variables |
doc-name>int |
@item instance variables |
doc-name?int |
fields of the data structure. |
doc-name>comp |
|
doc-name>string |
|
|
|
@node Word Lists, Environmental Queries, Tokens for Words, Words |
@cindex selector |
@section Word Lists |
@cindex method selector |
@cindex word lists |
@cindex virtual function |
@cindex name dictionary |
@item selector |
|
(or @emph{method selector}) a word (e.g., |
|
@code{draw}) that performs an operation on a variety of data |
|
structures (classes). A selector describes @emph{what} operation to |
|
perform. In C++ terminology: a (pure) virtual function. |
|
|
@cindex wid |
@cindex method |
All definitions other than those created by @code{:noname} have an entry |
@item method |
in the name dictionary. The name dictionary is fragmented into a number |
the concrete definition that performs the operation |
of parts, called @var{word lists}. A word list is identified by a |
described by the selector for a specific class. A method specifies |
cell-sized word list identifier (@var{wid}) in much the same way as a |
@emph{how} the operation is performed for a specific class. |
file is identified by a file handle. The numerical value of the wid has |
|
no (portable) meaning, and might change from session to session. |
|
|
|
@cindex compilation word list |
@cindex selector invocation |
At any one time, a single word list is defined as the word list to which |
@cindex message send |
all new definitions will be added -- this is called the @var{compilation |
@cindex invoking a selector |
word list}. When Gforth is started, the compilation word list is the |
@item selector invocation |
word list called @code{FORTH-WORDLIST}. |
a call of a selector. One argument of the call (the TOS (top-of-stack)) |
|
is used for determining which method is used. In Smalltalk terminology: |
|
a message (consisting of the selector and the other arguments) is sent |
|
to the object. |
|
|
@cindex search order stack |
@cindex receiving object |
Forth maintains a stack of word lists, representing the @var{search |
@item receiving object |
order}. When the name dictionary is searched (for example, when |
the object used for determining the method executed by a selector |
attempting to find a word's execution token during compilation), only |
invocation. In the @file{objects.fs} model, it is the object that is on |
those word lists that are currently in the search order are |
the TOS when the selector is invoked. (@emph{Receiving} comes from |
searched. The most recently-defined word in the word list at the top of |
the Smalltalk @emph{message} terminology.) |
the word list stack is searched first, and the search proceeds until |
|
either the word is located or the oldest definition in the word list at |
|
the bottom of the stack is reached. Definitions of the word may exist in |
|
more than one word lists; the search order determines which version will |
|
be found. |
|
|
|
The ANS Forth Standard "Search order" word set is intended to provide a |
@cindex child class |
set of low-level tools that allow various different schemes to be |
@cindex parent class |
implemented. Gforth provides @code{vocabulary}, a traditional Forth |
@cindex inheritance |
word. @file{compat/vocabulary.fs} provides an implementation in ANS |
@item child class |
Standard Forth. |
a class that has (@emph{inherits}) all properties (instance variables, |
|
selectors, methods) from a @emph{parent class}. In Smalltalk |
|
terminology: The subclass inherits from the superclass. In C++ |
|
terminology: The derived class inherits from the base class. |
|
|
TODO: locals section refers to here, saying that every word list (aka |
@end table |
vocabulary) has its own methods for searching etc. Need to document that. |
|
|
|
doc-forth-wordlist |
@c If you wonder about the message sending terminology, it comes from |
doc-definitions |
@c a time when each object had it's own task and objects communicated via |
doc-get-current |
@c message passing; eventually the Smalltalk developers realized that |
doc-set-current |
@c they can do most things through simple (indirect) calls. They kept the |
|
@c terminology. |
|
|
@comment TODO when a defn (like set-order) is instanced twice, the second instance gets documented. |
|
@comment In general that might be fine, but in this example (search.fs) the second instance is an |
|
@comment alias, so it would not naturally have documentation |
|
|
|
doc-get-order |
@node Objects, OOF, Object-Oriented Terminology, Object-oriented Forth |
doc-set-order |
@subsection The @file{objects.fs} model |
doc-wordlist |
@cindex objects |
doc-also |
@cindex object-oriented programming |
doc-forth |
|
doc-only |
|
doc-order |
|
doc-previous |
|
|
|
doc-find |
@cindex @file{objects.fs} |
doc-search-wordlist |
@cindex @file{oof.fs} |
|
|
doc-words |
This section describes the @file{objects.fs} package. This material also has been published in @cite{Yet Another Forth Objects Package} by Anton Ertl and appeared in Forth Dimensions 19(2), pages 37--43 (@url{http://www.complang.tuwien.ac.at/forth/objects/objects.html}). |
doc-vlist |
@c McKewan's and Zsoter's packages |
|
|
|
This section assumes that you have read @ref{Structures}. |
|
|
|
The techniques on which this model is based have been used to implement |
|
the parser generator, Gray, and have also been used in Gforth for |
|
implementing the various flavours of word lists (hashed or not, |
|
case-sensitive or not, special-purpose word lists for locals etc.). |
|
|
doc-mappedwordlist |
|
doc-root |
|
doc-vocabulary |
|
doc-seal |
|
doc-vocs |
|
doc-current |
|
doc-context |
|
|
|
@menu |
@menu |
* Why use word lists?:: |
* Properties of the Objects model:: |
* Word list examples:: |
* Basic Objects Usage:: |
|
* The Objects base class:: |
|
* Creating objects:: |
|
* Object-Oriented Programming Style:: |
|
* Class Binding:: |
|
* Method conveniences:: |
|
* Classes and Scoping:: |
|
* Object Interfaces:: |
|
* Objects Implementation:: |
|
* Objects Glossary:: |
@end menu |
@end menu |
|
|
@node Why use word lists?, Word list examples, Word Lists, Word Lists |
Marcel Hendrix provided helpful comments on this section. Andras Zsoter |
@subsection Why use word lists? |
and Bernd Paysan helped me with the related works section. |
@cindex word lists - why use them? |
|
|
|
There are several reasons for using multiple word lists: |
@node Properties of the Objects model, Basic Objects Usage, Objects, Objects |
|
@subsubsection Properties of the @file{objects.fs} model |
|
@cindex @file{objects.fs} properties |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
To improve compilation speed by reducing the number of name dictionary |
It is straightforward to pass objects on the stack. Passing |
entries that must be searched. This is achieved by creating a new |
selectors on the stack is a little less convenient, but possible. |
word list that contains all of the definitions that are used in the |
|
definition of a Forth system but which would not usually be used by |
|
programs running on that system. That word list would be on the search |
|
list when the Forth system was compiled but would be removed from the |
|
search list for normal operation. This can be a useful technique for |
|
low-performance systems (for example, 8-bit processors in embedded |
|
systems) but is unlikely to be necessary in high-performance desktop |
|
systems. |
|
@item |
@item |
To prevent a set of words from being used outside the context in which |
Objects are just data structures in memory, and are referenced by their |
they are valid. Two classic examples of this are an integrated editor |
address. You can create words for objects with normal defining words |
(all of the edit commands are defined in a separate word list; the |
like @code{constant}. Likewise, there is no difference between instance |
search order is set to the editor word list when the editor is invoked; |
variables that contain objects and those that contain other data. |
the old search order is restored when the editor is terminated) and an |
|
integrated assembler (the op-codes for the machine are defined in a |
|
separate word list which is used when a @code{CODE} word is defined). |
|
@item |
@item |
To prevent a name-space clash between multiple definitions with the same |
Late binding is efficient and easy to use. |
name. For example, when building a cross-compiler you might have a word |
|
@code{IF} that generates conditional code for your target system. By |
|
placing this definition in a different word list you can control whether |
|
the host system's @code{IF} or the target system's @code{IF} get used in |
|
any particular context by controlling the order of the word lists on the |
|
search order stack. |
|
@end itemize |
|
|
|
@node Word list examples, ,Why use word lists?, Word Lists |
@item |
@subsection Word list examples |
It avoids parsing, and thus avoids problems with state-smartness |
@cindex word lists - examples |
and reduced extensibility; for convenience there are a few parsing |
|
words, but they have non-parsing counterparts. There are also a few |
|
defining words that parse. This is hard to avoid, because all standard |
|
defining words parse (except @code{:noname}); however, such |
|
words are not as bad as many other parsing words, because they are not |
|
state-smart. |
|
|
Here is an example of creating and using a new wordlist using ANS |
@item |
Standard words: |
It does not try to incorporate everything. It does a few things and does |
|
them well (IMO). In particular, this model was not designed to support |
|
information hiding (although it has features that may help); you can use |
|
a separate package for achieving this. |
|
|
@example |
@item |
wordlist constant my-new-words-wordlist |
It is layered; you don't have to learn and use all features to use this |
: my-new-words get-order nip my-new-words-wordlist swap set-order ; |
model. Only a few features are necessary (@xref{Basic Objects Usage}, |
|
@xref{The Objects base class}, @xref{Creating objects}.), the others |
|
are optional and independent of each other. |
|
|
\ add it to the search order |
@item |
also my-new-words |
An implementation in ANS Forth is available. |
|
|
\ alternatively, add it to the search order and make it |
@end itemize |
\ the compilation word list |
|
also my-new-words definitions |
|
\ type "order" to see the problem |
|
@end example |
|
|
|
The problem with this example is that @code{order} has no way to |
|
associate the name @code{my-new-words} with the wid of the word list (in |
|
Gforth, @code{order} and @code{vocs} will display @code{???} for a wid |
|
that has no associated name). There is no Standard way of associating a |
|
name with a wid. |
|
|
|
In Gforth, this example can be re-coded using @code{vocabulary}, which |
@node Basic Objects Usage, The Objects base class, Properties of the Objects model, Objects |
associates a name with a wid: |
@subsubsection Basic @file{objects.fs} Usage |
|
@cindex basic objects usage |
|
@cindex objects, basic usage |
|
|
|
You can define a class for graphical objects like this: |
|
|
|
@cindex @code{class} usage |
|
@cindex @code{end-class} usage |
|
@cindex @code{selector} usage |
@example |
@example |
vocabulary my-new-words |
object class \ "object" is the parent class |
|
selector draw ( x y graphical -- ) |
|
end-class graphical |
|
@end example |
|
|
\ add it to the search order |
This code defines a class @code{graphical} with an |
my-new-words |
operation @code{draw}. We can perform the operation |
|
@code{draw} on any @code{graphical} object, e.g.: |
|
|
\ alternatively, add it to the search order and make it |
@example |
\ the compilation word list |
100 100 t-rex draw |
my-new-words definitions |
|
\ type "order" to see that the problem is solved |
|
@end example |
@end example |
|
|
|
@noindent |
|
where @code{t-rex} is a word (say, a constant) that produces a |
|
graphical object. |
|
|
@node Environmental Queries, Files, Word Lists, Words |
@comment nac TODO add a 2nd operation eg perimeter.. and use for |
@section Environmental Queries |
@comment a concrete example |
@cindex environmental queries |
|
@comment TODO more index entries |
|
|
|
The ANS Standard introduced the idea of "environmental queries" as a way |
@cindex abstract class |
for a program running on a system to determine certain characteristics of the system. |
How do we create a graphical object? With the present definitions, |
The Standard specifies a number of strings that might be recognised by a system. |
we cannot create a useful graphical object. The class |
|
@code{graphical} describes graphical objects in general, but not |
|
any concrete graphical object type (C++ users would call it an |
|
@emph{abstract class}); e.g., there is no method for the selector |
|
@code{draw} in the class @code{graphical}. |
|
|
The Standard requires that the name space used for environmental queries |
For concrete graphical objects, we define child classes of the |
be distinct from the name space used for definitions. |
class @code{graphical}, e.g.: |
|
|
Typically, environmental queries are supported by creating a set of |
@cindex @code{overrides} usage |
definitions in a word set that is @var{only} used during environmental |
@cindex @code{field} usage in class definition |
queries; that is what Gforth does. There is no Standard way of adding |
@example |
definitions to the set of recognised environmental queries, but any |
graphical class \ "graphical" is the parent class |
implementation that supports the loading of optional word sets must have |
cell% field circle-radius |
some mechanism for doing this (after loading the word set, the |
|
associated environmental query string must return @code{true}). In |
|
Gforth, the word set used to honour environmental queries can be |
|
manipulated just like any other word set. |
|
|
|
doc-environment? |
:noname ( x y circle -- ) |
doc-environment-wordlist |
circle-radius @@ draw-circle ; |
|
overrides draw |
|
|
doc-gforth |
:noname ( n-radius circle -- ) |
doc-os-class |
circle-radius ! ; |
|
overrides construct |
|
|
Note that, whilst the documentation for (eg) @code{gforth} shows it |
end-class circle |
returning two items on the stack, querying it using @code{environment?} |
@end example |
will return an additional item; the @code{true} flag that shows that the |
|
string was recognised. |
|
|
|
TODO Document the standard strings or note where they are documented herein |
Here we define a class @code{circle} as a child of @code{graphical}, |
|
with field @code{circle-radius} (which behaves just like a field |
|
(@pxref{Structures}); it defines (using @code{overrides}) new methods |
|
for the selectors @code{draw} and @code{construct} (@code{construct} is |
|
defined in @code{object}, the parent class of @code{graphical}). |
|
|
Here are some examples of using environmental queries: |
Now we can create a circle on the heap (i.e., |
|
@code{allocate}d memory) with: |
|
|
|
@cindex @code{heap-new} usage |
@example |
@example |
s" address-unit-bits" environment? 0= |
50 circle heap-new constant my-circle |
[IF] |
@end example |
cr .( environmental attribute address-units-bits unknown... ) cr |
|
[THEN] |
|
|
|
s" block" environment? [IF] DROP include block.fs [THEN] |
|
|
|
s" gforth" environment? [IF] 2DROP include compat/vocabulary.fs [THEN] |
|
|
|
s" gforth" environment? [IF] .( Gforth version ) TYPE [ELSE] .( Not Gforth..) [THEN] |
@noindent |
|
@code{heap-new} invokes @code{construct}, thus |
|
initializing the field @code{circle-radius} with 50. We can draw |
|
this new circle at (100,100) with: |
|
|
|
@example |
|
100 100 my-circle draw |
@end example |
@end example |
|
|
|
@cindex selector invocation, restrictions |
|
@cindex class definition, restrictions |
|
Note: You can only invoke a selector if the object on the TOS |
|
(the receiving object) belongs to the class where the selector was |
|
defined or one of its descendents; e.g., you can invoke |
|
@code{draw} only for objects belonging to @code{graphical} |
|
or its descendents (e.g., @code{circle}). Immediately before |
|
@code{end-class}, the search order has to be the same as |
|
immediately after @code{class}. |
|
|
Here is an example of adding a definition to the environment word list: |
@node The Objects base class, Creating objects, Basic Objects Usage, Objects |
|
@subsubsection The @file{object.fs} base class |
|
@cindex @code{object} class |
|
|
@example |
When you define a class, you have to specify a parent class. So how do |
get-current environment-wordlist set-current |
you start defining classes? There is one class available from the start: |
true constant block |
@code{object}. It is ancestor for all classes and so is the |
true constant block-ext |
only class that has no parent. It has two selectors: @code{construct} |
set-current |
and @code{print}. |
@end example |
|
|
|
You can see what definitions are in the environment word list like this: |
@node Creating objects, Object-Oriented Programming Style, The Objects base class, Objects |
|
@subsubsection Creating objects |
|
@cindex creating objects |
|
@cindex object creation |
|
@cindex object allocation options |
|
|
@example |
@cindex @code{heap-new} discussion |
get-order 1+ environment-wordlist swap set-order words previous |
@cindex @code{dict-new} discussion |
@end example |
@cindex @code{construct} discussion |
|
You can create and initialize an object of a class on the heap with |
|
@code{heap-new} ( ... class -- object ) and in the dictionary |
|
(allocation with @code{allot}) with @code{dict-new} ( |
|
... class -- object ). Both words invoke @code{construct}, which |
|
consumes the stack items indicated by "..." above. |
|
|
|
@cindex @code{init-object} discussion |
|
@cindex @code{class-inst-size} discussion |
|
If you want to allocate memory for an object yourself, you can get its |
|
alignment and size with @code{class-inst-size 2@@} ( class -- |
|
align size ). Once you have memory for an object, you can initialize |
|
it with @code{init-object} ( ... class object -- ); |
|
@code{construct} does only a part of the necessary work. |
|
|
|
@node Object-Oriented Programming Style, Class Binding, Creating objects, Objects |
|
@subsubsection Object-Oriented Programming Style |
|
@cindex object-oriented programming style |
|
|
@node Files, Including Files, Environmental Queries, Words |
This section is not exhaustive. |
@section Files |
|
|
|
This chapter describes how to operate on files from Forth. |
@cindex stack effects of selectors |
|
@cindex selectors and stack effects |
|
In general, it is a good idea to ensure that all methods for the |
|
same selector have the same stack effect: when you invoke a selector, |
|
you often have no idea which method will be invoked, so, unless all |
|
methods have the same stack effect, you will not know the stack effect |
|
of the selector invocation. |
|
|
Files are opened/created by name and type. The following types are |
One exception to this rule is methods for the selector |
recognised: |
@code{construct}. We know which method is invoked, because we |
|
specify the class to be constructed at the same place. Actually, I |
|
defined @code{construct} as a selector only to give the users a |
|
convenient way to specify initialization. The way it is used, a |
|
mechanism different from selector invocation would be more natural |
|
(but probably would take more code and more space to explain). |
|
|
doc-r/o |
@node Class Binding, Method conveniences, Object-Oriented Programming Style, Objects |
doc-r/w |
@subsubsection Class Binding |
doc-w/o |
@cindex class binding |
doc-bin |
@cindex early binding |
|
|
When a file is opened/created, it returns a file identifier, |
@cindex late binding |
@var{wfileid} that is used for all other file commands. All file |
Normal selector invocations determine the method at run-time depending |
commands also return a status value, @var{wior}, that is 0 for a |
on the class of the receiving object. This run-time selection is called |
successful operation and an implementation-defined non-zero value in the |
@var{late binding}. |
case of an error. |
|
|
|
doc-open-file |
Sometimes it's preferable to invoke a different method. For example, |
doc-create-file |
you might want to use the simple method for @code{print}ing |
|
@code{object}s instead of the possibly long-winded @code{print} method |
|
of the receiver class. You can achieve this by replacing the invocation |
|
of @code{print} with: |
|
|
doc-close-file |
@cindex @code{[bind]} usage |
doc-delete-file |
@example |
doc-rename-file |
[bind] object print |
doc-read-file |
@end example |
doc-read-line |
|
doc-write-file |
|
doc-write-line |
|
doc-emit-file |
|
doc-flush-file |
|
|
|
doc-file-status |
@noindent |
doc-file-position |
in compiled code or: |
doc-reposition-file |
|
doc-file-size |
|
doc-resize-file |
|
|
|
@node Including Files, Blocks, Files, Words |
@cindex @code{bind} usage |
@section Including Files |
@example |
@cindex including files |
bind object print |
|
@end example |
|
|
@menu |
@cindex class binding, alternative to |
* Words for Including:: |
@noindent |
* Search Path:: |
in interpreted code. Alternatively, you can define the method with a |
* Forth Search Paths:: |
name (e.g., @code{print-object}), and then invoke it through the |
* General Search Paths:: |
name. Class binding is just a (often more convenient) way to achieve |
@end menu |
the same effect; it avoids name clutter and allows you to invoke |
|
methods directly without naming them first. |
|
|
@node Words for Including, Search Path, Including Files, Including Files |
@cindex superclass binding |
@subsection Words for Including |
@cindex parent class binding |
|
A frequent use of class binding is this: When we define a method |
|
for a selector, we often want the method to do what the selector does |
|
in the parent class, and a little more. There is a special word for |
|
this purpose: @code{[parent]}; @code{[parent] |
|
@emph{selector}} is equivalent to @code{[bind] @emph{parent |
|
selector}}, where @code{@emph{parent}} is the parent |
|
class of the current class. E.g., a method definition might look like: |
|
|
doc-include-file |
@cindex @code{[parent]} usage |
doc-included |
@example |
doc-include |
:noname |
|
dup [parent] foo \ do parent's foo on the receiving object |
|
... \ do some more |
|
; overrides foo |
|
@end example |
|
|
Usually you want to include a file only if it is not included already |
@cindex class binding as optimization |
(by, say, another source file): |
In @cite{Object-oriented programming in ANS Forth} (Forth Dimensions, |
@comment TODO describe what happens on error. Describes how the require |
March 1997), Andrew McKewan presents class binding as an optimization |
@comment stuff works and describe how to clear/reset the history (eg |
technique. I recommend not using it for this purpose unless you are in |
@comment for debug). Might want to include that in the MARKER example. |
an emergency. Late binding is pretty fast with this model anyway, so the |
|
benefit of using class binding is small; the cost of using class binding |
|
where it is not appropriate is reduced maintainability. |
|
|
doc-required |
While we are at programming style questions: You should bind |
doc-require |
selectors only to ancestor classes of the receiving object. E.g., say, |
doc-needs |
you know that the receiving object is of class @code{foo} or its |
|
descendents; then you should bind only to @code{foo} and its |
|
ancestors. |
|
|
A definition in ANS Standard Forth for @code{required} is provided in |
@node Method conveniences, Classes and Scoping, Class Binding, Objects |
@file{compat/required.fs}. |
@subsubsection Method conveniences |
|
@cindex method conveniences |
|
|
@cindex stack effect of included files |
In a method you usually access the receiving object pretty often. If |
@cindex including files, stack effect |
you define the method as a plain colon definition (e.g., with |
I recommend that you write your source files such that interpreting them |
@code{:noname}), you may have to do a lot of stack |
does not change the stack. This allows using these files with |
gymnastics. To avoid this, you can define the method with @code{m: |
@code{required} and friends without complications. E.g., |
... ;m}. E.g., you could define the method for |
|
@code{draw}ing a @code{circle} with |
|
|
|
@cindex @code{this} usage |
|
@cindex @code{m:} usage |
|
@cindex @code{;m} usage |
@example |
@example |
1 require foo.fs drop |
m: ( x y circle -- ) |
|
( x y ) this circle-radius @@ draw-circle ;m |
@end example |
@end example |
|
|
@node Search Path, Forth Search Paths, Words for Including, Including Files |
@cindex @code{exit} in @code{m: ... ;m} |
@subsection Search Path |
@cindex @code{exitm} discussion |
@cindex path for @code{included} |
@cindex @code{catch} in @code{m: ... ;m} |
@cindex file search path |
When this method is executed, the receiver object is removed from the |
@cindex include search path |
stack; you can access it with @code{this} (admittedly, in this |
@cindex search path for files |
example the use of @code{m: ... ;m} offers no advantage). Note |
|
that I specify the stack effect for the whole method (i.e. including |
@comment what uses these search paths.. just inc;lude and friends? |
the receiver object), not just for the code between @code{m:} |
If you specify an absolute filename (i.e., a filename starting with |
and @code{;m}. You cannot use @code{exit} in |
@file{/} or @file{~}, or with @file{:} in the second position (as in |
@code{m:...;m}; instead, use |
@samp{C:...})) for @code{included} and friends, that file is included |
@code{exitm}.@footnote{Moreover, for any word that calls |
just as you would expect. |
@code{catch} and was defined before loading |
|
@code{objects.fs}, you have to redefine it like I redefined |
|
@code{catch}: @code{: catch this >r catch r> to-this ;}} |
|
|
For relative filenames, Gforth uses a search path similar to Forth's |
@cindex @code{inst-var} usage |
search order (@pxref{Word Lists}). It tries to find the given filename in |
You will frequently use sequences of the form @code{this |
the directories present in the path, and includes the first one it |
@emph{field}} (in the example above: @code{this |
finds. |
circle-radius}). If you use the field only in this way, you can |
|
define it with @code{inst-var} and eliminate the |
|
@code{this} before the field name. E.g., the @code{circle} |
|
class above could also be defined with: |
|
|
If the search path contains the directory @file{.} (as it should), this |
@example |
refers to the directory that the present file was @code{included} |
graphical class |
from. This allows files to include other files relative to their own |
cell% inst-var radius |
position (irrespective of the current working directory or the absolute |
|
position). This feature is essential for libraries consisting of |
|
several files, where a file may include other files from the library. |
|
It corresponds to @code{#include "..."} in C. If the current input |
|
source is not a file, @file{.} refers to the directory of the innermost |
|
file being included, or, if there is no file being included, to the |
|
current working directory. |
|
|
|
Use @file{~+} to refer to the current working directory (as in the |
m: ( x y circle -- ) |
@code{bash}). |
radius @@ draw-circle ;m |
|
overrides draw |
|
|
If the filename starts with @file{./}, the search path is not searched |
m: ( n-radius circle -- ) |
(just as with absolute filenames), and the @file{.} has the same meaning |
radius ! ;m |
as described above. |
overrides construct |
|
|
@node Forth Search Paths, General Search Paths, Search Path, Including Files |
end-class circle |
@subsection Forth Search Paths |
@end example |
@cindex search path control - forth |
|
|
|
The search path is initialized when you start Gforth (@pxref{Invoking |
@code{radius} can only be used in @code{circle} and its |
Gforth}). You can display it with |
descendent classes and inside @code{m:...;m}. |
|
|
doc-.fpath |
@cindex @code{inst-value} usage |
|
You can also define fields with @code{inst-value}, which is |
|
to @code{inst-var} what @code{value} is to |
|
@code{variable}. You can change the value of such a field with |
|
@code{[to-inst]}. E.g., we could also define the class |
|
@code{circle} like this: |
|
|
You can change it later with the following words: |
@example |
|
graphical class |
|
inst-value radius |
|
|
doc-fpath+ |
m: ( x y circle -- ) |
doc-fpath= |
radius draw-circle ;m |
|
overrides draw |
Using fpath and require would look like: |
|
|
|
@example |
m: ( n-radius circle -- ) |
fpath= /usr/lib/forth/|./ |
[to-inst] radius ;m |
|
overrides construct |
|
|
require timer.fs |
end-class circle |
@end example |
@end example |
|
|
If you have the need to look for a file in the Forth search path, you could |
|
use this Gforth feature in your application: |
|
|
|
doc-open-fpath-file |
@node Classes and Scoping, Object Interfaces, Method conveniences, Objects |
|
@subsubsection Classes and Scoping |
|
@cindex classes and scoping |
|
@cindex scoping and classes |
|
|
@node General Search Paths, , Forth Search Paths, Including Files |
Inheritance is frequent, unlike structure extension. This exacerbates |
@subsection General Search Paths |
the problem with the field name convention (@pxref{Structure Naming |
@cindex search path control - for user applications |
Convention}): One always has to remember in which class the field was |
|
originally defined; changing a part of the class structure would require |
|
changes for renaming in otherwise unaffected code. |
|
|
Your application may need to search files in sevaral directories, like |
@cindex @code{inst-var} visibility |
@code{included} does. For this purpose you can define and use your own |
@cindex @code{inst-value} visibility |
search paths. Create a search path like this: |
To solve this problem, I added a scoping mechanism (which was not in my |
|
original charter): A field defined with @code{inst-var} (or |
|
@code{inst-value}) is visible only in the class where it is defined and in |
|
the descendent classes of this class. Using such fields only makes |
|
sense in @code{m:}-defined methods in these classes anyway. |
|
|
@example |
This scoping mechanism allows us to use the unadorned field name, |
\ Make a buffer for the path: |
because name clashes with unrelated words become much less likely. |
create mypath 100 chars , \ maximum length (is checked) |
|
0 , \ real len |
|
100 chars allot \ space for path |
|
@end example |
|
|
|
You have the same functions for the forth search path in a generic version |
@cindex @code{protected} discussion |
for different paths. |
@cindex @code{private} discussion |
|
Once we have this mechanism, we can also use it for controlling the |
|
visibility of other words: All words defined after |
|
@code{protected} are visible only in the current class and its |
|
descendents. @code{public} restores the compilation |
|
(i.e. @code{current}) word list that was in effect before. If you |
|
have several @code{protected}s without an intervening |
|
@code{public} or @code{set-current}, @code{public} |
|
will restore the compilation word list in effect before the first of |
|
these @code{protected}s. |
|
|
Gforth also provides generic equivalents of the Forth search path words: |
@node Object Interfaces, Objects Implementation, Classes and Scoping, Objects |
|
@subsubsection Object Interfaces |
|
@cindex object interfaces |
|
@cindex interfaces for objects |
|
|
doc-.path |
In this model you can only call selectors defined in the class of the |
doc-path+ |
receiving objects or in one of its ancestors. If you call a selector |
doc-path= |
with a receiving object that is not in one of these classes, the |
doc-open-path-file |
result is undefined; if you are lucky, the program crashes |
|
immediately. |
|
|
|
@cindex selectors common to hardly-related classes |
|
Now consider the case when you want to have a selector (or several) |
|
available in two classes: You would have to add the selector to a |
|
common ancestor class, in the worst case to @code{object}. You |
|
may not want to do this, e.g., because someone else is responsible for |
|
this ancestor class. |
|
|
@node Blocks, Other I/O, Including Files, Words |
The solution for this problem is interfaces. An interface is a |
@section Blocks |
collection of selectors. If a class implements an interface, the |
|
selectors become available to the class and its descendents. A class |
|
can implement an unlimited number of interfaces. For the problem |
|
discussed above, we would define an interface for the selector(s), and |
|
both classes would implement the interface. |
|
|
This chapter describes how to use block files within Gforth. |
As an example, consider an interface @code{storage} for |
|
writing objects to disk and getting them back, and a class |
|
@code{foo} that implements it. The code would look like this: |
|
|
Block files are traditionally means of data and source storage in |
@cindex @code{interface} usage |
Forth. They have been very important in resource-starved computers |
@cindex @code{end-interface} usage |
without OS in the past. Gforth doesn't encourage to use blocks as |
@cindex @code{implementation} usage |
source, and provides blocks only for backward compatibility. The ANS |
@example |
standard requires blocks to be available when files are. |
interface |
|
selector write ( file object -- ) |
|
selector read1 ( file object -- ) |
|
end-interface storage |
|
|
@comment TODO what about errors on open-blocks? |
bar class |
doc-open-blocks |
storage implementation |
doc-use |
|
doc-scr |
|
doc-blk |
|
doc-get-block-fid |
|
doc-block-position |
|
doc-update |
|
doc-save-buffers |
|
doc-save-buffer |
|
doc-empty-buffers |
|
doc-empty-buffer |
|
doc-flush |
|
doc-get-buffer |
|
doc---block-block |
|
doc-buffer |
|
doc-updated? |
|
doc-list |
|
doc-load |
|
doc-thru |
|
doc-+load |
|
doc-+thru |
|
doc---block---> |
|
doc-block-included |
|
|
|
@node Other I/O, Programming Tools, Blocks, Words |
... overrides write |
@section Other I/O |
... overrides read |
@comment TODO more index entries |
... |
|
end-class foo |
|
@end example |
|
|
@menu |
@noindent |
* Simple numeric output:: Predefined formats |
(I would add a word @code{read} @var{( file -- object )} that uses |
* Formatted numeric output:: Formatted (pictured) output |
@code{read1} internally, but that's beyond the point illustrated |
* String Formats:: How Forth stores strings in memory |
here.) |
* Displaying characters and strings:: Other stuff |
|
* Input:: Input |
|
@end menu |
|
|
|
@node Simple numeric output, Formatted numeric output, Other I/O, Other I/O |
Note that you cannot use @code{protected} in an interface; and |
@subsection Simple numeric output |
of course you cannot define fields. |
@cindex Simple numeric output |
|
@comment TODO more index entries |
|
|
|
The simplest output functions are those that display numbers from the |
In the Neon model, all selectors are available for all classes; |
data or floating-point stacks. Floating-point output is always displayed |
therefore it does not need interfaces. The price you pay in this model |
using base 10. Numbers displayed from the data stack use the value stored |
is slower late binding, and therefore, added complexity to avoid late |
in @code{base}. |
binding. |
|
|
doc-. |
@node Objects Implementation, Objects Glossary, Object Interfaces, Objects |
doc-dec. |
@subsubsection @file{objects.fs} Implementation |
doc-hex. |
@cindex @file{objects.fs} implementation |
doc-u. |
|
doc-.r |
|
doc-u.r |
|
doc-d. |
|
doc-ud. |
|
doc-d.r |
|
doc-ud.r |
|
doc-f. |
|
doc-fe. |
|
doc-fs. |
|
|
|
Examples of printing the number 1234.5678E23 in the different floating-point output |
@cindex @code{object-map} discussion |
formats are shown below: |
An object is a piece of memory, like one of the data structures |
|
described with @code{struct...end-struct}. It has a field |
|
@code{object-map} that points to the method map for the object's |
|
class. |
|
|
|
@cindex method map |
|
@cindex virtual function table |
|
The @emph{method map}@footnote{This is Self terminology; in C++ |
|
terminology: virtual function table.} is an array that contains the |
|
execution tokens (@var{xt}s) of the methods for the object's class. Each |
|
selector contains an offset into a method map. |
|
|
|
@cindex @code{selector} implementation, class |
|
@code{selector} is a defining word that uses |
|
@code{CREATE} and @code{DOES>}. The body of the |
|
selector contains the offset; the @code{does>} action for a |
|
class selector is, basically: |
|
|
@example |
@example |
f. 123456779999999000000000000. |
( object addr ) @@ over object-map @@ + @@ execute |
fe. 123.456779999999E24 |
|
fs. 1.23456779999999E26 |
|
@end example |
@end example |
|
|
|
Since @code{object-map} is the first field of the object, it |
|
does not generate any code. As you can see, calling a selector has a |
|
small, constant cost. |
|
|
@node Formatted numeric output, String Formats, Simple numeric output, Other I/O |
@cindex @code{current-interface} discussion |
@subsection Formatted numeric output |
@cindex class implementation and representation |
@cindex Formatted numeric output |
A class is basically a @code{struct} combined with a method |
@cindex pictured numeric output |
map. During the class definition the alignment and size of the class |
@comment TODO more index entries |
are passed on the stack, just as with @code{struct}s, so |
|
@code{field} can also be used for defining class |
|
fields. However, passing more items on the stack would be |
|
inconvenient, so @code{class} builds a data structure in memory, |
|
which is accessed through the variable |
|
@code{current-interface}. After its definition is complete, the |
|
class is represented on the stack by a pointer (e.g., as parameter for |
|
a child class definition). |
|
|
Forth traditionally uses a technique called @var{pictured numeric |
A new class starts off with the alignment and size of its parent, |
output} for formatted printing of integers. In this technique, |
and a copy of the parent's method map. Defining new fields extends the |
digits are extracted from the number (using the current output radix |
size and alignment; likewise, defining new selectors extends the |
defined by @code{base}), converted to ASCII codes and appended to a |
method map. @code{overrides} just stores a new @var{xt} in the method |
string that is built in a scratch-pad area of memory |
map at the offset given by the selector. |
(@pxref{core-idef,Implementation-defined options}). During the extraction |
|
sequence, other arbitrary characters can be appended to the string. The |
|
completed string is specified by an address and length and can |
|
be manipulated (@code{TYPE}ed, copied, modified) under program control. |
|
|
|
All of the words described in the previous section for simple numeric |
@cindex class binding, implementation |
output are implemented in Gforth using pictured numeric output. |
Class binding just gets the @var{xt} at the offset given by the selector |
|
from the class's method map and @code{compile,}s (in the case of |
|
@code{[bind]}) it. |
|
|
Three important things to remember about Pictured Numeric Output: |
@cindex @code{this} implementation |
|
@cindex @code{catch} and @code{this} |
|
@cindex @code{this} and @code{catch} |
|
I implemented @code{this} as a @code{value}. At the |
|
start of an @code{m:...;m} method the old @code{this} is |
|
stored to the return stack and restored at the end; and the object on |
|
the TOS is stored @code{TO this}. This technique has one |
|
disadvantage: If the user does not leave the method via |
|
@code{;m}, but via @code{throw} or @code{exit}, |
|
@code{this} is not restored (and @code{exit} may |
|
crash). To deal with the @code{throw} problem, I have redefined |
|
@code{catch} to save and restore @code{this}; the same |
|
should be done with any word that can catch an exception. As for |
|
@code{exit}, I simply forbid it (as a replacement, there is |
|
@code{exitm}). |
|
|
@itemize @bullet |
@cindex @code{inst-var} implementation |
@item |
@code{inst-var} is just the same as @code{field}, with |
It always operates on double-precision numbers; to display a single-precision number, |
a different @code{DOES>} action: |
convert it first (@pxref{Double precision} for ways of doing this). |
@example |
@item |
@@ this + |
It always treats the double-precision number as though it were unsigned. Refer to |
@end example |
the examples below for ways of printing signed numbers. |
Similar for @code{inst-value}. |
@item |
|
The string is built up from right to left; least significant digit first. |
|
@end itemize |
|
|
|
doc-<# |
@cindex class scoping implementation |
doc-# |
Each class also has a word list that contains the words defined with |
doc-#s |
@code{inst-var} and @code{inst-value}, and its protected |
doc-hold |
words. It also has a pointer to its parent. @code{class} pushes |
doc-sign |
the word lists of the class and all its ancestors onto the search order stack, |
doc-#> |
and @code{end-class} drops them. |
|
|
doc-represent |
@cindex interface implementation |
|
An interface is like a class without fields, parent and protected |
|
words; i.e., it just has a method map. If a class implements an |
|
interface, its method map contains a pointer to the method map of the |
|
interface. The positive offsets in the map are reserved for class |
|
methods, therefore interface map pointers have negative |
|
offsets. Interfaces have offsets that are unique throughout the |
|
system, unlike class selectors, whose offsets are only unique for the |
|
classes where the selector is available (invokable). |
|
|
Here are some examples of using pictured numeric output: |
This structure means that interface selectors have to perform one |
|
indirection more than class selectors to find their method. Their body |
|
contains the interface map pointer offset in the class method map, and |
|
the method offset in the interface method map. The |
|
@code{does>} action for an interface selector is, basically: |
|
|
@example |
@example |
: my-u. ( u -- ) |
( object selector-body ) |
\ Simplest use of pns.. behaves like Standard u. |
2dup selector-interface @@ ( object selector-body object interface-offset ) |
0 \ convert to unsigned double |
swap object-map @@ + @@ ( object selector-body map ) |
<# \ start conversion |
swap selector-offset @@ + @@ execute |
#s \ convert all digits |
@end example |
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
|
: cents-only ( u -- ) |
where @code{object-map} and @code{selector-offset} are |
0 \ convert to unsigned double |
first fields and generate no code. |
<# \ start conversion |
|
# # \ convert two least-significant digits |
|
#> \ complete conversion, discard other digits |
|
TYPE SPACE ; \ display, with trailing space |
|
|
|
: dollars-and-cents ( u -- ) |
As a concrete example, consider the following code: |
0 \ convert to unsigned double |
|
<# \ start conversion |
|
# # \ convert two least-significant digits |
|
[char] . hold \ insert decimal point |
|
#s \ convert remaining digits |
|
[char] $ hold \ append currency symbol |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
|
: my-. ( n -- ) |
@example |
\ handling negatives.. behaves like Standard . |
interface |
s>d \ convert to signed double |
selector if1sel1 |
swap over dabs \ leave sign byte followed by unsigned double |
selector if1sel2 |
<# \ start conversion |
end-interface if1 |
#s \ convert all digits |
|
rot sign \ get at sign byte, append "-" if needed |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
|
|
: account. ( n -- ) |
object class |
\ accountants don't like minus signs, they use braces |
if1 implementation |
\ for negative numbers |
selector cl1sel1 |
s>d \ convert to signed double |
cell% inst-var cl1iv1 |
swap over dabs \ leave sign byte followed by unsigned double |
|
<# \ start conversion |
|
2 pick \ get copy of sign byte |
|
0< IF [char] ) hold THEN \ right-most character of output |
|
#s \ convert all digits |
|
rot \ get at sign byte |
|
0< IF [char] ( hold THEN |
|
#> \ complete conversion |
|
TYPE SPACE ; \ display, with trailing space |
|
@end example |
|
|
|
Here are some examples of using these words: |
' m1 overrides construct |
|
' m2 overrides if1sel1 |
|
' m3 overrides if1sel2 |
|
' m4 overrides cl1sel2 |
|
end-class cl1 |
|
|
@example |
create obj1 object dict-new drop |
1 my-u. 1 |
create obj2 cl1 dict-new drop |
hex -1 my-u. decimal FFFFFFFF |
|
1 cents-only 01 |
|
1234 cents-only 34 |
|
2 dollars-and-cents $0.02 |
|
1234 dollars-and-cents $12.34 |
|
123 my-. 123 |
|
-123 my. -123 |
|
123 account. 123 |
|
-456 account. (456) |
|
@end example |
@end example |
|
|
|
The data structure created by this code (including the data structure |
|
for @code{object}) is shown in the <a |
|
href="objects-implementation.eps">figure</a>, assuming a cell size of 4. |
|
@comment nac TODO add this diagram.. |
|
|
|
@node Objects Glossary, , Objects Implementation, Objects |
|
@subsubsection @file{objects.fs} Glossary |
|
@cindex @file{objects.fs} Glossary |
|
|
|
doc---objects-bind |
|
doc---objects-<bind> |
|
doc---objects-bind' |
|
doc---objects-[bind] |
|
doc---objects-class |
|
doc---objects-class->map |
|
doc---objects-class-inst-size |
|
doc---objects-class-override! |
|
doc---objects-construct |
|
doc---objects-current' |
|
doc---objects-[current] |
|
doc---objects-current-interface |
|
doc---objects-dict-new |
|
doc---objects-drop-order |
|
doc---objects-end-class |
|
doc---objects-end-class-noname |
|
doc---objects-end-interface |
|
doc---objects-end-interface-noname |
|
doc---objects-exitm |
|
doc---objects-heap-new |
|
doc---objects-implementation |
|
doc---objects-init-object |
|
doc---objects-inst-value |
|
doc---objects-inst-var |
|
doc---objects-interface |
|
doc---objects-;m |
|
doc---objects-m: |
|
doc---objects-method |
|
doc---objects-object |
|
doc---objects-overrides |
|
doc---objects-[parent] |
|
doc---objects-print |
|
doc---objects-protected |
|
doc---objects-public |
|
doc---objects-push-order |
|
doc---objects-selector |
|
doc---objects-this |
|
doc---objects-<to-inst> |
|
doc---objects-[to-inst] |
|
doc---objects-to-this |
|
doc---objects-xt-new |
|
|
|
@c ------------------------------------------------------------- |
|
@node OOF, Mini-OOF, Objects, Object-oriented Forth |
|
@subsection The @file{oof.fs} model |
|
@cindex oof |
|
@cindex object-oriented programming |
|
|
|
@cindex @file{objects.fs} |
|
@cindex @file{oof.fs} |
|
|
|
This section describes the @file{oof.fs} package. |
|
|
|
The package described in this section has been used in bigFORTH since 1991, and |
|
used for two large applications: a chromatographic system used to |
|
create new medicaments, and a graphic user interface library (MINOS). |
|
|
|
You can find a description (in German) of @file{oof.fs} in @cite{Object |
|
oriented bigFORTH} by Bernd Paysan, published in @cite{Vierte Dimension} |
|
10(2), 1994. |
|
|
|
@menu |
|
* Properties of the OOF model:: |
|
* Basic OOF Usage:: |
|
* The OOF base class:: |
|
* Class Declaration:: |
|
* Class Implementation:: |
|
@end menu |
|
|
|
@node Properties of the OOF model, Basic OOF Usage, OOF, OOF |
|
@subsubsection Properties of the @file{oof.fs} model |
|
@cindex @file{oof.fs} properties |
|
|
|
@itemize @bullet |
|
@item |
|
This model combines object oriented programming with information |
|
hiding. It helps you writing large application, where scoping is |
|
necessary, because it provides class-oriented scoping. |
|
|
@node String Formats, Displaying characters and strings, Formatted numeric output, Other I/O |
@item |
@subsection String Formats |
Named objects, object pointers, and object arrays can be created, |
@cindex string formats |
selector invocation uses the ``object selector'' syntax. Selector invocation |
|
to objects and/or selectors on the stack is a bit less convenient, but |
|
possible. |
|
|
@comment TODO more index entries |
@item |
|
Selector invocation and instance variable usage of the active object is |
|
straightforward, since both make use of the active object. |
|
|
Forth commonly uses two different methods for representing a string: |
@item |
|
Late binding is efficient and easy to use. |
|
|
@itemize @bullet |
|
@item |
@item |
@cindex address of counted string |
State-smart objects parse selectors. However, extensibility is provided |
As a @var{counted string}, represented by a c-addr. The char addressed |
using a (parsing) selector @code{postpone} and a selector @code{'}. |
by c-addr contains a character-count, n, of the string and the string |
|
occupies the subsequent n char addresses in memory. |
|
@item |
@item |
As cell pair on the stack; c-addr u, where u is the length of the string |
An implementation in ANS Forth is available. |
in characters, and c-addr is the address of the first byte of the string. |
|
@end itemize |
@end itemize |
|
|
The ANS Forth Standard encourages the use of the second format when |
|
representing strings on the stack, whilst conceeding that the counted |
|
string format remains useful as a way of storing strings in memory. |
|
|
|
doc-count |
@node Basic OOF Usage, The OOF base class, Properties of the OOF model, OOF |
|
@subsubsection Basic @file{oof.fs} Usage |
|
@cindex @file{oof.fs} usage |
|
|
@xref{Memory Blocks} for words that move, copy and search |
This section uses the same example as for @code{objects} (@pxref{Basic Objects Usage}). |
for strings. @xref{Displaying characters and strings,} for words that |
|
display characters and strings. |
|
|
|
|
You can define a class for graphical objects like this: |
|
|
@node Displaying characters and strings, Input, String Formats, Other I/O |
@cindex @code{class} usage |
@subsection Displaying characters and strings |
@cindex @code{class;} usage |
@cindex displaying characters and strings |
@cindex @code{method} usage |
@cindex compiling characters and strings |
@example |
@cindex cursor control |
object class graphical \ "object" is the parent class |
|
method draw ( x y graphical -- ) |
|
class; |
|
@end example |
|
|
@comment TODO more index entries |
This code defines a class @code{graphical} with an |
|
operation @code{draw}. We can perform the operation |
|
@code{draw} on any @code{graphical} object, e.g.: |
|
|
This section starts with a glossary of Forth words and ends with a set |
@example |
of examples. |
100 100 t-rex draw |
|
@end example |
|
|
doc-bl |
@noindent |
doc-space |
where @code{t-rex} is an object or object pointer, created with e.g. |
doc-spaces |
@code{graphical : t-rex}. |
doc-emit |
|
doc-toupper |
|
doc-." |
|
doc-.( |
|
doc-type |
|
doc-cr |
|
doc-at-xy |
|
doc-page |
|
doc-s" |
|
doc-c" |
|
doc-char |
|
doc-[char] |
|
doc-sliteral |
|
|
|
As an example, consider the following text, stored in a file @file{test.fs}: |
@cindex abstract class |
|
How do we create a graphical object? With the present definitions, |
|
we cannot create a useful graphical object. The class |
|
@code{graphical} describes graphical objects in general, but not |
|
any concrete graphical object type (C++ users would call it an |
|
@emph{abstract class}); e.g., there is no method for the selector |
|
@code{draw} in the class @code{graphical}. |
|
|
|
For concrete graphical objects, we define child classes of the |
|
class @code{graphical}, e.g.: |
|
|
@example |
@example |
.( text-1) |
graphical class circle \ "graphical" is the parent class |
: my-word |
cell var circle-radius |
." text-2" cr |
how: |
.( text-3) |
: draw ( x y -- ) |
; |
circle-radius @@ draw-circle ; |
|
|
." text-4" |
: init ( n-radius -- ( |
|
circle-radius ! ; |
|
class; |
|
@end example |
|
|
: my-char |
Here we define a class @code{circle} as a child of @code{graphical}, |
[char] ALPHABET emit |
with a field @code{circle-radius}; it defines new methods for the |
char emit |
selectors @code{draw} and @code{init} (@code{init} is defined in |
; |
@code{object}, the parent class of @code{graphical}). |
|
|
|
Now we can create a circle in the dictionary with: |
|
|
|
@example |
|
50 circle : my-circle |
@end example |
@end example |
|
|
When you load this code into Gforth, the following output is generated: |
@noindent |
|
@code{:} invokes @code{init}, thus initializing the field |
|
@code{circle-radius} with 50. We can draw this new circle at (100,100) |
|
with: |
|
|
@example |
@example |
@kbd{include test.fs} text-1text-3text-4 ok |
100 100 my-circle draw |
@end example |
@end example |
|
|
|
@cindex selector invocation, restrictions |
|
@cindex class definition, restrictions |
|
Note: You can only invoke a selector if the receiving object belongs to |
|
the class where the selector was defined or one of its descendents; |
|
e.g., you can invoke @code{draw} only for objects belonging to |
|
@code{graphical} or its descendents (e.g., @code{circle}). The scoping |
|
mechanism will check if you try to invoke a selector that is not |
|
defined in this class hierarchy, so you'll get an error at compilation |
|
time. |
|
|
|
|
|
@node The OOF base class, Class Declaration, Basic OOF Usage, OOF |
|
@subsubsection The @file{oof.fs} base class |
|
@cindex @file{oof.fs} base class |
|
|
|
When you define a class, you have to specify a parent class. So how do |
|
you start defining classes? There is one class available from the start: |
|
@code{object}. You have to use it as ancestor for all classes. It is the |
|
only class that has no parent. Classes are also objects, except that |
|
they don't have instance variables; class manipulation such as |
|
inheritance or changing definitions of a class is handled through |
|
selectors of the class @code{object}. |
|
|
|
@code{object} provides a number of selectors: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
Messages @code{text-1} and @code{text-3} are displayed because @code{.(} |
@code{class} for subclassing, @code{definitions} to add definitions |
is an immediate word; it behaves in the same way whether it is used inside |
later on, and @code{class?} to get type informations (is the class a |
or outside a colon definition. |
subclass of the class passed on the stack?). |
@item |
doc---object-class |
Message @code{text-4} is displayed because of Gforth's added interpretation |
doc---object-definitions |
semantics for @code{."}. |
doc---object-class? |
|
|
@item |
@item |
Message @code{text-2} is @var{not} displayed, because the text interpreter |
@code{init} and @code{dispose} as constructor and destructor of the |
performs the compilation semantics for @code{."} within the definition of |
object. @code{init} is invocated after the object's memory is allocated, |
@code{my-word}. |
while @code{dispose} also handles deallocation. Thus if you redefine |
@end itemize |
@code{dispose}, you have to call the parent's dispose with @code{super |
|
dispose}, too. |
|
doc---object-init |
|
doc---object-dispose |
|
|
Here are some examples of executing @code{my-word} and @code{my-char}: |
@item |
|
@code{new}, @code{new[]}, @code{:}, @code{ptr}, @code{asptr}, and |
|
@code{[]} to create named and unnamed objects and object arrays or |
|
object pointers. |
|
doc---object-new |
|
doc---object-new[] |
|
doc---object-: |
|
doc---object-ptr |
|
doc---object-asptr |
|
doc---object-[] |
|
|
@example |
@item |
my-word text-2 |
@code{::} and @code{super} for explicit scoping. You should use explicit |
ok |
scoping only for super classes or classes with the same set of instance |
@kbd{my-char fred} Af ok |
variables. Explicitly-scoped selectors use early binding. |
@kbd{my-char jim} Aj ok |
doc---object-:: |
@end example |
doc---object-super |
|
|
@itemize @bullet |
|
@item |
@item |
Message @code{text-2} is displayed because of the run-time behaviour of |
@code{self} to get the address of the object |
@code{."}. |
doc---object-self |
|
|
@item |
@item |
@code{[char]} compiles the "A" from "ALPHABET" and puts its display code |
@code{bind}, @code{bound}, @code{link}, and @code{is} to assign object |
on the stack at run-time. @code{emit} always displays the character |
pointers and instance defers. |
when @code{my-char} is executed. |
doc---object-bind |
|
doc---object-bound |
|
doc---object-link |
|
doc---object-is |
|
|
@item |
@item |
@code{char} parses a string at run-time and the second @code{emit} displays |
@code{'} to obtain selector tokens, @code{send} to invocate selectors |
the first character of the string. |
form the stack, and @code{postpone} to generate selector invocation code. |
|
doc---object-' |
|
doc---object-postpone |
|
|
@item |
@item |
If you type @code{see my-char} you can see that @code{[char]} discarded |
@code{with} and @code{endwith} to select the active object from the |
the text "LPHABET" and only compiled the display code for "A" into the |
stack, and enable its scope. Using @code{with} and @code{endwith} |
definition of @code{my-char}. |
also allows you to create code using selector @code{postpone} without being |
|
trapped by the state-smart objects. |
|
doc---object-with |
|
doc---object-endwith |
|
|
@end itemize |
@end itemize |
|
|
|
@node Class Declaration, Class Implementation, The OOF base class, OOF |
|
@subsubsection Class Declaration |
|
@cindex class declaration |
|
|
|
@itemize @bullet |
|
@item |
|
Instance variables |
|
doc---oof-var |
|
|
|
@item |
|
Object pointers |
|
doc---oof-ptr |
|
doc---oof-asptr |
|
|
@node Input, , Displaying characters and strings, Other I/O |
@item |
@subsection Input |
Instance defers |
@cindex Input |
doc---oof-defer |
@comment TODO more index entries |
|
|
|
Blah on traditional and recommended string formats. |
@item |
|
Method selectors |
|
doc---oof-early |
|
doc---oof-method |
|
|
doc--trailing |
@item |
doc-/string |
Class-wide variables |
doc-convert |
doc---oof-static |
doc->number |
|
doc->float |
|
doc-accept |
|
doc-query |
|
doc-expect |
|
doc-evaluate |
|
doc-key |
|
doc-key? |
|
|
|
TODO reference the block move stuff elsewhere |
@item |
|
End declaration |
|
doc---oof-how: |
|
doc---oof-class; |
|
|
TODO convert and >number might be better in the numeric input section. |
@end itemize |
|
|
TODO maybe some of these shouldn't be here but should be in a "parsing" section |
@c ------------------------------------------------------------- |
|
@node Class Implementation, , Class Declaration, OOF |
|
@subsubsection Class Implementation |
|
@cindex class implementation |
|
|
|
@c ------------------------------------------------------------- |
|
@node Mini-OOF, Comparison with other object models, OOF, Object-oriented Forth |
|
@subsection The @file{mini-oof.fs} model |
|
@cindex mini-oof |
|
|
@node Programming Tools, Assembler and Code Words, Other I/O, Words |
Gforth's third object oriented Forth package is a 12-liner. It uses a |
@section Programming Tools |
mixture of the @file{object.fs} and the @file{oof.fs} syntax, |
@cindex programming tools |
and reduces to the bare minimum of features. This is based on a posting |
|
of Bernd Paysan in comp.arch. |
|
|
@menu |
@menu |
* Debugging:: Simple and quick. |
* Basic Mini-OOF Usage:: |
* Assertions:: Making your programs self-checking. |
* Mini-OOF Example:: |
* Singlestep Debugger:: Executing your program word by word. |
* Mini-OOF Implementation:: |
@end menu |
@end menu |
|
|
@node Debugging, Assertions, Programming Tools, Programming Tools |
@c ------------------------------------------------------------- |
@subsection Debugging |
@node Basic Mini-OOF Usage, Mini-OOF Example, , Mini-OOF |
@cindex debugging |
@subsubsection Basic @file{mini-oof.fs} Usage |
|
@cindex mini-oof usage |
Languages with a slow edit/compile/link/test development loop tend to |
|
require sophisticated tracing/stepping debuggers to facilate |
|
productive debugging. |
|
|
|
A much better (faster) way in fast-compiling languages is to add |
There is a base class (@code{class}, which allocates one cell |
printing code at well-selected places, let the program run, look at |
for the object pointer) plus seven other words: to define a method, a |
the output, see where things went wrong, add more printing code, etc., |
variable, a class; to end a class, to resolve binding, to allocate an |
until the bug is found. |
object and to compile a class method. |
|
@comment TODO better description of the last one |
|
|
The simple debugging aids provided in @file{debugs.fs} |
doc-object |
are meant to support this style of debugging. In addition, there are |
doc-method |
words for non-destructively inspecting the stack and memory: |
doc-var |
|
doc-class |
|
doc-end-class |
|
doc-defines |
|
doc-new |
|
doc-:: |
|
|
doc-.s |
|
doc-f.s |
|
|
|
There is a word @code{.r} but it does @var{not} display the return |
@c ------------------------------------------------------------- |
stack! It is used for formatted numeric output. |
@node Mini-OOF Example, Mini-OOF Implementation, Basic Mini-OOF Usage, Mini-OOF |
|
@subsubsection Mini-OOF Example |
|
@cindex mini-oof example |
|
|
doc-depth |
A short example shows how to use this package. This example, in slightly |
doc-fdepth |
extended form, is supplied as @file{moof-exm.fs} |
doc-clearstack |
@comment nac TODO could flesh this out with some comments from the Forthwrite article |
doc-? |
|
doc-dump |
|
|
|
The word @code{~~} prints debugging information (by default the source |
@example |
location and the stack contents). It is easy to insert. If you use Emacs |
object class |
it is also easy to remove (@kbd{C-x ~} in the Emacs Forth mode to |
method init |
query-replace them with nothing). The deferred words |
method draw |
@code{printdebugdata} and @code{printdebugline} control the output of |
end-class graphical |
@code{~~}. The default source location output format works well with |
@end example |
Emacs' compilation mode, so you can step through the program at the |
|
source level using @kbd{C-x `} (the advantage over a stepping debugger |
|
is that you can step in any direction and you know where the crash has |
|
happened or where the strange data has occurred). |
|
|
|
Note that the default actions clobber the contents of the pictured |
This code defines a class @code{graphical} with an |
numeric output string, so you should not use @code{~~}, e.g., between |
operation @code{draw}. We can perform the operation |
@code{<#} and @code{#>}. |
@code{draw} on any @code{graphical} object, e.g.: |
|
|
doc-~~ |
@example |
doc-printdebugdata |
100 100 t-rex draw |
doc-printdebugline |
@end example |
|
|
doc-see |
where @code{t-rex} is an object or object pointer, created with e.g. |
doc-marker |
@code{graphical new Constant t-rex}. |
|
|
Here's an example of using @code{marker} at the start of a source file |
For concrete graphical objects, we define child classes of the |
that you are debugging; it ensures that you only ever have one copy of |
class @code{graphical}, e.g.: |
the file's definitions compiled at any time: |
|
|
|
@example |
@example |
[IFDEF] my-code |
graphical class |
my-code |
cell var circle-radius |
[ENDIF] |
end-class circle \ "graphical" is the parent class |
|
|
marker my-code |
|
|
|
\ .. definitions start here |
:noname ( x y -- ) |
\ . |
circle-radius @@ draw-circle ; circle defines draw |
\ . |
:noname ( r -- ) |
\ end |
circle-radius ! ; circle defines init |
@end example |
@end example |
|
|
|
There is no implicit init method, so we have to define one. The creation |
|
code of the object now has to call init explicitely. |
@node Assertions, Singlestep Debugger, Debugging, Programming Tools |
|
@subsection Assertions |
|
@cindex assertions |
|
|
|
It is a good idea to make your programs self-checking, in particular, if |
|
you use an assumption (e.g., that a certain field of a data structure is |
|
never zero) that may become wrong during maintenance. Gforth supports |
|
assertions for this purpose. They are used like this: |
|
|
|
@example |
@example |
assert( @var{flag} ) |
circle new Constant my-circle |
|
50 my-circle init |
@end example |
@end example |
|
|
The code between @code{assert(} and @code{)} should compute a flag, that |
It is also possible to add a function to create named objects with |
should be true if everything is alright and false otherwise. It should |
automatic call of @code{init}, given that all objects have @code{init} |
not change anything else on the stack. The overall stack effect of the |
on the same place: |
assertion is @code{( -- )}. E.g. |
|
|
|
@example |
@example |
assert( 1 1 + 2 = ) \ what we learn in school |
: new: ( .. o "name" -- ) |
assert( dup 0<> ) \ assert that the top of stack is not zero |
new dup Constant init ; |
assert( false ) \ this code should not be reached |
80 circle new: large-circle |
@end example |
@end example |
|
|
The need for assertions is different at different times. During |
We can draw this new circle at (100,100) with: |
debugging, we want more checking, in production we sometimes care more |
|
for speed. Therefore, assertions can be turned off, i.e., the assertion |
|
becomes a comment. Depending on the importance of an assertion and the |
|
time it takes to check it, you may want to turn off some assertions and |
|
keep others turned on. Gforth provides several levels of assertions for |
|
this purpose: |
|
|
|
doc-assert0( |
@example |
doc-assert1( |
100 100 my-circle draw |
doc-assert2( |
@end example |
doc-assert3( |
|
doc-assert( |
|
doc-) |
|
|
|
@code{Assert(} is the same as @code{assert1(}. The variable |
@node Mini-OOF Implementation, , Mini-OOF Example, Mini-OOF |
@code{assert-level} specifies the highest assertions that are turned |
@subsubsection @file{mini-oof.fs} Implementation |
on. I.e., at the default @code{assert-level} of one, @code{assert0(} and |
|
@code{assert1(} assertions perform checking, while @code{assert2(} and |
|
@code{assert3(} assertions are treated as comments. |
|
|
|
Note that the @code{assert-level} is evaluated at compile-time, not at |
|
run-time. I.e., you cannot turn assertions on or off at run-time, you |
|
have to set the @code{assert-level} appropriately before compiling a |
|
piece of code. You can compile several pieces of code at several |
|
@code{assert-level}s (e.g., a trusted library at level 1 and newly |
|
written code at level 3). |
|
|
|
doc-assert-level |
Object-oriented systems with late binding typically use a |
|
``vtable''-approach: the first variable in each object is a pointer to a |
|
table, which contains the methods as function pointers. The vtable |
|
may also contain other information. |
|
|
If an assertion fails, a message compatible with Emacs' compilation mode |
So first, let's declare methods: |
is produced and the execution is aborted (currently with @code{ABORT"}. |
|
If there is interest, we will introduce a special throw code. But if you |
|
intend to @code{catch} a specific condition, using @code{throw} is |
|
probably more appropriate than an assertion). |
|
|
|
Definitions in ANS Standard Forth for these assertion words are provided |
@example |
in @file{compat/assert.fs}. |
: method ( m v -- m' v ) Create over , swap cell+ swap |
|
DOES> ( ... o -- ... ) @ over @ + @ execute ; |
|
@end example |
|
|
|
During method declaration, the number of methods and instance |
|
variables is on the stack (in address units). @code{method} creates |
|
one method and increments the method number. To execute a method, it |
|
takes the object, fetches the vtable pointer, adds the offset, and |
|
executes the @var{xt} stored there. Each method takes the object it is |
|
invoked from as top of stack parameter. The method itself should |
|
consume that object. |
|
|
@node Singlestep Debugger, , Assertions, Programming Tools |
Now, we also have to declare instance variables |
@subsection Singlestep Debugger |
|
@cindex singlestep Debugger |
|
@cindex debugging Singlestep |
|
@cindex @code{dbg} |
|
@cindex @code{BREAK:} |
|
@cindex @code{BREAK"} |
|
|
|
When a new word is created there's often the need to check whether it behaves |
@example |
correctly or not. You can do this by typing @code{dbg badword}. |
: var ( m v size -- m v' ) Create over , + |
|
DOES> ( o -- addr ) @ + ; |
|
@end example |
|
|
doc-dbg |
As before, a word is created with the current offset. Instance |
|
variables can have different sizes (cells, floats, doubles, chars), so |
|
all we do is take the size and add it to the offset. If your machine |
|
has alignment restrictions, put the proper @code{aligned} or |
|
@code{faligned} before the variable, to adjust the variable |
|
offset. That's why it is on the top of stack. |
|
|
This might look like: |
We need a starting point (the base object) and some syntactic sugar: |
|
|
@example |
@example |
: badword 0 DO i . LOOP ; ok |
Create object 1 cells , 2 cells , |
2 dbg badword |
: class ( class -- class methods vars ) dup 2@ ; |
: badword |
@end example |
Scanning code... |
|
|
|
Nesting debugger ready! |
For inheritance, the vtable of the parent object has to be |
|
copied when a new, derived class is declared. This gives all the |
|
methods of the parent class, which can be overridden, though. |
|
|
400D4738 8049BC4 0 -> [ 2 ] 00002 00000 |
@example |
400D4740 8049F68 DO -> [ 0 ] |
: end-class ( class methods vars -- ) |
400D4744 804A0C8 i -> [ 1 ] 00000 |
Create here >r , dup , 2 cells ?DO ['] noop , 1 cells +LOOP |
400D4748 400C5E60 . -> 0 [ 0 ] |
cell+ dup cell+ r> rot @ 2 cells /string move ; |
400D474C 8049D0C LOOP -> [ 0 ] |
|
400D4744 804A0C8 i -> [ 1 ] 00001 |
|
400D4748 400C5E60 . -> 1 [ 0 ] |
|
400D474C 8049D0C LOOP -> [ 0 ] |
|
400D4758 804B384 ; -> ok |
|
@end example |
@end example |
|
|
Each line displayed is one step. You always have to hit return to |
The first line creates the vtable, initialized with |
execute the next word that is displayed. If you don't want to execute |
@code{noop}s. The second line is the inheritance mechanism, it |
the next word in a whole, you have to type @kbd{n} for @code{nest}. Here is |
copies the xts from the parent vtable. |
an overview what keys are available: |
|
|
|
@table @i |
We still have no way to define new methods, let's do that now: |
|
|
@item <return> |
@example |
Next; Execute the next word. |
: defines ( xt class -- ) ' >body @ + ! ; |
|
@end example |
|
|
@item n |
To allocate a new object, we need a word, too: |
Nest; Single step through next word. |
|
|
|
@item u |
@example |
Unnest; Stop debugging and execute rest of word. If we got to this word |
: new ( class -- o ) here over @ allot swap over ! ; |
with nest, continue debugging with the calling word. |
@end example |
|
|
@item d |
Sometimes derived classes want to access the method of the |
Done; Stop debugging and execute rest. |
parent object. There are two ways to achieve this with Mini-OOF: |
|
first, you could use named words, and second, you could look up the |
|
vtable of the parent object. |
|
|
@item s |
@example |
Stopp; Abort immediately. |
: :: ( class "name" -- ) ' >body @ + @ compile, ; |
|
@end example |
|
|
@end table |
|
|
|
Debugging large application with this mechanism is very difficult, because |
Nothing can be more confusing than a good example, so here is |
you have to nest very deep into the program before the interesting part |
one. First let's declare a text object (called |
begins. This takes a lot of time. |
@code{button}), that stores text and position: |
|
|
To do it more directly put a @code{BREAK:} command into your source code. |
@example |
When program execution reaches @code{BREAK:} the single step debugger is |
object class |
invoked and you have all the features described above. |
cell var text |
|
cell var len |
|
cell var x |
|
cell var y |
|
method init |
|
method draw |
|
end-class button |
|
@end example |
|
|
If you have more than one part to debug it is useful to know where the |
@noindent |
program has stopped at the moment. You can do this by the |
Now, implement the two methods, @code{draw} and @code{init}: |
@code{BREAK" string"} command. This behaves like @code{BREAK:} except that |
|
string is typed out when the ``breakpoint'' is reached. |
|
|
|
@node Assembler and Code Words, Threading Words, Programming Tools, Words |
@example |
@section Assembler and Code Words |
:noname ( o -- ) |
@cindex assembler |
>r r@ x @ r@ y @ at-xy r@ text @ r> len @ type ; |
@cindex code words |
button defines draw |
|
:noname ( addr u o -- ) |
|
>r 0 r@ x ! 0 r@ y ! r@ len ! r> text ! ; |
|
button defines init |
|
@end example |
|
|
Gforth provides some words for defining primitives (words written in |
@noindent |
machine code), and for defining the the machine-code equivalent of |
To demonstrate inheritance, we define a class @code{bold-button}, with no |
@code{DOES>}-based defining words. However, the machine-independent |
new data and no new methods: |
nature of Gforth poses a few problems: First of all, Gforth runs on |
|
several architectures, so it can provide no standard assembler. What's |
|
worse is that the register allocation not only depends on the processor, |
|
but also on the @code{gcc} version and options used. |
|
|
|
The words that Gforth offers encapsulate some system dependences (e.g., the |
@example |
header structure), so a system-independent assembler may be used in |
button class |
Gforth. If you do not have an assembler, you can compile machine code |
end-class bold-button |
directly with @code{,} and @code{c,}. |
|
|
|
doc-assembler |
: bold 27 emit ." [1m" ; |
doc-code |
: normal 27 emit ." [0m" ; |
doc-end-code |
@end example |
doc-;code |
|
doc-flush-icache |
|
|
|
If @code{flush-icache} does not work correctly, @code{code} words |
@noindent |
etc. will not work (reliably), either. |
The class @code{bold-button} has a different draw method to |
|
@code{button}, but the new method is defined in terms of the draw method |
|
for @code{button}: |
|
|
These words are rarely used. Therefore they reside in @code{code.fs}, |
@example |
which is usually not loaded (except @code{flush-icache}, which is always |
:noname bold [ button :: draw ] normal ; bold-button defines draw |
present). You can load them with @code{require code.fs}. |
@end example |
|
|
@cindex registers of the inner interpreter |
@noindent |
In the assembly code you will want to refer to the inner interpreter's |
Finally, create two objects and apply methods: |
registers (e.g., the data stack pointer) and you may want to use other |
|
registers for temporary storage. Unfortunately, the register allocation |
|
is installation-dependent. |
|
|
|
The easiest solution is to use explicit register declarations |
@example |
(@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info, |
button new Constant foo |
GNU C Manual}) for all of the inner interpreter's registers: You have to |
s" thin foo" foo init |
compile Gforth with @code{-DFORCE_REG} (configure option |
page |
@code{--enable-force-reg}) and the appropriate declarations must be |
foo draw |
present in the @code{machine.h} file (see @code{mips.h} for an example; |
bold-button new Constant bar |
you can find a full list of all declarable register symbols with |
s" fat bar" bar init |
@code{grep register engine.c}). If you give explicit registers to all |
1 bar y ! |
variables that are declared at the beginning of @code{engine()}, you |
bar draw |
should be able to use the other caller-saved registers for temporary |
@end example |
storage. Alternatively, you can use the @code{gcc} option |
|
@code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code |
|
Generation Conventions, gcc.info, GNU C Manual}) to reserve a register |
|
(however, this restriction on register allocation may slow Gforth |
|
significantly). |
|
|
|
If this solution is not viable (e.g., because @code{gcc} does not allow |
|
you to explicitly declare all the registers you need), you have to find |
|
out by looking at the code where the inner interpreter's registers |
|
reside and which registers can be used for temporary storage. You can |
|
get an assembly listing of the engine's code with @code{make engine.s}. |
|
|
|
In any case, it is good practice to abstract your assembly code from the |
@node Comparison with other object models, , Mini-OOF, Object-oriented Forth |
actual register allocation. E.g., if the data stack pointer resides in |
@subsubsection Comparison with other object models |
register @code{$17}, create an alias for this register called @code{sp}, |
@cindex comparison of object models |
and use that in your assembly code. |
@cindex object models, comparison |
|
|
@cindex code words, portable |
Many object-oriented Forth extensions have been proposed (@cite{A survey |
Another option for implementing normal and defining words efficiently |
of object-oriented Forths} (SIGPLAN Notices, April 1996) by Bradford |
is: adding the wanted functionality to the source of Gforth. For normal |
J. Rodriguez and W. F. S. Poehlman lists 17). This section discusses the |
words you just have to edit @file{primitives} (@pxref{Automatic |
relation of the object models described here to two well-known and two |
Generation}), defining words (equivalent to @code{;CODE} words, for fast |
closely-related (by the use of method maps) models. |
defined words) may require changes in @file{engine.c}, @file{kernel.fs}, |
|
@file{prims2x.fs}, and possibly @file{cross.fs}. |
|
|
|
|
@cindex Neon model |
|
The most popular model currently seems to be the Neon model (see |
|
@cite{Object-oriented programming in ANS Forth} (Forth Dimensions, March |
|
1997) by Andrew McKewan) but this model has a number of limitations |
|
@footnote{A longer version of this critique can be |
|
found in @cite{On Standardizing Object-Oriented Forth Extensions} (Forth |
|
Dimensions, May 1997) by Anton Ertl.}: |
|
|
@node Threading Words, Passing Commands to the OS, Assembler and Code Words, Words |
@itemize @bullet |
@section Threading Words |
@item |
@cindex threading words |
It uses a @code{@emph{selector |
|
object}} syntax, which makes it unnatural to pass objects on the |
|
stack. |
|
|
@cindex code address |
@item |
These words provide access to code addresses and other threading stuff |
It requires that the selector parses the input stream (at |
in Gforth (and, possibly, other interpretive Forths). It more or less |
compile time); this leads to reduced extensibility and to bugs that are+ |
abstracts away the differences between direct and indirect threading |
hard to find. |
(and, for direct threading, the machine dependences). However, at |
|
present this wordset is still incomplete. It is also pretty low-level; |
|
some day it will hopefully be made unnecessary by an internals wordset |
|
that abstracts implementation details away completely. |
|
|
|
doc-threading-method |
@item |
doc->code-address |
It allows using every selector to every object; |
doc->does-code |
this eliminates the need for classes, but makes it harder to create |
doc-code-address! |
efficient implementations. |
doc-does-code! |
@end itemize |
doc-does-handler! |
|
doc-/does-handler |
|
|
|
The code addresses produced by various defining words are produced by |
@cindex Pountain's object-oriented model |
the following words: |
Another well-known publication is @cite{Object-Oriented Forth} (Academic |
|
Press, London, 1987) by Dick Pountain. However, it is not really about |
|
object-oriented programming, because it hardly deals with late |
|
binding. Instead, it focuses on features like information hiding and |
|
overloading that are characteristic of modular languages like Ada (83). |
|
|
doc-docol: |
@cindex Zsoter's object-oriented model |
doc-docon: |
In @cite{Does late binding have to be slow?} (Forth Dimensions 18(1) 1996, pages 31-35) |
doc-dovar: |
Andras Zsoter describes a model that makes heavy use of an active object |
doc-douser: |
(like @code{this} in @file{objects.fs}): The active object is not only |
doc-dodefer: |
used for accessing all fields, but also specifies the receiving object |
doc-dofield: |
of every selector invocation; you have to change the active object |
|
explicitly with @code{@{ ... @}}, whereas in @file{objects.fs} it |
|
changes more or less implicitly at @code{m: ... ;m}. Such a change at |
|
the method entry point is unnecessary with the Zsoter's model, because |
|
the receiving object is the active object already. On the other hand, the explicit |
|
change is absolutely necessary in that model, because otherwise no one |
|
could ever change the active object. An ANS Forth implementation of this |
|
model is available at @url{http://www.forth.org/fig/oopf.html}. |
|
|
You can recognize words defined by a @code{CREATE}...@code{DOES>} word |
@cindex @file{oof.fs}, differences to other models |
with @code{>DOES-CODE}. If the word was defined in that way, the value |
The @file{oof.fs} model combines information hiding and overloading |
returned is different from 0 and identifies the @code{DOES>} used by the |
resolution (by keeping names in various word lists) with object-oriented |
defining word. |
programming. It sets the active object implicitly on method entry, but |
@comment TODO should that be "identifies the xt of the DOES> ?? |
also allows explicit changing (with @code{>o...o>} or with |
|
@code{with...endwith}). It uses parsing and state-smart objects and |
|
classes for resolving overloading and for early binding: the object or |
|
class parses the selector and determines the method from this. If the |
|
selector is not parsed by an object or class, it performs a call to the |
|
selector for the active object (late binding), like Zsoter's model. |
|
Fields are always accessed through the active object. The big |
|
disadvantage of this model is the parsing and the state-smartness, which |
|
reduces extensibility and increases the opportunities for subtle bugs; |
|
essentially, you are only safe if you never tick or @code{postpone} an |
|
object or class (Bernd disagrees, but I (Anton) am not convinced). |
|
|
|
@cindex @file{mini-oof.fs}, differences to other models |
|
The @file{mini-oof.fs} model is quite similar to a very stripped-down version of |
|
the @file{objects.fs} model, but syntactically it is a mixture of the @file{objects.fs} and |
|
@file{oof.fs} models. |
|
|
@node Passing Commands to the OS, Miscellaneous Words, Threading Words, Words |
@c ------------------------------------------------------------- |
|
@node Passing Commands to the OS, Miscellaneous Words, Object-oriented Forth, Words |
@section Passing Commands to the Operating System |
@section Passing Commands to the Operating System |
@cindex operating system - passing commands |
@cindex operating system - passing commands |
@cindex shell commands |
@cindex shell commands |
Line 7102 doc-system
|
Line 7275 doc-system
|
doc-$? |
doc-$? |
doc-getenv |
doc-getenv |
|
|
|
@c ------------------------------------------------------------- |
@node Miscellaneous Words, , Passing Commands to the OS, Words |
@node Miscellaneous Words, , Passing Commands to the OS, Words |
@section Miscellaneous Words |
@section Miscellaneous Words |
@cindex miscellaneous words |
@cindex miscellaneous words |
|
|
These section lists the ANS Standard Forth words that are not documented |
These section lists the ANS Forth words that are not documented |
elsewhere in this manual. Ultimately, they all need proper homes. |
elsewhere in this manual. Ultimately, they all need proper homes. |
|
|
doc-, |
doc-, |
Line 7126 doc-word
|
Line 7299 doc-word
|
doc-[compile] |
doc-[compile] |
doc-refill |
doc-refill |
|
|
These ANS Standard Forth words are not currently implemented in Gforth |
These ANS Forth words are not currently implemented in Gforth |
(see TODO section on dependencies) |
(see TODO section on dependencies) |
|
|
The following ANS Standard Forth words are not currently supported by Gforth |
The following ANS Forth words are not currently supported by Gforth |
(@pxref{ANS conformance}) |
(@pxref{ANS conformance}) |
|
|
@code{EDITOR} |
@code{EDITOR} |
Line 7385 installation-dependent. Currently a char
|
Line 7558 installation-dependent. Currently a char
|
|
|
@item character-set extensions and matching of names: |
@item character-set extensions and matching of names: |
@cindex character-set extensions and matching of names |
@cindex character-set extensions and matching of names |
@cindex case sensitivity for name lookup |
@cindex case-sensitivity for name lookup |
@cindex name lookup, case sensitivity |
@cindex name lookup, case-sensitivity |
@cindex locale and case sensitivity |
@cindex locale and case-sensitivity |
Any character except the ASCII NUL character can be used in a |
Any character except the ASCII NUL character can be used in a |
name. Matching is case-insensitive (except in @code{TABLE}s). The |
name. Matching is case-insensitive (except in @code{TABLE}s). The |
matching is performed using the C function @code{strncasecmp}, whose |
matching is performed using the C function @code{strncasecmp}, whose |
Line 7413 like @code{PARSE} otherwise. @code{(NAME
|
Line 7586 like @code{PARSE} otherwise. @code{(NAME
|
interpreter (aka text interpreter) by default, treats all white-space |
interpreter (aka text interpreter) by default, treats all white-space |
characters as delimiters. |
characters as delimiters. |
|
|
@item format of the control flow stack: |
@item format of the control-flow stack: |
@cindex control flow stack, format |
@cindex control-flow stack, format |
The data stack is used as control flow stack. The size of a control flow |
The data stack is used as control-flow stack. The size of a control-flow |
stack item in cells is given by the constant @code{cs-item-size}. At the |
stack item in cells is given by the constant @code{cs-item-size}. At the |
time of this writing, an item consists of a (pointer to a) locals list |
time of this writing, an item consists of a (pointer to a) locals list |
(third), an address in the code (second), and a tag for identifying the |
(third), an address in the code (second), and a tag for identifying the |
Line 7443 The error string is stored into the vari
|
Line 7616 The error string is stored into the vari
|
@item input line terminator: |
@item input line terminator: |
@cindex input line terminator |
@cindex input line terminator |
@cindex line terminator on input |
@cindex line terminator on input |
@cindex newline charcter on input |
@cindex newline character on input |
For interactive input, @kbd{C-m} (CR) and @kbd{C-j} (LF) terminate |
For interactive input, @kbd{C-m} (CR) and @kbd{C-j} (LF) terminate |
lines. One of these characters is typically produced when you type the |
lines. One of these characters is typically produced when you type the |
@kbd{Enter} or @kbd{Return} key. |
@kbd{Enter} or @kbd{Return} key. |
Line 7548 The remainder of dictionary space. @code
|
Line 7721 The remainder of dictionary space. @code
|
|
|
@item system case-sensitivity characteristics: |
@item system case-sensitivity characteristics: |
@cindex case-sensitivity characteristics |
@cindex case-sensitivity characteristics |
Dictionary searches are case insensitive (except in |
Dictionary searches are case-insensitive (except in |
@code{TABLE}s). However, as explained above under @i{character-set |
@code{TABLE}s). However, as explained above under @i{character-set |
extensions}, the matching for non-ASCII characters is determined by the |
extensions}, the matching for non-ASCII characters is determined by the |
locale you are using. In the default @code{C} locale all non-ASCII |
locale you are using. In the default @code{C} locale all non-ASCII |
Line 7594 No.
|
Line 7767 No.
|
|
|
@item a name is neither a word nor a number: |
@item a name is neither a word nor a number: |
@cindex name not found |
@cindex name not found |
@cindex Undefined word |
@cindex undefined word |
@code{-13 throw} (Undefined word). Actually, @code{-13 bounce}, which |
@code{-13 throw} (Undefined word). Actually, @code{-13 bounce}, which |
preserves the data and FP stack, so you don't lose more work than |
preserves the data and FP stack, so you don't lose more work than |
necessary. |
necessary. |
|
|
@item a definition name exceeds the maximum length allowed: |
@item a definition name exceeds the maximum length allowed: |
@cindex Word name too long |
@cindex word name too long |
@code{-19 throw} (Word name too long) |
@code{-19 throw} (Word name too long) |
|
|
@item addressing a region not inside the various data spaces of the forth system: |
@item addressing a region not inside the various data spaces of the forth system: |
Line 7611 the operating system. On decent systems:
|
Line 7784 the operating system. On decent systems:
|
address). |
address). |
|
|
@item argument type incompatible with parameter: |
@item argument type incompatible with parameter: |
@cindex Argument type mismatch |
@cindex argument type mismatch |
This is usually not caught. Some words perform checks, e.g., the control |
This is usually not caught. Some words perform checks, e.g., the control |
flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type |
flow words, and issue a @code{ABORT"} or @code{-12 THROW} (Argument type |
mismatch). |
mismatch). |
Line 7626 get an execution token for @code{compile
|
Line 7799 get an execution token for @code{compile
|
@item dividing by zero: |
@item dividing by zero: |
@cindex dividing by zero |
@cindex dividing by zero |
@cindex floating point unidentified fault, integer division |
@cindex floating point unidentified fault, integer division |
@cindex divide by zero |
|
On better platforms, this produces a @code{-10 throw} (Division by |
On better platforms, this produces a @code{-10 throw} (Division by |
zero); on other systems, this typically results in a @code{-55 throw} |
zero); on other systems, this typically results in a @code{-55 throw} |
(Floating-point unidentified fault). |
(Floating-point unidentified fault). |
Line 7634 zero); on other systems, this typically
|
Line 7806 zero); on other systems, this typically
|
@item insufficient data stack or return stack space: |
@item insufficient data stack or return stack space: |
@cindex insufficient data stack or return stack space |
@cindex insufficient data stack or return stack space |
@cindex stack overflow |
@cindex stack overflow |
@cindex Address alignment exception, stack overflow |
@cindex address alignment exception, stack overflow |
@cindex Invalid memory address, stack overflow |
@cindex Invalid memory address, stack overflow |
Depending on the operating system, the installation, and the invocation |
Depending on the operating system, the installation, and the invocation |
of Gforth, this is either checked by the memory management hardware, or |
of Gforth, this is either checked by the memory management hardware, or |
Line 7729 Compiles a recursive call to the definin
|
Line 7901 Compiles a recursive call to the definin
|
|
|
@item argument input source different than current input source for @code{RESTORE-INPUT}: |
@item argument input source different than current input source for @code{RESTORE-INPUT}: |
@cindex argument input source different than current input source for @code{RESTORE-INPUT} |
@cindex argument input source different than current input source for @code{RESTORE-INPUT} |
@cindex Argument type mismatch, @code{RESTORE-INPUT} |
@cindex argument type mismatch, @code{RESTORE-INPUT} |
@cindex @code{RESTORE-INPUT}, Argument type mismatch |
@cindex @code{RESTORE-INPUT}, Argument type mismatch |
@code{-12 THROW}. Note that, once an input file is closed (e.g., because |
@code{-12 THROW}. Note that, once an input file is closed (e.g., because |
the end of the file was reached), its source-id may be |
the end of the file was reached), its source-id may be |
Line 7748 memory access faults or execution of ill
|
Line 7920 memory access faults or execution of ill
|
@item data space read/write with incorrect alignment: |
@item data space read/write with incorrect alignment: |
@cindex data space read/write with incorrect alignment |
@cindex data space read/write with incorrect alignment |
@cindex alignment faults |
@cindex alignment faults |
@cindex Address alignment exception |
@cindex address alignment exception |
Processor-dependent. Typically results in a @code{-23 throw} (Address |
Processor-dependent. Typically results in a @code{-23 throw} (Address |
alignment exception). Under Linux-Intel on a 486 or later processor with |
alignment exception). Under Linux-Intel on a 486 or later processor with |
alignment turned on, incorrect alignment results in a @code{-9 throw} |
alignment turned on, incorrect alignment results in a @code{-9 throw} |
Line 7781 defined by @code{CONSTANT}; in the latte
|
Line 7953 defined by @code{CONSTANT}; in the latte
|
|
|
@item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}): |
@item name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}): |
@cindex name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}) |
@cindex name not found (@code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]}) |
@cindex Undefined word, @code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]} |
@cindex undefined word, @code{'}, @code{POSTPONE}, @code{[']}, @code{[COMPILE]} |
@code{-13 throw} (Undefined word) |
@code{-13 throw} (Undefined word) |
|
|
@item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}): |
@item parameters are not of the same type (@code{DO}, @code{?DO}, @code{WITHIN}): |
Line 7795 Assume @code{: X POSTPONE TO ; IMMEDIATE
|
Line 7967 Assume @code{: X POSTPONE TO ; IMMEDIATE
|
compilation semantics of @code{TO}. |
compilation semantics of @code{TO}. |
|
|
@item String longer than a counted string returned by @code{WORD}: |
@item String longer than a counted string returned by @code{WORD}: |
@cindex String longer than a counted string returned by @code{WORD} |
@cindex string longer than a counted string returned by @code{WORD} |
@cindex @code{WORD}, string overflow |
@cindex @code{WORD}, string overflow |
Not checked. The string will be ok, but the count will, of course, |
Not checked. The string will be ok, but the count will, of course, |
contain only the least significant bits of the length. |
contain only the least significant bits of the length. |
Line 8507 as well as possible.
|
Line 8679 as well as possible.
|
@cindex @code{FORGET}, deleting the compilation word list |
@cindex @code{FORGET}, deleting the compilation word list |
Not implemented (yet). |
Not implemented (yet). |
|
|
@item fewer than @var{u}+1 items on the control flow stack (@code{CS-PICK}, @code{CS-ROLL}): |
@item fewer than @var{u}+1 items on the control-flow stack (@code{CS-PICK}, @code{CS-ROLL}): |
@cindex @code{CS-PICK}, fewer than @var{u}+1 items on the control flow stack |
@cindex @code{CS-PICK}, fewer than @var{u}+1 items on the control flow-stack |
@cindex @code{CS-ROLL}, fewer than @var{u}+1 items on the control flow stack |
@cindex @code{CS-ROLL}, fewer than @var{u}+1 items on the control flow-stack |
@cindex control-flow stack underflow |
@cindex control-flow stack underflow |
This typically results in an @code{abort"} with a descriptive error |
This typically results in an @code{abort"} with a descriptive error |
message (may change into a @code{-22 throw} (Control structure mismatch) |
message (may change into a @code{-22 throw} (Control structure mismatch) |
Line 8596 are applied to the latest defined word (
|
Line 8768 are applied to the latest defined word (
|
|
|
@item search order empty (@code{previous}): |
@item search order empty (@code{previous}): |
@cindex @code{previous}, search order empty |
@cindex @code{previous}, search order empty |
@cindex Vocstack empty, @code{previous} |
@cindex vocstack empty, @code{previous} |
@code{abort" Vocstack empty"}. |
@code{abort" Vocstack empty"}. |
|
|
@item too many word lists in search order (@code{also}): |
@item too many word lists in search order (@code{also}): |
@cindex @code{also}, too many word lists in search order |
@cindex @code{also}, too many word lists in search order |
@cindex Vocstack full, @code{also} |
@cindex vocstack full, @code{also} |
@code{abort" Vocstack full"}. |
@code{abort" Vocstack full"}. |
|
|
@end table |
@end table |
Line 8664 Signals?
|
Line 8836 Signals?
|
|
|
Accessing the Stacks |
Accessing the Stacks |
|
|
|
@c ****************************************************************** |
@node Emacs and Gforth, Image Files, Integrating Gforth, Top |
@node Emacs and Gforth, Image Files, Integrating Gforth, Top |
@chapter Emacs and Gforth |
@chapter Emacs and Gforth |
@cindex Emacs and Gforth |
@cindex Emacs and Gforth |
Line 8678 Accessing the Stacks
|
Line 8851 Accessing the Stacks
|
@cindex Forth mode in Emacs |
@cindex Forth mode in Emacs |
Gforth comes with @file{gforth.el}, an improved version of |
Gforth comes with @file{gforth.el}, an improved version of |
@file{forth.el} by Goran Rydqvist (included in the TILE package). The |
@file{forth.el} by Goran Rydqvist (included in the TILE package). The |
improvements are a better (but still not perfect) handling of |
improvements are: |
indentation. I have also added comment paragraph filling (@kbd{M-q}), |
|
commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) regions and |
@itemize @bullet |
removing debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). I left the |
@item |
stuff I do not use alone, even though some of it only makes sense for |
A better (but still not perfect) handling of indentation. |
TILE. To get a description of these features, enter Forth mode and type |
@item |
@kbd{C-h m}. |
Comment paragraph filling (@kbd{M-q}) |
|
@item |
|
Commenting (@kbd{C-x \}) and uncommenting (@kbd{C-u C-x \}) of regions |
|
@item |
|
Removal of debugging tracers (@kbd{C-x ~}, @pxref{Debugging}). |
|
@end itemize |
|
|
|
I left the stuff I do not use alone, even though some of it only makes |
|
sense for TILE. To get a description of these features, enter Forth mode |
|
and type @kbd{C-h m}. |
|
|
@cindex source location of error or debugging output in Emacs |
@cindex source location of error or debugging output in Emacs |
@cindex error output, finding the source location in Emacs |
@cindex error output, finding the source location in Emacs |
Line 8700 message is only a few keystrokes away (@
|
Line 8882 message is only a few keystrokes away (@
|
@cindex @file{TAGS} file |
@cindex @file{TAGS} file |
@cindex @file{etags.fs} |
@cindex @file{etags.fs} |
@cindex viewing the source of a word in Emacs |
@cindex viewing the source of a word in Emacs |
Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file |
Also, if you @code{include} @file{etags.fs}, a new @file{TAGS} file will |
(@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) will be produced that |
be produced (@pxref{Tags, , Tags Tables, emacs, Emacs Manual}) that |
contains the definitions of all words defined afterwards. You can then |
contains the definitions of all words defined afterwards. You can then |
find the source for a word using @kbd{M-.}. Note that emacs can use |
find the source for a word using @kbd{M-.}. Note that emacs can use |
several tags files at the same time (e.g., one for the Gforth sources |
several tags files at the same time (e.g., one for the Gforth sources |
Line 8719 file:
|
Line 8901 file:
|
(setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist)) |
(setq auto-mode-alist (cons '("\\.fs\\'" . forth-mode) auto-mode-alist)) |
@end example |
@end example |
|
|
|
@c ****************************************************************** |
@node Image Files, Engine, Emacs and Gforth, Top |
@node Image Files, Engine, Emacs and Gforth, Top |
@chapter Image Files |
@chapter Image Files |
@cindex image files |
@cindex image file |
@cindex @code{.fi} files |
@cindex @file{.fi} files |
@cindex precompiled Forth code |
@cindex precompiled Forth code |
@cindex dictionary in persistent form |
@cindex dictionary in persistent form |
@cindex persistent form of dictionary |
@cindex persistent form of dictionary |
Line 8774 Our Forth system consists not only of pr
|
Line 8957 Our Forth system consists not only of pr
|
definitions written in Forth. Since the Forth compiler itself belongs to |
definitions written in Forth. Since the Forth compiler itself belongs to |
those definitions, it is not possible to start the system with the |
those definitions, it is not possible to start the system with the |
primitives and the Forth source alone. Therefore we provide the Forth |
primitives and the Forth source alone. Therefore we provide the Forth |
code as an image file in nearly executable form. At the start of the |
code as an image file in nearly executable form. When Gforth starts up, |
system a C routine loads the image file into memory, optionally |
a C routine loads the image file into memory, optionally relocates the |
relocates the addresses, then sets up the memory (stacks etc.) according |
addresses, then sets up the memory (stacks etc.) according to |
to information in the image file, and starts executing Forth code. |
information in the image file, and (finally) starts executing Forth |
|
code. |
|
|
The image file variants represent different compromises between the |
The image file variants represent different compromises between the |
goals of making it easy to generate image files and making them |
goals of making it easy to generate image files and making them |
portable. |
portable. |
|
|
@cindex relocation at run-time |
@cindex relocation at run-time |
Win32Forth 3.4 and Mitch Bradleys @code{cforth} use relocation at |
Win32Forth 3.4 and Mitch Bradley's @code{cforth} use relocation at |
run-time. This avoids many of the complications discussed below (image |
run-time. This avoids many of the complications discussed below (image |
files are data relocatable without further ado), but costs performance |
files are data relocatable without further ado), but costs performance |
(one addition per memory access). |
(one addition per memory access). |
|
|
@cindex relocation at load-time |
@cindex relocation at load-time |
By contrast, our loader performs relocation at image load time. The |
By contrast, the Gforth loader performs relocation at image load time. The |
loader also has to replace tokens standing for primitive calls with the |
loader also has to replace tokens that represent primitive calls with the |
appropriate code-field addresses (or code addresses in the case of |
appropriate code-field addresses (or code addresses in the case of |
direct threading). |
direct threading). |
|
|
Line 8809 caused by the design of the image file l
|
Line 8993 caused by the design of the image file l
|
@item |
@item |
There is only one segment; in particular, this means, that an image file |
There is only one segment; in particular, this means, that an image file |
cannot represent @code{ALLOCATE}d memory chunks (and pointers to |
cannot represent @code{ALLOCATE}d memory chunks (and pointers to |
them). And the contents of the stacks are not represented, either. |
them). The contents of the stacks are not represented, either. |
|
|
@item |
@item |
The only kinds of relocation supported are: adding the same offset to |
The only kinds of relocation supported are: adding the same offset to |
Line 8855 a place where it is stored in a non-mang
|
Line 9039 a place where it is stored in a non-mang
|
@node Non-Relocatable Image Files, Data-Relocatable Image Files, Image File Background, Image Files |
@node Non-Relocatable Image Files, Data-Relocatable Image Files, Image File Background, Image Files |
@section Non-Relocatable Image Files |
@section Non-Relocatable Image Files |
@cindex non-relocatable image files |
@cindex non-relocatable image files |
@cindex image files, non-relocatable |
@cindex image file, non-relocatable |
|
|
These files are simple memory dumps of the dictionary. They are specific |
These files are simple memory dumps of the dictionary. They are specific |
to the executable (i.e., @file{gforth} file) they were created |
to the executable (i.e., @file{gforth} file) they were created |
Line 8873 doc-savesystem
|
Line 9057 doc-savesystem
|
@node Data-Relocatable Image Files, Fully Relocatable Image Files, Non-Relocatable Image Files, Image Files |
@node Data-Relocatable Image Files, Fully Relocatable Image Files, Non-Relocatable Image Files, Image Files |
@section Data-Relocatable Image Files |
@section Data-Relocatable Image Files |
@cindex data-relocatable image files |
@cindex data-relocatable image files |
@cindex image files, data-relocatable |
@cindex image file, data-relocatable |
|
|
These files contain relocatable data addresses, but fixed code addresses |
These files contain relocatable data addresses, but fixed code addresses |
(instead of tokens). They are specific to the executable (i.e., |
(instead of tokens). They are specific to the executable (i.e., |
Line 8886 Relocatable Image Files}).
|
Line 9070 Relocatable Image Files}).
|
@node Fully Relocatable Image Files, Stack and Dictionary Sizes, Data-Relocatable Image Files, Image Files |
@node Fully Relocatable Image Files, Stack and Dictionary Sizes, Data-Relocatable Image Files, Image Files |
@section Fully Relocatable Image Files |
@section Fully Relocatable Image Files |
@cindex fully relocatable image files |
@cindex fully relocatable image files |
@cindex image files, fully relocatable |
@cindex image file, fully relocatable |
|
|
@cindex @file{kern*.fi}, relocatability |
@cindex @file{kern*.fi}, relocatability |
@cindex @file{gforth.fi}, relocatability |
@cindex @file{gforth.fi}, relocatability |
Line 9021 gforth -i @var{image}
|
Line 9205 gforth -i @var{image}
|
@end example |
@end example |
|
|
@cindex executable image file |
@cindex executable image file |
@cindex image files, executable |
@cindex image file, executable |
If your operating system supports starting scripts with a line of the |
If your operating system supports starting scripts with a line of the |
form @code{#! ...}, you just have to type the image file name to start |
form @code{#! ...}, you just have to type the image file name to start |
Gforth with this image file (note that the file extension @code{.fi} is |
Gforth with this image file (note that the file extension @code{.fi} is |
just a convention). I.e., to run Gforth with the image file @var{image}, |
just a convention). I.e., to run Gforth with the image file @var{image}, |
you can just type @var{image} instead of @code{gforth -i @var{image}}. |
you can just type @var{image} instead of @code{gforth -i @var{image}}. |
|
|
|
For example, if you place this text in a file: |
|
|
|
@example |
|
#! /usr/local/bin/gforth |
|
|
|
." Hello, world" CR |
|
bye |
|
|
|
@end example |
|
|
|
@noindent |
|
And then make the file executable (chmod +x in Unix), you can run it |
|
directly from the command line. The sequence @code{#!} is used in two |
|
ways; firstly, it is recognised as a ``magic sequence'' by the operating |
|
system, secondly it is treated as a comment character by Gforth. Because |
|
of the second usage, a space is required between @code{#!} and the path |
|
to the executable. |
|
@comment TODO describe the #! magic with reference to the Power Tools book. |
|
|
doc-#! |
doc-#! |
|
|
@node Modifying the Startup Sequence, , Running Image Files, Image Files |
@node Modifying the Startup Sequence, , Running Image Files, Image Files |
Line 9037 doc-#!
|
Line 9240 doc-#!
|
@cindex initialization sequence of image file |
@cindex initialization sequence of image file |
|
|
You can add your own initialization to the startup sequence through the |
You can add your own initialization to the startup sequence through the |
deferred word |
deferred word @code{'cold}. @code{'cold} is invoked just before the |
|
image-specific command line processing (by default, loading files and |
doc-'cold |
evaluating (@code{-e}) strings) starts. |
|
|
@code{'cold} is invoked just before the image-specific command line |
|
processing (by default, loading files and evaluating (@code{-e}) strings) |
|
starts. |
|
|
|
A sequence for adding your initialization usually looks like this: |
A sequence for adding your initialization usually looks like this: |
|
|
Line 9055 A sequence for adding your initializatio
|
Line 9254 A sequence for adding your initializatio
|
@end example |
@end example |
|
|
@cindex turnkey image files |
@cindex turnkey image files |
@cindex image files, turnkey applications |
@cindex image file, turnkey applications |
You can make a turnkey image by letting @code{'cold} execute a word |
You can make a turnkey image by letting @code{'cold} execute a word |
(your turnkey application) that never returns; instead, it exits Gforth |
(your turnkey application) that never returns; instead, it exits Gforth |
via @code{bye} or @code{throw}. |
via @code{bye} or @code{throw}. |
Line 9063 via @code{bye} or @code{throw}.
|
Line 9262 via @code{bye} or @code{throw}.
|
@cindex command-line arguments, access |
@cindex command-line arguments, access |
@cindex arguments on the command line, access |
@cindex arguments on the command line, access |
You can access the (image-specific) command-line arguments through the |
You can access the (image-specific) command-line arguments through the |
variables @code{argc} and @code{argv}. @code{arg} provides conventient |
variables @code{argc} and @code{argv}. @code{arg} provides convenient |
access to @code{argv}. |
access to @code{argv}. |
|
|
|
If @code{'cold} exits normally, Gforth processes the command-line |
|
arguments as files to be loaded and strings to be evaluated. Therefore, |
|
@code{'cold} should remove the arguments it has used in this case. |
|
|
|
doc-'cold |
doc-argc |
doc-argc |
doc-argv |
doc-argv |
doc-arg |
doc-arg |
|
|
If @code{'cold} exits normally, Gforth processes the command-line |
|
arguments as files to be loaded and strings to be evaluated. Therefore, |
|
@code{'cold} should remove the arguments it has used in this case. |
|
|
|
@c ****************************************************************** |
@c ****************************************************************** |
@node Engine, Binding to System Library, Image Files, Top |
@node Engine, Binding to System Library, Image Files, Top |
Line 9080 arguments as files to be loaded and stri
|
Line 9281 arguments as files to be loaded and stri
|
@cindex engine |
@cindex engine |
@cindex virtual machine |
@cindex virtual machine |
|
|
Reading this section is not necessary for programming with Gforth. It |
Reading this chapter is not necessary for programming with Gforth. It |
may be helpful for finding your way in the Gforth sources. |
may be helpful for finding your way in the Gforth sources. |
|
|
The ideas in this section have also been published in the papers |
The ideas in this section have also been published in the papers |
Line 9100 Ertl, presented at EuroForth '93; the la
|
Line 9301 Ertl, presented at EuroForth '93; the la
|
@section Portability |
@section Portability |
@cindex engine portability |
@cindex engine portability |
|
|
One of the main goals of the effort is availability across a wide range |
An important goal of the Gforth Project is availability across a wide |
of personal machines. fig-Forth, and, to a lesser extent, F83, achieved |
range of personal machines. fig-Forth, and, to a lesser extent, F83, |
this goal by manually coding the engine in assembly language for several |
achieved this goal by manually coding the engine in assembly language |
then-popular processors. This approach is very labor-intensive and the |
for several then-popular processors. This approach is very |
results are short-lived due to progress in computer architecture. |
labor-intensive and the results are short-lived due to progress in |
|
computer architecture. |
|
|
@cindex C, using C for the engine |
@cindex C, using C for the engine |
Others have avoided this problem by coding in C, e.g., Mitch Bradley |
Others have avoided this problem by coding in C, e.g., Mitch Bradley |
Line 9169 makes it possible to take the address of
|
Line 9371 makes it possible to take the address of
|
@code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as |
@code{goto *@var{address}}. I.e., @code{goto *&&x} is the same as |
@code{goto x}. |
@code{goto x}. |
|
|
@cindex NEXT, indirect threaded |
@cindex @code{NEXT}, indirect threaded |
@cindex indirect threaded inner interpreter |
@cindex indirect threaded inner interpreter |
@cindex inner interpreter, indirect threaded |
@cindex inner interpreter, indirect threaded |
With this feature an indirect threaded NEXT looks like: |
With this feature an indirect threaded @code{NEXT} looks like: |
@example |
@example |
cfa = *ip++; |
cfa = *ip++; |
ca = *cfa; |
ca = *cfa; |
Line 9186 executed; The @code{ca} (code address) f
|
Line 9388 executed; The @code{ca} (code address) f
|
executable code, e.g., a primitive or the colon definition handler |
executable code, e.g., a primitive or the colon definition handler |
@code{docol}. |
@code{docol}. |
|
|
@cindex NEXT, direct threaded |
@cindex @code{NEXT}, direct threaded |
@cindex direct threaded inner interpreter |
@cindex direct threaded inner interpreter |
@cindex inner interpreter, direct threaded |
@cindex inner interpreter, direct threaded |
Direct threading is even simpler: |
Direct threading is even simpler: |
Line 9196 goto *ca;
|
Line 9398 goto *ca;
|
@end example |
@end example |
|
|
Of course we have packaged the whole thing neatly in macros called |
Of course we have packaged the whole thing neatly in macros called |
@code{NEXT} and @code{NEXT1} (the part of NEXT after fetching the cfa). |
@code{NEXT} and @code{NEXT1} (the part of @code{NEXT} after fetching the cfa). |
|
|
@menu |
@menu |
* Scheduling:: |
* Scheduling:: |
Line 9221 sp++;
|
Line 9423 sp++;
|
sp[0]=n; |
sp[0]=n; |
NEXT; |
NEXT; |
@end example |
@end example |
the NEXT comes strictly after the other code, i.e., there is nearly no |
the @code{NEXT} comes strictly after the other code, i.e., there is nearly no |
scheduling. After a little thought the problem becomes clear: The |
scheduling. After a little thought the problem becomes clear: The |
compiler cannot know that @code{sp} and @code{ip} point to different |
compiler cannot know that @code{sp} and @code{ip} point to different |
addresses (and the version of @code{gcc} we used would not know it even |
addresses (and the version of @code{gcc} we used would not know it even |
Line 9229 if it was possible), so it could not mov
|
Line 9431 if it was possible), so it could not mov
|
store to the TOS. Indeed the pointers could be the same, if code on or |
store to the TOS. Indeed the pointers could be the same, if code on or |
very near the top of stack were executed. In the interest of speed we |
very near the top of stack were executed. In the interest of speed we |
chose to forbid this probably unused ``feature'' and helped the compiler |
chose to forbid this probably unused ``feature'' and helped the compiler |
in scheduling: NEXT is divided into the loading part (@code{NEXT_P1}) |
in scheduling: @code{NEXT} is divided into the loading part (@code{NEXT_P1}) |
and the goto part (@code{NEXT_P2}). @code{+} now looks like: |
and the goto part (@code{NEXT_P2}). @code{+} now looks like: |
@example |
@example |
n=sp[0]+sp[1]; |
n=sp[0]+sp[1]; |
Line 9274 supported on all machines.
|
Line 9476 supported on all machines.
|
@subsection DOES> |
@subsection DOES> |
@cindex @code{DOES>} implementation |
@cindex @code{DOES>} implementation |
|
|
@cindex dodoes routine |
@cindex @code{dodoes} routine |
@cindex DOES-code |
@cindex @code{DOES>}-code |
One of the most complex parts of a Forth engine is @code{dodoes}, i.e., |
One of the most complex parts of a Forth engine is @code{dodoes}, i.e., |
the chunk of code executed by every word defined by a |
the chunk of code executed by every word defined by a |
@code{CREATE}...@code{DOES>} pair. The main problem here is: How to find |
@code{CREATE}...@code{DOES>} pair. The main problem here is: How to find |
the Forth code to be executed, i.e. the code after the |
the Forth code to be executed, i.e. the code after the |
@code{DOES>} (the DOES-code)? There are two solutions: |
@code{DOES>} (the @code{DOES>}-code)? There are two solutions: |
|
|
In fig-Forth the code field points directly to the @code{dodoes} and the |
In fig-Forth the code field points directly to the @code{dodoes} and the |
DOES-code address is stored in the cell after the code address (i.e. at |
@code{DOES>}code address is stored in the cell after the code address (i.e. at |
@code{@var{cfa} cell+}). It may seem that this solution is illegal in |
@code{@var{CFA} cell+}). It may seem that this solution is illegal in |
the Forth-79 and all later standards, because in fig-Forth this address |
the Forth-79 and all later standards, because in fig-Forth this address |
lies in the body (which is illegal in these standards). However, by |
lies in the body (which is illegal in these standards). However, by |
making the code field larger for all words this solution becomes legal |
making the code field larger for all words this solution becomes legal |
Line 9296 to avoid having different image files fo
|
Line 9498 to avoid having different image files fo
|
systems (direct threaded systems require two-cell code fields on many |
systems (direct threaded systems require two-cell code fields on many |
machines). |
machines). |
|
|
@cindex DOES-handler |
@cindex @code{DOES>}-handler |
The other approach is that the code field points or jumps to the cell |
The other approach is that the code field points or jumps to the cell |
after @code{DOES}. In this variant there is a jump to @code{dodoes} at |
after @code{DOES>}. In this variant there is a jump to @code{dodoes} at |
this address (the DOES-handler). @code{dodoes} can then get the |
this address (the @code{DOES>}-handler). @code{dodoes} can then get the |
DOES-code address by computing the code address, i.e., the address of |
@code{DOES>}-code address by computing the code address, i.e., the address of |
the jump to dodoes, and add the length of that jump field. A variant of |
the jump to dodoes, and add the length of that jump field. A variant of |
this is to have a call to @code{dodoes} after the @code{DOES>}; then the |
this is to have a call to @code{dodoes} after the @code{DOES>}; then the |
return address (which can be found in the return register on RISCs) is |
return address (which can be found in the return register on RISCs) is |
the DOES-code address. Since the two cells available in the code field |
the @code{DOES>}-code address. Since the two cells available in the code field |
are used up by the jump to the code address in direct threading on many |
are used up by the jump to the code address in direct threading on many |
architectures, we use this approach for direct threading on these |
architectures, we use this approach for direct threading on these |
architectures. We did not want to add another cell to the code field. |
architectures. We did not want to add another cell to the code field. |
Line 9388 well and produces optimal code for @code
|
Line 9590 well and produces optimal code for @code
|
HP RISC machines: Defining the @code{n}s does not produce any code, and |
HP RISC machines: Defining the @code{n}s does not produce any code, and |
using them as intermediate storage also adds no cost. |
using them as intermediate storage also adds no cost. |
|
|
There are also other optimizations, that are not illustrated by this |
There are also other optimizations that are not illustrated by this |
example: Assignments between simple variables are usually for free (copy |
example: assignments between simple variables are usually for free (copy |
propagation). If one of the stack items is not used by the primitive |
propagation). If one of the stack items is not used by the primitive |
(e.g. in @code{drop}), the compiler eliminates the load from the stack |
(e.g. in @code{drop}), the compiler eliminates the load from the stack |
(dead code elimination). On the other hand, there are some things that |
(dead code elimination). On the other hand, there are some things that |
Line 9400 a stack item to the place where it just
|
Line 9602 a stack item to the place where it just
|
While programming a primitive is usually easy, there are a few cases |
While programming a primitive is usually easy, there are a few cases |
where the programmer has to take the actions of the generator into |
where the programmer has to take the actions of the generator into |
account, most notably @code{?dup}, but also words that do not (always) |
account, most notably @code{?dup}, but also words that do not (always) |
fall through to NEXT. |
fall through to @code{NEXT}. |
|
|
@node TOS Optimization, Produced code, Automatic Generation, Primitives |
@node TOS Optimization, Produced code, Automatic Generation, Primitives |
@subsection TOS Optimization |
@subsection TOS Optimization |
Line 9530 matmul 1.00 1.47 1.35 1.46 0.74
|
Line 9732 matmul 1.00 1.47 1.35 1.46 0.74
|
fib 1.00 1.52 1.34 1.22 0.86 1.74 2.99 4.30 |
fib 1.00 1.52 1.34 1.22 0.86 1.74 2.99 4.30 |
@end example |
@end example |
|
|
You may find the good performance of Gforth compared with the systems |
You may be quite surprised by the good performance of Gforth when |
written in assembly language quite surprising. One important reason for |
compared with systems written in assembly language. One important reason |
the disappointing performance of these systems is probably that they are |
for the disappointing performance of these other systems is probably |
not written optimally for the 486 (e.g., they use the @code{lods} |
that they are not written optimally for the 486 (e.g., they use the |
instruction). In addition, Win32Forth uses a comfortable, but costly |
@code{lods} instruction). In addition, Win32Forth uses a comfortable, |
method for relocating the Forth image: like @code{cforth}, it computes |
but costly method for relocating the Forth image: like @code{cforth}, it |
the actual addresses at run time, resulting in two address computations |
computes the actual addresses at run time, resulting in two address |
per NEXT (@pxref{Image File Background}). |
computations per @code{NEXT} (@pxref{Image File Background}). |
|
|
Only Eforth with the peephole optimizer performs comparable to |
Only Eforth with the peephole optimizer has a performance that is |
Gforth. The speedups achieved with peephole optimization of threaded |
comparable to Gforth. The speedups achieved with peephole optimization |
code are quite remarkable. Adding a peephole optimizer to Gforth should |
of threaded code are quite remarkable. Adding a peephole optimizer to |
cause similar speedups. |
Gforth should cause similar speedups. |
|
|
The speedup of Gforth over PFE, ThisForth and TILE can be easily |
The speedup of Gforth over PFE, ThisForth and TILE can be easily |
explained with the self-imposed restriction of the latter systems to |
explained with the self-imposed restriction of the latter systems to |
Line 9552 Vars, , Defining Global Register Variabl
|
Line 9754 Vars, , Defining Global Register Variabl
|
Moreover, current C compilers have a hard time optimizing other aspects |
Moreover, current C compilers have a hard time optimizing other aspects |
of the ThisForth and the TILE source. |
of the ThisForth and the TILE source. |
|
|
Note that the performance of Gforth on 386 architecture processors |
The performance of Gforth on 386 architecture processors varies widely |
varies widely with the version of @code{gcc} used. E.g., @code{gcc-2.5.8} |
with the version of @code{gcc} used. E.g., @code{gcc-2.5.8} failed to |
failed to allocate any of the virtual machine registers into real |
allocate any of the virtual machine registers into real machine |
machine registers by itself and would not work correctly with explicit |
registers by itself and would not work correctly with explicit register |
register declarations, giving a 1.3 times slower engine (on a 486DX2/66 |
declarations, giving a 1.3 times slower engine (on a 486DX2/66 running |
running the Sieve) than the one measured above. |
the Sieve) than the one measured above. |
|
|
Note also that there have been several releases of Win32Forth since the |
Note that there have been several releases of Win32Forth since the |
release presented here, so the results presented here may have little |
release presented here, so the results presented above may have little |
predictive value for the performance of Win32Forth today. |
predictive value for the performance of Win32Forth today. |
|
|
@cindex @file{Benchres} |
@cindex @file{Benchres} |
Line 9575 newer version of these measurements at
|
Line 9777 newer version of these measurements at
|
@url{http://www.complang.tuwien.ac.at/forth/performance.html}. You can |
@url{http://www.complang.tuwien.ac.at/forth/performance.html}. You can |
find numbers for Gforth on various machines in @file{Benchres}. |
find numbers for Gforth on various machines in @file{Benchres}. |
|
|
|
@c ****************************************************************** |
@node Binding to System Library, Cross Compiler, Engine, Top |
@node Binding to System Library, Cross Compiler, Engine, Top |
@chapter Binding to System Library |
@chapter Binding to System Library |
|
|
Line 9652 was developed across the Internet, and i
|
Line 9855 was developed across the Internet, and i
|
physically for the first 4 years of development. |
physically for the first 4 years of development. |
|
|
@section Pedigree |
@section Pedigree |
@cindex Pedigree of Gforth |
@cindex pedigree of Gforth |
|
|
Gforth descends from bigFORTH (1993) and fig-Forth. Gforth and PFE (by |
Gforth descends from bigFORTH (1993) and fig-Forth. Gforth and PFE (by |
Dirk Zoller) will cross-fertilize each other. Of course, a significant |
Dirk Zoller) will cross-fertilize each other. Of course, a significant |
Line 9702 information about Forth there.
|
Line 9905 information about Forth there.
|
|
|
@node Internet resources, Books, Forth-related information, Forth-related information |
@node Internet resources, Books, Forth-related information, Forth-related information |
@section Internet resources |
@section Internet resources |
@cindex Internet resources |
@cindex internet resources |
|
|
@cindex comp.lang.forth |
@cindex comp.lang.forth |
@cindex frequently asked questions |
@cindex frequently asked questions |
Line 9738 Research (JFAR) and a searchable Forth b
|
Line 9941 Research (JFAR) and a searchable Forth b
|
|
|
@node Books, The Forth Interest Group, Internet resources, Forth-related information |
@node Books, The Forth Interest Group, Internet resources, Forth-related information |
@section Books |
@section Books |
@cindex Books |
@cindex books on Forth |
|
|
As the Standard is relatively new, there are not many books out yet. It |
As the Standard is relatively new, there are not many books out yet. It |
is not recommended to learn Forth by using Gforth and a book that is not |
is not recommended to learn Forth by using Gforth and a book that is not |
Line 9749 should be ok, because ANS Forth is prima
|
Line 9952 should be ok, because ANS Forth is prima
|
@cindex standard document for ANS Forth |
@cindex standard document for ANS Forth |
@cindex ANS Forth document |
@cindex ANS Forth document |
The definite reference if you want to write ANS Forth programs is, of |
The definite reference if you want to write ANS Forth programs is, of |
course, the ANS Forth Standard. It is available in printed form from the |
course, the ANS Forth document. It is available in printed form from the |
National Standards Institute Sales Department (Tel.: USA (212) 642-4900; |
National Standards Institute Sales Department (Tel.: USA (212) 642-4900; |
Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about |
Fax.: USA (212) 302-1286) as document @cite{X3.215-1994} for about |
$200. You can also get it from Global Engineering Documents (Tel.: USA |
$200. You can also get it from Global Engineering Documents (Tel.: USA |
Line 9763 format); this HTML version also includes
|
Line 9966 format); this HTML version also includes
|
Interpretation (RFIs). Some pointers to these versions can be found |
Interpretation (RFIs). Some pointers to these versions can be found |
through @*@url{http://www.complang.tuwien.ac.at/projects/forth.html}. |
through @*@url{http://www.complang.tuwien.ac.at/projects/forth.html}. |
|
|
@cindex introductory book |
@cindex introductory book on Forth |
@cindex book, introductory |
@cindex book on Forth, introductory |
@cindex Woehr, Jack: @cite{Forth: The New Model} |
@cindex Woehr, Jack: @cite{Forth: The New Model} |
@cindex @cite{Forth: The new model} (book) |
@cindex @cite{Forth: The new model} (book) |
@cite{Forth: The New Model} by Jack Woehr (Prentice-Hall, 1993) is an |
@cite{Forth: The New Model} by Jack Woehr (Prentice-Hall, 1993) is an |
Line 9795 hardly more useful than a pre-ANS book.
|
Line 9998 hardly more useful than a pre-ANS book.
|
@cindex Forth interest group (FIG) |
@cindex Forth interest group (FIG) |
|
|
The Forth Interest Group (FIG) is a world-wide, non-profit, |
The Forth Interest Group (FIG) is a world-wide, non-profit, |
member-supported organisation. It publishes a regular magazine and |
member-supported organisation. It publishes a regular magazine, |
offers other benefits of membership. You can contact the FIG through |
@var{FORTH Dimensions}, and offers other benefits of membership. You can |
their office email address: @email{office@@forth.org} or by visiting |
contact the FIG through their office email address: |
their web site at @url{http://www.forth.org/}. This web site also |
@email{office@@forth.org} or by visiting their web site at |
includes links to FIG chapters in other countries and American cities |
@url{http://www.forth.org/}. This web site also includes links to FIG |
|
chapters in other countries and American cities |
(@url{http://www.forth.org/chapters.html}). |
(@url{http://www.forth.org/chapters.html}). |
|
|
@node Conferences, , The Forth Interest Group, Forth-related information |
@node Conferences, , The Forth Interest Group, Forth-related information |
Line 9807 includes links to FIG chapters in other
|
Line 10011 includes links to FIG chapters in other
|
@cindex Conferences |
@cindex Conferences |
|
|
There are several regular conferences related to Forth. They are all |
There are several regular conferences related to Forth. They are all |
well-publicised in FIG magazine and on the comp.lang.forth news group: |
well-publicised in @var{FORTH Dimensions} and on the comp.lang.forth |
|
news group: |
|
|
@itemize @bullet |
@itemize @bullet |
@item |
@item |
Line 9824 EuroForth -- this European conference ta
|
Line 10029 EuroForth -- this European conference ta
|
@node Word Index, Concept Index, Forth-related information, Top |
@node Word Index, Concept Index, Forth-related information, Top |
@unnumbered Word Index |
@unnumbered Word Index |
|
|
This index is as incomplete as the manual. Each word is listed with |
This index is a list of Forth words that have ``glossary'' entries |
stack effect and wordset. |
within this manual. Each word is listed with its stack effect and |
|
wordset. |
|
|
@printindex fn |
@printindex fn |
|
|
@node Concept Index, , Word Index, Top |
@node Concept Index, , Word Index, Top |
@unnumbered Concept and Word Index |
@unnumbered Concept and Word Index |
|
|
This index is as incomplete as the manual. Not all entries listed are |
Not all entries listed in this index are present verbatim in the |
present verbatim in the text. Only the names are listed for the words |
text. This index also duplicates, in abbreviated form, all of the words |
here. |
listed in the Word Index (only the names are listed for the words here). |
|
|
@printindex cp |
@printindex cp |
|
|