version 1.24, 1995/11/15 17:29:07
|
version 1.27, 1995/12/04 16:38:53
|
Line 44 Copyright @copyright{} 1995 Free Softwar
|
Line 44 Copyright @copyright{} 1995 Free Softwar
|
@center for version 0.1 |
@center for version 0.1 |
@sp 2 |
@sp 2 |
@center Anton Ertl |
@center Anton Ertl |
|
@center Bernd Paysan |
@sp 3 |
@sp 3 |
@center This manual is under construction |
@center This manual is under construction |
|
|
Line 2088 machine code), and for defining the the
|
Line 2089 machine code), and for defining the the
|
nature of Gforth poses a few problems: First of all. Gforth runs on |
nature of Gforth poses a few problems: First of all. Gforth runs on |
several architectures, so it can provide no standard assembler. What's |
several architectures, so it can provide no standard assembler. What's |
worse is that the register allocation not only depends on the processor, |
worse is that the register allocation not only depends on the processor, |
but also on the gcc version and options used. |
but also on the @code{gcc} version and options used. |
|
|
The words Gforth offers encapsulate some system dependences (e.g., the |
The words that Gforth offers encapsulate some system dependences (e.g., the |
header structure), so a system-independent assembler may be used in |
header structure), so a system-independent assembler may be used in |
Gforth. If you do not have an assembler, you can compile machine code |
Gforth. If you do not have an assembler, you can compile machine code |
directly with @code{,} and @code{c,}. |
directly with @code{,} and @code{c,}. |
Line 2108 These words are rarely used. Therefore t
|
Line 2109 These words are rarely used. Therefore t
|
which is usually not loaded (except @code{flush-icache}, which is always |
which is usually not loaded (except @code{flush-icache}, which is always |
present). You can load them with @code{require code.fs}. |
present). You can load them with @code{require code.fs}. |
|
|
|
In the assembly code you will want to refer to the inner interpreter's |
|
registers (e.g., the data stack pointer) and you may want to use other |
|
registers for temporary storage. Unfortunately, the register allocation |
|
is installation-dependent. |
|
|
|
The easiest solution is to use explicit register declarations |
|
(@pxref{Explicit Reg Vars, , Variables in Specified Registers, gcc.info, |
|
GNU C Manual}) for all of the inner interpreter's registers: You have to |
|
compile Gforth with @code{-DFORCE_REG} (configure option |
|
@code{--enable-force-reg}) and the appropriate declarations must be |
|
present in the @code{machine.h} file (see @code{mips.h} for an example; |
|
you can find a full list of all declarable register symbols with |
|
@code{grep register engine.c}). If you give explicit registers to all |
|
variables that are declared at the beginning of @code{engine()}, you |
|
should be able to use the other caller-saved registers for temporary |
|
storage. Alternatively, you can use the @code{gcc} option |
|
@code{-ffixed-REG} (@pxref{Code Gen Options, , Options for Code |
|
Generation Conventions, gcc.info, GNU C Manual}) to reserve a register |
|
(however, this restriction on register allocation may slow Gforth |
|
significantly). |
|
|
|
If this solution is not viable (e.g., because @code{gcc} does not allow |
|
you to explicitly declare all the registers you need), you have to find |
|
out by looking at the code where the inner interpreter's registers |
|
reside and which registers can be used for temporary storage. You can |
|
get an assembly listing of the engine's code with @code{make engine.s}. |
|
|
|
In any case, it is good practice to abstract your assembly code from the |
|
actual register allocation. E.g., if the data stack pointer resides in |
|
register @code{$17}, create an alias for this register called @code{sp}, |
|
and use that in your assembly code. |
|
|
Another option for implementing normal and defining words efficiently |
Another option for implementing normal and defining words efficiently |
is: adding the wanted functionality to the source of Gforth. For normal |
is: adding the wanted functionality to the source of Gforth. For normal |
words you just have to edit @file{primitives}, defining words (for fast |
words you just have to edit @file{primitives}, defining words (for fast |
defined words) probably require changes in @file{engine.c}, |
defined words) may require changes in @file{engine.c}, |
@file{kernal.fs}, @file{prims2x.fs}, and possibly @file{cross.fs}. |
@file{kernal.fs}, @file{prims2x.fs}, and possibly @file{cross.fs}. |
|
|
|
|
Line 2384 characters is determined by the locale y
|
Line 2417 characters is determined by the locale y
|
|
|
@item division rounding: |
@item division rounding: |
installation dependent. @code{s" floored" environment? drop .}. We leave |
installation dependent. @code{s" floored" environment? drop .}. We leave |
the choice to gcc (what to use for @code{/}) and to you (whether to use |
the choice to @code{gcc} (what to use for @code{/}) and to you (whether to use |
@code{fm/mod}, @code{sm/rem} or simply @code{/}). |
@code{fm/mod}, @code{sm/rem} or simply @code{/}). |
|
|
@item values of @code{STATE} when true: |
@item values of @code{STATE} when true: |
Line 2495 The next invocation of a parsing word re
|
Line 2528 The next invocation of a parsing word re
|
Compiles a recursive call to the defining word not to the defined word. |
Compiles a recursive call to the defining word not to the defined word. |
|
|
@item argument input source different than current input source for @code{RESTORE-INPUT}: |
@item argument input source different than current input source for @code{RESTORE-INPUT}: |
!!???If the argument input source is a valid input source then it gets |
@code{-12 THROW}. Note that, once an input file is closed (e.g., because |
restored. Otherwise causes @code{-12 THROW}, which, unless caught, issues |
the end of the file was reached), its source-id may be |
the message "argument type mismatch" and aborts. |
reused. Therefore, restoring an input source specification referencing a |
|
closed file may lead to unpredictable results instead of a @code{-12 |
|
THROW}. |
|
|
|
In the future, Gforth may be able to retore input source specifications |
|
from other than the current input soruce. |
|
|
@item data space containing definitions gets de-allocated: |
@item data space containing definitions gets de-allocated: |
Deallocation with @code{allot} is not checked. This typically resuls in |
Deallocation with @code{allot} is not checked. This typically resuls in |
Line 2569 Not checked. As usual, you can expect me
|
Line 2607 Not checked. As usual, you can expect me
|
None. |
None. |
|
|
@item operator's terminal facilities available: |
@item operator's terminal facilities available: |
!!?? |
After processing the command line, Gforth goes into interactive mode, |
|
and you can give commands to Gforth interactively. The actual facilities |
|
available depend on how you invoke Gforth. |
|
|
@item program data space available: |
@item program data space available: |
@code{sp@ here - .} gives the space remaining for dictionary and data |
@code{sp@ here - .} gives the space remaining for dictionary and data |
stack together. |
stack together. |
|
|
@item return stack space available: |
@item return stack space available: |
!!?? |
By default 16 KBytes. The default can be overridden with the @code{-r} |
|
switch (@pxref{Invocation}) when Gforth starts up. |
|
|
@item stack space available: |
@item stack space available: |
@code{sp@ here - .} gives the space remaining for dictionary and data |
@code{sp@ here - .} gives the space remaining for dictionary and data |
Line 2897 System dependent; @code{REPRESENT} is im
|
Line 2938 System dependent; @code{REPRESENT} is im
|
function @code{ecvt()} and inherits its behaviour in this respect. |
function @code{ecvt()} and inherits its behaviour in this respect. |
|
|
@item rounding or truncation of floating-point numbers: |
@item rounding or truncation of floating-point numbers: |
What's the question?!! |
System dependent; the rounding behaviour is inherited from the hosting C |
|
compiler. IEEE-FP-based (i.e., most) systems by default round to |
|
nearest, and break ties by rounding to even (i.e., such that the last |
|
bit of the mantissa is 0). |
|
|
@item size of floating-point stack: |
@item size of floating-point stack: |
@code{s" FLOATING-STACK" environment? drop .}. Can be changed at startup |
@code{s" FLOATING-STACK" environment? drop .}. Can be changed at startup |
Line 3608 Sieve benchmark on a 486DX2/66 than Gfor
|
Line 3652 Sieve benchmark on a 486DX2/66 than Gfor
|
|
|
However, this potential advantage of assembly language implementations |
However, this potential advantage of assembly language implementations |
is not necessarily realized in complete Forth systems: We compared |
is not necessarily realized in complete Forth systems: We compared |
Gforth (compiled with @code{gcc-2.6.3} and @code{-DFORCE_REG}) with |
Gforth (direct threaded, compiled with @code{gcc-2.6.3} and |
Win32Forth 1.2093 and LMI's NT Forth (Beta, May 1994), two systems |
@code{-DFORCE_REG}) with Win32Forth 1.2093, LMI's NT Forth (Beta, May |
written in assembly, and with two systems written in C: PFE-0.9.11 |
1994) and Eforth (with and without peephole (aka pinhole) optimization |
(compiled with @code{gcc-2.6.3} with the default configuration for |
of the threaded code); all these systems were written in assembly |
Linux: @code{-O2 -fomit-frame-pointer -DUSE_REGS}) and ThisForth Beta |
language. We also compared Gforth with two systems written in C: |
(compiled with gcc-2.6.3 -O3 -fomit-frame-pointer). We benchmarked |
PFE-0.9.11 (compiled with @code{gcc-2.6.3} with the default |
Gforth, PFE and ThisForth on a 486DX2/66 under Linux. Kenneth O'Heskin |
configuration for Linux: @code{-O2 -fomit-frame-pointer -DUSE_REGS}) and |
kindly provided the results for Win32Forth and NT Forth on a 486DX2/66 |
ThisForth Beta (compiled with gcc-2.6.3 -O3 -fomit-frame-pointer; |
with similar memory performance under Windows NT. |
ThisForth employs peephole optimization of the threaded code). We |
|
benchmarked Gforth, PFE and ThisForth on a 486DX2/66 under |
|
Linux. Kenneth O'Heskin kindly provided the results for Win32Forth and |
|
NT Forth on a 486DX2/66 with similar memory performance under Windows |
|
NT. Marcel Hendrix ported Eforth to Linux, then extended it to run the |
|
benchmarks, added the peephole optimizer, ran the benchmarks and |
|
reported the results. |
|
|
We used four small benchmarks: the ubiquitous Sieve; bubble-sorting and |
We used four small benchmarks: the ubiquitous Sieve; bubble-sorting and |
matrix multiplication come from the Stanford integer benchmarks and have |
matrix multiplication come from the Stanford integer benchmarks and have |
Line 3628 other words, it shows the speedup factor
|
Line 3678 other words, it shows the speedup factor
|
other systems). |
other systems). |
|
|
@example |
@example |
relative Win32- NT This- |
relative Win32- NT eforth This- |
time Gforth Forth Forth PFE Forth |
time Gforth Forth Forth eforth +opt PFE Forth |
sieve 1.00 1.30 1.07 1.67 2.98 |
sieve 1.00 1.39 1.14 1.39 0.85 1.78 3.18 |
bubble 1.00 1.30 1.40 1.66 |
bubble 1.00 1.33 1.43 1.51 0.89 1.70 |
matmul 1.00 1.40 1.29 2.24 |
matmul 1.00 1.43 1.31 1.42 1.12 2.28 |
fib 1.00 1.44 1.26 1.82 2.82 |
fib 1.00 1.55 1.36 1.24 1.15 1.97 3.04 |
@end example |
@end example |
|
|
You may find the good performance of Gforth compared with the systems |
You may find the good performance of Gforth compared with the systems |
Line 3645 method for relocating the Forth image: l
|
Line 3695 method for relocating the Forth image: l
|
the actual addresses at run time, resulting in two address computations |
the actual addresses at run time, resulting in two address computations |
per NEXT (@pxref{System Architecture}). |
per NEXT (@pxref{System Architecture}). |
|
|
|
Only Eforth with the peephole optimizer performs comparable to |
|
Gforth. The speedups achieved with peephole optimization of threaded |
|
code are quite remarkable. Adding a peephole optimizer to Gforth should |
|
cause similar speedups. |
|
|
The speedup of Gforth over PFE and ThisForth can be easily explained |
The speedup of Gforth over PFE and ThisForth can be easily explained |
with the self-imposed restriction to standard C (although the measured |
with the self-imposed restriction to standard C (although the measured |
implementation of PFE uses a GNU C extension: global register |
implementation of PFE uses a GNU C extension: global register |
Line 3659 machine registers by itself and would no
|
Line 3714 machine registers by itself and would no
|
register declarations, giving a 1.3 times slower engine (on a 486DX2/66 |
register declarations, giving a 1.3 times slower engine (on a 486DX2/66 |
running the Sieve) than the one measured above. |
running the Sieve) than the one measured above. |
|
|
The numbers in this section have also been published in the paper |
In @cite{Translating Forth to Efficient C} by M. Anton Ertl and Martin |
@cite{Translating Forth to Efficient C} by M. Anton Ertl and Martin |
Maierhofer (presented at EuroForth '95), an indirect threaded version of |
Maierhofer, presented at EuroForth '95. It is available at |
Gforth is compared with Win32Forth, NT Forth, PFE, and ThisForth; that |
|
version of Gforth is 2\%@minus{}8\% slower on a 486 than the version |
|
used here. The paper available at |
@*@file{http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz}; |
@*@file{http://www.complang.tuwien.ac.at/papers/ertl&maierhofer95.ps.gz}; |
it also contains numbers for some native code systems. You can find |
it also contains numbers for some native code systems. You can find |
numbers for Gforth on various machines in @file{Benchres}. |
numbers for Gforth on various machines in @file{Benchres}. |
Line 3701 VolksForth descends from F83. It was wri
|
Line 3758 VolksForth descends from F83. It was wri
|
Pennemann, Georg Rehfeld and Dietrich Weineck for the C64 (called |
Pennemann, Georg Rehfeld and Dietrich Weineck for the C64 (called |
UltraForth there) in the mid-80s and ported to the Atari ST in 1986. |
UltraForth there) in the mid-80s and ported to the Atari ST in 1986. |
|
|
Laxen and Perry wrote F83 as a model implementation of the |
Hennry Laxen and Mike Perry wrote F83 as a model implementation of the |
Forth-83 standard. !! Pedigree? When? |
Forth-83 standard. !! Pedigree? When? |
|
|
A team led by Bill Ragsdale implemented fig-Forth on many processors in |
A team led by Bill Ragsdale implemented fig-Forth on many processors in |