A.7 The optional Block word set

Early Forth systems ran stand-alone, with no host OS. Blocks of 1024 bytes were designed as a convenient unit of disk, and most native Forth systems still use them. It is relatively easy to write a native disk driver that maps head/track/sector addresses to block numbers. Such disk drivers are extremely fast in comparison with conventional file-oriented operating systems, and security is high because there is no reliance on a disk map.

Today many Forth implementations run under host operating systems, because the compatibility they offer the user outweighs the performance overhead. Many people who use such systems prefer using host OS files only; however, people who use both native and non-native Forths need a compatible way of accessing disk. The Block Word set includes the most common words for accessing program source and data on disk.

In order to guarantee that Standard Programs that need access to mass storage have a mechanism appropriate for both native and non-native implementations, ANS Forth requires that the Block word set be available if any mass storage facilities are provided. On non-native implementations, blocks normally reside in host OS files.

A.7.2 Additional terms

block
Many Forth systems use blocks to contain program source. Conventionally such blocks are formatted for editing as 16 lines of 64 characters. Source blocks are often referred to as screens.

A.7.6 Glossary

A.7.6.2.2190 SCR

SCR is short for screen.

A.8 The optional Double-Number word set

Forth systems on 8-bit and 16-bit processors often find it necessary to deal with double-length numbers. But many Forths on small embedded systems do not, and many users of Forth on systems with a cell size of 32-bits or more find that the necessity for double-length numbers is much diminished. Therefore, we have factored the words that manipulate double-length entities into this optional word set.

Please note that the naming convention used in this word set conveys some important information:

1. Words whose names are of the form 2xxx deal with cell pairs, where the relationship between the cells is unspecified. They may be two-vectors, double-length numbers, or any pair of cells that it is convenient to manipulate together.

2. Words with names of the form Dxxx deal specifically with double-length integers.

3. Words with names of the form Mxxx deal with some combination of single and double integers. The order in which these appear on the stack is determined by long-standing common practice.

Refer to A.3.1 for a discussion of data types in Forth.

A.8.6 Glossary

A.8.6.1.0360 2CONSTANT

Typical use: x1 x2 2CONSTANT name

A.8.6.1.0390 2LITERAL

Typical use: : X ... [ x1 x2 ] 2LITERAL ... ;

A.8.6.1.0440 2VARIABLE

Typical use: 2VARIABLE name

A.8.6.1.1070 D.R

In D.R, the R is short for RIGHT.

A.8.6.1.1090 D2*

See: A.6.1.0320 2* for applicable discussion.

A.8.6.1.1100 D2/

See: A.6.1.0330 2/ for applicable discussion.

A.8.6.1.1140 D>S

There exist number representations, e.g., the sign-magnitude representation, where reduction from double- to single-precision cannot simply be done with DROP. This word, equivalent to DROP on two's complement systems, desensitizes application code to number representation and facilitates portability.

A.8.6.1.1820 M*/

M*/ was once described by Chuck Moore as the most useful arithmetic operator in Forth. It is the main workhorse in most computations involving double-cell numbers. Note that some systems allow signed divisors. This can cost a lot in performance on some CPUs. The requirement for a positive divisor has not proven to be a problem.

A.8.6.1.1830 M+

M+ is the classical method for integrating.

A.9 The optional Exception word set

CATCH and THROW provide a reliable mechanism for handling exceptions, without having to propagate exception flags through multiple levels of word nesting. It is similar in spirit to the non-local return mechanisms of many other languages, such as C's setjmp() and longjmp(), and LISP's CATCH and THROW. In the Forth context, THROW may be described as a multi-level EXIT, with CATCH marking a location to which a THROW may return.

Several similar Forth multi-level EXIT exception-handling schemes have been described and used in past years. It is not possible to implement such a scheme using only standard words (other than CATCH and THROW), because there is no portable way to unwind the return stack to a predetermined place.

THROW also provides a convenient implementation technique for the standard words ABORT and ABORT", allowing an application to define, through the use of CATCH, the behavior in the event of a system ABORT.

This sample implementation of CATCH and THROW uses the non-standard words described below. They or their equivalents are available in many systems. Other implementation strategies, including directly saving the value of DEPTH, are possible if such words are not available.

SP@ ( -- addr ) returns the address corresponding to the top of data stack.

SP! ( addr -- ) sets the stack pointer to addr, thus restoring the stack depth to the same depth that existed just before addr was acquired by executing SP@.

RP@ ( -- addr ) returns the address corresponding to the top of return stack.

RP! ( addr -- ) sets the return stack pointer to addr, thus restoring the return stack depth to the same depth that existed just before addr was acquired by executing RP@.

VARIABLE HANDLER   0 HANDLER !  \ last exception handler

: CATCH  ( xt -- exception# | 0 ) \ return addr on stack
    SP@ >R         ( xt ) \ save data stack pointer
    HANDLER @ >R   ( xt ) \ and previous handler
    RP@ HANDLER !  ( xt ) \ set current handler
    EXECUTE        ( )    \ execute returns if no THROW
    R> HANDLER !   ( )    \ restore previous handler
    R> DROP        ( )    \ discard saved stack ptr
    0              ( 0 )  \ normal completion
;

: THROW  ( ??? exception# -- ??? exception# )
    ?DUP IF          ( exc# ) \ 0 THROW is no-op
    HANDLER @ RP!  ( exc# ) \ restore prev return stack
    R> HANDLER !   ( exc# ) \ restore prev handler
    R> SWAP >R     ( saved-sp ) \ exc# on return stack
    SP! DROP R>    ( exc# ) \ restore stack
        \  Return to the caller of CATCH because return
        \  stack is restored to the state that existed
        \  when CATCH began execution
THEN
;

In a multi-tasking system, the HANDLER variable should be in the per-task variable area (i.e., a user variable).

This sample implementation does not explicitly handle the case in which CATCH has never been called (i.e., the ABORT behavior). One solution is to add the following code after the IF in THROW:

	HANDLER @ 0= IF ( empty the stack ) QUIT THEN

Another solution is to execute CATCH within QUIT, so that there is always an exception handler of last resort present. For example:

: QUIT      ( empty the return stack and )
            ( set the input source to the user input device )
    POSTPONE [
    BEGIN
      REFILL
    WHILE
      ['] INTERPRET CATCH
      CASE
      0 OF STATE @ 0= IF ." OK" THEN CR  ENDOF
     -1 OF ( Aborted) ENDOF
     -2 OF ( display  message from ABORT" ) ENDOF
      ( default ) DUP ." Exception # "  .
      ENDCASE
    REPEAT BYE
;

This example assumes the existance of a system-implementation word INTERPRET that embodies the text interpreter semantics described in 3.4 The Forth text interpreter. Note that this implementation of QUIT automatically handles the emptying of the stack and return stack, due to THROW's inherent restoration of the data and return stacks. Given this definition of QUIT, it's easy to define:

	: ABORT  -1 THROW ;

In systems with other stacks in addition to the data and return stacks, the implementation of CATCH and THROW must save and restore those stack pointers as well. Such an extended version can be built on top of this basic implementation. For example, with another stack pointer accessed with FP@ and FP! only CATCH needs to be redefined:

: CATCH  ( xt -- exception# | 0 )
    FP@ >R  CATCH  R> OVER IF FP! ELSE DROP THEN ;

No change to THROW is necessary in this case. Note that, as with all redefinitions, the redefined version of CATCH will only be available to definitions compiled after the redefinition of CATCH.

CATCH and THROW provide a convenient way for an implementation to clean up the state of open files if an exception occurs during the text interpretation of a file with INCLUDE-FILE. The implementation of INCLUDE-FILE may guard (with CATCH) the word that performs the text interpretation, and if CATCH returns an exception code, the file may be closed and the exception reTHROWn so that the files being included at an outer nesting level may be closed also. Note that the Standard allows, but does not require, INCLUDE-FILE to close its open files if an exception occurs. However, it does require INCLUDE-FILE to unnest the input source specification if an exception is THROWn.

A.9.3 Additional usage requirements

One important use of an exception handler is to maintain program control under many conditions which ABORT. This is practicable only if a range of codes is reserved. Note that an application may overload many standard words in such a way as to THROW ambiguous conditions not normally THROWn by a particular system.

A.9.3.6 Exception handling

The method of accomplishing this coupling is implementation dependent. For example, LOAD could know about CATCH and THROW (by using CATCH itself, for example), or CATCH and THROW could know about LOAD (by maintaining input source nesting information in a data structure known to THROW, for example). Under these circumstances it is not possible for a Standard Program to define words such as LOAD in a completely portable way.

A.9.6 Glossary

A.9.6.1.2275 THROW

If THROW is executed with a non zero argument, the effect is as if the corresponding CATCH had returned it. In that case, the stack depth is the same as it was just before CATCH began execution. The values of the i*x stack arguments could have been modified arbitrarily during the execution of xt. In general, nothing useful may be done with those stack items, but since their number is known (because the stack depth is deterministic), the application may DROP them to return to a predictable stack state.

Typical use:

: could-fail ( -- char )
    KEY DUP [CHAR] Q =  IF  1 THROW THEN ;

: do-it ( a b -- c)   2DROP could-fail ;

: try-it ( --)
    1 2 ['] do-it  CATCH  IF ( x1 x2 )
        2DROP ." There was an exception" CR
    ELSE ." The character was " EMIT CR
    THEN
;

: retry-it ( -- )
    BEGIN  1 2 ['] do-it CATCH  WHILE
       ( x1 x2) 2DROP  ." Exception, keep trying" CR
    REPEAT ( char )
    ." The character was " EMIT CR
;

A.10 The optional Facility word set

A.10.6 Glossary

A.10.6.1.0742 AT-XY

Most implementors supply a method of positioning a cursor on a CRT screen, but there is great variance in names and stack arguments. This version is supported by at least one major vendor.

A.10.6.1.1755 KEY?

The Technical Committee has gone around several times on the stack effects. Whatever is decided will violate somebody's practice and penalize some machine. This way doesn't interfere with type-ahead on some systems, while requiring the implementation of a single-character buffer on machines where polling the keyboard inevitably results in the destruction of the character.

Use of KEY or KEY? indicates that the application does not wish to bother with non-character events, so they are discarded, in anticipation of eventually receiving a valid character. Applications wishing to handle non-character events must use EKEY and EKEY?. It is possible to mix uses of KEY? / KEY and EKEY? / EKEY within a single application, but the application must use KEY? and KEY only when it wishes to discard non-character events until a valid character is received.

A.10.6.2.1305 EKEY

EKEY provides a standard word to access a system-dependent set of raw keyboard events, including events corresponding to members of the standard character set, events corresponding to other members of the implementation-defined character set, and keystrokes that do not correspond to members of the character set.

EKEY assumes no particular numerical correspondence between particular event code values and the values representing standard characters. On some systems, this may allow two separate keys that correspond to the same standard character to be distinguished from one another.

In systems that combine both keyboard and mouse events into a single event stream, the single number returned by EKEY may be inadequate to represent the full range of input possibilities. In such systems, a single event record may include a time stamp, the x,y coordinates of the mouse position, the keyboard state, and the state of the mouse buttons. In such systems, it might be appropriate for EKEY to return the address of an event record from which the other information could be extracted.

Also, consider a hypothetical Forth system running under MS-DOS on a PC-compatible computer. Assume that the implementation-defined character set is the normal 8-bit PC character set. In that character set, the codes from 0 to 127 correspond to ASCII characters. The codes from 128 to 255 represent characters from various non-English languages, mathematical symbols, and some graphical symbols used for line drawing. In addition to those characters, the keyboard can generate various other scan codes, representing such non-character events as arrow keys and function keys.

There may be multiple keys, with different scan codes, corresponding to the same standard character. For example, the character representing the number 1 often appears both in the row of number keys above the alphabetic keys, and also in the separate numeric keypad.

When a program asks the MS-DOS operating system for a keyboard event, it receives either a single non-zero byte, representing a character, or a zero byte followed by a scan code byte, representing a non-character keyboard event (e.g., a function key).

EKEY represents each keyboard event as a single number, rather than as a sequence of numbers. For the system described above, the following would be a reasonable implementation of EKEY and related words:

The MAX-CHAR environmental query would return 255.

Assume the existence of a word DOS-KEY ( -- char ) which executes the MS-DOS Direct STDIN Input system call (Interrupt 21h, Function 07h) and a word DOS-KEY? ( -- flag) which executes the MS-DOS Check STDIN Status system call (Interrupt 21h, Function 0Bh).

: EKEY?  ( -- flag )  DOS-KEY?  0<>  ;

: EKEY  ( -- u )  DOS-KEY  ?DUP 0= IF  DOS-KEY 256 +  THEN ;

: EKEY>CHAR  ( u -- u false | char true )
    DUP 255 > IF          ( u )
    DUP 259 = IF           \ 259 is Ctrl-@ (ASCII NUL)
        DROP 0 TRUE EXIT   \ so replace with character
      THEN FALSE EXIT      \ otherwise extended character
    THEN  TRUE             \ normal extended ASCII char.
;

VARIABLE PENDING-CHAR   -1 PENDING-CHAR !

: KEY?  ( -- flag )
    PENDING-CHAR @ 0< IF
      BEGIN  EKEY? WHILE
        EKEY EKEY>CHAR IF
          PENDING-CHAR !  TRUE EXIT
        THEN DROP
      REPEAT  FALSE EXIT
    THEN  TRUE
;

: KEY  ( -- char )
    PENDING-CHAR @ 0< IF
      BEGIN  EKEY  EKEY>CHAR 0=
      WHILE
        DROP
      REPEAT  EXIT
    THEN  PENDING-CHAR @  -1 PENDING-CHAR !
;

This is a full-featured implementation, providing the application program with an easy way to either handle non-character events (with EKEY), or to ignore them and to only consider real characters (with KEY).

Note that EKEY maps scan codes from 0 to 255 into numbers from 256 to 511. EKEY maps the number 259, representing the keyboard combination Ctrl-Shift-@, to the character whose numerical value is 0 (ASCII NUL). Many ASCII keyboards generate ASCII NUL for Ctrl-Shift-@, so we use that key combination for ASCII NUL (which is otherwise unavailable from MS-DOS, because the zero byte signifies that another scan-code byte follows).

One consequence of using the Direct STDIN Input system call (function 7) instead of the STDIN Input system call (function 8) is that the normal DOS Ctrl-C interrupt behavior is disabled when the system is waiting for input (Ctrl-C would still cause an interrupt while characters are being output). On the other hand, if the STDIN Input system call (function 8) were used to implement EKEY, Ctrl-C interrupts would be enabled, but Ctrl-Shift-@ would also cause an interrupt, because the operating system would treat the second byte of the 0,3 sequence as a Ctrl-C, even though the 3 is really a scan code and not a character. One best of both worlds solution is to use function 8 for the first byte received by EKEY, and function 7 for the scan code byte. For example:

: EKEY  ( -- u )
    DOS-KEY-FUNCTION-8  ?DUP  0=  IF
      DOS-KEY-FUNCTION-7 DUP 3  =  IF
        DROP 0  ELSE  256 +
      THEN
    THEN
;

Of course, if the Forth implementor chooses to pass Ctrl-C through to the program, without using it for its usual interrupt function, then DOS function 7 is appropriate in both cases (and some additional care must be taken to prevent a typed-ahead Ctrl-C from interrupting the Forth system during output operations).

A Forth system might also choose a simpler implementation of KEY, without implementing EKEY, as follows:

: KEY   ( -- char )  DOS-KEY  ;

: KEY?  ( -- flag )  DOS-KEY? 0<>  ;

The disadvantages of the simpler version are:

a) An application program that uses KEY, expecting to receive only valid characters, might receive a sequence of bytes (e.g., a zero byte followed by a byte with the same numerical value as the letter A) that appears to contain a valid character, even though the user pressed a key (e.g., function key 4) that does not correspond to any valid character.

b) An application program that wishes to handle non-character events will have to execute KEY twice if it returns zero the first time. This might appear to be a reasonable and easy thing to do. However, such code is not portable to other systems that do not use a zero byte as an escape code. Using the EKEY approach, the algorithm for handling keyboard events can be the same for all systems; the system dependencies can be reduced to a table or set of constants listing the system-dependent key codes used to access particular application functions. Without EKEY, the algorithm, not just the table, is likely to be system dependent.

Another approach to EKEY on MS-DOS is to use the BIOS Read Keyboard Status function (Interrupt 16h, Function 01h) or the related Check Keyboard function (Interrupt 16h, Function 11h). The advantage of this function is that it allows the program to distinguish between different keys that correspond to the same character (e.g. the two 1 keys). The disadvantage is that the BIOS keyboard functions read only the keyboard. They cannot be redirected to another standard input source, as can the DOS STDIN Input functions.

A.10.6.2.1306 EKEY>CHAR

EKEY>CHAR translates a keyboard event into the corresponding member of the character set, if such a correspondence exists for that event.

It is possible that several different keyboard events may correspond to the same character, and other keyboard events may correspond to no character.

A.10.6.2.1325 EMIT?

An indefinite delay is a device related condition, such as printer off-line, that requires operator intervention before the device will accept new data.

A.10.6.2.1905 MS

Although their frequencies vary, every system has a clock. Since many programs need to time intervals, this word is offered. Use of milliseconds as an internal unit of time is a practical least common denominator external unit. It is assumed implementors will use clock ticks (whatever size they are) as an internal unit and convert as appropriate.

A.10.6.2.2292 TIME&DATE

Most systems have a real-time clock/calendar. This word gives portable access to it.

Table of Contents
Next Section