Numbers shall be represented externally by using characters from the standard character set.
Conversion between the internal and external forms of a digit shall behave as follows:
The value in BASE is the radix for number conversion. A digit has a value ranging from zero to one less than the contents of BASE. The digit with the value zero corresponds to the character 0. This representation of digits proceeds through the character set to the decimal value nine corresponding to the character 9. For digits beginning with the decimal value ten the graphic characters beginning with the character A are used. This correspondence continues up to and including the digit with the decimal value thirty-five which is represented by the character Z. The conversion of digits outside this range is implementation defined.
Free-field number display uses the characters described in digit conversion, without leading zeros, in a field the exact size of the converted string plus a trailing space. If a number is zero, the least significant digit is not considered a leading zero. If the number is negative, a leading minus sign is displayed.
Number display may use the pictured numeric output string buffer to hold partially converted strings (see 3.3.3.6 Other transient regions).
Division produces a quotient q and a remainder r by dividing operand a by operand b. Division operations return q, r, or both. The identity b*q + r = a shall hold for all a and b.
When unsigned integers are divided and the remainder is not zero, q is the largest integer less than the true quotient.
When signed integers are divided, the remainder is not zero, and a and b have the same sign, q is the largest integer less than the true quotient. If only one operand is negative, whether q is rounded toward negative infinity (floored division) or rounded towards zero (symmetric division) is implementation defined.
Floored division is integer division in which the remainder carries the sign of the divisor or is zero, and the quotient is rounded to its arithmetic floor. Symmetric division is integer division in which the remainder carries the sign of the dividend or is zero and the quotient is the mathematical quotient rounded towards zero or truncated. Examples of each are shown in tables 3.3 and 3.4.
In cases where the operands differ in sign and the rounding direction matters, a program shall either include code generating the desired form of division, not relying on the implementation-defined default result, or have an environmental dependency on the desired rounding direction.
Table 3.3 - Floored Division Example
Dividend Divisor Remainder Quotient -------- ------- --------- -------- 10 7 3 1 -10 7 4 -2 10 -7 -4 -2 -10 -7 -3 1
Table 3.4 - Symmetric Division Example
Dividend Divisor Remainder Quotient -------- ------- --------- -------- 10 7 3 1 -10 7 -3 -1 10 -7 3 -1 -10 -7 -3 1
In all integer arithmetic operations, both overflow and underflow shall be ignored. The value returned when either overflow or underflow occurs is implementation defined.
Objects on the data stack shall be one cell wide.
The control-flow stack is a last-in, first out list whose elements define the permissible matchings of control-flow words and the restrictions imposed on data-stack usage during the compilation of control structures.
The elements of the control-flow stack are system-compilation data types.
The control-flow stack may, but need not, physically exist in an implementation. If it does exist, it may be, but need not be, implemented using the data stack. The format of the control-flow stack is implementation defined. Since the control-flow stack may be implemented using the data stack, items placed on the data stack are unavailable to a program after items are placed on the control-flow stack and remain unavailable until the control-flow stack items are removed.
Items on the return stack shall consist of one or more cells. A system may use the return stack in an implementation-dependent manner during the compilation of definitions, during the execution of do-loops, and for storing run-time nesting information.
A program may use the return stack for temporary storage during the execution of a definition subject to the following restrictions:
See 1.2.2 Exclusions.
The method of selecting the user input device is implementation defined.
The method of indicating the end of an input line of text is implementation defined.
The method of selecting the user output device is implementation defined.
A system need not provide any standard words for accessing mass storage. If a system provides any standard word for accessing mass storage, it shall also implement the Block word set.
The name spaces for ENVIRONMENT? and definitions are disjoint. Names of definitions that are the same as ENVIRONMENT? strings shall not impair the operation of ENVIRONMENT?. Table 3.5 contains the valid input strings and corresponding returned value for inquiring about the programming environment with ENVIRONMENT?.
Table 3.5 - Environmental Query Strings
String Value Constant? Meaning data type /COUNTED-STRING n yes maximum size of a counted string, in characters /HOLD n yes size of the pictured numeric output string buffer, in characters /PAD n yes size of the scratch area pointed to by PAD, in characters ADDRESS-UNIT-BITS n yes size of one address unit, in bits CORE flag no true if complete core word set present (i.e., not a subset as defined in 5.1.1) CORE-EXT flag no true if core extensions word set present FLOORED flag yes true if floored division is the default MAX-CHAR u yes maximum value of any character in the implementation-defined character set MAX-D d yes largest usable signed double number MAX-N n yes largest usable signed integer MAX-U u yes largest usable unsigned integer MAX-UD ud yes largest usable unsigned double number RETURN-STACK-CELLS n yes maximum size of the return stack, in cells STACK-CELLS n yes maximum size of the data stack, in cells
If an environmental query (using ENVIRONMENT?) returns false (i.e., unknown) in response to a string, subsequent queries using the same string may return true. If a query returns true (i.e., known) in response to a string, subsequent queries with the same string shall also return true. If a query designated as constant in the above table returns true and a value in response to a string, subsequent queries with the same string shall return true and the same value.
Forth words are organized into a structure called the dictionary. While the form of this structure is not specified by the Standard, it can be described as consisting of three logical parts: a name space, a code space, and a data space. The logical separation of these parts does not require their physical separation.
A program shall not fetch from or store into locations outside data space. An ambiguous condition exists if a program addresses name space or code space.
The relationship between name space and data space is implementation dependent.
The structure of a word list is implementation dependent. When duplicate names exist in a word list, the latest-defined duplicate shall be the one found during a search for the name.
Definition names shall contain {1..31} characters. A system may allow or prohibit the creation of definition names containing non-standard characters.
Programs that use lower case for standard definition names or depend on the case-sensitivity properties of a system have an environmental dependency.
A program shall not create definition names containing non-graphic characters.
The relationship between code space and data space is implementation dependent.
Data space is the only logical area of the dictionary for which standard words are provided to allocate and access regions of memory. These regions are: contiguous regions, variables, text-literal regions, input buffers, and other transient regions, each of which is described in the following sections. A program may read from or write into these regions unless otherwise specified.
Most addresses used in ANS Forth are aligned addresses (indicated by a-addr) or character-aligned (indicated by c-addr). ALIGNED, CHAR+, and arithmetic operations can alter the alignment state of an address on the stack. CHAR+ applied to an aligned address returns a character-aligned address that can only be used to access characters. Applying CHAR+ to a character-aligned address produces the succeeding character-aligned address. Adding or subtracting an arbitrary number to an address can produce an unaligned address that shall not be used to fetch or store anything. The only way to find the next aligned address is with ALIGNED. An ambiguous condition exists when @, !, , (comma), +!, 2@, or 2! is used with an address that is not aligned, or when C@, C!, or C, is used with an address that is not character-aligned.
The definitions of 6.1.1000 CREATE and 6.1.2410 VARIABLE require that the definitions created by them return aligned addresses.
After definitions are compiled or the word ALIGN is executed the data-space pointer is guaranteed to be aligned.
A system guarantees that a region of data space allocated using ALLOT, , (comma), C, (c-comma), and ALIGN shall be contiguous with the last region allocated with one of the above words, unless the restrictions in the following paragraphs apply. The data-space pointer HERE always identifies the beginning of the next data-space region to be allocated. As successive allocations are made, the data-space pointer increases. A program may perform address arithmetic within contiguously allocated regions. The last region of data space allocated using the above operators may be released by allocating a corresponding negatively-sized region using ALLOT, subject to the restrictions of the following paragraphs.
CREATE establishes the beginning of a contiguous region of data space, whose starting address is returned by the CREATEd definition. This region is terminated by compiling the next definition.
Since an implementation is free to allocate data space for use by code, the above operators need not produce contiguous regions of data space if definitions are added to or removed from the dictionary between allocations. An ambiguous condition exists if deallocated memory contains definitions.
The region allocated for a variable may be non-contiguous with regions subsequently allocated with , (comma) or ALLOT. For example, in:
VARIABLE X 1 CELLS ALLOT
the region X and the region ALLOTted could be non-contiguous.
Some system-provided variables, such as STATE, are restricted to read-only access.
The text-literal regions, specified by strings compiled with S" and C", may be read-only.
A program shall not store into the text-literal regions created by S" and C" nor into any read-only system variable or read-only transient regions. An ambiguous condition exists when a program attempts to store into read-only regions.
The address, length, and content of the input buffer may be transient. A program shall not write into the input buffer. In the absence of any optional word sets providing alternative input sources, the input buffer is either the terminal-input buffer, used by QUIT to hold one line from the user input device, or a buffer specified by EVALUATE. In all cases, SOURCE returns the beginning address and length in characters of the current input buffer.
The minimum size of the terminal-input buffer shall be 80 characters.
The address and length returned by SOURCE, the string returned by PARSE, and directly computed input-buffer addresses are valid only until the text interpreter does I/O to refill the input buffer or the input source is changed.
A program may modify the size of the parse area by changing the contents of >IN within the limits imposed by this Standard. For example, if the contents of >IN are saved before a parsing operation and restored afterwards, the text that was parsed will be available again for subsequent parsing operations. The extent of permissible repositioning using this method depends on the input source (see 7.3.3 Block buffer regions and 11.3.4 Input source).
A program may directly examine the input buffer using its address and length as returned by SOURCE; the beginning of the parse area within the input buffer is indexed by the number in >IN. The values are valid for a limited time. An ambiguous condition exists if a program modifies the contents of the input buffer.
The data space regions identified by PAD, WORD, and #> (the pictured numeric output string buffer) may be transient. Their addresses and contents may become invalid after:
The previous contents of the regions identified by WORD and #> may be invalid after each use of these words. Further, the regions returned by WORD and #> may overlap in memory. Consequently, use of one of these words can corrupt a region returned earlier by a different word. The other words that construct pictured numeric output strings (<#, #, #S, and HOLD) may also modify the contents of these regions. Words that display numbers may be implemented using pictured numeric output words. Consequently, . (dot), .R, .S, ?, D., D.R, U., and U.R could also corrupt the regions.
The size of the scratch area whose address is returned by PAD shall be at least 84 characters. The contents of the region addressed by PAD are intended to be under the complete control of the user: no words defined in this Standard place anything in the region, although changing data-space allocations as described in 3.3.3.2 Contiguous regions may change the address returned by PAD. Non-standard words provided by an implementation may use PAD, but such use shall be documented.
The size of the region identified by WORD shall be at least 33 characters.
The size of the pictured numeric output string buffer shall be at least (2*n) + 2 characters, where n is the number of bits in a cell. Programs that consider it a fixed area with unchanging access parameters have an environmental dependency.
Table of Contents
Next Section