PARSE-WORD

[ RfDs/CfVs | Other proposals ]

Problem

How do we parse a word from the input stream?

PARSE does not skip leading delimiters, and you cannot specify that you want to parse for white space.

WORD skips leading delimiters, but you cannot specify parsing for white space, it creates a counted string (not the preferred representation), the length of the string is therefore limited (and by the buffer length), it requires a separate buffer (and the copying to that buffer consumes time); WORD also requires passing a delimiter, although skipping leading delimiters only makes sense for white-space delimiters. ANS Forth does not specify the lifetime of the resulting string very much.

Proposal

PARSE-WORD  ( "name" -- c-addr u ) CORE-EXT
Skip leading white space and parse name delimited by a white space character.

c-addr is the address within the input buffer and u is the length of the selected string. If the parse area is empty or contains only white space, the resulting string has a zero length.

Typical Use

PARSE-WORD some-name TYPE

Remarks

Lifetime
The lifetime of the resulting string is specified implicitly through "within the input buffer", as is done in PARSE; i.e., the string will be usable until the next input buffer is read, for whatever reason (REFILL, INCLUDED, etc.). Should the lifetime be made more explicit.
Existing practice
ANS Forth mentions a PARSE-WORD with essentially the same definition in A.6.2.2008. Open Firmware also defines PARSE-WORD with the same definition. The only difference between these definitions and the current definition is that the current definition makes it explicit what happens when there is only white space in the input buffer.

Several systems have implemented a PARSE-WORD compatible with this specification, e.g., Gforth.

A number of systems have been named that define a PARSE-WORD incompatible with this specification (e.g., they often pass a delimiter on the stack). The systems mentioned are MinForth, CHForth, Jforth, 4th. Of these systems MinForth and CHForth are ANS Forth implementations, 4th and JForth are not (although 4th partially stays close to ANS Forth). Coos Haak (CHForth) indicated that the next version of CHForth will have a PARSE-WORD compatible to this specification.

Implementation and Tests

Experience

DEFER and IS have been used in many systems and programs for a long time. ACTION-OF also has been used, often under a different name (WHAT'S, WHATIS, IS?). DEFER@ and DEFER! also have been present in some systems (under different names); on many systems >BODY @/! were used instead.

Change history

Comments


Anton Ertl