User-defined Defining Words

Next: Deferred Words, Previous: Supplying names, Up: Defining Words

5.9.8 User-defined Defining Words

You can create a new defining word by wrapping defining-time code around an existing defining word and putting the sequence in a colon definition.

For example, suppose that you have a word stats that gathers statistics about colon definitions given the xt of the definition, and you want every colon definition in your application to make a call to stats. You can define and use a new version of : like this:

     : stats ( xt -- ) DUP ." (Gathering statistics for " . ." )"
       ... ;  \ other code
     
     : my: : latestxt postpone literal ['] stats compile, ;
     
     my: foo + - ;

When foo is defined using my: these steps occur:

my: is executed.
The : within the definition (the one between my: and latestxt) is executed, and does just what it always does; it parses the input stream for a name, builds a dictionary header for the name foo and switches state from interpret to compile.
The word latestxt is executed. It puts the xt for the word that is being defined – foo – onto the stack.
The code that was produced by postpone literal is executed; this causes the value on the stack to be compiled as a literal in the code area of foo.
The code ['] stats compiles a literal into the definition of my:. When compile, is executed, that literal – the execution token for stats – is layed down in the code area of foo , following the literal¹.
At this point, the execution of my: is complete, and control returns to the text interpreter. The text interpreter is in compile state, so subsequent text + - is compiled into the definition of foo and the ; terminates the definition as always.

You can use see to decompile a word that was defined using my: and see how it is different from a normal : definition. For example:

     : bar + - ;  \ like foo but using : rather than my:
     see bar
     : bar
       + - ;
     see foo
     : foo
       107645672 stats + - ;
     
     \ use ' foo . to show that 107645672 is the xt for foo

You can use techniques like this to make new defining words in terms of any existing defining word.

If you want the words defined with your defining words to behave differently from words defined with standard defining words, you can write your defining word like this:

     : def-word ( "name" -- )
         CREATE code1
     DOES> ( ... -- ... )
         code2 ;
     
     def-word name

This fragment defines a defining word def-word and then executes it. When def-word executes, it CREATEs a new word, name, and executes the code code1. The code code2 is not executed at this time. The word name is sometimes called a child of def-word.

When you execute name, the address of the body of name is put on the data stack and code2 is executed (the address of the body of name is the address HERE returns immediately after the CREATE, i.e., the address a created word returns by default).

You can use def-word to define a set of child words that behave similarly; they all have a common run-time behaviour determined by code2. Typically, the code1 sequence builds a data area in the body of the child word. The structure of the data is common to all children of def-word, but the data values are specific – and private – to each child word. When a child word is executed, the address of its private data area is passed as a parameter on TOS to be used and manipulated² by code2.

The two fragments of code that make up the defining words act (are executed) at two completely separate times:

At define time, the defining word executes code1 to generate a child word
At child execution time, when a child word is invoked, code2 is executed, using parameters (data) that are private and specific to the child word.

Another way of understanding the behaviour of def-word and name is to say that, if you make the following definitions:

     : def-word1 ( "name" -- )
         CREATE code1 ;
     
     : action1 ( ... -- ... )
         code2 ;
     
     def-word1 name1

Then using name1 action1 is equivalent to using name.

The classic example is that you can define CONSTANT in this way:

     : CONSTANT ( w "name" -- )
         CREATE ,
     DOES> ( -- w )
         @ ;

When you create a constant with 5 CONSTANT five, a set of define-time actions take place; first a new word five is created, then the value 5 is laid down in the body of five with ,. When five is executed, the address of the body is put on the stack, and @ retrieves the value 5. The word five has no code of its own; it simply contains a data field and a pointer to the code that follows DOES> in its defining word. That makes words created in this way very compact.

The final example in this section is intended to remind you that space reserved in CREATEd words is data space and therefore can be both read and written by a Standard program³:

     : foo ( "name" -- )
         CREATE -1 ,
     DOES> ( -- )
         @ . ;
     
     foo first-word
     foo second-word
     
     123 ' first-word >BODY !

If first-word had been a CREATEd word, we could simply have executed it to get the address of its data field. However, since it was defined to have DOES> actions, its execution semantics are to perform those DOES> actions. To get the address of its data field it's necessary to use ' to get its xt, then >BODY to translate the xt into the address of the data field. When you execute first-word, it will display 123. When you execute second-word it will display -1.

In the examples above the stack comment after the DOES> specifies the stack effect of the defined words, not the stack effect of the following code (the following code expects the address of the body on the top of stack, which is not reflected in the stack comment). This is the convention that I use and recommend (it clashes a bit with using locals declarations for stack effect specification, though).

Footnotes

[1] Strictly speaking, the mechanism that compile, uses to convert an xt into something in the code area is implementation-dependent. A threaded implementation might spit out the execution token directly whilst another implementation might spit out a native code sequence.

[2] It is legitimate both to read and write to this data area.

[3] Exercise: use this example as a starting point for your own implementation of Value and TO – if you get stuck, investigate the behaviour of ' and ['].