THREADING TECHNIQUES IN FORTHS FORTH has been implemented on almost every possible computer processor and this has lead to a wide variety of implementation styles. The discussion of 32 vs 16 bit implementations and of environments and operating systems is a subject for some other time. The purpose of this article is to describe the differences and implementations of various threading techniques: ITC, DTC, STC, Token, Compiled, Segment threaded, Status threaded, NOVIX, and others. Definitions of terms to be used: ITC - Indirect Threaded Code DTC - Direct Threaded Code STC - Subroutine Threaded Code CFA - Code Field Address PFA - Parameter Field Address Jcode - Jump/Call/Jump-to-subroutine code Ptr - Pointer Header - size and location of header depends on implementation. Could be inline, separated, or even non-existent Inner Interpreter - the 'routine' that controls the execution of FORTH words after they have been compiled Representation of ITC : /header/ /CFA/ /PFA /header/ /Ptr/ /list......./ /header/ /Ptr/ /code......./ /header/ /Ptr/ /data......./ CFA = address of machine language routine that will process the data in the PFA area. Points to PFA if definition is a code word. PFA = [ ':' definition ] contains list of pointers that point to the CFA's of other definitions. [ 'CODE' word ] contains executable machine language routine that usually ends with a transfer of control to the inner interpreter. [ data structures] contains data to be used ( any format needed) Representation of DTC : /header/ /CFA/ /PFA /header/ /Jcode /list......./ /header/ /Jcode /code......./ [ /header/ /code.............../ ] ( a variation ) /header/ /Jcode /data......./ CFA = a jump/call to machine language routine that will process the data in the PFA area. Points to PFA if definition is a code word. PFA = [ ':' definition ] contains list of pointers that point to the CFA's of other definitions. [ 'CODE' word ] contains executable machine language routine that usually ends with a transfer of control to the inner interpreter. [ data structures] contains data to be used ( any format needed) Representation of STC : /header/ /CFA/ /PFA A. /header/ /Jcode.Jcode.Jcode...../ ( list of Jcodes) B. /header/ /anycode.............../ ( not just calls) /header/ /Jcode /list........../ CFA = a jump/call to machine language routine that will process the data in the PFA area. ':' and 'code' words do not really have a CFA separate from the PFA unless so desired by the implementer of the system. PFA = [ ':' definition ] contains executable calls to other words (A.) or executable code of any kind (B.). Version B. is basically a true compiler and allows a certain amount of optimizing. In both cases, the inner interpreter is the machines program sequencer/counter and not a routine. [ 'CODE' word ] see [ ':' ] [ data structures] contains data to be used ( any format needed) Representation of Token threading : ----- ITC : ----- /header/ /CFA/ /PFA /header/ /Ptr/ /token.list.../ /header/ /Ptr/ /code........./ /header/ /Ptr/ /data........./ CFA = address of machine language routine that will process the data in the PFA area. Points to PFA if definition is a code word. PFA = [ ':' definition ] contains pointers to the token table that contains references to other words (token tables described below). [ 'CODE' word ] SEE ITC. [ data structures] SEE ITC. ----- DTC : ----- /header/ /CFA/ /PFA /header/ /Jcode /token.list.../ /header/ /Jcode /code........./ [ /header/ /code................./ ] ( a variation ) /header/ /Jcode /data........./ CFA = a jump/call to machine language routine that will process the data in the PFA area. Points to PFA if definition is a code word. PFA = [ ':' definition ] contains pointers to the token table that contains references to other words (token tables described below). [ 'CODE' word ] SEE DTC. [ data structures] SEE DTC. ----- STC : ----- /header/ /CFA/ /PFA A. /header/ /Jcode.Jcode.Jcode...../ ( list of Jcodes) B. /header/ /anycode.............../ ( not just calls) /header/ /Jcode /list........../ CFA = a jump/call to machine language routine that will process the data in the PFA area. ':' and 'code' words do not really have a CFA separate from the PFA unless so desired by the implementer of the system. PFA = [ ':' definition ] contains executable calls to other words (A.) or executable code of any kind (B.). Version B. is basically a true compiler and allows a certain amount of optimizing. In both cases, the inner interpreter is the machines program sequencer/counter and not a routine. A token reference would be a call to a table of jumps. This scheme has many variations. [ 'CODE' word ] see [ ':' ] [ data structures] contains data to be used ( any format needed) ----------- TOKEN TABLE ----------- The token table is little more than a list of references to definitions in the vocabularies. The way references are made varies according to the needs and whims of the designer. In a simple ITC/DTC system, the table simply contains the addresses of the words. /table_start addr_word_0 addr_word_1 ...... addr_word_n In a more elaborate version, /table_start modifier_0 / addr_word_0 modifier_1 / addr_word_1 ...... modifier_n / addr_word_n where modifier_x is additional information about the routine. Most commonly, the modifier is just a segment or bank-selection number. It can contain such information as 'available/not- avail/virtual' , relocation info, user #, security, network node, ad infinitum. A variation on this scheme has been proposed that would include a CFA either as modifier_x or in front of it. This method would allow a mix of methods in the table if desired. The CFA could be ITC, DTC, STC, or any other weirdness. The CFA can process the following addr_word_x in any manner before passing control. Token threading provides flexibility at the cost of extra indirection and slower execution. In large systems, it provides a way to use 8- and 16-bit references to address spaces larger than 64k without causing the word definitions to contain 32-bit PFA entries. Representation of other threading schemes : I will now attempt to present some other schemes, both used and proposed. This list is incomplete since the number of schemes is limited only by the imagination, memory size, and the trade-off of speed vs. design-needs. 1. Segment threading: for 8086, with limited usefulnesses in later 80x86 chips. PFA contains segment references (16-bit) and the inner interpreter always starts interpreting at a fixed offset in the word's segment. This scheme allows a 16-bit Forth to use the full 1-megabyte range of the 8086. The author of the scheme warns that later 80x86 chips have high overhead on segment register overrides. 2. Bit threading: for NOVIX NC4000. A single bit in a 16-bit opcode determines if the opcode is a call to routine or an executable instruction. This scheme allows a 1-cycle call to subroutine. Another bit in the opcode indicates a return from subroutine at end of execution of the opcode. In most cases, this allows a return with no cycle time or at most 1-cycle. Neat. Because of the NC4000's design, a single 16-bit opcode has the effect of several Forth instructions in many cases. Useful references : Journal of Forth Application and Research, Vol.2, no.4, 1984. (discusses extended addressing schemes) FORML Conference Proceedings, 1980-present. BYTE magazine, Sept. 1980, p.206, "Varieties of Threaded Code for Language Implementation". 'Threaded Interpretive Languages: Their Design and Implementation,' by R.G. Loeliger. Peterborough, NH: Byte Books, c1981. various articles in FORTH DIMENSIONS