Can locals be replaced?

There are regular discussions on comp.lang.forth where one side claims that locals are unnecessary and un-Forth-like. According to this position, programmers should just factor the code better, and the stack would prove sufficient.

While this is possible in many cases without resorting to deep stacks and without using pick and roll, in other cases I failed to find such a factoring, and I posted some of these cases as a challenge to those who hold the anti-locals position. I finally collected the challenges and the solutions posted up to now on this page.

Note that Chuck Moore uses variables to reduce the stack depth. An advantage of this approach over using locals is that the word can be split into factors without ado. It also has disadvantages: The words involved now have parameters and/or results that are not passed on the stack, which makes testing of individual words harder; there may be non-obvious data flow between words, which makes the program harder to understand; and the resulting code is not reentrant.

In my EuroForth 2011 paper I give an overview of techniques programmers can use to reduce the stack depth.

The challenges are:

The challenge is to implement the words without using locals, using the stacks instead, or maybe temporarily allocated memory. I.e., no USERs or VARIABLEs at least for F>BUF-RDP-TRY; for comp-i.fs VARIABLEs are less of a problem, but to me still a sign that stacks are not the answer. And the result should be at least as understandable and maintainable as the original; using factoring to achieve the latter is probably a good idea.

If you can make it more readable by factoring while still using locals, this would show the merits of factoring, but not the demerits of using locals.

A problem with these challenges is that they are taken from real code, call non-standard words (you have to look up the stack effects of these), and are unwieldy in other ways. This is in some way unavoidable: A designed (rather than real-world) example would appear contrived.


f>buf-rdp-try ( rf c-addr ur nd up um1 -- um2 ) is a factor of f.rdp, a floating-point output word. It tries to convert the FP number rf into a string in the buffer c-addr ur. The result is in either fixed-point or exponential notation, depending on the FP number and the other parameters (see f.rdp documentation). ur corresponds to the nr parameter of f.rdp, nd to the nd parameter, and up to the np parameter. um1 is the mantissa length to try, um2 is the proper mantissa length as determined by this try. Make another try with that mantissa length, and you get the proper conversion. The source code with its surroundings are here.
\ uses the following non-standard non-obvious words:
\ push-right ( c-addr u1 u2 cfill -- )
    \ move string at c-addr u1 right by u2 chars (without exceeding
    \ the original bound); fill the gap with cfill
\ <<# ( -- )
    \ Start a hold area that ends with '#>>'.
\ #>> ( -- )
    \ Release the hold area started with '<<#'.

: f>buf-rdp-try { f: rf c-addr ur nd up um1 -- um2 }
    \ um1 is the mantissa length to try, um2 is the actual mantissa length
    c-addr ur um1 /string '0 fill
    rf c-addr um1 represent if { nexp fsign }
        nd nexp + up >= up 0= or
        ur nd - 1- dup { beforep } fsign + nexp 0 max >= and if
            \ fixed-point notation
            c-addr ur beforep nexp - dup { befored } '0 push-right
            befored 1+ ur >= if \ <=1 digit left, will be pushed out by '.'
                rf fabs f2* 0.1e nd s>d d>f f** f> if \ round last digit
                    '1 c-addr befored + 1- c!
            c-addr beforep 1- befored min dup { beforez } 0 max bl fill
            fsign if
                '- c-addr beforez 1- 0 max + c!
            c-addr ur beforep /string 1 '. push-right
            nexp nd +
        else \ exponential notation
            c-addr ur 1 /string 1 '. push-right
            fsign if
                c-addr ur 1 '- push-right
            nexp 1- s>d tuck dabs <<# #s rot sign 'E hold #> { explen }
            ur explen - 1- fsign + { mantlen }
            mantlen 0< if \ exponent too large
                drop c-addr ur '* fill
                c-addr ur + 0 explen negate /string move
            #>> mantlen
    else \ inf or nan
        \ don't rely on REPRESENT result
        rf f0< if s" -Inf" else rf f0>= if s" Inf" else s" NaN" endif endif
        c-addr ur rot umin dup >r move c-addr ur r> /string blank
    1 max ur min ;
The problem is the big number of values around. One could factor out the three cases (fixed-point, exponential, and inf/nan), but the number of values around would stay the same, and we would still need locals.


This program is used when building Gforth to generate relocatable images: Two instances of Gforth load the same code at different addresses, then store the resulting non-relocatable images, and this program compares the images and finds out which cells are addresses and need relocation.

The code uses knowledge about the image file format, which is not really documented; this makes the code hard to grok. There are two long words with locals in this code: compare-images and comp-image.

\ non-standard words used:
\ image-data ( i-field expected-offset -- base offset )
\ image1 ( -- addr )
\ image2 ( -- addr )
\ write-cell ( x file-id -- ior )
\ reloc-bits ( -- addr )
\ set-bit ( u addr -- )
\ write-symbol ( acell mask file-id u -- )
: compare-images { size file-id -- }
    \G compares image1 and image2 (of size cells) and sets reloc-bits.
    \G offset is the difference for relocated addresses
    \ this definition is certainly to long and too complex, but is
    \ hard to factor.
    cr ."  code" cell  26 cells image-data { cbase coffset }
    ."    xt" 13 cells 22 cells image-data { xbase xoffset }
    ." label" 14 cells 18 cells image-data { lbase loffset }
    size 0 u+do
	image1 i th @ image2 i th @ { cell1 cell2 }
	    cell1 cell2 = ?of
		cell1 file-id write-cell throw endof
	    cell1 im-sects1 sect-reloc cell2 im-sects2 sect-reloc over = ?of
		file-id write-cell throw
		i reloc-bits set-bit endof
	    cell1 coffset + cell2 = ?of
		cell1 cbase - $4000 file-id i write-symbol endof
	    cell1 xoffset + cell2 = ?of
		cell1 xbase -     0 file-id i write-symbol endof
	    cell1 loffset + cell2 = ?of
		cell1 lbase - $8000 file-id i write-symbol endof
	    cell1 file-id write-cell throw
	    cell1 cell2 <> if
		0 i th 9 u.r cell1 17 u.r cell2 17 u.r cr
	0 endcase
    loop ;
Again, there is a large number of values in play here.
\ additional non-standard words used:
\ slurp-file ( c-addr1 u1 -- c-addr2 u2 )
\ size1 ( -- u )
\ size2 ( -- u )
\ prepare-sections ( -- )
\ reloc-size ( -- u )
\ reloc-bits ( -- addr )
: comp-image ( "image-file1" "image-file2" "new-image" -- )
    parse-name slurp-file { file1 fsize1 }
    file1 fsize1 s" Gforth5" search 0= abort" not a Gforth image"
    drop 8 + file1 - { header-offset }
    file1 fsize1 header-offset /string to size1 to image1
    size1 aligned size1 <> abort" unaligned image size"
    parse-name slurp-file header-offset /string to size2 to image2
    size1 size2 <> abort" image sizes differ"
    parse-name ( "new-image" ) w/o bin create-file throw { outfile }
    size1 1- cell/ bits/au / 1+ to reloc-size
    reloc-size allocate throw to reloc-bits
    reloc-bits reloc-size erase
    file1 header-offset outfile write-file throw
    base @ hex
    size1 aligned cell/  outfile  compare-images
    base !
    reloc-bits reloc-size outfile write-file throw
    outfile close-file throw ;
This word is just a straight-line word, which makes it relatively easy to understand despite its length (if you understand the components).

map-array (mostly solved)

This example is shorter and easier to understand:
: map-array ( ... addr u xt -- ... )
\ executes xt ( ... x -- ... ) for every element of the array starting
\ at addr and containing u elements
  { xt }
  cells over + swap ?do
    i @ xt execute
  1 cells +loop ;
Here the standard restrictions of ?do...+loop on return stack usage do not allow us to keep xt on the return stack the whole time, while we can keep it in a local the whole time. It is possible to push the xt on the return stack only during the EXECUTE:
: map-array ( ... addr u xt -- ... )
  rot rot cells over + swap ?do ( xt )
    i @ swap dup >r execute r>
  1 cells +loop
  drop ;
But that would mean quite a bit of stack busy work; not really an example that shows the superiority of not using locals. Andrew Haley suggests using macros instead:
: (map) ( a n - a a')   cells over + swap ;
: map[   postpone (map)  postpone ?do ; immediate
: ]map   postpone cell  postpone +loop ; immediate
This is used like map[ i @ + ]map instead of ['] + map-array and can only be used in a colon definition.
Anton Ertl