Best of GEnie..... November 1988 News from the GEnie Forth RoundTable by Gary Smith I have already devoted one column to efforts of the X3/J14 Technical Committee, their task of defining the future structure of a ANS Standard Forth, and input to that effort on GEnie. The working draft of X3/J14 is BASISn where n is the current level. I had commented briefly on some notable differences between BASIS4N and BASIS5 - very briefly. A week later Bob Berkey literally dissected BASIS5 by verse and line. The following message from Bob Berkey and Greg Bailey's rebuttal serve to reflect the nature of both better than anything I might say to induce your participation in this endeavor. Please read this exchange, make notes as you agree or disagree, and put them on GEnie in Category 10 for all to benefit from. Category 10, Topic 2 Message 124 Sun Oct 09, 1988 R.BERKEY at 09:39 PDT Notes, comments, reactions and opinions--after 24 hours with BASIS5 . Page Note . p. 2 The Forth 77 and Forth 78 documents are no longer "pertinent". p. 3 But parsing can be delimited by a string or a set of strings, and doesn't have to throw away leading delimiters. Suggest that the parsing concept here be called "word parsing." Parsing that only looks for a single trailing delimiter could be called "delimiter parsing". . p. 4 "address, compilation" has become "compilation token" and "execution token". I like the use of "compilation token", because "compilation address" in 83 was a misnomer. . p. 5 character. character sets are now implementation defined, and not necessarily ASCII. Wow this is a really biggie. It's not possible right now for a program to know what characters are in the character set. Numbers can be detected with CONVERT , but that's about it. It appears that output has new restrictions--can't print addresses or bit-masks. . p. 42 now that a char is implementation defined, ASCII char is a misnomer. If I am on an EBCDIC system will ASCII . give me the value of an EBCDIC "." (as BASIS5 now states)? From my viewpoint either the implementation defined character set needs to go or ASCII and [ASCII] should go. . All Propose that whenever a phrase from the Definitions of Terms is used that it be italicized. This would help in recognizing the intent of the language. The typesetting looks great. One exception, I'd like to see the Forth words without justification. Also, I find that Forth words without a trailing space are hard to syntactically distinguish from following punctuation. In fact it might be more correct and useful to have the "word name" definition specify that a word name includes a trailing space--I hope to develop a proposal or a paper on this. . All The word "cell" is back! In 79 it meant a 16-bit word. It took a vacation during Forth-83, and is now back as 16 or more bits. But, has this word been dormant long enough for its former meaning not to cause confusion? I doubt it. . p. 44 BLOCK is no longer in the required word set. Give me a 100 squared with a terminal and I will be able to have an ANS standard Forth! Since FILE words are also not required, a Standard Program no longer has a mass storage capability. . p. 16 Variables can not be ticked ('). Don't know why. p. 19 PARSE does not appear in the glossary. . p. 24 This is the page with stack parameter abbreviations. Things are starting to get hairy. The wrap-around number type (w) has been eliminated from the standard along with the arbitrary bits number type (8b, 16b, 32b). A new number type, unspecified, has been added, but using the old label w. Oh no, DON'T REUSE THE OLD "w" WITH A NEW DEFINITION.!!!! FIND ANOTHER LETTER!!!!!!!!!!! What's wrong with "x"? . The confusion of the 83 "w" and the BASIS5 "w" is pointless and I trust the committee will change it, but the deeper change involved is technically a dynamite issue. Any of the words in 83 that had wrap-around numbers have, from the programmer's viewpoint, been radically altered. . Take + for example. In BASIS5 it has a stack of ( n1 n2 --n3 ). n has a range of -32767 to 32767 or larger. In 83 any input to + ( w1 w2 --w3 ) produces a known output. The BASIS5 + allows the program to use fewer than half of these combinations--$7FE,002 compared to $1,000,000 if you are counting. . Oh no, I'm not believing this. LOOP is back to a sort of 79 definition. This must be just a mistake. Musn't it? Maybe not, what with consideration for overflow on a one's complement machine. Yet, now that I look further at the semantics under DO , I see that w DUP DO ... LOOP will execute at least 65,536 times. The two ideas seem contradictory so no doubt more changes are coming here. . Here's another implication with the lost "w", Forth can no longer detect a carry. The idea is to use D+ instead of + to add two numbers, and if the top of the stack is non-zero, a carry has occurred. . p. 24 addr is now a data type rather than a number type. This has yet-to-be-discovered implications. char, bit-mask, and r (real) are also new data types. Specification is needed to know what input, arithmetic, logical, and output operators, etc., can be used with what data types. . p. 28 " ("quote") New word compiles a string. It leaves addr and count. Good. CONVERT (with a new name) should also have addr and count. One proposal is to use a new name: NUMBER? ( adr count -- flag ) . p. 3 ( ) ABORT" " .( ) and ." " will produce unexpected behavior. Only " (quote) has survived the capacity to handle a null string. I think this is a bad change. ABORT" " I use regularly, and the other three cases make Forth look flaky and/or unreliable. . p. 56 FIND This word still refers to compilation address, which phrase is no longer in the definition of terms. Should have been changed along with ' ['] EXECUTE and >BODY to execution token(?) . p. 42 Code and data are separated in the 1983 standard. , "comma" ALLOT and HERE are properties of the program, that the system can borrow during compilation if desired. COMPILE can ignore HERE . . D2* and 2* are an interesting pair, D2* is a logical shift, and 2* is an arithmetic shift. Does this mean something? . And, if you said to yourself "d-2-times" and "d-times" as you read that last sentence, they are now "d-2-star" and "d-star". First they told me its "star", then they said in 1980 (Forth-79) that its "times". So for eight years I've tried to use "times" and now I am supposed to go back to "star". I assume Forth, Inc. is involved in this because they let Brodie publish those cutesy and memorable pictures in his book with the non-standard names. Hmmm, come to think of it one of his more memorable pictures was "slash", and now I see that "divide" has not been changed, just "times". So this is yet a third name for */ and */MOD (?!?) . . As I think about how I pronounce these words I find there is a dichotomy between "star-slash" and "times-divide". . D2* always "d-2-times" 2* always "2-times" UM* always "u-m-times" * sometimes "star" sometimes "times" */ more often "star-slash" than "times-divide" */MOD more often "star-slash-mod" than "times-divide-mod" D2/ usually "d-2-slash" UM/MOD usually "u-m-slash-mod" . Is that a pattern? I guess I learned it one way early on and still don't have it switched over. It appears to me that there are two sets of pronunciations in general use. Personally, I'd rather not see the standard pronunciations changed. There is an alternative, however, which may seem radical, but also fits our world--TWO pronunciations in BASIS6, Brodie and 79-Standard. But a third and new pronunciation? I hope that doesn't last. . And another naming decision from 79 has been overturned. The pictured numeric words are no longer "sharp" but "number-sign". As in <# "less-number-sign". . p. 69 new word: UNDO p. 32-68 new words: <= <> >= 0<= 0<> 0>= U<= U>= U> Nine new words in the controlled reference section . p. 34 new words: 2>R 2R> The last two concern me because they are not the same as R> R> and >R >R . The above potential bug needs to be covered in the rationale. . p. 62 NEGATE is the radix-complement. NOT is the radix-minus-one's complement. Some of us don't know what these mean, these concepts need to be in the definition of terms. . p. 66 SP@ deleted from the controlled word set. What was wrong with leaving this word controlled? . I sense that another organization, like the contributors to this board, could take up the cause of collecting commonly used names, so as to avoid needless duplication of names, and miscommunication that results from one name coming to have two slightly different meanings. A good example of this is the long fetch with segment word used on the 80x86. Propose that SP@ be the first word in a new Topic here. . I've been trying to figure out how to run a Forth-83 program using a BASIS5 system. In a worst case one could implement the arithmetic/logical functions using 16 bytes to represent register bits. But probably it could be done with byte slices. . Implementation-requirement-removal proposals: (1) Leave the function of BLK = 0 as implementation dependent. The older rule makes using block and buffer on a file system most inconvenient. (2) Remove the implementation requirement that : (colon) sets the compilation vocabulary. I believe that this behavior is only for programmers with a line editor who would forget to manually switch out of the EDITOR with its I word. . . One pattern is clear from BASIS5, the committee is working to remove implementation requirements and specifications. While reducing the functional capacity of a Standard Program, the number of Forth implementations that could meet the standard has been increased. At the same time that running a Forth-83 Standard Program on a generic BASIS5-Forth Standard System has become functionally impractical, the committee is allowing that Forth-83 Standard Systems are maximally supported. ------------ Category 10, Topic 2 Message 125 Sun Oct 09, 1988 G.BAILEY1 at 12:48 PDT Howdy, Bob! Let me be the first to thank you very much for your brief (har) remarks on your first reading of BASIS5. If your second is as fruitful, we need to get your input into usable form. The next meeting is too soon to get proposals in that would meet the two week rule, but if you have some time between now and then I would request that you get the less argumentative of them on paper and into Martin's hands anyway. We have not been two-weeking things introduced at the meetings unless really called for. Proposals that clearly correct ambiguities or awkwardnesses of language without having any technical implications have been called "post" and are generally passed to the documentation committee without debate. Any such would be welcome at any time. Issues of notation choice or implication that may have been overlooked could be simply pointed out, as you have already done, or specific solutions could be formulated (as you have also done). As a general comment your posting is already very useful, but if you have specific solutions that might be debatable I would suggest formulating each as a proposal. In both of these areas, it might be useful for you or anyone else who wishes to take the trouble to think about the editorial aspects of the document to coordinate with Ted Dickens who may be reached at (213) 477-7287. He chairs the documentation committee and would likely be willing to compare notes on things his group may already be planning to propose. AS A MATTER OF PERSPECTIVE ON BASIS5: Extensive editorial changes were necessary to beat our working document into the shape required by ANSI style. Rather than wrangle out endless detailed proposals, the documentation committee offered to produce a massively edited document that would presumably take less effort to complete than would the original. We accepted this offer, and BASIS4N, the immediate predecessor to BASIS5, resulted. It was adopted "warts and all" with the understanding that it had problems. You have been noticing some of the warts. The most risky part of Ted's work was that it was not supposed to have technical impact. However some of the changes undoubtedly had such impact. For example, I had proposed that we delete the requirement for null strings, but the support for this proposal did not represent a strong consensus so the proposal has been committed. Thus Ted's committee was NOT given authorization to delete this requirement, and it should still be there. Likewise the technical implications of the stack notation changes have not been subjected to deliberation by the committee as a whole except in a few cases when we have been working on specific words. No doubt there are unintended effects. Since there are known or suspected warts, proofreading for them is very useful. Please continue to do so. I'd like to chat about a couple of selected points you raised just for fun. Wil Baden had proposed the words CHAR and [CHAR] which were amended to be ASCII and [ASCII] (TP88-114) before adoption. He had been specifically interested in pursuing the decoupling from the ASCII character set. The committee was not quite ready to fully support this decoupling at that time. We have, through the definitions of address units, and the words CELL+ , CELLS , BYTE+ , and BYTES admitted to the possibility of storage allocation exceeding 8 bits. However "byte" as data is still defined as an assembly of 8 bits, "character" is still "a single-byte value" and the definition of "character" still references 2.1 (Referenced Standards) which consists of ASCII. I do not recall voting on the words "implementation defined" there, and in fact this seems to have been an editorial (!) change. Suggest that you submit a proposal that forces this issue; we need one. I hope it isn't too soon to use the term "cell" again. For gosh sake it's been more than half a decade and will be nearly a full one before this thing has a pretty binding to wear. I'd like to be able to use the term again within my own lifetime... Wrap around numbers are an interesting concept on a one's complement computer. Put simply they don't work that way. However, this doesn't mean that your code is broken. YOUR CODE HAS ALWAYS HAD A DEPENDENCY ON TWO'S COMPLEMENT HARDWARE IF YOU USE SUCH NUMBERS; HENCE IT HAS ALWAYS BEEN BROKEN RELATIVE TO RUNNING ON A ONE'S COMPLEMENT MACHINE. It will still run just fine on the type of hardware that the technique depends on. What is missing here is some good prose to express what we are trying to say. There is absolutely nothing wrong with writing code that inherently Rdepends on a particular ALU type or even cell width (like 0< to test high bit of a boolean mask). This merely means that the application uses techniques that are hardware dependent, and the last thing Forth should do is to forbid people to exploit the hardware they have to work with. However as soon as one starts doing so he is writing code that is HARDWARE DEPENDENT and should happily admit it. All we are trying to do here is to clarify what you can do that is NOT hardware dependent and call a spade a spade. Since there are many such dependencies in conventional usage of the same set of operators for dealing with the various data types (pairs versus doubles; numbers versus flags versus boolean masks) there is clearly much work left to be done. The key thing about the "contractions" 2* and 2/ is that we inherited them from FORTH-83 defined as shifts, so they are not in fact contractions at all, at least at present. The definition of 2* as an "arithmetic" left shift has always given me a chuckle since if one is going to insist on two's complement hardware there is of course no need to make the distinction. On the other hand one who uses this definition for 2* on a 1's complement machine to manipulate bits he will be surprised to find that the arithmetic left shift is and must be CIRCULAR. Unless the committee is willing to change this we will need to add controlled reference words that guarantee specific bit oriented shifting operations. How to pronounce our favorite words is almost as good as counting angels for consuming debate time. I am hoping that the committee can avoid getting Xbogged down in this relatively irrelevant nonsense until the important part of the document makes sense. SP@ was deleted because its existence directly conflicts with existing Forth hardware, such as the Novix and Harris chips. To the extent that programmer portability is important, dependency on stack addressing techniques forms habits that simply don't work and cannot be reasonably implemented on such hardware. Even though a word is merely "controlled" it is quite often the case that implementors are effectively forced to support everything in the book. If it can be shown that a word cannot be reasonably implemented on hardware DESIGNED to run Forth, then that word is an excellent candidate for the silent treatment. Finally, In reference to your remark about trying to figure out how to run a Forth-83 program on a BASIS5 system: I don't know about you, but most of my systems are nearly BASIS5 so they would probably run reasonably well. If you have arithmetic that depends on a 16-bit ALU, you will have a problem. In my own experience conversion to a 32-bit or larger machine has not led to much of this. However I will admit that you may have some difficulty in transporting some code that depends on 16-bit 2's complement byte addressing to many of the world's architectures howEVER you do it. One way to deal with the problem is to assert that you don't plan on using such equipment... then you don't have a problem any more. If instead you plan on such broad portability, then you (like all the rest of us) will probably need to clean up your act a little bit. We all play fast and loose with data types! Thanks again --- Greg B.