Notes on the ISO-Prolog Standard

Author: Joachim Schimpf
Last modified: 2011-04-11

As constructive discussions on the prolog-standard mailing list are quite difficult, I have decided to quietly collect my comments and contributions here, so they can be considered by those who want to do so. I will try to address any feedback (no matter how I become aware of it) directly here on the web site. If you want to interactively discuss a concrete point, please email me personally.

Floating-point errors

include/1

min/1 and max/1

number_chars/2 and number_codes/2

pi/0

Signed numbers

uninstantiation error

Comments on 2nd Technical Corrigendum

Corrections for Arithmetic

Please include the following corrections to the core standard.

Section 9.1.4:

The definition of divF is incomplete, the zero-divisor case is missing (this can be seen by comparing signature and definition). Please correct as follows:

divF(x,y) = resultF(x/y),rndF) if y\=0
          = undefined          if x=0,y=0
          = zero_divisor       if x\=0,y=0

This mistake is apparently inherited from the first LIA standard.

Section 9.3.1.3 (and future ^/2 function):

Change the error condition (d) for zero raised to a negative exponent from undefined to zero_divisor.

Section 9.3.6.3:

Change (c) from "VX is zero or negative" to "VX is negative". Add case (d) saying "VX is zero" leads to evaluation_error(zero_divisor).

The idea here is that zero_divisor corresponds to an infinite result, while undefined is truly undefined. This is not my idea, I hasten to add, and every system that creates infinities must make this distinction anyway.

I have been told that these changes cannot be put into the corrigendum because they would make existing conforming implementations non-comforming. It is, however, the whole point of a "corrigendum" to correct mistakes, and inevitable that implementations may have to change to implement the corrections.

min and max

In the new section on min/2 and max/2, please do not refer to float_overflow when talking about to the case that an integer is not exactly representable as a float! As defined in 9.1.4.2, float_overflow occurs when a result is greater than fmax, which is not the case we want to capture here.

See 2006 Mailing list discussion for a previous discussion on that topic.

Uninstantiation error

Whether such an error is needed is debatable. In the suggested case of the open/3 predicate being called with an instantiated output argument, the need for a new error goes away when we imagine that the check is performed after the new stream has been created, and just before it is unified with the result argument. Without check, this unification would fail, and it is this failure that we want to supplant with an approriate error. Elsewhere, such a situation would be signaled via a type_error (if types are different) or a domain_error (if types match, but value differs from what's expected). So why not

% newly opened stream would be unified with 99 -> type error
?- open(f1, read, 99).
type_error(stream, 99)

% newly opened stream would be unified with other stream -> domain error
?- open(f1, read, S), open(f2, read, S).
domain_error(stream, $stream(f1))

It has been argued that in some implementations open(t,write,S), close(S), open(t,write,S2), S==S2 may succeed, and thus the error case does not replace a failure. I hope it is undisputed that the intended semantics in that case is indeed a failure (even though the standard leaves it unspecified), and not to implement this failure is merely an oversight.

A case for the "uninstantiation" error can be made, however, in a completely different situation, where a variable is used as a quasi-identifier in a quantifier-like construct, e.g.

?- Y=foo, setof(X, Y^p(X,Y), L).

One could argue that this is likely to be a mistake and should be flagged by an "uninstantiation" error for Y. In this context, however, the name "uninstantiation" is unhelpful. A type_error(variable) would be quite appropriate, since these variables are not meant to ever become instantiated, so being a "variable" is their final destiny or, arguably, their "type".

Another (non-ISO) argument that has been made for why we need this error is a predicate that attaches attributes to variables, but is called with a non-variable, e.g. put_attr(foo,attrname,attrval). There is, however, no reason why this should not be equivalent to put_attr(X,attrname,attrval),X=foo and behave accordingly, i.e. succeed or fail according to the semantics of the attribute.

pi/0

The pi/0 arithmetic function is problematic because of its return type. In a basic ISO system, the return type is simply float. A system that provides more than one representation for real numbers (e.g. different floating-point representations, or both floats and intervals) must settle for one of these types as the return type for pi (because it is argument-less, it cannot be polymorphic). It would have to be the one with the highest precision because it then can be converted to a lower-precision type as needed (whereas missing precision cannot be recovered later).

What this means, however, is that a multi-representation system cannot really commit to pi/0 returning the float-type (which may not have enough precision). Or, if it did, it would have to provide additional variants to return the higher precision representations (pi_binary_128/0, pi_decimal_64/0 etc).

A similar problem exists with regard to the type of literal constants such as 1.0. In a system with multiple representations, possible solutions are

different syntax - Common Lisp has 1.0S1.0, 1.0D1.0, 1.0L1.0 to denote different precisions. CON: precision is always explicit.
let constants always have the maximum precision type, and require explicit coercion when wanted. CON: when coercion is forgotten, the maximum precision will propagate through the computation unnecessarily (assuming usual contagion rules).
have a precision-flag (possibly module-local, like ECLiPSe's read_floats_as_breals) that determines the precision of constants and pi/0

Often, one wants to write code that does a computation in a type-generic way, i.e. uses and propagates the precision of its input arguments. Unfortunately, none of the above alternatives is much help with that.

For the case of pi, the easiest way to solve the problem is to provide a function pi/1, where pi(X) is equivalent to pi*X computed with the precision of X (but at least the smallest floating point precision, to avoid biblical surprises like 3=:=pi(1)).

Signed Numbers (Apr 2011)

Draft technical corrigendum 2 (Mar 2011) includes the introduction of a predefined prefix operator declaration and evaluable function for +/1. It seems that no corresponding modification for the signed number term syntax has been proposed. Unless I am misinterpreting something, this leads to an asymmetry between the minus and the plus sign:

Input	ISO 13211-1	DTC2 Mar 2011	ECLiPSe 6.0 native	My suggestion for ISO
`-1`	-1	-1	-1	-1
`+1`	error	+(1)	1	1
`- 1`	-1	-1	-(1)	-1
`+ 1`	error	+(1)	+(1)	1
`'-'1`	-1	-1	-(1) (version 6.1)	-(1)
`'+'1`	error	+(1)	+(1) (version 6.1)	+(1)
`'-' 1`	-1	-1	-(1)	-(1)
`'+' 1`	error	+(1)	+(1)	+(1)
`'-'/**/1`	-1	-1	-(1)	-(1)
`'+'/**/1`	error	+(1)	+(1)	+(1)
`- /**/1`	-1	-1	-(1)	-1
`+ /**/1`	error	+(1)	+(1)	1
`+1.0e+2`	error	+(1.0e2)	1.0e2	1.0e2
`1.0e'-'2`	error	error	error	error
`1.0e- 2`	error	error	error	error
`number_codes(N,"-1")`	N = -1	N = -1	N = -1	N = -1
`number_codes(N,"+1")`	error	error	N = 1	N = 1

I suggest to adopt the last column as correction for DTC2:

Accept + not only as prefix operator, but also as sign, such that integer(+3). This requires a change to 6.3.1.2. Rationale: Be consistent with the minus sign, and be consistent with the sign of floating-point exponents.
Do not accept quoted signs as signs, such that compound('-'3). Rationale: the acceptance of quoted signs was almost certainly unintended and merely an accident of "grammar reuse" in the specification. Note also that the sign of a floating-point exponent cannot be quoted.

On the other hand, it has been pointed out that allowing + as a sign is less "necessary" than allowing - (negative number syntax is necessary to allow writeq+read give the correct result with negative numbers), and therefore the +/- asymmetry may be considered ok.

NOTE (not for ISO - I think it's too late to change this aspect): I do of course think that the ECLiPSe choice (of not allowing space of any kind between sign and number) is the most consistent because:

The reason to allow space seems to be the same as the reason to allow quoted signs: (understandable) laziness in early implementations. It simplifies the lexer-parser interface, because the information about quotes and spaces does not have to be exported from the lexer.
But: spaces or quoted bits are not allowed elsewhere in floating point numbers, for instance.
I have heard the argument that one sometimes wants to read tables of numbers where the sign is separated by spaces. However, when such a table is read as data, this can easily be solved by postprocessing (in the simplest case, calling is/2 on the read term), which will be needed anyway if the input format is not completely under the programmer's control. For numbers occurring in programs, one would hope that the restriction still leaves enough room for programmers to satisfy their layout preferences.
A significant space in front of a number is tantamount to the significant space between a prefix functor and an opening parenthesis: -(a,b) is a -/2 term in functor notation, while - (a,b) is a -/1 term in prefix notation, i.e. the space causes the minus sign to be interpreted as prefix operator.

Comments on Minutes of WG17 2010 Edinburgh Meeting

term_variables/3

I assume this is meant to be a difference-list version of term_variables/2. Difference-list-variants of list-constructing predicates can be a good idea (findall/3 comes to mind, but also atom_codes/2 etc), but as the standard does not systematically provide these, it seems there should be a special reason for providing it in this case - what is it?

Apparently, the use case is the implementation of bagof/setof. The arguments made below still hold.

As opposed to the other examples quoted above, a difference-list version of term_variables/2 is a particularly bad idea because

   term_variables(T1, Vs, Vs1), term_variables(T2, Vs1, []).

is not the same as

   term_variables(T1-T2, Vs).

because it is supposed to return a duplicate-free variable list. The predicate invites bugs by suggesting a usage that is unlikely to give the expected results.

Note that the correct way to augment a term-variable-list is

   term_variables(T1, Vs1), term_variables(Vs1-T2, Vs).

Miscellaneous

number_chars/2 and number_codes/2 (Apr 2011)

Exceptions

The predicate as originally specified in the standard is unnecessarily limited in usefulness by requiring an error for the case that the right-hand side string cannot be parsed as a number. Failure would be much more useful, as the predicate could be readily used to "convert to a number if possible", allowing the following common pattern:

    ( number_chars(Num, Chars) ->
        <deal with a number Num>
    ;
        <deal with something else in Chars>
    )

Accepted Language

Neumerkel's comparison drew my attention to other surprises in the ISO spec:

space and even comments can occur in the string before the actual number, i.e. number_codes(3," /*comment*/ 3") is supposed to succeed. Why comments? We are not parsing a program here!
On the other hand, space after the number is forbidden, i.e. number_codes(N, "3 ") is required to raise an exception. Why the asymmetry?
number_codes(3, "03") succeeds, but number_codes(3, "+3") is required to raise an exception. Do we want to allow redundant characters or not?
number_codes(N, "- 3.1e-2") is allowed, but number_codes(N, "-3.1e- 2") is not. Do we want to allow redundant space or not?

This all makes very little sense for a predicate that is supposed to convert back and forth between numbers and their various string representations. While it is clear that the mapping is not one-to-one, the flexibility allowed in the string-to-number backward direction looks completely arbitrary.

The reason for the strangeness is, of course, that the specification refers to the Prolog term syntax (probably to avoid having to deal with signs), rather than more directly to number token syntax. A more sensible, but still compact spec can be had by specifying the language accepted by number_chars separately in terms of token syntax, e.g.

    [sign] (integer token|float number token)

If it is felt that redundant spaces must be accepted, these can now be re-introduced explicitly (but preferably on both sides of the number, and without allowing comments). Nevertheless, for a built-in 'primitive', it would be cleaner to accept the pure number only.

It can be doubted whether point 1 above (allowing comments) and point 4 (space after sign) are really required by 8.16.7, because it mentions "a character sequence which could be output". No ouput primitive will output extra comments, and no output primitive according to 7.10.5 will output space between sign and number.

include/1

Suppose we have the following 3 files:

% top.pl
:- include('somedir/include_p').

% somedir/include_p.pl
:- ensure_loaded(p).

% somedir/p.pl
p.

and we compile top.pl. What current directory should we expect when encountering the ensure_loaded(p) directive in somedir/include_p.pl? With pure include-semantics, the above code does not work: the ensure_loaded behaves as if it occurred within top.pl directly. SWI Prolog, YAP and ECLiPSe behave that way. SICStus (and similar directives in C) behave the other way, i.e. certain pathnames in included files are relative to the includee's not the includer's location. However, the current directory is still the includer's, so the situation is a bit confusing. A discussion from the ECLiPSe point of view is in bug 678.

Thanks

Thanks for comments to: Richard O'Keefe, Paulo Moura, Ulrich Neumerkel.