Files processed with chpp
are passed through two stages, which
are, however, not sequential in nature but can rather be viewed as
coroutines. The first stage processes commands
(see section Commands). Command processing is a sequential process, i.e. no
loops or recursions occur. The second stage is macro processing
(see section The Meta-Char) and allows for loops as well as for recursion.
Command processing is line-oriented. It affects only lines which have,
as their first non-whitespace character, the command-char
(#
). All other lines are passed through literally. Another
function of command processing is the concatenation of lines: If the
last character of a line is the backslash (\
), then the
backslash, the following newline and all leading whitespace of the next
line are ignored.
A line invoking a command consists of optional whitespace at the
beginning, the command-char #
, optional whitespace, the command
name, whitespace and the command arguments (if any). Thus, the following
lines are all commands (given that the command names exist):
#abc #def arg # ghi too many arguments
while the following lines are not:
this is a line without commands. although this line contains a # it is not a command.
The command !
(exclamation mark) is a special case among
commands, as it does nothing, independent of its parameters, i.e. can be
used to write comments, or, if used in the first line of a file, to
specify the command line to be used if the containing file is
executed. Thus, this is a "Hello world" program in chpp
:
#! /usr/local/bin/chpp Hello world!
After setting the executable bit for this file, it can be called like any command and will produce the output
Hello world!
Note that the exclamation mark must be followed by whitespace.
chpp
).
If the file is not found, an error message is produced.
endif
.
endif
.
endif
.
endd
.
The second stage processes everything that is passed through by the
first stage. It is called macro processing because its main use is the
expansion of macros. There is just one special character for this stage,
namely the meta-char (%
). Only character sequences beginning with
the meta-char are modified by the macro processing stage. All other
characters are simply passed through. Since chpp
was designed to
be non-intrusive, even uses of the meta-char which do not correspond to
the uses described in this chapter are copied verbatim. For example:
Temperature today is 10% above average. => Temperature today is 10% above average.
In cases where it is absolutely necessary that the meta-char not be
interpreted as special, it can be quoted with itself (i.e. %%
),
yielding one meta-char. Example:
%<heinz=deinz>\ %%heinz evals to %heinz. => %heinz evals to deinz.
The only primitive type in chpp
is the string. Values of that
type are referred to as scalars (see section Scalars). Values of any type
can be combined arbitrarily to lists (see section Lists) and hashes
(see section Hashes). Closures (see section Closures) also form a data type as
they can be stored and used, even though they cannot be directly
manipulated.
Scalars are strings of arbitrary length (including the length 0).
Lists are ordered collections of arbitrary values indexed by consecutive numbers starting at 0. It follows that lists cannot have gaps.
Hashes are ordered collections of arbitrary values indexed by arbitrary scalars, i.e. they establish a so-called key/value mapping.
A closure is a piece of code associated with an environment in which it
is to be executed, as created by lambda
(or define
, for
that matter). Thus, the names macro and closure actually stand for the
same thing, although one usually tends to call anonymous macros
(i.e. values returned by lambda
) closures, whereas named closure
(i.e. define
d macros) are usually called macros.
In order to be able to retain values for subsequent use it is necessary to store them in variables.
There are two different syntactic forms of variable access, called the short and the long form.
The short form consists of the meta-char followed by an optional
ampersand (&
) followed by the variable name, e.g. %name
or
%&name
. The variable name is taken as-is, i.e. is not
evaluated. The variable name ends with the first char that is not a
letter, a digit or the underscore (_
). If a variable with the
given name does not exist, the whole string is not interpreted as a
variable access and is copied verbatim, i.e. %name
evaluates to
%name
if there is no variable with the name name
.
The long form consists of the meta-char followed by the optional
ampersand and the variable name within angle brackets,
e.g. %<name>
or %<&name>
. The variable name is evaluated
before the variable is looked up, making it possible, for example, to
use variable names containing right angle brackets: The term
%<%'>>>'>
accesses the variable >>>
. If a variable with
the name does not exists, an error message is issued. Note:
Although it is possible to use macros to construct variable names
(e.g. %<%name>
), this feature is deprecated. Please don't use it.
If the variable is a list or a hash, it can be subscribed by appending
the index in brackets or curly braces to the name, in both the short and
the long form. In order to access nested data structures, any number of
such indexes can be used in one access, for example
%name[3]{foo}
.
Accessing a variable or a subscript without an ampersand produces a shallow copy of its value, i.e. accessing a list produces a copy of the list containing the elements of the original list. Example:
%<lst1=%list(a,b,c)>%<lst2=%lst1>\ %same(%&lst1,%&lst2) : %same(%&lst1[0],%&lst2[0]) => 0 : 1
Accessing a variable or subscript with an ampersand produces the same value, i.e. can be used to bind two names to the same value:
%<str1=abc>%<str2=%&str1>\ %same(%&str1,%&str2) => 1
There are several important issues to this:
chpp
's built-in macros, however, are wise enough not to
copy their arguments when they don't need to. For example, calling
llength
never copies its argument.
chpp
user, as it employs garbage collection to free
unused memory.
A macro can be invoked by appending, in the short or long form of
variable access, to the variable name or subscript a left parenthesis
followed by the actual arguments separated by commas followed by a right
parenthesis, e.g. %list(a,b)
. The value that is yielded by the
macro invocation cannot be subscribed further, i.e. %list(a,b)[1]
is not allowed. However, see section Subscribing Non-Variables for a
method to achieve this goal.
Arguments of a macro-call are processed as follows: First, all leading and trailing whitespace from all arguments is removed. Then, the remaining strings are evaluated and the results are passed as arguments to the macro. In order to pass an argument with leading or trailing whitespace to a macro, it must be quoted. For example:
%define(foobar,arg,"%arg") %foobar( xyz ) => "xyz" %foobar( ) => "" %foobar( %' ' ) => " " %foobar(%' xyz ') => " xyz "
It is possible to subscribe values that are not variables, for example ones that are returned from macros, by using a modified long form of variable access. Instead of the variable name the expression yielding the value enclosed in parentheses is used. Upon evaluation, the expression is evaluated and all following subscriptions are applied to its value. Example:
%<(%list(a,b))[1]> => b
Assignment syntax is an enhancement of the long form of variable access.
The last subscription (or the variable name, if no subscription is used)
is followed by an equal sign (=
) which is followed by an
expression yielding the value to be assigned.
When assignment is attempted to an element of a list which is out of bounds, the list is enlarged. Elements between the formerly last element and the newly added element default to the empty string. Indexes less then 0 are not allowed.
Assigning to a key in a hash which is not part of it, adds the key/value pair to the hash.
It is not possible to assign to a subscript of a value which is not
subscribeable, i.e. it is not possible to do %<bar[3]=foo>
if
bar
is not a list. To make bar
an empty list, simply do
%<bar=%list()>
.
Assignment usually changes a binding, be it in an environment or in a list or hash. This means that the sequence
%<value=%list()>%<value=%hash()>
first binds the name value
to a newly created list and then
rebinds value
to a newly created hash, leaving the old list
intact. When using the ampersand-form, however, the old value is changed
to the new value, which is a destructive process. Example:
%<value=abc>%<ref=%&value>%<&value=123>%ref => 123
When an assignment to a variable is executed for which there is no binding, a new binding in the global environment is created for this variable name.
chpp
uses lexical scoping, using an environmental model, very
similar to Scheme's. An environment contains bindings for names to
values and a reference to its parent environment. The only environment
without parent is the global environment. Execution always takes place
in some environment. If a variable name has to be resolved, the current
environment is checked for whether it contains a binding for that
name. If it does not, its parent is checked, and so on until the global
environment is reached. If it does not contain a corresponding binding,
the variable name cannot be resolved and an error message is produced.
New environments are created upon several occasions:
locals
expression. The new environment is set
up to contain bindings for all the variables mentioned as parameters to
locals
. The parent environment of the new environment is the
environment active at the time of execution of the expression. The
environment is active throughout the body of the locals
expression.
lambda
or the value bound to a name as a result of an invocation
of define
). The environment is set up to contain bindings for all
the parameters of the closure. The parent environment of this new
environment is the environment active at the time of the generation of
the closure, i.e. of the invocation of lambda
. That makes it
possible to do things like this:
%define(newcounter,%locals(c,%<c=0>%lambda(%<c=%[c+1]>%c)))\ %<counter=%newcounter()>\ %counter() %counter() %counter() => 1 2 3The parent environment for the environments of the closure invocations is the environment created by
locals
, mapping c
to
0
, which is itself created every time newcounter
is
executed. Thus, in a way, counter
carries a state, namely its
private variable c
. Had we called newcounter
a second
time, a second counter would have been created, with its own c
,
initally set to 0
, completely independent of the first.
for
, foreach
and foreachkey
expressions (see section Special Forms).
chpp
permits the evaluation of arithmetic expressions by
enclosing the expression in square brackets ([]
) preceded by the
meta-char. The expression is first evaluated according to chpp
rules and the resulting value is treated as an arithmetic expression
which, in turn, yields a number. Whitespace between operators and
operands is ignored. The following table is a summary of all available
operators together with their arity. They are sorted by precedence, the
first line being of the highest precedence. All binary operators
evaluate from left to right. All operators have the same meaning as the
corresponding operators in the C language.
@multitable @columnfractions .2 .8
!
, ~
, -
@tab unary
*
, /
, %
@tab binary
+
, -
@tab binary
<
, >
, <=
, >=
@tab binary
==
, !=
@tab binary
&
@tab binary
^
@tab binary
|
@tab binary
&&
@tab binary
||
@tab binary
Precedence of operators can be overridden by using parentheses
(()
).
In order to make arithmetic expressions more readable, it is allowed to
refer to the values of variables within an arithmetic expression by
writing its name--without a preceding meta-char. Note that subscription
and macro invocation using this syntax is not allowed.
Some examples:
%[1+2] => 3 %[1.5+3.3] => 4.800000 %[3==3] => 1 %[3!=3] => 0 %[(1+2)*(3+4)] => 21 %<x=4>%[%x+1] => 5 %<x=4>%[x+1] => 5
To prevent some string from evaluation, it can be quoted by enclosing it
in a pair of single-quotes ("
), preceded by the meta-char. The
only special characters after the first double-quote are the quote-char
(the backslash, \
) and the closing double-quote. The quote-char
gives special meanings to some characters following it: an n
becomes a newline character and a t
is interpreted as a
tabulator. All other character preceded by the quote-char stand for
themselves. This includes the quote-char and the double-quote,
i.e. %'\\\"
evaluates to \'
.
In order to evaluate a string twice, for example to evaluate the
contents of a variable, the string must be enclosed in curly braces
({}
), preceded by the meta-char. Example:
%<a=abc>%<b=%%a>%{%b} => abc
Go to the first, previous, next, last section, table of contents.