Go to the first, previous, next, last section, table of contents.


Language Reference

Files processed with chpp are passed through two stages, which are, however, not sequential in nature but can rather be viewed as coroutines. The first stage processes commands (see section Commands). Command processing is a sequential process, i.e. no loops or recursions occur. The second stage is macro processing (see section The Meta-Char) and allows for loops as well as for recursion.

Commands

Command processing is line-oriented. It affects only lines which have, as their first non-whitespace character, the command-char (#). All other lines are passed through literally. Another function of command processing is the concatenation of lines: If the last character of a line is the backslash (\), then the backslash, the following newline and all leading whitespace of the next line are ignored.

A line invoking a command consists of optional whitespace at the beginning, the command-char #, optional whitespace, the command name, whitespace and the command arguments (if any). Thus, the following lines are all commands (given that the command names exist):

#abc
    #def arg
 #  ghi too many arguments

while the following lines are not:

this is a line without commands.
although this line contains a # it is not a command.

Comments

The command ! (exclamation mark) is a special case among commands, as it does nothing, independent of its parameters, i.e. can be used to write comments, or, if used in the first line of a file, to specify the command line to be used if the containing file is executed. Thus, this is a "Hello world" program in chpp:

#! /usr/local/bin/chpp
Hello world!

After setting the executable bit for this file, it can be called like any command and will produce the output

Hello world!

Note that the exclamation mark must be followed by whitespace.

Command Reference

Command: include filename
Includes the file filename. If filename is relative, it is first searched for in the directory of the including file, then in the directories contained in the include search path (see section Invoking chpp). If the file is not found, an error message is produced.

Command: define name value
Defines the global variable name to contain whatever value evaluates to.

Command: if condition
Evaluates condition. If its boolean value is FALSE, it skips everything up to the corresponding endif.

Command: ifdefined symbol
Command: ifdef symbol
If a variable with the name symbol does not exist, skips everything up to the corresponding endif.

Command: ifnotdefined symbol
Command: ifndef symbol
If a variable with the name symbol exists, skips everything up to the corresponding endif.

Command: error message
Produces an error message with the text message.

Command: discard
Command: disc
Discards everything up to the corresponding endd.

The Meta-Char

The second stage processes everything that is passed through by the first stage. It is called macro processing because its main use is the expansion of macros. There is just one special character for this stage, namely the meta-char (%). Only character sequences beginning with the meta-char are modified by the macro processing stage. All other characters are simply passed through. Since chpp was designed to be non-intrusive, even uses of the meta-char which do not correspond to the uses described in this chapter are copied verbatim. For example:

Temperature today is 10% above average.
=> Temperature today is 10% above average.

In cases where it is absolutely necessary that the meta-char not be interpreted as special, it can be quoted with itself (i.e. %%), yielding one meta-char. Example:

%<heinz=deinz>\
%%heinz evals to %heinz.
=> %heinz evals to deinz.

Data Types

The only primitive type in chpp is the string. Values of that type are referred to as scalars (see section Scalars). Values of any type can be combined arbitrarily to lists (see section Lists) and hashes (see section Hashes). Closures (see section Closures) also form a data type as they can be stored and used, even though they cannot be directly manipulated.

Scalars

Scalars are strings of arbitrary length (including the length 0).

Lists

Lists are ordered collections of arbitrary values indexed by consecutive numbers starting at 0. It follows that lists cannot have gaps.

Hashes

Hashes are ordered collections of arbitrary values indexed by arbitrary scalars, i.e. they establish a so-called key/value mapping.

Closures

A closure is a piece of code associated with an environment in which it is to be executed, as created by lambda (or define, for that matter). Thus, the names macro and closure actually stand for the same thing, although one usually tends to call anonymous macros (i.e. values returned by lambda) closures, whereas named closure (i.e. defined macros) are usually called macros.

Variables

In order to be able to retain values for subsequent use it is necessary to store them in variables.

Accessing Variables

There are two different syntactic forms of variable access, called the short and the long form.

The short form consists of the meta-char followed by an optional ampersand (&) followed by the variable name, e.g. %name or %&name. The variable name is taken as-is, i.e. is not evaluated. The variable name ends with the first char that is not a letter, a digit or the underscore (_). If a variable with the given name does not exist, the whole string is not interpreted as a variable access and is copied verbatim, i.e. %name evaluates to %name if there is no variable with the name name.

The long form consists of the meta-char followed by the optional ampersand and the variable name within angle brackets, e.g. %<name> or %<&name>. The variable name is evaluated before the variable is looked up, making it possible, for example, to use variable names containing right angle brackets: The term %<%'>>>'> accesses the variable >>>. If a variable with the name does not exists, an error message is issued. Note: Although it is possible to use macros to construct variable names (e.g. %<%name>), this feature is deprecated. Please don't use it.

List and Hash Subscription

If the variable is a list or a hash, it can be subscribed by appending the index in brackets or curly braces to the name, in both the short and the long form. In order to access nested data structures, any number of such indexes can be used in one access, for example %name[3]{foo}.

Copies and References

Accessing a variable or a subscript without an ampersand produces a shallow copy of its value, i.e. accessing a list produces a copy of the list containing the elements of the original list. Example:

%<lst1=%list(a,b,c)>%<lst2=%lst1>\
%same(%&lst1,%&lst2) : %same(%&lst1[0],%&lst2[0])
=> 0 : 1

Accessing a variable or subscript with an ampersand produces the same value, i.e. can be used to bind two names to the same value:

%<str1=abc>%<str2=%&str1>\
%same(%&str1,%&str2)
=> 1

There are several important issues to this:

Macro Invocation

A macro can be invoked by appending, in the short or long form of variable access, to the variable name or subscript a left parenthesis followed by the actual arguments separated by commas followed by a right parenthesis, e.g. %list(a,b). The value that is yielded by the macro invocation cannot be subscribed further, i.e. %list(a,b)[1] is not allowed. However, see section Subscribing Non-Variables for a method to achieve this goal.

Arguments of a macro-call are processed as follows: First, all leading and trailing whitespace from all arguments is removed. Then, the remaining strings are evaluated and the results are passed as arguments to the macro. In order to pass an argument with leading or trailing whitespace to a macro, it must be quoted. For example:

%define(foobar,arg,"%arg")

%foobar(  xyz  )
=> "xyz"
%foobar(    )
=> ""
%foobar(  %'  '  )
=> "  "
%foobar(%'  xyz  ')
=> "  xyz  "

Subscribing Non-Variables

It is possible to subscribe values that are not variables, for example ones that are returned from macros, by using a modified long form of variable access. Instead of the variable name the expression yielding the value enclosed in parentheses is used. Upon evaluation, the expression is evaluated and all following subscriptions are applied to its value. Example:

%<(%list(a,b))[1]>
=> b

Assignment

Assignment syntax is an enhancement of the long form of variable access. The last subscription (or the variable name, if no subscription is used) is followed by an equal sign (=) which is followed by an expression yielding the value to be assigned.

When assignment is attempted to an element of a list which is out of bounds, the list is enlarged. Elements between the formerly last element and the newly added element default to the empty string. Indexes less then 0 are not allowed.

Assigning to a key in a hash which is not part of it, adds the key/value pair to the hash.

It is not possible to assign to a subscript of a value which is not subscribeable, i.e. it is not possible to do %<bar[3]=foo> if bar is not a list. To make bar an empty list, simply do %<bar=%list()>.

Assignment usually changes a binding, be it in an environment or in a list or hash. This means that the sequence

%<value=%list()>%<value=%hash()>

first binds the name value to a newly created list and then rebinds value to a newly created hash, leaving the old list intact. When using the ampersand-form, however, the old value is changed to the new value, which is a destructive process. Example:

%<value=abc>%<ref=%&value>%<&value=123>%ref
=> 123

When an assignment to a variable is executed for which there is no binding, a new binding in the global environment is created for this variable name.

Scoping Rules

chpp uses lexical scoping, using an environmental model, very similar to Scheme's. An environment contains bindings for names to values and a reference to its parent environment. The only environment without parent is the global environment. Execution always takes place in some environment. If a variable name has to be resolved, the current environment is checked for whether it contains a binding for that name. If it does not, its parent is checked, and so on until the global environment is reached. If it does not contain a corresponding binding, the variable name cannot be resolved and an error message is produced.

New environments are created upon several occasions:

Arithmetic Expansion

chpp permits the evaluation of arithmetic expressions by enclosing the expression in square brackets ([]) preceded by the meta-char. The expression is first evaluated according to chpp rules and the resulting value is treated as an arithmetic expression which, in turn, yields a number. Whitespace between operators and operands is ignored. The following table is a summary of all available operators together with their arity. They are sorted by precedence, the first line being of the highest precedence. All binary operators evaluate from left to right. All operators have the same meaning as the corresponding operators in the C language.

@multitable @columnfractions .2 .8

  • Operators @tab Arity
  • !, ~, - @tab unary
  • *, /, % @tab binary
  • +, - @tab binary
  • <, >, <=, >= @tab binary
  • ==, != @tab binary
  • & @tab binary
  • ^ @tab binary
  • | @tab binary
  • && @tab binary
  • || @tab binary Precedence of operators can be overridden by using parentheses (()). In order to make arithmetic expressions more readable, it is allowed to refer to the values of variables within an arithmetic expression by writing its name--without a preceding meta-char. Note that subscription and macro invocation using this syntax is not allowed. Some examples:
    %[1+2]
    => 3
    %[1.5+3.3]
    => 4.800000
    %[3==3]
    => 1
    %[3!=3]
    => 0
    %[(1+2)*(3+4)]
    => 21
    %<x=4>%[%x+1]
    => 5
    %<x=4>%[x+1]
    => 5
    

    Quotation

    To prevent some string from evaluation, it can be quoted by enclosing it in a pair of single-quotes ("), preceded by the meta-char. The only special characters after the first double-quote are the quote-char (the backslash, \) and the closing double-quote. The quote-char gives special meanings to some characters following it: an n becomes a newline character and a t is interpreted as a tabulator. All other character preceded by the quote-char stand for themselves. This includes the quote-char and the double-quote, i.e. %'\\\" evaluates to \'.

    Explicit Evaluation

    In order to evaluate a string twice, for example to evaluate the contents of a variable, the string must be enclosed in curly braces ({}), preceded by the meta-char. Example:

    %<a=abc>%<b=%%a>%{%b}
    => abc
    


    Go to the first, previous, next, last section, table of contents.