lispreader

Reference manual

last updated 2 April 2005 for version 0.5

Mark Probst (schani@complang.tuwien.ac.at)


Introduction

What is lispreader?

lispreader is a small library for reading expressions in Lisp syntax. It has originally been written to facilitate simple exchange of structured data between processes but its main purpose is now to provide a framework for reading configuration files. To simplify interpretation of the data read, lispreader also provides functions for simple matching of expressions against patterns.

lispreader is also used in several application to read and write data files. Lisp syntax is very suitable for doing this, especially if the data is organized hierachically.

What is lispreader not?

lispreader is not a Lisp system in that it cannot, by itself, interpret Lisp expressions. It only provides a subset of the features of libraries like Guile of librep (namely the reading of expressions) and does thus not compete directly with those. If all you need is a simple way to read Lisp expressions without interpreting them with Lisp semantics, you will probably be satisfied with lispreader.

Licence and Warranty

lispreader is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

lispreader is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with chpp; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

Obtaining lispreader

lispreader is available for free download on the world-wide-web at the URL http://www.complang.tuwien.ac.at/schani/lispreader/.

Using lispreader in your programs

lispreader consists of only a few C files, namely `lispreader.c', `lispreader.h', `lispscan.h', `allocator.c', `allocator.h', `pools.c', and `pools.h'. To incorporate lispreader in your own programs, just add these files to your own program's files.

Syntax

lispreader can read lists consisting of other lists, symbols, strings, integers, real numbers and booleans. It also provides a syntax for specifying patterns.

Comments

Comments are started by the semicolon (;) and reach until the end of the line:

; this line is completely ignored

Lists

Lists consist of so-called cons-pairs, or conses. A cons is constituted by its car and cdr. A list is defined as either being the empty list, which is no cons at all, or as being a cons, the cdr of which is a list. The cars of these conses are the actual elements of the list.

An example: The list (a b c) consists of three conses, the cars of which are the symbols a, b and c. The structure can be depicted using a box diagram:

 _ _     _ _     _ _
|_|_|-->|_|_|-->|_|_|-->nil
 |       |       |
 v       v       v
 a       b       c

Each box denotes a cons, with the left half being its car and the right half being its cdr. nil denotes the empty list.

It is also possible to explicitly set the cdr of the last cons using the dot-notation: (a b . c) can be illustrated thus:

 _ _     _ _
|_|_|-->|_|_|-->c
 |       |
 v       v
 a       b

Note that this is technically not a list. Since the empty list can be written as (), the list (a b c) can be written using the dot-notation as (a . (b . (c . ()))).

Symbols

Symbols are pretty much everything that cannot be interpreted as anything else. They can have arbitrary length.

Integers

As integers are internally represented by int values, their range is restricted to the range of the int data type. Bignums are not supported.

Reals

Reals are internally represented by values of the float datatype. lispreader cannot yet interpret exponential notation or reals without digits before the dot.

Strings

Strings are delimited on both sides by double quotes (`"'). The backslash (`\') is used as escape character. The sequence `\n' is interpreted as newline, `\t' as tab. All other escape sequences evaluate to the char after the backslash, e.g. `\\' denotes the backslash itself and `\"' denotes the double quote.

Booleans

The boolean values true and false are represented by #t and #f, respectively.

Patterns

Patterns are used to represent classes of expressions. They contain no other value than the types of expressions they match against.

Patterns are written using a special list syntax where the opening parenthesis is replaced by `#?('. There are patterns for matching all types of simple expressions:

#?(symbol)
Any symbol.
#?(string)
Any string.
#?(integer)
Any integer.
#?(real)
Any real.
#?(boolean)
#t of #f.

Two other patterns have a wider scope:

#?(list)
Any list.
#?(number)
Any number, i.e., any integer or real.
#?(any)
Any expression (including lists).

It is also possible to construct a pattern matching at least one out of a given set of expressions, which themselves can contain patterns, using the or pattern. For example, the pattern #?(or (a . #?(list)) (b #?(integer))) matches the list (a #t 43) as well as the list (b 1), but not the list (b #f). As another example, #?(boolean) is equivalent to #?(or #t #f).

Pools

Introduction

Most applications of lispreader use it to quickly read bits of data from a file, process it, and then read the next bit, until the end of the file. If the file is big, it is an advantage if reading is fast. Part of the reading process is allocating memory for the data read, so fast memory allocation results in better reading performance.

lispreader comes with a memory allocator optimized for this application pattern, called the "pools" allocator. It is very fast, can allocate lots of small chunks of memory with virtually no overhead apart from the alignment padding, and can free all allocated memory at once. The downside is that freeing all allocated memory is the only way of freeing.

Using pools is not mandatory for using lispreader, but it increases performance significantly (by about a factor of 2) compared to the standard malloc allocator. If you never read files larger than a few tens of kilobytes, you will probably never notice, though.

Reference

Function: int init_pools (pools_t* pools)
Initializes the pools data structure pointed to by pools. After calling this function, the pools can be used to allocate memory via pools_alloc. Returns non-zero upon success, zero upon failure.

Function: void reset_pools (pools_t* pools)
Resets the pools pointed to by pools. This does not actually free the memory allocated from this pools, but reuses it for further allocations, i.e., the data previously allocated from it will be overwritten.

Function: void free_pools (pools_t* pools)
Frees all the memory allocated by pools.

Function: void* pools_alloc (pools_t* pools, size_t size)
Allocates a region of memory size bytes long from the pools pointed to by pools. Returns a null pointer if the allocation failed.

Allocators

Introduction

The allocator_t data structure is lispreader's interface to your memory allocator of choice:

typedef struct
{
    void* (*alloc) (void *allocator_data, size_t size);
    void (*free) (void *allocator_data, void *chunk);
    void *allocator_data;
} allocator_t;

An allocator must provide two functions:

Both functions are always passed the value of allocator_data as their first argument.

All lispreader functions which allocate or free (non-temporary) memory come in two versions: The "normal" version uses the standard malloc/free memory allocation mechanism. The *_with_allocator version takes a pointer to an allocator_t as its first argument and allocates and frees memory via that allocator.

Reference

Global Variable: allocator_t malloc_allocator
This is an allocator which uses the standard malloc and free memory allocation functions.

Function: void init_pools_allocator (allocator_t* allocator, pools_t* pools)
Initializes the data structure pointed to by allocator to use the pools allocator pointed to by pools. Note that the free function for the pools allocator does not free memory, so you'll have to free the pools yourself.

lispreader Reference

Reading expressions

Function: lisp_stream_t* lisp_stream_init_path (lisp_stream_t* stream, const char* path)
Initializes stream to be a file stream reading from the file with path path. Returns a null pointer if the file cannot be opened. The caller is supposed to use the function lisp_stream_free_path to close the file.

This function should be preferred over lisp_stream_init_file because it uses memory mapping if possible, resulting in better parsing performance.

Function: void lisp_stream_free_path (lisp_stream_t* stream)
Closes the file associated with the file stream stream.

Function: lisp_stream_t* lisp_stream_init_file (lisp_stream_t* stream, FILE* file)
Initializes stream to be a file stream reading from file. The caller is still responsible to close file when it is not needed any more.

Function: lisp_stream_t* lisp_stream_init_string (lisp_stream_t* stream, char* buf)
Initializes stream to be a string stream reading from buf. buf is not copied by this function, hence the effects of reading from the stream after modifying buf are undefined.

Function: lisp_stream_t* lisp_stream_init_any (lisp_stream_t* stream, void* data, int (*next_char) (void *data), void (*unget_char) (char c, void *data))
Initializes stream to be a user-defined stream. The function next_char is used to read individual characters from the stream. It must return EOF upon end-of-file and on all invocations succeeding the invocation that first returned EOF. unget_char is called to push back a character for reading it again. The character pushed back is always the character returned by the last call to next_char. The next call to next_char must return that character. unget_char is never called twice without at least a call to next_char in between. data is always passed to next_char and unget_char. No other action whatsoever is performed on data, i.e. should it point to a dynamically allocated memory region, the application is responsible for freeing it after the stream has been closed.

Function: lisp_object_t* lisp_read (lisp_stream_t* in)
Function: lisp_object_t* lisp_read_with_allocator (allocator_t* allocator, lisp_stream_t* in)
Reads a Lisp expression from the stream in and returns it. The caller is responsible for deallocating its memory using lisp_free/lisp_free_with_allocator.

Function: lisp_object_t* lisp_read_from_string (char* buf)
Function: lisp_object_t* lisp_read_from_string_with_allocator (allocator_t* allocator, const char* buf)
Reads a Lisp expression from the string buf and returns it. The caller is responsible for deallocating its memory using lisp_free/lisp_free_with_allocator. Although buf may contain more than one expression, only the first is read. If you need to read more than one expression from a string, use lisp_read/lisp_read_with_allocator on a string stream created by lisp_stream_init_string.

Writing expressions

Function: int lisp_dump (lisp_object_t* obj, FILE* out)
Writes the external representation of obj, which can be read again by lisp_read, to out.

Examining expressions

Function: int lisp_type (lisp_object_t* obj)
Returns the type of the lisp object obj.

The returned type can be one of

LISP_TYPE_NIL
The empty list.
LISP_TYPE_SYMBOL
A symbol.
LISP_TYPE_INTEGER
An integer.
LISP_TYPE_REAL
A real.
LISP_TYPE_STRING
A string.
LISP_TYPE_CONS
A cons-pair.
LISP_TYPE_BOOLEAN
A boolean.
LISP_TYPE_PATTERN_CONS
A cons-pair of a pattern. The interpretation of these should be left to the function lisp_compile_pattern.
LISP_TYPE_EOF
Indicates that end-of-file occured during reading the expression.
LISP_TYPE_PARSE_ERROR
Indicates a malformed expression.

Function: int lisp_nil_p (lisp_object_t* obj)
If obj is the empty list, returns a non-zero value, otherwise zero.

Function: int lisp_integer_p (lisp_object_t* obj)
If obj is an integer object, returns a non-zero value, otherwise zero.

Function: int lisp_integer (lisp_object_t* obj)
Returns the integer value for obj. This function must not be called when the type of obj is not LISP_TYPE_INTEGER.

Function: int lisp_real_p (lisp_object_t* obj)
If obj is a real object, returns a non-zero value, otherwise zero.

Function: float lisp_real (lisp_object_t* obj)
Returns the real value for obj. This function must not be called when the type of obj is not either LISP_TYPE_REAL or LISP_TYPE_INTEGER.

Function: int lisp_symbol_p (lisp_object_t* obj)
If obj is a symbol, returns a non-zero value, otherwise zero.

Function: char* lisp_symbol (lisp_object_t* obj)
Returns the string for the symbol stored in obj. This function must not be called when the type of obj is not LISP_TYPE_SYMBOL.

Function: int lisp_string_p (lisp_object_t* obj)
If obj is a string object, returns a non-zero value, otherwise zero.

Function: char* lisp_string (lisp_object_t* obj)
Returns the string value for obj. This function must not be called when the type of obj is not LISP_TYPE_STRING.

Function: int lisp_boolean_p (lisp_object_t* obj)
If obj is a boolean object, returns a non-zero value, otherwise zero.

Function: int lisp_boolean (lisp_object_t* obj)
Returns the boolean value for obj. If obj represents false, the result is 0, otherwise some integer not equal to 0. This function must not be called when the type of obj is not LISP_TYPE_BOOLEAN.

Function: int lisp_cons_p (lisp_object_t* obj)
If obj is a cons, returns a non-zero value, otherwise zero.

Function: lisp_object_t* lisp_car (lisp_object_t* obj)
Returns the car of the cons stored in obj. This function must not be called when type type of obj is not LISP_TYPE_CONS.

Function: lisp_object_t* lisp_cdr (lisp_object_t* obj)
Returns the cdr of the cons stored in obj. This function must not be called when type type of obj is not LISP_TYPE_CONS.

Function: lisp_object_t* lisp_cxr (lisp_object_t* obj, const char* x)
x must be a string consisting of the chars a and d. Returns the object resulting from applying lisp_car and lisp_cdr according to x with a corresponding to the former and d to the latter starting with obj in reverse order. As an example, lisp_cxr(o,"ad") is equivalent to lisp_car(lisp_cdr(o)).

Function: int lisp_list_length (lisp_object_t* obj)
Returns the length of the list stored in obj. A list is defined as the empty list, which is represented by a null pointer, or a cons, the cdr of which is a list.

Function: lisp_object_t* lisp_list_nth_cdr (lisp_object_t* obj, int n)
Returns the result of iterating lisp_cdr n times on obj.

Function: lisp_object_t* lisp_list_nth (lisp_object_t* obj, int n)
Returns the car of the result of applying lisp_list_nth_cdr on obj with n.

Creating expressions

Function: lisp_object_t* lisp_nil ()
Returns the empty list.

Function: lisp_object_t* lisp_make_integer (int value)
Function: lisp_object_t* lisp_make_integer_with_allocator (allocator_t* allocator, int value)
Returns an integer object with the value value.

Function: lisp_object_t* lisp_make_real (float value)
Function: lisp_object_t* lisp_make_real_with_allocator (allocator_t* allocator, float value)
Returns a real object with the value value.

Function: lisp_object_t* lisp_make_symbol (const char* value)
Function: lisp_object_t* lisp_make_symbol_with_allocator (allocator_t* allocator, const char* value)
Returns a symbol object with the name value.

Function: lisp_object_t* lisp_make_string (const char* value)
Function: lisp_object_t* lisp_make_string_with_allocator (allocator_t* allocator, const char* value)
Returns a string object with the value value.

Function: lisp_object_t* lisp_make_cons (lisp_object_t* car, lisp_object_t* cdr)
Function: lisp_object_t* lisp_make_cons_with_allocator (allocator_t* allocator, lisp_object_t* car, lisp_object_t* cdr)
Returns a cons object with car car and cdr cdr.

Function: lisp_object_t* lisp_make_boolean (int value)
Function: lisp_object_t* lisp_make_boolean_with_allocator (allocator_t* allocator, int value)
Returns a boolean. Its value is false if value is 0, true otherwise.

Matching expressions against patterns

Function: int lisp_compile_pattern (lisp_object_t** obj, int* num_subs)
Prepares the expression *obj for use as a pattern in the lisp_match_pattern function. The expression is modified in the process. If num_subs is non-null, the number of pattern expressions (including all sub-expressions) will be written to *num_subs.

Returns 0 if an error occurred, non-zero on success. Note that the expression could have been modified even if the function returned 0.

Function: int lisp_match_pattern (lisp_object_t* pattern, lisp_object_t* obj, lisp_object_t** vars, int num_subs)
Matches the pattern pattern (which must have previously been compiled using lisp_compile_pattern) against obj, storing the resulting subexpressions in vars, if it is non-null. num_subs should be the number of sub-patterns in pattern, if vars is non-null. Otherwise, it is ignored.

Patterns are counted by their special opening parenthesis (`#?(') from left to right, beginning with 0. For example, in the pattern expression (a #?(or #?(integer) #?(string)) #?(symbol)), the or-pattern has index 0, the integer index 1, the string index 2 and the symbol index 3. This means than upon matching this pattern against (a 1 b), the integer 1 is stored in vars[0] and vars[1] and the symbol b is stored in vars[3]. The values for unmatched parts, like vars[2], are set to an expression of type LISP_TYPE_PARSE_ERROR.

Returns 0 if the match was unsuccessful, non-zero on success.

Function: int lisp_match_string (char* pattern_string, lisp_object_t* obj, lisp_object_t** vars)
Reads an expression from pattern_string, compiles it using lisp_compile_pattern and matches obj against it using lisp_match_pattern, storing the resulting subexpressions in vars, if it is non-zero.

Returns non-zero if reading and matching were successful, 0 otherwise.

Freeing expressions

Function: void lisp_free (lisp_object_t* obj)
Function: void lisp_free_with_allocator (allocator_t* allocator, lisp_object_t* obj)
Frees all memory occupied by obj, including all its subexpressions.

An Example

The following program reads expressions from standard input, prints the string `parse error' when a parse error occurs, exits on end-of-file and, if an entered expression is of the form (+ number1 number2), prints the sum of number1 and number2.

#include <lispreader.h>

int
main (void)
{
    lisp_object_t *obj;
    lisp_stream_t stream;

    lisp_stream_init_file(&stream, stdin);

    while (1)
    {
        int type;

        obj = lisp_read(&stream);
        type = lisp_type(obj);
        if (type != LISP_TYPE_EOF && type != LISP_TYPE_PARSE_ERROR)
        {
            lisp_object_t *vars[2];

            if (lisp_match_string("(+ #?(number) #?(number))",
                                  obj, vars))
                printf("%f\n", lisp_real(vars[0])
                               + lisp_real(vars[1]));

        }
        else if (type == LISP_TYPE_PARSE_ERROR)
            printf("parse error\n");
        lisp_free(obj);

        if (type == LISP_TYPE_EOF)
            break;
    }

    return 0;
}

Function Index

Jump to: f - i - l - p - r

f

  • free_pools
  • i

  • init_pools
  • init_pools_allocator
  • l

  • lisp_boolean
  • lisp_boolean_p
  • lisp_car
  • lisp_cdr
  • lisp_compile_pattern
  • lisp_cons_p
  • lisp_cxr
  • lisp_dump
  • lisp_free
  • lisp_free_with_allocator
  • lisp_integer
  • lisp_integer_p
  • lisp_list_length
  • lisp_list_nth
  • lisp_list_nth_cdr
  • lisp_make_boolean
  • lisp_make_boolean_with_allocator
  • lisp_make_cons
  • lisp_make_cons_with_allocator
  • lisp_make_integer
  • lisp_make_integer_with_allocator
  • lisp_make_real
  • lisp_make_real_with_allocator
  • lisp_make_string
  • lisp_make_string_with_allocator
  • lisp_make_symbol
  • lisp_make_symbol_with_allocator
  • lisp_match_pattern
  • lisp_match_string
  • lisp_nil
  • lisp_nil_p
  • lisp_read
  • lisp_read_from_string
  • lisp_read_from_string_with_allocator
  • lisp_read_with_allocator
  • lisp_real
  • lisp_real_p
  • lisp_stream_free_path
  • lisp_stream_init_any
  • lisp_stream_init_file
  • lisp_stream_init_path
  • lisp_stream_init_string
  • lisp_string
  • lisp_string_p
  • lisp_symbol
  • lisp_symbol_p
  • lisp_type
  • p

  • pools_alloc
  • r

  • reset_pools

  • This document was generated on 2 April 2005 using texi2html 1.56k.