lispreader
lispreader
?
lispreader
is a small library for reading expressions in Lisp
syntax. It has originally been written to facilitate simple exchange of
structured data between processes but its main purpose is now to provide
a framework for reading configuration files. To simplify interpretation
of the data read, lispreader
also provides functions for simple
matching of expressions against patterns.
lispreader
is also used in several application to read and
write data files. Lisp syntax is very suitable for doing this,
especially if the data is organized hierachically.
lispreader
not?
lispreader
is not a Lisp system in that it cannot, by itself,
interpret Lisp expressions. It only provides a subset of the features of
libraries like Guile of librep (namely the reading of expressions) and
does thus not compete directly with those. If all you need is a simple
way to read Lisp expressions without interpreting them with Lisp
semantics, you will probably be satisfied with lispreader
.
lispreader
is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.
lispreader
is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with chpp
; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
lispreader
lispreader
is available for free download on the world-wide-web
at the URL http://www.complang.tuwien.ac.at/schani/lispreader/.
lispreader
in your programs
lispreader
consists of only a few C files, namely
`lispreader.c', `lispreader.h', `lispscan.h',
`allocator.c', `allocator.h', `pools.c', and
`pools.h'. To incorporate lispreader
in your own
programs, just add these files to your own program's files.
lispreader
can read lists consisting of other lists, symbols,
strings, integers, real numbers and booleans. It also provides a syntax
for specifying patterns.
Comments are started by the semicolon (;
) and reach until the end
of the line:
; this line is completely ignored
Lists consist of so-called cons-pairs, or conses. A cons is constituted by its car and cdr. A list is defined as either being the empty list, which is no cons at all, or as being a cons, the cdr of which is a list. The cars of these conses are the actual elements of the list.
An example: The list (a b c)
consists of three conses, the cars
of which are the symbols a
, b
and c
. The structure
can be depicted using a box diagram:
_ _ _ _ _ _ |_|_|-->|_|_|-->|_|_|-->nil | | | v v v a b c
Each box denotes a cons, with the left half being its car and the right
half being its cdr. nil
denotes the empty list.
It is also possible to explicitly set the cdr of the last cons using the
dot-notation: (a b . c)
can be illustrated thus:
_ _ _ _ |_|_|-->|_|_|-->c | | v v a b
Note that this is technically not a list. Since the empty list can be
written as ()
, the list (a b c)
can be written using the
dot-notation as (a . (b . (c . ())))
.
Symbols are pretty much everything that cannot be interpreted as anything else. They can have arbitrary length.
As integers are internally represented by int
values, their range
is restricted to the range of the int
data type. Bignums are not
supported.
Reals are internally represented by values of the float
datatype. lispreader
cannot yet interpret exponential notation or
reals without digits before the dot.
Strings are delimited on both sides by double quotes (`"'). The backslash (`\') is used as escape character. The sequence `\n' is interpreted as newline, `\t' as tab. All other escape sequences evaluate to the char after the backslash, e.g. `\\' denotes the backslash itself and `\"' denotes the double quote.
The boolean values true and false are represented by #t
and
#f
, respectively.
Patterns are used to represent classes of expressions. They contain no other value than the types of expressions they match against.
Patterns are written using a special list syntax where the opening parenthesis is replaced by `#?('. There are patterns for matching all types of simple expressions:
#?(symbol)
#?(string)
#?(integer)
#?(real)
#?(boolean)
#t
of #f
.
Two other patterns have a wider scope:
#?(list)
#?(number)
#?(any)
It is also possible to construct a pattern matching at least one out of
a given set of expressions, which themselves can contain patterns, using
the or
pattern. For example, the pattern #?(or (a . #?(list)) (b #?(integer)))
matches the list (a #t 43)
as well as the list (b 1)
, but
not the list (b #f)
. As another example, #?(boolean)
is
equivalent to #?(or #t #f)
.
Most applications of lispreader
use it to quickly read bits of
data from a file, process it, and then read the next bit, until the
end of the file. If the file is big, it is an advantage if reading is
fast. Part of the reading process is allocating memory for the data
read, so fast memory allocation results in better reading performance.
lispreader
comes with a memory allocator optimized for this
application pattern, called the "pools" allocator. It is very fast,
can allocate lots of small chunks of memory with virtually no overhead
apart from the alignment padding, and can free all allocated memory at
once. The downside is that freeing all allocated memory is the only
way of freeing.
Using pools is not mandatory for using lispreader
, but it
increases performance significantly (by about a factor of 2) compared
to the standard malloc allocator. If you never read files larger than
a few tens of kilobytes, you will probably never notice, though.
pools_alloc
. Returns non-zero upon success, zero upon failure.
The allocator_t
data structure is lispreader
's interface
to your memory allocator of choice:
typedef struct { void* (*alloc) (void *allocator_data, size_t size); void (*free) (void *allocator_data, void *chunk); void *allocator_data; } allocator_t;
An allocator must provide two functions:
Both functions are always passed the value of allocator_data as their first argument.
All lispreader
functions which allocate or free (non-temporary)
memory come in two versions: The "normal" version uses the standard
malloc
/free
memory allocation mechanism. The
*_with_allocator
version takes a pointer to an
allocator_t
as its first argument and allocates and frees
memory via that allocator.
malloc
and
free
memory allocation functions.
lispreader
Reference
lisp_stream_free_path
to close the file.
This function should be preferred over lisp_stream_init_file
because it uses memory mapping if possible, resulting in better
parsing performance.
EOF
upon end-of-file and on all
invocations succeeding the invocation that first returned
EOF
. unget_char is called to push back a character for
reading it again. The character pushed back is always the character
returned by the last call to next_char. The next call to
next_char must return that character. unget_char is never
called twice without at least a call to next_char in
between. data is always passed to next_char and
unget_char. No other action whatsoever is performed on data,
i.e. should it point to a dynamically allocated memory region, the
application is responsible for freeing it after the stream has been
closed.
lisp_free
/lisp_free_with_allocator
.
lisp_free
/lisp_free_with_allocator
. Although buf may contain more than one
expression, only the first is read. If you need to read more than one
expression from a string, use lisp_read
/lisp_read_with_allocator
on a string stream
created by lisp_stream_init_string
.
lisp_read
, to out.
obj
.
The returned type can be one of
LISP_TYPE_NIL
LISP_TYPE_SYMBOL
LISP_TYPE_INTEGER
LISP_TYPE_REAL
LISP_TYPE_STRING
LISP_TYPE_CONS
LISP_TYPE_BOOLEAN
LISP_TYPE_PATTERN_CONS
lisp_compile_pattern
.
LISP_TYPE_EOF
LISP_TYPE_PARSE_ERROR
LISP_TYPE_INTEGER
.
LISP_TYPE_REAL
or
LISP_TYPE_INTEGER
.
LISP_TYPE_SYMBOL
.
LISP_TYPE_STRING
.
0
, otherwise some integer not equal to
0
. This function must not be called when the type of obj is
not LISP_TYPE_BOOLEAN
.
LISP_TYPE_CONS
.
LISP_TYPE_CONS
.
a
and
d. Returns the object resulting from applying lisp_car
and
lisp_cdr
according to x with a
corresponding to the
former and d
to the latter starting with obj in reverse
order. As an example, lisp_cxr(o,"ad")
is equivalent to
lisp_car(lisp_cdr(o))
.
lisp_cdr
n times on
obj.
lisp_list_nth_cdr
on
obj with n.
0
, true otherwise.
*
obj for use as a pattern in the
lisp_match_pattern
function. The expression is modified in the
process. If num_subs is non-null, the number of pattern
expressions (including all sub-expressions) will be written to
*
num_subs.
Returns 0
if an error occurred, non-zero on success. Note that
the expression could have been modified even if the function returned
0
.
lisp_compile_pattern
) against obj, storing
the resulting subexpressions in vars, if it is
non-null. num_subs should be the number of sub-patterns in
pattern, if vars is non-null. Otherwise, it is ignored.
Patterns are counted by their special opening parenthesis (`#?(')
from left to right, beginning with 0. For example, in the pattern
expression (a #?(or #?(integer) #?(string)) #?(symbol))
, the
or
-pattern has index 0, the integer
index 1, the
string
index 2 and the symbol
index 3. This means than
upon matching this pattern against (a 1 b)
, the integer 1
is stored in vars[0]
and vars[1]
and the
symbol b
is stored in vars[3]
. The values for
unmatched parts, like vars[2]
, are set to an expression of
type LISP_TYPE_PARSE_ERROR
.
Returns 0
if the match was unsuccessful, non-zero on success.
lisp_compile_pattern
and matches obj against it using
lisp_match_pattern
, storing the resulting subexpressions in
vars, if it is non-zero.
Returns non-zero if reading and matching were successful, 0
otherwise.
The following program reads expressions from standard input, prints the
string `parse error' when a parse error occurs, exits on
end-of-file and, if an entered expression is of the form
(+
number1
number2)
, prints the sum of
number1 and number2.
#include <lispreader.h> int main (void) { lisp_object_t *obj; lisp_stream_t stream; lisp_stream_init_file(&stream, stdin); while (1) { int type; obj = lisp_read(&stream); type = lisp_type(obj); if (type != LISP_TYPE_EOF && type != LISP_TYPE_PARSE_ERROR) { lisp_object_t *vars[2]; if (lisp_match_string("(+ #?(number) #?(number))", obj, vars)) printf("%f\n", lisp_real(vars[0]) + lisp_real(vars[1])); } else if (type == LISP_TYPE_PARSE_ERROR) printf("parse error\n"); lisp_free(obj); if (type == LISP_TYPE_EOF) break; } return 0; }
This document was generated on 2 April 2005 using texi2html 1.56k.