Generated: June 1, 2002, 16:56:16Copyright ©2002, Kurt NørmarkThe local LAML software home page

Reference Manual of the Text Collection and Skipping Library

Kurt Nørmark ©    normark@cs.auc.dk    Department of Computer Science    Aalborg University    Denmark    

Master index
Source file: lib/collect-skip.scm
LAML Version 17.10 (June 1, 2002) full

This library contains a number of functions which collect and skip characters in a text file. These functions may, for instance, be used to parse a file.

It is assumed that the variable ip references an input port. The assignment of ip must be done exernally to this library, and after the library is loaded.

The main functions can be found in the section Collection and skipping functions below.

This library has been developed as part of an SGML Document Type Definition (DTD) parser. There exists internal documentation of the DTD parser, as such also of some aspects of the functions in this library.

Table of Contents:
1. Look ahead buffer and queue.2. Collection and skipping functions.3. Useful predicates for skipping and collecting.

Alphabetic index:
advance-look-ahead(advance-look-ahead n)Provided that there is at least n characters in the reading queue, advance next-read with n positions.
char-predicate(char-predicate ch)Return a predicate functions which matches the character ch.
collect-balanced-until(collect-balanced-until char-pred-1 char-pred-2)This collection procedure returns a balanced collection given two char predicates.
collect-until(collect-until p)Return the string collected from the input port ip.
collect-until-string(collect-until-string str . inclusive)Collect characters until str is encountered.
end-of-line?(end-of-line? ch)Is ch an end of line charcter?
ensure-look-ahead(ensure-look-ahead n)Make sure that there is at least n characters in the look ahead queue
eof?(eof? ch)Is ch an end of file character?
is-white-space?(is-white-space? ch)Is ch a white space character?
look-ahead-char(look-ahead-char)Return the first character in the look ahead vector.
look-ahead-prefix(look-ahead-prefix lgt)Return a lgt character string from the peeked chars in the queue.
match-look-ahead?(match-look-ahead? str)Return whether the queue contents match the string str.
max-look-aheadmax-look-aheadThe length of the cyclic look ahead buffer.
max-look-ahead-prefix(max-look-ahead-prefix)Return the entire look ahead queue as a string
peek-a-char(peek-a-char)Peek a character from the input port, but queues it for subsequent reading at "the peek end".
peek-chars(peek-chars n)Peeks n charcters
put-back-a-char-read-end(put-back-a-char-read-end ch)Put ch back at the front end of the "queue" (where read-a-char operates).
put-back-a-char-write-end(put-back-a-char-write-end ch)Put ch back at the rear end of the queue (where peek-a-char operates).
put-back-a-string(put-back-a-string str which-end)Put str back in queue.
read-a-char(read-a-char)Read from the the look ahead buffer.
read-a-string(read-a-string n)Read and return a string of length n.
reset-look-ahead-buffer(reset-look-ahead-buffer)Reset the look ahead buffer.
skip-string(skip-string str if-not-message)Assume that str is just in front of us.
skip-until-string(skip-until-string str . inclusive)Skip characters until str is encountered.
skip-while(skip-while p)Skip characters while p holds.

 

1.   LOOK AHEAD BUFFER AND QUEUE.
The functions in this section manipulates a look ahead queue, which is in between the input port ip and the applications. Via this buffer it is possible to implement look ahead in the input port.


max-look-ahead


Form
max-look-ahead

Description
The length of the cyclic look ahead buffer. Predefined to 2000 characters.


reset-look-ahead-buffer


Form
(reset-look-ahead-buffer)

Description
Reset the look ahead buffer.


peek-a-char


Form
(peek-a-char)

Description
Peek a character from the input port, but queues it for subsequent reading at "the peek end". This function always reads one character via read-char.


peek-chars


Form
(peek-chars n)

Description
Peeks n charcters


read-a-char


Form
(read-a-char)

Description
Read from the the look ahead buffer. Only if this buffer is empty, read from the port. Reads from "the read end" of the queue.


read-a-string


Form
(read-a-string n)

Description
Read and return a string of length n. Should take eof into account such that a string shorter than n can be returned.


look-ahead-prefix


Form
(look-ahead-prefix lgt)

Description
Return a lgt character string from the peeked chars in the queue.


max-look-ahead-prefix


Form
(max-look-ahead-prefix)

Description
Return the entire look ahead queue as a string


look-ahead-char


Form
(look-ahead-char)

Description
Return the first character in the look ahead vector. As a precondition, the look ahead queue is assumed not to be empty


match-look-ahead?


Form
(match-look-ahead? str)

Description
Return whether the queue contents match the string str. The queue must contain (length str) characters in order to call this function. If not, an error is issued. This is a proper function (appart from the error condition).


ensure-look-ahead


Form
(ensure-look-ahead n)

Description
Make sure that there is at least n characters in the look ahead queue


put-back-a-char-write-end


Form
(put-back-a-char-write-end ch)

Description
Put ch back at the rear end of the queue (where peek-a-char operates).


put-back-a-char-read-end


Form
(put-back-a-char-read-end ch)

Description
Put ch back at the front end of the "queue" (where read-a-char operates).


put-back-a-string


Form
(put-back-a-string str which-end)

Description
Put str back in queue. The second parameter which-end controls whether to put back in read end or write end. Possible values 'read-end and 'write-end.


advance-look-ahead


Form
(advance-look-ahead n)

Description
Provided that there is at least n characters in the reading queue, advance next-read with n positions. Hereby queued characters are skipped. Not used in dtd parsing.


 

2.   COLLECTION AND SKIPPING FUNCTIONS.
This section contains a number of higher level collection and skipping functions. These functions use the funtions from the previous section. The functions in this section are the most important of this library.


collect-until


Form
(collect-until p)

Description
Return the string collected from the input port ip. The collection stops when the predicate p holds holds on the character read. The last read character (the first character on which p holds) is left as the oldest character in the queue.


collect-balanced-until


Form
(collect-balanced-until char-pred-1 char-pred-2)

Description
This collection procedure returns a balanced collection given two char predicates. Return the string collected from the input port ip. The collection stops when the predicate char-pred-2 holds holds on the character read. However, if char-pred-1 becomes true it has to be matched by char-pred-2 without causing a termination of the collection. The last read character (the first character on which char-pred-2 holds) is processed by this function. As a precondition assume that if char-pred-1 holds then char-pred-2 does not hold, and vice versa.


skip-while


Form
(skip-while p)

Description
Skip characters while p holds. The first character on which p fails is left as the oldest character in the queue The predicate does not hold if end of file


skip-string


Form
(skip-string str if-not-message)

Description
Assume that str is just in front of us. Skip through it. If str is not in front of us, a fatal error occurs with if-not-message as error message.


skip-until-string


Form
(skip-until-string str . inclusive)

Description
Skip characters until str is encountered. If inclusive, also skip str. It is assumed as a precondition that the length of str is at least one.


collect-until-string


Form
(collect-until-string str . inclusive)

Description
Collect characters until str is encountered. If inclusive, also collect str. It is assumed as a precondition that the length of str is at least one.


 

3.   USEFUL PREDICATES FOR SKIPPING AND COLLECTING.


is-white-space?


Form
(is-white-space? ch)

Description
Is ch a white space character?


end-of-line?


Form
(end-of-line? ch)

Description
Is ch an end of line charcter?


eof?


Form
(eof? ch)

Description
Is ch an end of file character?


char-predicate


Form
(char-predicate ch)

Description
Return a predicate functions which matches the character ch. A higher order function.


Generated: June 1, 2002, 16:56:17
This documentation has been extracted automatically from the Scheme source file by means of the Schemedoc tool