Generated: May 6, 2005, 18:55:11

LAML Background

I have used Scheme for approximately 18 years. As such it was natural for me to use Scheme for CGI programming purposes a few of years ago. In order to use Scheme for CGI programming it was necessary to generate text with HTML markup. Therefore I implemented a simple library of functions, which helped me do so. I only implemented the functions that I actually needed. Concretely, I wrote one function for each HTML tag, in order to be able to mirror HTML as close as possible. This is today the html.scm library (manual), which now is completely obsolete. However, a number of tools still rely on this library.

It soon became clear that this ad hoc approach was inappropriate - and almost never ending. Therefore I generated a set of functions given a list of single tags and a list of double tags. This is the html-v1.scm library ( manual ) - also now completely obsolete.

It turned out to be difficult to tune the list of HTML mirror functions to be in accord with the HTML standard - at that time HTML4.0. The next natural step was therefore to extract the information about HTML from the Document Type Definition (DTD). I searched for a good and convenient representation of the HTML DTD, but could not find one. As a consequence I wrote an ad hoc parser via which I was able to extract all information about HTML4.0, as well as other versions of course. On this ground I constructed an exact mirror of HTML in Scheme, which included attribute checking, assured well-formedness, but no document validation.

As the last development I have fine tuned this mirror to validate the document. I.e., while generating the HTML code the document composition of is checked against the DTD. If there are errors they are reported as warnings (or fatal errors, if you so desire). In order to do the validation, we work with an internal document represented as a syntax tree (in terms of a nested list structure). We transform this tree to an HTML string by a tree traversal (using the render function), and we are careful not to generate a lot of garbage during this process.

As the latest development we have established the XML-in-LAML facility, which makes it possible to create mirrors of XML languages in Scheme. Several XML language can co-exist. All three XHTML languages (strict, transitional, frameset), SVG, and LENO (our own language for presentation teaching material) use the XML-in-LAML stuff. From version 20, the generation of the validation predicates of the XML mirror functions is fully automated.

Soon it became clear that the HTML mirror functions can be used directly when we write static HTML files. Thus instead of using HTML or XML directly, we use the Scheme mirror functions. (Of course many people use WYSIWYG editors, but these are complicted in their own sense, and you are not in 100% control. These are not for programmers...) Thus, instead of writing:

  <point> 
    A text with a 
    <a href="subsection/sec1.html">link</a> 
    to a <b>subsection</b>.
  </point>

we write something like

  (point
    "A text with a" 
    (a "link" 'href "subsection/sec.html")
    "to a" (b "subsection") _ ".")

point is not a standard tag, but a function we have programmed in Scheme. The point function can also be generated by the XML-in-LAML mirror generation tool.

For people who like Lisp this is great. We are able to make and use our own tags, just like in XML. All the functions corresponding to HTML elements are part of the LAML libraries. By executing the Scheme program we return a text in pure HTML. (Or more correctly, we return a data structure - a list that represents a syntax tree - which can be rendered as as HTML). Such a transformation is, by the way, also what happens on most XML documents today, by means of XSL for instance. Using Scheme we do not need grammars or DTDs. We are free to form the abstractions we like, and it is very easy and straightforward for a trained Lisp programmer. We can mirror XML as closely as we want or need. We can also just form the abstraction we find most attractive.

Now, the road is clear for definition of LAML styles for various document styles. I have made a variety of such styles for various purposes, such as course notes and home pages. The manuals referred above is also written using a LAML style. In fact, the manuals are extracted from 'doc comments' in Scheme programs, and as such made 'automatically'. And this file is written in LAML, of course. I never write a plain HTML file anymore. By the way, I wrote a simple article style in less than half a day mirroring the few LaTeX commands I use in my papers. This style is not yet released...

The main advantages of using LAML and Scheme instead of HTML/XML are the following:

It is based on functional programming techniques. It is simply the wrong way to emit a separate start, end tag and other fragments via print statements.
Programmed solutions are available anywhere in a document, and any time in the writing process.
You can abstract from details by writing and applying Scheme functions
You can process a LAML file with less efforts than saving it on a file. Using the LAML system from Emacs C-o saves and processes the laml buffer. A buffer with the file f.laml generates the file f.html . In order to save f.laml you usually write C-x C-s .
When you are used to Lisp expressions you can navigate over balanced tags/expressions very easily using your favorite Lisp editor. This takes an advanced HTML editor to do so...
A LAML document is a program, which generates a low level HTML equivalent when it is executed. This provides for ultimative flexibility for programmers. Other people can just use the Netscape Gold Editor, or a similarly user friendly editor...

We find it important to provide access to XML and HTML elements via functions, not macros or just list data forms. The reason is that we want to author our XML/HTML documents in a functional programming context, where especially higher-order functions are attractive. It is only possible to do this in a flexible way if all the elements of the markup languages are represented as functions. You should read the paper 'Web Programming in Scheme with LAML' (pdf) if you are interested to learn more.

Good so far. But there is a problem, namely the fact that all text have to be strings within string quotes. In practice, you will need to pass many small strings to HTML mirror functions or your own functions. It turns out that we can alleviate the problem of writing such strings via good editor support. First, just write the string without markup. After that apply the markup using a special editor command called embed, which splits the string in parts, and embeds the selected string into a Lisp function calling form. If needed, the embed function can also take care of surrounding string concatenation. The reverse unembed command is also available. A split and unsplit breaks an existing string in two parts, and again it takes care of surrounding string concatenation if needed. These command are all available via the LAML major mode in Emacs. We have used the commands heavily on this document, for instance. It turns out that using these editor commands during the document creation process, you almost never experience errors due to malformed Lisp expressions in your document.

If you are a Lisp programmer I am confident that you also will enjoy making HTML and XML-like work in Scheme. There are other programmatic alternatives around than Scheme, typically based on functional langauges with emphasis on strong typing. Types can - to some degree - be used to ensure grammatical correctness of your documents. I feel the LAML work is more pragmatic or more flexible. This boils down to Lisp vs. languages in the more theoretical areas of Computer Science.

For further information you can consult the the papers I have written, see the LAML Home page. The LAML tutorial is also a good possibility.

Kurt Nørmark
normark@cs.aau.dk
http://www.cs.auc.dk/~normark/

Generated: May 6, 2005, 18:55:11