Generated: Thursday, January 26, 2006, 23:38:52 Copyright © 2006, Kurt Nørmark The local LAML software home page

Link Checking in LAML

The facilities described below are available from LAML version 29.

For a long time, LAML has been able to validate attributes and contents. Recently also checking for proper use of ID attributes has been introduced.

Linking errors are a major source of frustration in web documents. Concretely, we have the problem if an URL points to a non-existing web resource. Therefore it is - from a practical point of view - very useful to check all links when the document is generated. This is now possible in LAML. LAML is able to check for relative links, with and without an explicitly given base URL (given by the base element of XHTML, for instance) and absolute links. Checking for absolute links requires an implementation of the LAML compatibility function url-target-exists?. (If you use MzScheme, LAML has such an implementation. However, in LAML version 29 this function does not give accurate results!)

The variable xml-link-checking (in the library lib/xml-in-laml/xml-in-laml.scm) controls the amount of link checking. You can ask for no link checking, checking of relative links, checking of absolute links, and checking of all links. See the documentation of the variable xml-link-checking.

When you use an XHTML language in LAML everything is provided for you. The default value of xml-link-checking provides for checking of relative links.

Some supported XML languages in LAML come with an URL extractor function, and a base URL extractor function. They are returned by the higher-order functions url-extractor-of-xml-language and base-url-extractor-of-xml-language. The function set-xml-link-checking-functions is used to define the URL extractor function and a base URL extractor function. The mirror function library calls this function with appropriate URL extractor functions.

Link checking in LAML works as follows:

  1. Document generation
    The document is generated as usual. This gives a LAML AST of the document.
  2. Link collection
    The function collect-links-for-later-checking-in-ast! is called on the AST. This function collects all URLs, which must be checked. Two global variables relative-url-list-for-later-checking and absolute-url-list-for-later-checking are hereby assigned. The second parameter of collect-links-for-later-checking-in-ast! gives the absolute file path of the location, where the file is written in the file system (the base URL). This information is important for checking of relative links in static web documents. Notice that the collection of links cannot be done before we know the file location of the document.
  3. Relative link checking
    The function check-relative-url-list! checks the existence of resources addressed by the URLs in relative-url-list-for-later-checking. This boils down to file and directory existence checks. Problems are addressed via the function xml-check-error. Notice that link checking should be done as late as possible, after all more or less dependent files have been written.
  4. Absolute link checking
    The function check-absolute-url-list! checks the existence of resources addressed by the URLs in absolute-url-list-for-later-checking. This function relies on the LAML compatibility function url-target-exists? (for which we do not yet have a good implementation in the LAML distribution). Errors are reported as for relative links.

In the standard LAML setup, step 2 is done by write-html, when a document is rendered (linearized and written to a file). You can alternatively do it 'manually' from other contexts, of course.

Step 3 and 4 are done from the function end-laml. Again, you can check the collected links yourself, using the functions described about, if you need to.

Notice that anchor parts, following '#' in a URL, are not checked by LAML.

The link checking framework, as described in this document, can be used on pre-existing XML languages as well as on your own XML languages. Just provide URL extractor functions for you own XML language, install them with set-xml-link-checking-functions, and arrange for link collection and link checking (using the functions from above). Notice, however, that your own XML language typically is translated to XHTML, which will check it's links. Therefore, it is not always necessary to do a lot of link checking in your own, high-level XML languages.

A single end remark for relative links. It is better to prevent errors than to find them at a later point in time. When you make relative links in LAML, the Emacs command M-x insert-relative-file-path is very useful together with the embedding Emacs command (with M-x embed).

 
Kurt Nørmark
normark@cs.aau.dk
http://www.cs.aau.dk/~normark/