This is G o o g l e's cache of
G o o g l e's cache is the snapshot that we took of the page as we crawled the web.
The page may have changed since that time. Click here for the current page without highlighting.
To link to or bookmark this page, use the following url:

Google is not affiliated with the authors of this page nor responsible for its content.

Simple Uses of Flex

Noeud:Simple Uses of Flex, Noeud «Next»:, Noeud «Previous»:What Flex is, Noeud «Up»:Scanning with Flex

Simple Uses of Flex

Flex is a source generator, just as Gperf, Bison, and others. It takes a list of regular expressions and actions, and produces a fast function triggering the action associated to the pattern recognized. As for Gperf and Bison, the input syntax allows for a prologue, containing Flex directives and possibly some user declarations and initializations, and an epilogue, typically additional functions:

regular-expression-1    action-1
regular-expression-2    action-2
Example 6.11: Structure of a Flex Input File

All the pairs of regular-expression and action is listed on separate lines. The regular-expression must be written at the first column, otherwise it is considered as code to output inside the function which will be produced. This can be used to leave comments in the input. The actions maybe enclosed in braces if they are several lines long, and action = | stands for "same as the next action".

When run, flex produces a file named lex.yy.c, containing a C program including, in addition to your user-prologue and user-epilogue, one function:

int yylex () Fonction
Scan the FILE *yyin, which defaults to the standard input, for tokens. Trigger the action associated to the succeeding regular-expression. Return 0 when it should no longer be called, typically when the end of the file is reached. Otherwise, typically returns the kind of the token that has just been recognized.

For instance, this simple Flex input file is meant to recognize rude words, and to express its surprise on unknown words:

%{  /* -*- C -*- */
"sh*t"               |
"f*k"                |
"win*ows"            |
"Huh? What the f*?"  printf ("I don't like you saying `%s'.\n", yytext);
^.*$                 printf ("Huh? What the f* `%s'?\n", yytext);
\n                   /* Ignore. */
Example 6.12: rude-3.l -- Recognizing Rude Words With Flex

which we can try now:

$ flex -orude-3.c rude-3.l
$ gcc -Wall -o rude-3 rude-3.c -lfl
error-->rude-3.c:1020: warning: `yyunput' defined but not used
$ echo 'dear' | ./rude-3
Huh? What the f* `dear'?

We paid attention to writing the most general rules last, and to providing a rule to prevent the newline characters from being echoed to the standard output. This is needed because . does not match the newline characters, hence ^.*$ doesn't cover them.

You certainly have noted that we did not provide any main, however the program works: the Flex library, libfl, provides a default main which calls yylex until the end of file is found.

Let's try an actual match:

$ echo 'Huh? What the f*?' | ./rude-3
Huh? What the f* `Huh? What the f*?'?

Huh? What the f* Huh? What the f* `Huh? What the f*?'?? It was supposed to recognize it!

You just fell onto so-called right contexts: $ is exactly equivalent to /\n standing for "if followed by a newline". The bad news, is that when /right-context is used, the number of characters matched by right-context counts to elect the longest match (but of course doesn't get into yyleng). In other words, ^.*$ matched the whole line plus the newline, hence it wins over the dedicated pattern.

As much as possible you should avoid depending upon contexts, i.e., if you ever design a language, make sure it can be scanned without. If we simply replace ^.*$ with .* in the example 6.12, then we have:

$ echo 'Huh? What the f*?' | ./rude-4
I don't like you saying `Huh? What the f*?'.