MP4H - A Macro Processeur for HTML Documents

Introduction

The mp4h software is a macro-processor specifically designed to deal with HTML documents. It allows powerful programming constructs, with a syntax familiar to HTML authors.

This software is based on Meta-HTML, written by Brian J. Fox, Even if both syntaxes look similar, source code is completely different. Indeed, a subset of Meta-HTML was used as a part of a more complex program, WML (Website Meta Language) written by Ralf S. Engelschall and which i maintain since january 1999. For licensing reasons, it was hard to hack Meta-HTML and so i decided to write my own macro-processor.

Instead of rewriting it from scratch, i preferred using another macro-processor engine. I chose GNU m4, written by René Seindal, because of its numerous advantages : this software is stable, robust and very well documented. This version of mp4h is derived from GNU m4 version 1.4n, which is a development version.

The mp4h software is not an HTML editor ; its unique goal is to provide an easy way to define its own macros inside HTML documents. There is no plan to add functionalities to automagically produce valid HTML documents, if you want to clean up your code or validate it, simply use a post-processor like tidy.

Command line options

Optional arguments are enclosed within square brackets. All option synonyms have a similar syntax, so when a long option accepts an argument, short option do too.

Syntax call is

mp4h [options] [filename [filename] ...]
Options are described below. If no filename is specified, or if its name is -, then characters are read on standard input.

Operation modes

--help display an help message and exit
--version output mp4h version information and exit
-E --fatal-warnings stop execution after first warning
-Q --quiet --silent suppress some warnings for builtins

Preprocessor features

-I --include=DIRECTORY search this directory second for includes
-D --define=NAME[=VALUE] enter NAME has having VALUE, or empty
-U --undefine=COMMAND delete builtin COMMAND

Limits control

-H --hashsize=PRIME set symbol lookup hash table size (default 509)
-L -nesting-limit=NUMBER change artificial nesting limit (default 250)

Debugging

-d --debug=FLAGS set debug level (no FLAGS implies `aeq')
-t --trace=NAME trace NAME when it will be defined
-l --arglength=NUMBER restrict macro tracing size
-o --error-output=FILE redirect debug and trace output

Flags are any of:

t trace for all macro calls, not only debugging-on'ed
a show actual arguments
e show expansion
c show before collect, after collect and after call
x add a unique macro call id, useful with c flag
f say current input file name
l say current input line number
p show results of path searches
i show changes in input files
V shorthand for all of the above flags

Description

The mp4h software is a macro-processor, which means that keywords are replaced by other text. This chapter describes all primitives. As mp4h has been specially designed for HTML documents, its syntax is very similar to HTML, with tags and attributes. An important feature has no equivalent in HTML: comments until end of line. All text following three colons is discarded until end of line, like

;;;  This is a comment

Function Macros

The definition of new tags is the most common task provided by mp4h. As with HTML, macro names are case insensitive. In this documentation, only lowercase letters are used. There are two kinds of tags: simple and complex. A simple tag has the following form:

<name [attributes]>
whereas a complex tag looks like:
<name [attributs]>
body
</name>

  In macro descriptions below, a slash indicates a complex tag, and a V letter that attributes are read verbatim (without expansion) (see the chapter on macro expansion for further details).

 

/   define-tag
name
[attributes=verbatim]
[endtag=required]
[whitespace=delete]

This function lets you define your own tags. First argument is the command name. Replacement text is the function body.

Source
<define-tag foo>bar</define-tag>
<foo>
Output
bar
Even if spaces have usually few incidence on HTML syntax, it is important to note that
<define-tag foo>bar</define-tag>
and
<define-tag foo>
bar
</define-tag>
are not equivalent, the latter form contains two newlines that were not present in the former.

 

/   provide-tag
name
[attributes=verbatim]
[endtag=required]
[whitespace=delete]

This command is similar to the previous one, except that no operation is performed if this command was still defined.

 

    let
new old

Copy a function. This command is useful to save a macro definition before redefining it.

Source
<define-tag foo>one</define-tag>
<let bar foo>
<define-tag foo>two</define-tag>
<foo><bar>
Output
twoone

 

    undef
name

Delete a command definition.

Source
<define-tag foo>one</define-tag>
<undef foo>
<foo>
Output
<foo>

 

/   set-hook
name
[position=[before|after]
[action=insert|append|replace]

Add text to a predefined macro. This mechanism allows modifications of existing macros without having to worry about its type, whether it is complex or not.

Source
<let foo add>
<set-hook foo position=before>
Before</set-hook>
<set-hook foo position=after>
After</set-hook>
<foo 1 2 3 4>
Output
Before10
After

 

    get-hook
name
[position=[before|after]

Print current hooks of a macro.

Source
Text inserted with position=before:<get-hook foo position=before>!
Text inserted with position=after:<get-hook foo position=after>!
Output
Text inserted with position=before:
Before!
Text inserted with position=after:
After!

Variables

Variables are a special case of simple tags, because they do not accept attributes. In fact their use is different, because variables contain text whereas macros act like operators. A nice feature concerning variables is their manipulation as arrays. Indeed variables can be considered like newline separated lists, which will allow powerful manipulation functions as we will see below.

 

    set-var
name[=value]
[name[=value]] ...

This command sets variables.

 

  V set-var-verbatim
name[=value]
[name[=value]] ...

As above but attributes are read verbatim.

 

    get-var
name
[name] ...

Show variable contents. If a numeric value within square brackets is appended to a variable name, it represents the index of an array. The first index of arrays is 0 by convention.

Source
<set-var version="0.10.1">
This is version <get-var version>
Output
This is version 0.10.1
Source
<set-var foo="0
1
2
3">
<get-var foo[2] foo[0] foo>
Output
200
1
2
3

 

  V get-var-once
name
[name] ...

As above but attributes are not expanded.

Source
<define-tag foo>0.10.1</define-tag>
<set-var version="<foo>">;;;
Here is version <get-var version>
<set-var-verbatim version="<foo>">;;;
Here is version <get-var version>
<set-var-verbatim version="<foo>">;;;
Here is version <get-var-once version>
Output
Here is version 0.10.1
Here is version 0.10.1
Here is version <foo>

 

    preserve
name

All variables are global, there is no variable or macro scope. For this reason a stack is used to preserve variables. When this command is invoked, the first argument is the name of a variable. The value of this variable is put at the top of the stack and this variable is reset to an empty string.

 

    restore
name

This is the opposite: first argument is a variable name, this variable is set to the value found at the top of the stack, and this value is popped.

 

    unset-var
name
[name] ...

Undefine variables.

 

    var-exists
name

Returns true when this variable exists.

 

    increment
name
[by=value]

Increment the variable whose name is the first argument. Default increment is one.

Source
<set-var i=10>
<get-var i>
<increment i><get-var i>
<increment i by="-3"><get-var i>
Output
10
11
8

 

    decrement
name
[by=value]

Decrement the variable whose name is the first argument. Default decrement is one.

Source
<set-var i=10>
<get-var i>
<decrement i><get-var i>
<decrement i by="3"><get-var i>
Output
10
9
6

 

    copy-var
src
dest

Copie a variable into another.

Source
<set-var i=10>
<copy-var i j>
<get-var j>
Output
10

 

    defvar
name
value

If this variable is not defined or is defined to an empty string, then it is set to the second argument.

Source
<unset-var title>
<defvar title "Title"><get-var title>
<defvar title "New title"><get-var title>
Output
Title
Title

 

    symbol-info
name

Show informations on symbols. If it is a variable name, the STRING word is printed as well as the number of lines contained within this variable. If it is a macro name, one of the following messages is printed: PRIM COMPLEX, PRIM TAG, USER COMPLEX or USER TAG

Source
<set-var x="0\n1\n2\n3\n4">
<define-tag foo>bar</define-tag>
<define-tag bar endtag=required>quux</define-tag>
<symbol-info x>
<symbol-info symbol-info>
<symbol-info define-tag>
<symbol-info foo>
<symbol-info bar>
Output
STRING
5
PRIM TAG
PRIM COMPLEX
USER TAG
USER COMPLEX

String Functions

 

    string-length
string

Prints the length of the string.

Source
<set-var foo="0
1
2
3">;;;
<string-length <get-var foo>>
<set-var foo="0 1 2 3">;;;
<set-var l=<string-length <get-var foo>>>;;;
<get-var l>
Output
7
7

 

    downcase
string

Convert to lowercase letters.

Source
<downcase "Does it work?">
Output
does it work?

 

    upcase
string

Convert to uppercase letters.

Source
<upcase "Does it work?">
Output
DOES IT WORK?

 

    capitalize
string

Convert to a title, with a capital letter at the beginning of every word.

Source
<capitalize "Does it work?">
Output
Does It Work?

 

    substring
string
[start [end]]

Extracts a substring from a string. First argument is original string, second and third are respectively start and end indexes. By convention first character has a null index.

Source
<set-var foo="abcdefghijk">
<substring <get-var foo> 4>
<substring <get-var foo> 4 6>
Output
efghijk
ef

 

    subst-in-string
string
regexp
[replacement]
[singleline=true]

Replace a regular expression in a string by a replacement text.

Source
<set-var foo="abcdefghijk">
<subst-in-string <get-var foo> "[c-e]">
<subst-in-string <get-var foo> "([c-e])" "\\1 ">
Output
abfghijk
abc d e fghijk
Source
<set-var foo="abcdefghijk\nabcdefghijk\nabcdefghijk">
<subst-in-string <get-var foo> ".$" "">
<subst-in-string <get-var foo> ".$" "" singleline=true>
Output
abcdefghij
abcdefghij
abcdefghij
abcdefghijk
abcdefghijk
abcdefghij

 

    subst-in-var
name
regexp
[replacement]
[singleline=true]

Performs substitutions inside variable content.

 

    string-eq
string1
string2
[caseless=true]

Returns true if first two arguments are equal.

Source
1:<string-eq "aAbBcC" "aabbcc">
2:<string-eq "aAbBcC" "aAbBcC">
Output
1:
2:true
Source
1:<string-eq "aAbBcC" "aabbcc" caseless=true>
2:<string-eq "aAbBcC" "aAbBcC" caseless=true>
Output
1:true
2:true

 

    string-neq
string1
string2
[caseless=true]

Returns true if the first two arguments are not equal.

Source
1:<string-neq "aAbBcC" "aabbcc">
2:<string-neq "aAbBcC" "aAbBcC">
Output
1:true
2:
Source
1:<string-neq "aAbBcC" "aabbcc" caseless=true>
2:<string-neq "aAbBcC" "aAbBcC" caseless=true>
Output
1:
2:

 

    string-compare
string1
string2
[caseless=true]

Compares two strings and returns one of the values less, greater or equal depending on this comparison.

Source
1:<string-compare "aAbBcC" "aabbcc">
2:<string-compare "aAbBcC" "aAbBcC">
Output
1:less
2:equal
Source
1:<string-compare "aAbBcC" "aabbcc" caseless=true>
Output
1:equal

 

    match
string
regexp
[caseless=true]
[action=report|extract|delete|startpos|endpos|length]
Source
1:<match "abcdefghijk" "[c-e]+">
2:<match "abcdefghijk" "[c-e]+" action=extract>
3:<match "abcdefghijk" "[c-e]+" action=delete>
4:<match "abcdefghijk" "[c-e]+" action=startpos>
5:<match "abcdefghijk" "[c-e]+" action=endpos>
6:<match "abcdefghijk" "[c-e]+" action=length>
Output
1:true
2:cde
3:abfghijk
4:2
5:5
6:3

 

    char-offsets
string
character
[caseless=true]

Prints an array containing indexes where the character appear in the string.

Source
1:<char-offsets "abcdAbCdaBcD" a>
2:<char-offsets "abcdAbCdaBcD" a caseless=true>
Output
1:0
8
2:0
4
8

 

    set-regexp-syntax
[type=basic|extended]

This command controls which regular expressions are used in the macros described above. There are only two possible values: basic and extended. The former are basic regular expressions and the latter are extended regular expressions. By default extended regular expressions are used.

Source
<set-var foo="abcdefghijk">
<set-regexp-syntax type=basic>
<subst-in-string <get-var foo> "([c-e]+)" ":\\1:">
<subst-in-string <get-var foo> "\\([c-e]\\{1,\\}\\)" ":\\1:">
<set-regexp-syntax type=extended>
<subst-in-string <get-var foo> "([c-e]+)" ":\\1:">
<subst-in-string <get-var foo> "\\([c-e]\\{1,\\}\\)" ":\\1:">
Output
abcdefghijk
ab:cde:fghijk

ab:cde:fghijk
abcdefghijk

 

    get-regexp-syntax

Prints actual regexp type.

Source
<get-regexp-syntax>
Output
extended

Arrays

With mp4h one can easily deal with string arrays. Variables can be treated as a single value or as a newline separated list of strings. Thus after defining

<set-var digits="0
1
2
3">
one can view its content or one of these values:
Source
<get-var digits>
<get-var digits[2]>
Output
0
1
2
3
2

 

    array-size
name

Returns an array size which is the number of lines present in the variable.

Source
<array-size digits>
Output
4

 

    array-push
name
value

Add a value (or more if this value contains newlines) at the end of an array.

Source
<array-push digits "10\n11\n12">
<get-var digits>
Output
0
1
2
3
10
11
12

 

    array-pop
name

Remove the toplevel vale of an array and returns this string.

 

    array-add-unique
name
value
[caseless=true]

Add a value at the end of an array if this value is not already present in this variable.

Source
<array-add-unique digits 2>
<get-var digits>
Output
0
1
2
3
10
11
12

 

    array-concat
name1
[name2] ...

Concatenates all arrays into the first one.

Source
<set-var foo="foo">
<set-var bar="bar">
<array-concat foo bar><get-var foo>
Output
foo
bar

 

    array-member
name
value
[caseless=true]

If value is contained in array, returns its index otherwise returns -1.

Source
<array-member digits 11>
Output
5

 

    array-shift
name
offset
[start=start]

Shifts an array. If offset is negative, indexes below 0 are lost. If offset is positive, first indexes are filled with empty strings.

Source
<array-shift digits 2>
Now: <get-var digits>
<array-shift digits -4>
And: <get-var digits>
Output
Now: 

0
1
2
3
10
11
12

And: 2
3
10
11
12

 

    sort
name
[caseless=true]
[numeric=true]
[sortorder=reverse]

Sort lines of an array in place. Default is to sort lines alphabetically.

Source
<sort digits><get-var digits>
Output
12
2
3

Numerical operators

These operators perform basic arithmetic operations. When all operands are integers result is an integer too, otherwise it is a float. These operators are self-explanatory.

 

    add
number1
number2
[number3] ...

 

    substract
number1
number2
[number3] ...

 

    multiply
number1
number2
[number3] ...

 

    divide
number1
number2
[number3] ...

 

    min
number1
number2
[number3] ...

 

    max
number1
number2
[number3] ...
Source
<add 1 2 3 4 5 6>
<add 1 2 3 4 5 6.>
Output
21
21.000000
Source
<define-tag factorial whitespace=delete>
<ifeq %0 1 1 <multiply %0 "<factorial <substract %0 1>>">>
</define-tag>
<factorial 6>
Output
720

 

    modulo
number1
number2

Unlike functions listed above the modulo function cannot handle more than 2 arguments, and these arguments must be integers.

Source
<modulo 345 7>
Output
2

Those functions compare two numbers and returns true when this comparison is true. If one argument is not a number, comparison is false.

 

    gt
number1
number2

Returns true if first argument is greater than second.

 

    lt
number1
number2

Returns true if first argument is lower than second.

 

    eq
number1
number2

Returns true if arguments are equal.

 

    neq
number1
number2

Returns true if arguments are not equal.

Relational operators

 

    not
string

Returns true if string is empty, otherwise returns an empty string.

 

    and
string
[string] ...

Returns the last argument if all arguments are non empty.

 

    or
string
[string] ...

Returns the first non empty argument.

Flow functions

 

  V group
statement
[statement] ...
[separator=string]

This function groups multiple statements into a single one. Some examples will be seen below with conditional operations.

A less intuitive but very helpful use of this macro is to preserve newlines when whitespace=delete is specified.

Source
<define-tag text1>
Text on
3 lines without
whitespace=delete
</define-tag>
<define-tag text2 whitespace=delete>
Text on
3 lines with
whitespace=delete
</define-tag>
<define-tag text3 whitespace=delete>
<group "Text on
3 lines with
whitespace=delete">
</define-tag>
<text1>
<text2>
<text3>
Output
Text on
3 lines without
whitespace=delete

Text on3 lines withwhitespace=delete
Text on
3 lines with
whitespace=delete

Note that newlines are suppressed in text2 and result is certainly unwanted.

 

  V noexpand
command
[command] ...

Prints its arguments without expansion. They will never be expanded unless the expand tag is used to cancel this noexpand tag.

 

    expand
commande
[commande] ...

Annule l'effet de la balise noexpand.

Source
<subst-in-string "=LT=define-tag foo>bar</define-tag>" "=LT=" "<">
<foo>
<subst-in-string "=LT=define-tag foo>quux</define-tag>" "=LT=" "<noexpand "<">">
<foo>
Output
bar
<define-tag foo>quux</define-tag>
bar

 

  V if
string
then-clause
[else-clause]

If string is non empty, second argument is evalled otherwise third argument is evalled.

Source
<define-tag test whitespace=delete>
<if %0 "yes" "no">
</define-tag>
<test "string">
<test "">
Output
yes
no

 

  V ifeq
string1
string2
then-clause
[else-clause]

If first two arguments are identical strings, third argument is evalled otherwise fourth argument is evalled.

 

  V ifneq
string1
string2
then-clause
[else-clause]

If first two arguments are not identical strings, third argument is evalled otherwise fourth argument is evalled.

 

/   when
string

When argument is not empty, its body fuction is evalled.

 

/ V while
cond

While condition is true, body function is evalled.

Source
<set-var i=10>
<while <gt <get-var i> 0>>;;;
  <get-var i> <decrement i>;;;
</while>
Output
10 9 8 7 6 5 4 3 2 1 

 

/   foreach
variable
array
[start=start]
[end=end]
[step=pas]

This macro is similar to the foreach Perl's macro: a variable loops over array values and function body is evalled for each value. first argument is a generic variable name, and second is the name of an array.

Source
<set-var x="1\n2\n3\n4\n5\n6">
<foreach i x><get-var i> </foreach>
Output
1 2 3 4 5 6 

 

  V var-case
var1=value1 action1
[var2=value2 action2 ...

This command performs multiple conditions with a single instruction.

Source
<set-var i=0>
<define-tag test>
<var-case x=1 <group <increment i> x<get-var i>> x=2 <group <decrement i> x<get-var i>> y=1 <group <increment i> y<get-var i>> y=2 <group <decrement i> y<get-var i>>>
</define-tag>
<set-var x=1 y=2><test>
<set-var x=0 y=2><test>
Output
x1y0


y-1

 

    break

Breaks the innermost while loop.

Source
<set-var i=10>
<while <gt <get-var i> 0>>;;;
  <get-var i> <decrement i>;;;
  <ifeq <get-var i> 5 <break>>;;;
</while>
Output
10 9 8 7 6 

 

    return
[up=number]
string

This command immediately exits from the innermost macro. A message may also be inserted. But this macro changes token parsing so its use may become very hazardous in some situations.

 

    warning

Prints a warning on standard error.

 

    exit
[status=rc]
[message=string]

Immediately exits program.

 

    at-end-of-file

This is a special command: all attributes are stored and will be expanded after end of input.

File functions

 

    directory-contents
dirname
[matching=regexp]

Returns a newline separated list of files contained in a given directory.

Source
<directory-contents . matching=".*\\.mp4h$">
Output
mp4h.mp4h

 

    file-exists
filename

Returns true if file exists.

 

    get-file-properties
filename

Returns an array of informations on this file. These informations are: size, type, ctime, mtime, atime, owner and group.

Source
<get-file-properties mp4h.mp4h>
Output
43659
FILE
954610154
954610154
954610155
barbier
users

 

    include
filename
[alt=action]
[verbatim=true]

Read input from another file.

 

/   comment

This tag does nothing, its body is simply discarded.

 

    set-eol-comment
[string]

Change comment characters.

Debugging

When constructs become complex it could be hard to debug them. Functions listed below are very useful when you could not figure what is wrong. These functions are not perfect yet and must be improved in future releases.

 

    function-def
name

Prints the replacement text of a user defined macro. For instance, the macro used to generate all examples of this documentation is

Source
<function-def example>
Output
<set-var-verbatim verb-body=%ubody>
<subst-in-var verb-body "&" "&amp;">
<subst-in-var verb-body "<" "&lt;">
<subst-in-var verb-body ">" "&gt;">
<set-var body=%body>
<subst-in-var body "&" "&amp;">
<subst-in-var body "<" "&lt;">
<subst-in-var body ">" "&gt;">
<subst-in-var body "<three-colon>\n[ \t]*" "" singleline=true>
<subst-in-var body "<three-colon>[^;][^\n]*\n[ \t]*" "" singleline=true>
<subst-in-var body "<three-colon>$" "">
<subst-in-var body "^\n+" "" singleline=true>
<table border=2 cellpadding=0 cellspacing=0 width="80%" summary="">
    <tr><th bgcolor="#ccccff"><lang:example-source></th></tr>
    <tr><td bgcolor="#ccff99" width="80%"><dnl>
      <pre><get-var-once verb-body></pre><dnl>
    </td></tr>
    <tr><th bgcolor="#ccccff"><lang:example-output></th></tr>
    <tr><td bgcolor="#ff99cc" width="80%"><dnl>
      <pre><get-var-once body></pre><dnl>
    </td></tr>
</table>

 

    debugmode
string

This comand acts like the -d flag but can be ynamically changed.

 

    debugfile
filename

Selects a file where debugging messages are diverted. If this filename is empty, debugging messages are sent back to standard error, and if it is set to - these messages are discarded.

Note: There is no way to print these debugging messages into the document being processed.

 

    debugging-on
name
[name] ...

Declare these macros traced, i.e. informations about these macros will be printed if -d flag or debugmode macro are used.

 

    debugging-off
name
[name] ...

These macros are no more traced.

Miscellaneous

 

    __file__
[name]

Without argument this macro prints current input filename. With an argument, this macro sets the string returned by future invocation of this macro.

 

    __line__
[number]

Without argument this macro prints current number line in input file. With an argument, this macro sets the number returned by future invocation of this macro.

Source
This is <__file__>, line <__line__>.
Output
This is mp4h.mp4h, line 1601.

If you closely look at source code you will see that this number is wrong. Indeed the number line is the end of the entire block containing this instruction.

 

    __version__

Prints the version of mp4h.

 

    dnl

Discard all characters until newline is reached. This macro ensures that following string is a comment and does not depend of the value of comment characters.

Source
<dnl>This is a comment
foo
<dnl>This is a comment
bar
Output
foo
bar

 

    date
[epoch]

Prints local time according to the epoch passed on argument. If there is no argument, current local time is printed.

Source
<date>
<set-var info=<get-file-properties <__file__>>>
<date <get-var info[2]>>
Output
Sat Apr  1 19:29:15 2000

Sat Apr  1 19:29:14 2000

 

    timer

Prints the time spent since last call to this macro. The printed value is the number of clock ticks, and so is dependant of your CPU.

Source
The number of clock ticks since the beginning of generation of
this documentation by <mp4h> is:
<timer>
Output
The number of clock ticks since the beginning of generation of
this documentation by <b>mp4h</b> is:
user 52
sys 0

 

    mp4h-l10n
name=value

Set locale-specific variables. By default, the portable "C" locale is selected.

Source
<mp4h-l10n LC_NUMERIC=fr_FR>
<add 1,2 3,4>
Output
4,600000

 

    mp4h-output-radix
number

Change the output format of floats by setting the number of digits after the decimal point. Default is to print numbers in the "%6.f" format.

Source
<add 1.2 3.4>
<mp4h-output-radix 2>
<add 1.2 3.4>
Output
4.600000

4.60

Macro expansion

This part describes internal mechanism of macro expansion. It must be as precise and exhaustive as possible so contact me if you have any suggestion.

Basics

Let us begin with some examples:
Source
<define-tag foo>
This is a simple tag
</define-tag>
<define-tag bar endtag=required>
This is a complex tag
</define-tag>
<foo>
<bar>Body function</bar>
Output
This is a simple tag


This is a complex tag

User defined macros may have attributes like HTML tags. To handle these attributes in replacement text, following conventions have been adopted (mostly derived from Meta-HTML):

Note: Input expansion is completely different in Meta-HTML and in mp4h. With Meta-HTML it is sometimes necessary to use other constructs like %xbody and %qbody. In order to improve compatibity with Meta-HTML, these constructs are recognized and are interpreted like %body. Another feature provided for compatibility reason is the fact that for simple tags %body and %attributes are equivalent. These features are in the current mp4h version but may disappear in future releases.

Attributes

Attributes are separated by spaces, tabulations or newlines, and each attribute must be a valid mp4h entity. For instance with the definitions above, <bar> can not be an attribute since it must be finished by </bar>. But this is valid:

<foo <foo>>
or even
<foo <foo name=src url=ici>>
In these examples, the foo has only one argument.

Under certain circumstances it is necessary to group multiple statements into a single one. This can be done with double quotes or with the group primitive, e.g.

<foo "This is the 1st attribute" <group and the second>>

Note: Unlike HTML single quotes can not replace doube quotes for this purpose.

If double quotes appear in an argument, they must be escaped by a backslash \.

Source
  <set-var text="Text with double quotes \" inside">
  <get-var text>
Output
  
  Text with double quotes " inside

Macro evaluation

Macros are characterized by

Characters are read on input until a < is found. Then macro name is read. After that attributes are read, verbatim or not depending on how this macro as been defined. And if this macro is complex, its body is read verbatim. When this is finished, some special sequences in replacement text are replaced (like %body, %attributes, %0, %1, etc.) and resulting text is put on input stack in order to be rescanned.

Note: By default attributes are evalled before any replacement.

Consider the following example, to change text in typewriter font:

<define-tag text-tt endtag=required whitespace=delete>
<tt>%body</tt>
</define-tag>

This definition has a major drawback:

Source
<text-tt>This is an <text-tt>example</text-tt></text-tt>
Output
<tt>This is an <tt>example</tt></tt>
We would like that inner tags are removed.

First idea is to use an auxiliary variable to know whether we still are inside such an environment:

<set-var _text:tt=0>
<define-tag text-tt endtag=required whitespace=delete>
<increment _text:tt>
<ifeq <get-var _text:tt> 1 "<tt>">
%body
<ifeq <get-var _text:tt> 1 "</tt>">
<decrement _text:tt>
</define-tag>
Source
<text-tt>This is an <text-tt>example</text-tt></text-tt>
Output
<tt>This is an example</tt>

But if we use simple tags, as in the example below, our definition does not seem to work. It is because attributes are expanded before they are put into replacement text.

Source
<define-tag opt><text-tt>%attributes</text-tt></define-tag>
<opt "This is an <opt example>">
Output
<tt>This is an <tt>example</tt></tt>

If we want to prevent this problem we have to forbid attributes expansion with

Source
<define-tag opt attributes=verbatim>;;;
<text-tt>%attributes</text-tt>;;;
</define-tag>
<opt "This is an <opt example>">
Output
<tt>This is an example</tt>

Author

Denis Barbier <barbier@imacs.polytechnique.fr>

Thanks

Sincere thanks to Brian J. Fox for writing Meta-HTML and Rene Seindal for maintaining this wonderful macro parser called GNU m4.