Stilo e-Publishing Solutions picture - clouds picture - clouds
dark blue bar dark blue bar dark blue bar
Contact Us Stilo Home
OmniMark Developer Resources  
defects reporting

Literate Programming Using OmniMark


3. Weaving

Weaving is in essence a formatting process, where code references are replaced by link to the actual code blocks. In Knuth's original system, the output of the weaving step could be used as the input to TeX. Here, we'll generate well-formed XML as the output of the weaving step; this well-formed XML can then be used as the input to some formatting program. Alternatively, the well-formed XML output could be viewed directly in a browser using a stylesheet.

The premise of literate programming is that the author has chosen a presentation order for the program. We therefore do not need to worry about re-ordering sections. However, we must resolve cross-references from one program section to another. This can be handled by using referents.

Our goal should be to simplify the task of the formatter as much as possible; this way, we make it easier to support various output targets. It seems reasonable, therefore, to put each section into its own file: I find it easier to re-combine split files than to split a monolithic file. The main file consists of a list of section files: it will provide a driver for the formatter.

The weaving process is begun when we hit the program element:

<4 weaving a program> =
element "program" <35 generate weaved filename>

Generating a filename for the weaved output is handled by <34 generating output filenames>:

Once we've parsed the document, we can generate a list of sections and output the main document. The main document is generated according to the template

<5 main page template> =
"<?xml version=%"1.0%"?>%n" || "<weaved type=%"main%">%n" || "<program-name>[:program name:]</program-name>%n" || "<sections>%n" || "[:section list:]%n" || "</sections>%n" || "</weaved>%n"

The [:section list:] template parameter is generated dynamically using

<6 generate section list> =
open s as buffer using output as s repeat over section-numbers & section-titles save-clear template-parameters set new template-parameters{"section name"} to key of section-titles set new template-parameters{"section number"} to "d" % section-numbers set new template-parameters{"section title"} to section-titles output emit-template templates{"section list item"} again close s set new template-parameters{"section list"} to s

and consists of multiple copies of the following template, one for each section in the input document

<7 section list item template> =
"<section><filename>[:section name:]</filename>" || "<number>[:section number:]</number>" || "<title>[:section title:]</title></section>%n"

We will see shortly that the section-numbers and section-titles shelves are built-up when weaving the various sections in the literate program (see <9 weaving a section>).

Meanwhile, the [:program name:] template parameter is picked up from the title child of the program element. We will need the program name for each section we format, so we store it in the program-name global.

<8 weaving miscellaneous elements> =
element "title" when parent is "program" set program-name to "%sc"

Combining all of this, the steps for handling the program element are

<4 weaving a program> +=
clear template-parameters open new weaved-file with referents-allowed as file weaved-filename using output as weaved-file do local stream s output "%c" <6 generate section list> set new template-parameters{"program name"} to program-name output emit-template templates{"main page"} done close weaved-file

As mentioned earlier, it seems reasonable to put each section of the weaved program into its own file. We therefore need to generate a new filename each time we encounter a section element:

<9 weaving a section> =
element "section" local stream s <3 clear template parameters> <35 generate weaved filename>

Sections are numbered, and their titles must be collected for the main page.

<9 weaving a section> +=
increment section-number new section-titles{weaved-filename} set new section-numbers{weaved-filename} to section-number set s with referents-allowed to "%c"

Although we create the entry on the shelf here, the section's title will only be stored in the section-titles shelf when we encounter the title element:

<8 weaving miscellaneous elements> +=
element "title" when parent is "section" set section-titles to "%sc"

Just like with the program element, once we've parsed the contents of the section, we collect the template parameters

<9 weaving a section> +=
set new template-parameters{"program name"} to program-name set new template-parameters{"section number"} to "d" % section-numbers set new template-parameters{"section title"} to section-titles set new template-parameters{"section contents"} with referents-allowed to s open new weaved-file with referents-allowed as file weaved-filename using output as weaved-file output emit-template templates{"section page"} close weaved-file

and emit the section page template:

<10 section page template> =
"<?xml version=%"1.0%"?>%n" || "<weaved type=%"section%">%n" || "<program-name>[:program name:]</program-name>%n" || "<number>[:section number:]</number>%n" || "<title>[:section title:]</title>%n" || "<section>%n" || "[:section contents:]" || "</section>%n" || "</weaved>%n"

Oddly enough, the rule for element code is the longest in the program, even though weaving code is simpler than tangling it (see <28 tangling code>).

<11 weaving code> =
element "code" <3 clear template parameters> set new template-parameters{"code body"} with referents-allowed to "%c" do when attribute "id" is specified <16 weaving identified code> else <17 weaving anonymous code> done

The reason is fairly simple, however: the formatting of a code element changes depending on whether or not this is the first occurrence of this particular code block. The first time we see a code block (called, say some code block), we must emit a cross-reference anchor as well as the XML required to support the following formatted herald

<99 some code block> = ... the code goes here ...

Specifically, we use the following template the first time we see a code block.

<12 template code identified> =
"<code-body type=%"identified%">[?key:code id?]<name>[:code name:]</name>%n" || "<code>%n[:code body:]%n</code>%n" || "</code-body>"

We also need to generate a block number.

<13 generating a block's number> =
increment block-number set new block-numbers{code-key} to block-number set new template-parameters{"block number"} to "d" % block-number

However, the second and subsequent occurrences of a code block are formatted as

<99 some code block> += ... more code goes here ...

Note that the = from the first example has turned in to +=, indicating that this block is appending code to a previous block. More precisely, we use a template similar to <12 template code identified>:

<14 template code identified appended> =
"<code-body type=%"identified appended%">[?key:code id?]" || "<name>[:code name:]</name>%n" || "<code>%n[:code body:]%n</code>%n" || "</code-body>"

The attribute type is used by the formatting program to determine if this is the first occurrence of the code block or not. In the case where this is a subsequent occurrence, we need to determine the what number was previously-assigned to this block; we store this in the block-numbers shelf, which is keyed on the block identifier:

<15 determining a block's number> =
set new template-parameters{"block number"} to "d" % block-numbers{code-key}

The templates <12 template code identified> and <14 template code identified appended> could be combined into a single template that has an extra parameterisation specifying whether the current block is a first occurrence or not. However, I prefer to have templates be self-contained as much as possible, rather than have parts generated conditionally in the code. Otherwise, it becomes more difficult to track down where different parts of a template are generated.

Combining these elements, weaving identified code is relatively simple

<16 weaving identified code> =
local switch block-exists local string code-key initial { "key:" || "lg" % attribute "id" } set block-exists to referents has key code-key & referents{code-key} is attached do when block-exists <15 determining a block's number> else <13 generating a block's number> done set new template-parameters{"code id"} to "lg" % attribute "id" set new template-parameters{"filename"} to weaved-filename set referent code-key to emit-template templates{"code pointer"}

If the code element does not provide a name attribute, we can re-use the id attribute.

<16 weaving identified code> +=
do when attribute "name" is specified set referent ("name:" || "lg" % attribute "id") to attribute "name" set new template-parameters{"code name"} to attribute "name" else set referent ("name:" || "lg" % attribute "id") to attribute "id" set new template-parameters{"code name"} to attribute "id" done output emit-template templates{block-exists -> "identified code appended" | "identified code"}

If a code element does not provide an id attribute, then all we do is emit the code verbatim, with no herald:

<17 weaving anonymous code> =
output emit-template templates{"anonymous code"}

which uses the template

<18 template anonymous code> =
"<code-body type=%"anonymous%">%n" || "<code>%n[:code body:]%n</code>%n" || "</code-body>%n"

In our literate programming tool, cross-references are represented using undefined general entities (i.e., general entites that are not defined in the document's internal DTD subset). When an entity is encountered in the input, the external-text-entity rule (<33 handling a cross-reference>) fires, translating the entity into a processing instruction. In the weaving process, this processing instruction is further translated into a cross-reference to the appropriate code section:

<19 weaving a code reference> =
processing-instruction "code-reference " any+ => reference-name <3 clear template parameters> set new template-parameters{"reference"} to reference-name output emit-template templates{"code reference"}

This uses the template

<20 template code reference> =
"<code-reference>[?key:reference?]" || "<name>[?name:reference?]</name></code-reference>"

Apart from section titles, we have not dealt with any of the textual elements of the document (e.g., paragraphs, italicised sections, and so on). These elements are handled by the formatting program (or a stylesheet); it is sufficient to let them pass through essentially unaffected, into the XML output.

<21 weaving formatting elements> =
element ("p" | "b" | "i" | "tt") <3 clear template parameters> set new template-parameters{"element name"} to "%lq" set new template-parameters{"element content"} with referents-allowed to "%sc" output emit-template templates{"identity"}

The same applies to the data content in a code block, except that we need to escape the reserved XML text entities:

<22 escaping data-content> =
data-content when element is "code" repeat scan "%c" match "<" output "&lt;" match ">" output "&gt;" match "%"" output "&quot;" match "'" output "&apos;" match "&" output "&amp;" match [any \ "<>%"'&"]+ => t output t again

To complete the weaving process, we need to declare the global shelves mentioned earlier,

<2 global shelves> +=
global string weaved-filename variable global stream weaved-file variable global string weaved-filename-suffix initial { ".xml" } global string program-name global integer block-numbers variable global integer block-number global integer section-number global string section-titles variable global integer section-numbers variable

and declare the group that holds all the rules together:

<23 weaving> =

Previous section: Template Processing

Next section: Tangling

blue bar