OmniMark Developer Resources

Public

Developer Home

Defects Reporting

Literate Programming Using OmniMark

1.	Introduction
2.	Template Processing
3.	Weaving
4.	Tangling
5.	Handling Cross-References
6.	Main Loop
7.	Generating Output Filenames
8.	Epilogue

3. Weaving

Weaving is in essence a formatting process, where code references are replaced by link to the actual code blocks. In Knuth's original system, the output of the weaving step could be used as the input to TeX. Here, we'll generate well-formed XML as the output of the weaving step; this well-formed XML can then be used as the input to some formatting program. Alternatively, the well-formed XML output could be viewed directly in a browser using a stylesheet.

The premise of literate programming is that the author has chosen a presentation order for the program. We therefore do not need to worry about re-ordering sections. However, we must resolve cross-references from one program section to another. This can be handled by using referents.

Our goal should be to simplify the task of the formatter as much as possible; this way, we make it easier to support various output targets. It seems reasonable, therefore, to put each section into its own file: I find it easier to re-combine split files than to split a monolithic file. The main file consists of a list of section files: it will provide a driver for the formatter.

The weaving process is begun when we hit the program element:

<4 weaving a program> =


   element "program"
      <35 generate weaved filename>

Generating a filename for the weaved output is handled by <34 generating output filenames>:

Once we've parsed the document, we can generate a list of sections and output the main document. The main document is generated according to the template

<5 main page template> =


      "<?xml version=%"1.0%"?>%n"
   || "<weaved type=%"main%">%n"
   || "<program-name>[:program name:]</program-name>%n"
   || "<sections>%n"
   || "[:section list:]%n"
   || "</sections>%n"
   || "</weaved>%n"

The [:section list:] template parameter is generated dynamically using

<6 generate section list> =


         open s as buffer
         using output as s
            repeat over section-numbers & section-titles
               save-clear template-parameters
               set new template-parameters{"section name"}   to key of section-titles
               set new template-parameters{"section number"} to "d" % section-numbers
               set new template-parameters{"section title"}  to section-titles

               output emit-template templates{"section list item"}
            again
         close s
         set new template-parameters{"section list"} to s

and consists of multiple copies of the following template, one for each section in the input document

<7 section list item template> =


      "<section><filename>[:section name:]</filename>"
   || "<number>[:section number:]</number>"
   || "<title>[:section title:]</title></section>%n"

We will see shortly that the section-numbers and section-titles shelves are built-up when weaving the various sections in the literate program (see <9 weaving a section>).

Meanwhile, the [:program name:] template parameter is picked up from the title child of the program element. We will need the program name for each section we format, so we store it in the program-name global.

<8 weaving miscellaneous elements> =


   element "title" when parent is "program"
      set program-name to "%sc"

Combining all of this, the steps for handling the program element are

<4 weaving a program> +=


      clear template-parameters

      open new weaved-file with referents-allowed as file weaved-filename
      using output as weaved-file
      do
         local stream s

         output "%c"

         <6 generate section list>
         set new template-parameters{"program name"} to program-name

         output emit-template templates{"main page"}
      done
      close weaved-file

As mentioned earlier, it seems reasonable to put each section of the weaved program into its own file. We therefore need to generate a new filename each time we encounter a section element:

<9 weaving a section> =


   element "section"
      local stream s

      <3 clear template parameters>

      <35 generate weaved filename>

Sections are numbered, and their titles must be collected for the main page.

<9 weaving a section> +=


      increment section-number
      new section-titles{weaved-filename}
      set new section-numbers{weaved-filename} to section-number

      set s with referents-allowed to "%c"

Although we create the entry on the shelf here, the section's title will only be stored in the section-titles shelf when we encounter the title element:

<8 weaving miscellaneous elements> +=


   element "title" when parent is "section"
      set section-titles to "%sc"

Just like with the program element, once we've parsed the contents of the section, we collect the template parameters

<9 weaving a section> +=


      set new template-parameters{"program name"}   to program-name
      set new template-parameters{"section number"} to "d" % section-numbers
      set new template-parameters{"section title"}  to section-titles

      set new template-parameters{"section contents"} with referents-allowed to s

      open new weaved-file with referents-allowed as file weaved-filename
      using output as weaved-file
         output emit-template templates{"section page"}
      close weaved-file

and emit the section page template:

<10 section page template> =


      "<?xml version=%"1.0%"?>%n"
   || "<weaved type=%"section%">%n"
   || "<program-name>[:program name:]</program-name>%n"
   || "<number>[:section number:]</number>%n"
   || "<title>[:section title:]</title>%n"
   || "<section>%n"
   || "[:section contents:]"
   || "</section>%n"
   || "</weaved>%n"

Oddly enough, the rule for element code is the longest in the program, even though weaving code is simpler than tangling it (see <28 tangling code>).

<11 weaving code> =


   element "code"
      <3 clear template parameters>

      set new template-parameters{"code body"} with referents-allowed to "%c"

      do when attribute "id" is specified
         <16 weaving identified code>

      else
         <17 weaving anonymous code>
      done

The reason is fairly simple, however: the formatting of a code element changes depending on whether or not this is the first occurrence of this particular code block. The first time we see a code block (called, say some code block), we must emit a cross-reference anchor as well as the XML required to support the following formatted herald


<99 some code block> = 
   ... the code goes here ...

Specifically, we use the following template the first time we see a code block.

<12 template code identified> =


      "<code-body type=%"identified%">[?key:code id?]<name>[:code name:]</name>%n"
   || "<code>%n[:code body:]%n</code>%n"
   || "</code-body>"

We also need to generate a block number.

<13 generating a block's number> =


            increment block-number
            set new block-numbers{code-key}             to block-number
            set new template-parameters{"block number"} to "d" % block-number

However, the second and subsequent occurrences of a code block are formatted as


<99 some code block> += 
   ... more code goes here ...

Note that the = from the first example has turned in to +=, indicating that this block is appending code to a previous block. More precisely, we use a template similar to <12 template code identified>:

<14 template code identified appended> =


      "<code-body type=%"identified appended%">[?key:code id?]"
   || "<name>[:code name:]</name>%n"
   || "<code>%n[:code body:]%n</code>%n"
   || "</code-body>"

The attribute type is used by the formatting program to determine if this is the first occurrence of the code block or not. In the case where this is a subsequent occurrence, we need to determine the what number was previously-assigned to this block; we store this in the block-numbers shelf, which is keyed on the block identifier:

<15 determining a block's number> =


            set new template-parameters{"block number"} 
               to "d" % block-numbers{code-key}

The templates <12 template code identified> and <14 template code identified appended> could be combined into a single template that has an extra parameterisation specifying whether the current block is a first occurrence or not. However, I prefer to have templates be self-contained as much as possible, rather than have parts generated conditionally in the code. Otherwise, it becomes more difficult to track down where different parts of a template are generated.

Combining these elements, weaving identified code is relatively simple

<16 weaving identified code> =


         local switch block-exists
         local string code-key     initial { "key:" || "lg" % attribute "id" }

         set block-exists to referents has key code-key 
                             & referents{code-key} is attached

         do when block-exists
            <15 determining a block's number>

         else
            <13 generating a block's number>
         done

         set new template-parameters{"code id"}  to "lg" % attribute "id"
         set new template-parameters{"filename"} to weaved-filename
         set referent code-key to emit-template templates{"code pointer"}

If the code element does not provide a name attribute, we can re-use the id attribute.

<16 weaving identified code> +=


         do when attribute "name" is specified
            set referent ("name:" || "lg" % attribute "id") to attribute "name"
            set new template-parameters{"code name"} to attribute "name"

         else
            set referent ("name:" || "lg" % attribute "id") to attribute "id"
            set new template-parameters{"code name"} to attribute "id"
         done

         output emit-template templates{block-exists -> "identified code appended" 
                                                      | "identified code"}

If a code element does not provide an id attribute, then all we do is emit the code verbatim, with no herald:

<17 weaving anonymous code> =


         output emit-template templates{"anonymous code"}

which uses the template

<18 template anonymous code> =


      "<code-body type=%"anonymous%">%n"
   || "<code>%n[:code body:]%n</code>%n"
   || "</code-body>%n"

In our literate programming tool, cross-references are represented using undefined general entities (i.e., general entites that are not defined in the document's internal DTD subset). When an entity is encountered in the input, the external-text-entity rule (<33 handling a cross-reference>) fires, translating the entity into a processing instruction. In the weaving process, this processing instruction is further translated into a cross-reference to the appropriate code section:

<19 weaving a code reference> =


   processing-instruction "code-reference " any+ => reference-name
      <3 clear template parameters>

      set new template-parameters{"reference"} to reference-name

      output emit-template templates{"code reference"}

This uses the template

<20 template code reference> =


      "<code-reference>[?key:reference?]"
   || "<name>[?name:reference?]</name></code-reference>"

Apart from section titles, we have not dealt with any of the textual elements of the document (e.g., paragraphs, italicised sections, and so on). These elements are handled by the formatting program (or a stylesheet); it is sufficient to let them pass through essentially unaffected, into the XML output.

<21 weaving formatting elements> =


   element ("p" | "b" | "i" | "tt")
      <3 clear template parameters>

      set new template-parameters{"element name"}                           to "%lq"
      set new template-parameters{"element content"} with referents-allowed to "%sc"

      output emit-template templates{"identity"}

The same applies to the data content in a code block, except that we need to escape the reserved XML text entities:

<22 escaping data-content> =


   data-content when element is "code"
      repeat scan "%c"
      match "<"
         output "&lt;"

      match ">"
         output "&gt;"

      match "%""
         output "&quot;"

      match "'"
         output "&apos;"

      match "&"
         output "&amp;"

      match [any \ "<>%"'&"]+ => t
         output t
      again

To complete the weaving process, we need to declare the global shelves mentioned earlier,

<2 global shelves> +=


global string  weaved-filename        variable
global stream  weaved-file            variable
global string  weaved-filename-suffix          initial { ".xml" }

global string  program-name

global integer block-numbers          variable
global integer block-number

global integer section-number
global string  section-titles         variable
global integer section-numbers        variable

and declare the group that holds all the rules together:

<23 weaving> =


group "weave"
   <4 weaving a program>
   <9 weaving a section>
   <11 weaving code>
   <19 weaving a code reference>
   <21 weaving formatting elements>
   <8 weaving miscellaneous elements>
   <22 escaping data-content>

Previous section: Template Processing

Next section: Tangling

Copyright © Stilo International Ltd 2022. All information on this website is protected under Stilo's copyright.
OmniMark, the OmniMark swirl logo and Stilo are registered trademarks of Stilo International Ltd. All rights reserved.

Literate Programming Using OmniMark

Contents

3. Weaving