Contents:

Warning

This documentation is incomplete. It will be finished as soon as possible.

1. TODO

2. Introduction

QML is a markup language designed by Baron R.K. von Wolfsheild for Qtask. This program parses a string of QML text into a QML document tree (that can then be converted to, say, XHTML with an appropriate "emitter").

Parsing is done with a pipeline made of three stages. The first stage uses parse on the input string and sends QML commands to the second stage; the second stage, whose main function is greatly simplifying the implementation of the third stage, ensures command balancing ("open" commands must be followed by "close" commands in the correct order, and correctly nested; also there are two levels of commands, "block" and "inline", that must nest properly), preprocessing and translating commands, and then sends them to the third stage; the third stage converts the stream of commands into a QML document tree.

The second and third stages are implemented as finite state machines (FSMs). The document tree that comes out of the third stage is optimized with a tree rewriting engine before returning it. The three rewriting rules also process the =repeat and =data commands; after the document has been completely expanded and optimized (which may require more than one pass), any table of contents is created and headers are numbered; numbers for enum lists are calculated too. (Numbering and TOC creation has to be done last because of =repeat.)

3. Overview

This section is the one users of this scanner are interested the most, because it shows the "interface".

The QML scanner uses a stack-based finite state machine interpreter and a data structure matching and rewriting engine, so we include them here. The qml-scanner object is defined, which contains all the code for the scanner.

Two functions are available for users: scan-doc and search. scan-doc takes a string as its argument and returns a QML document tree (a block). Please note that scan-doc is not reentrant and you cannot call it recursively (not that there is any way for you to do so). You can use the /with refinement to provide a block of pairs of strings, to set default options for commands such as =box and =table. (See the Example usage.) It is also possible to use the /keep refinement to stop scan-doc from resetting default options, so that you can use the ones that have been set in a previous session; note that if you use /keep, /with is ignored, so you should not provide both. (There is also a limitation, in that /keep does not actually keep the default value for "alias".)

search takes a string and a QML document tree; it performs a simple substring search inside the document (ignoring formatting etc.). A block is returned, with pairs of header and text snippet (with one exception explained below); the text snippet is a string of about 100 characters around the substring match, and the header is a QML header node (a block) for the "section" that contains the text (i.e., the last header encountered before the string was found). The returned block contains a pair of header and snippet for each occurrence of the substring in the document. If the substring is found before encountering any header in the document, the result block will contain the word 'doc-start in place of a header block.

Note that the text snippet does not contain any formatting (but paragraphs are separated with a newline).

search also returns anchor name exact matches: if the document contains an anchor whose name matches the text string exactly, then the first pair in the result block will be the header for the section that contains the anchor (or 'doc-start), and the anchor node. (You can distinguish this case because the "snippet" is a block instead of a string; also, this can only happen for the first pair in the result block, and since anchor names need to be unique only one can match.)

Overview

#include %fsm.r
#include %rewrite.r

qml-scanner: context [
 The parser (stage 1)
 Command options parsing
 Setting default options
 Balancing (stage 2): the stage2 function
 Generating QML document tree (stage 3)

 Searching a QML document

 ; the interface
 
 search: func [
  "Search a QML document tree for a substring"
  doc [block!] "The QML document tree (as returned by SCAN-DOC)"
  text [string!] "The substring to search for"
  
  /local search's locals
 ] [
  Search a QML document tree for a substring
 ]

 scan-doc: func [
  {Parse a QML text string and return a QML document tree}
  text [string!]
  /with defaults [block!] "Default options for commands"
  /keep {Keep default options from previous session (ignores /with)}
 ] [
  Scan a text string and return a QML document tree
 ]
]

3.1 Example usage

To scan a document, just call scan-doc with the document text; you can use the /with refinement to set default options for commands. (Commands that support default options are: =toc (sets the default number style), =alias (sets the default "magic" sequence), =table, =row, =column, =cell, =box and =image.) You can pass the resulting document tree to an emitter (such as the XHTML emitter) to generate a document that the user can read.

To search a substring in a document just call search passing the document tree and the substring to search for.

Example usage

text: "...some QML text..."
doc: qml-scanner/scan-doc text

doc: qml-scanner/scan-doc/with text ["box" "yellow dashed" "table" "all dotted"]

search-results: qml-scanner/search doc "substring"

4. The parser (stage 1)

The first stage of the pipeline is the parser. The parse-qml function just uses parse on the text string.

The parser (stage 1)

The parse rules
The set-magic function

parse-qml: func [text [string!] magic [string! none!]] [
 ; initialize magic char
 set-magic any [magic "="]
 parse/all text qml-rule
]

4.1 The parse rules

The parse-qml function uses qml-rule to parse the text; everything that is not a command, is considered text and is handed over to the second stage (with the stage2 function, see Balancing (stage 2): the stage2 function).

The txt-chars charset is initialized by the set-magic function (see The set-magic function).

The parse rules

qml-rule: [
 some [commands | text]
]

commands: [
 parse rule for commands
]

txt: none
txt-chars: none
text: [copy txt some txt-chars (stage2 [text:] txt)]

4.2 parse rule for commands

The heart of the parser is the commands rule.

The newline character is treated like a command (produces the "^/" command); spaces at the end and at the beginning of lines are ignored. (As already noted, the stage2 function is used to hand the commands over to the second stage of the pipeline; see Balancing (stage 2): the stage2 function.)

Other commands are introduced by magic-char (set by the set-magic function, see The set-magic function); there are some special commands that require their own parse rules, but most commands are parsed by the rule shown here. This parses normal opening commands (like "=cmd"), opening commands with options ("=cmd[...]" or "=cmd{...}"), and ending commands ("=cmd." or "=cmd/"); commands with options eat at most one space following them (i.e., "=cmd[...]text" is exactly the same as "=cmd[...] text"; if you need a space after a command with options, use two spaces); ending commands do not eat any space (so "=cmd.text" is not the same as "=cmd. text"); commands without options require at least a space following them and will eat all spaces following them. Note that the mk: magic-char :mk construct allows putting commands one after another without spaces between them (e.g., "=b=i"). Note that this rule also parses the dot command "=.".

As you can see each command is sent to the second stage as a pair of values: the command as a string, and the command's options (which can be none). As you have already seen in The parse rules, this is also true for text, which is sent as the text: (a set-word!) command with the text as options (a set-word! is used so that there is no clash with other commands, e.g. if we ever need to add a =text command and so on).

parse rule for commands

any spc newline any spc (stage2 "^/" none)
|
magic-char [
 Special commands
 |
 ; ignore a = at the end of the text
 end
 |
 copy cmd any cmd-chars [
  "[" copy options to "]" skip opt spc (stage2 cmd options)
  |
  "{" copy options to "}" skip opt spc (stage2 cmd options)
  |
  ["." | "/"] (stage2 join any [cmd ""] "." none)
  |
  [some spc | mk: [newline | magic-char] :mk | end] (stage2 cmd none)
 ]
 |
 ; malformed command?
 (stage2 [text:] magic-char)
]

4.2.1 Special commands

Some command require custom parse rules. I've tried to keep this to a minimum, but Reichart loves special cases. ;-)

Special commands

magic-char (stage2 [text:] magic-char)
|
[" " | mk: newline :mk] (stage2 " " none)
|
"alias" some spc copy cmd some cmd-chars any spc (set-magic cmd)
|
"csv" [
 "[" copy options to "]" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""])
 |
 "{" copy options to "}" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""])
 |
 any spc opt newline (options: context [name: show: none])
] (csv: make block! 256) some [
 [magic-char "csv" ["." | "/"] any spc opt newline | end] (stage2 "csv" make options [contents: csv]) break
 |
 [copy txt to newline newline | copy txt to end] (append/only csv parse/all txt ",")
]
|
copy cmd escape-cmd [
 "[" copy options to "]" skip any spc opt newline
 |
 "{" copy options to "}" skip any spc opt newline
 |
 any spc opt newline (options: rejoin [magic-char cmd "."])
] [copy txt to options options any spc opt newline | copy txt to end] (stage2 cmd txt)
|
some "-" [some spc opt newline | newline | mk: magic-char :mk | end] (stage2 "-" none)
|
copy cmd some ">" [some spc | mk: [newline | magic-char] :mk | end] (stage2 ">" length? cmd)
|
"[" copy options to "]" skip (stage2 "" options)
|
"{" copy options to "}" skip (stage2 "" options)
|
"," (stage2 "," none)
|
"repeat" (options: make block! 16) [
 (opt-open-char: "[" opt-close-char: "]") rebol-options (stage2 "repeat" options)
 |
 (opt-open-char: "{" opt-close-char: "}") rebol-options (stage2 "repeat" options)
 |
 ["." | "/"] (stage2 "repeat." none)
]

4.2.2 Words used by the commands rule

The commands parse rule uses a number of words that we need to make local. It also uses the rebol-options subrule which we define here.

cmd-chars and magic-char are initialized by the set-magic function (see The set-magic function); escape-cmd is the list of escape commands.

rebol-options uses the load-next function to parse the options as REBOL values; in turn it uses load/next to parse the text and tries to recover in case of errors.

The parse rules +≡

mk: cmd: options: csv: none
spc: charset " ^-"
spc+: charset " ^-^/"
cmd-chars: none
magic-char: none

escape-cmd: ["HTML" | "REBOL" | "MakeDoc" | "Example"]

opt-open-char: "[" opt-close-char: "]"
rebol-options: [
 opt-open-char
 txt: (txt: load-next options txt) :txt
 some [any spc+ opt-close-char break | end break | txt: (txt: load-next options txt) :txt]
 opt spc
]
load-next: func [out text /local val] [
 if error? try [
  set [val text] load/next text
  insert/only tail out val
 ] [
  insert tail out copy/part text text: any [find text opt-close-char tail text]
 ]
 text
]

4.3 The set-magic function

This function initializes the values of magic-char, cmd-chars and txt-chars. cmd-chars is a charset containing the characters that are valid in command names; txt-chars contains the characters that are valid for text outside commands.

The set-magic function

set-magic: func [magic [string!]] [
 ; magic cannot be empty
 if empty? magic [magic: "="]
 magic-char: magic
 cmd-chars: complement charset join " ^-^/[]{}./" first magic-char
 txt-chars: complement charset join "^/" first magic-char
]

5. Balancing (stage 2): the stage2 function

The second stage of the pipeline is meant to simplify the third stage by "normalizing" the command stream. It takes care of ensuring command balancing, and handles the space, dot and comma commands ("= ", "=." and "=,"). Parsing of command options is also done in the stage2 function (just because this is a convenient place to do it, even though conceptually it belongs to the first stage).

This stage uses a Finite State Machine, handling each command as an event. Please see the documentation of the FSM interpreter for more informations (linked in from Overview).

As seen in the first stage, the stage2 function is invoked with a command and its options. A command can be a string! or a set-word!; for simplicity set-words can be passed in a block (as in stage2 [text:] txt), so if cmd is a block we just extract the first value. The options are parsed with the parse-command-options function; the cmd and opts words are then set in the stage2-ctx object, so that they are available to the FSM rules, and cmd is issued as an event for the state machine. (stage2 needs to do another thing after this, we'll discuss this later on.)

Balancing (stage 2): the stage2 function

stage2-fsm: make fsm! [ ]

stage2: func [cmd opts] [
 if block? cmd [cmd: first cmd]
 stage2-ctx/opts: parse-command-options stage2-ctx/cmd: cmd :opts
 stage2-fsm/event cmd
 Additional code for the stage2 function
]

stage2-ctx: context [
 cmd: opts: none
 Balancing (stage 2): the Finite State Machine
]

Initialization and termination of the FSM

The merge-style function

6. Command options parsing

Command options (also called "refinements" in the code) are parsed by the parse-command-options function, which is called by the stage2 function (although, conceptually options parsing belongs to the first stage; the stage2 function parses the options before actually processing the command in the FSM).

Only some commands need their options to be parsed; if cmd is one of them, and options is a string, then the parse-arg-string function is used. Otherwise options is returned as-is.

Command options parsing

parse-command-options: func [cmd options] [
 either all [
  string? options
  find [
   "table" "row" "column" "cell" "box"
   "image" "font" "f" "span" "data"
  ] cmd
 ] [
  refinements/parse-arg-string cmd options
 ] [
  options
 ]
]

refinements: context [
 Option values' types
 Type map
 Object map
 parse-arg-string's support functions
 parse-arg-string: func [cmd args /local parse-arg-string's locals] [
  Parse the args string into an object
 ]
]

6.1 Option values' types

Options are parsed as a list of values; option values can be of a number of different types, each recognized by syntax (like in REBOL) and parsed by a specific parse rule. Values can be of one of the following types: flag!, set-word!, color!, string!, integer!, url!, percent!, pair! and comma-pair!; after being parsed, each value is represented by a REBOL value, and each type is mapped to a REBOL type: flag! is mapped to word!; set-word!, string!, integer!, url! and pair! are mapped to the respective REBOL types; color! is mapped to one of issue!, refinement! or tuple! (depending on how the color was specified, if as a hex string, a color name or a tuple); percent! is mapped to money!; comma-pair! is mapped to block!.

The types object holds the parse rules for the option types. The flag-word and set-word rules are set dynamically based on the command that is being parsed. Note that strings can be specified even without quotes, so the string! rule must be applied last (other types need to be tried first); also note that we only allow integer percent values, even tough we use money! to represent them; also, comma-pair! can contain integers or percents, so the block representing it can contain integer! or money! values.

Option values' types

types: context [
 flag!: [flag-word [some spc | end]]
 set-word!: [set-word any spc]
 color!: [
  [
   color-keyword
   |
   tuple
   |
   [opt "#" copy value 6 hex-digits | "#" copy value 3 hex-digits] (value: to issue! value)
  ]
  [some spc | end]
 ]
 string!: [
  [{"} copy value some dquotechars {"} | "'" copy value some quotechars "'" | copy value some chars]
  [some spc | end]
 ]
 integer!: [copy value some digits (value: to system/words/integer! value) [some spc | end]]
 url!: [
  copy value [some urlchars ":" 0 2 "/" some urlchars any ["/" some urlchars]]
  (value: to system/words/url! value)
  [some spc | end]
 ]
 percent!: [copy value 1 3 digits "%" (value: to money! value) [some spc | end]]
 pair!: [copy value [some digits "x" some digits] (value: to system/words/pair! value) [some spc | end]]
 comma-pair!: [
  (value: make block! 4)
  copy val some digits ["%" (append value to money! val) | none (append value to integer! val)]
  ","
  copy val some digits ["%" (append value to money! val) | none (append value to integer! val)]
  [some spc | end]
 ]
]

6.1.1 Subrules and other words used by the parse rules

As already said above, flag-word and set-word are dynamically set to a command-specific rule. See Rules for flag! and Rules for set-word!.

Option values' types +≡

value: val: none
chars: complement spc: charset " ^-^/"
urlchars: complement charset {"':/ ^- }
dquotechars: complement charset {"}
quotechars: complement charset "'"
digits: charset "1234567890"
hex-digits: union digits charset "ABCDEFabcdef"
flag-word: none
set-word: none
Rules for color!
Rules for flag!
Rules for set-word!

6.1.2 Rule for parsing a generic value

The rule value-rule can be used to parse a generic (non flag or set-word) value; the order of the rules is significant. (Note that it's written this way for compatibility with older REBOLs.)

Option values' types +≡

value-rule: bind [color! | percent! | pair! | comma-pair! | integer! | url! | string!] in types 'self

6.1.3 Rules for color!

Rules for color!

color-keyword: [
 "clear" (value: /transparent) | copy value [
  "aliceblue" | "antiquewhite" | "aqua" | "aquamarine" | "azure" |
  "beige" | "bisque" | "black" | "blanchedalmond" | "blue" |
  "blueviolet" | "brown" | "burlywood" | "cadetblue" | "chartreuse" |
  "chocolate" | "coral" | "cornflowerblue" | "cornsilk" | "crimson" |
  "cyan" | "darkblue" | "darkcyan" | "darkgoldenrod" | "darkgray" |
  "darkgreen" | "darkkhaki" | "darkmagenta" | "darkolivegreen" |
  "darkorange" | "darkorchid" | "darkred" | "darksalmon" |
  "darkseagreen" | "darkslateblue" | "darkslategray" | "darkturquoise" |
  "darkviolet" | "deeppink" | "deepskyblue" | "dimgray" |
  "dodgerblue" | "feldspar" | "firebrick" | "floralwhite" |
  "forestgreen" | "fuchsia" | "gainsboro" | "ghostwhite" | "gold" |
  "goldenrod" | "gray" | "green" | "greenyellow" | "honeydew" |
  "hotpink" | "indianred" | "indigo" | "ivory" | "khaki" | "lavender" |
  "lavenderblush" | "lawngreen" | "lemonchiffon" | "lightblue" |
  "lightcoral" | "lightcyan" | "lightgoldenrodyellow" | "lightgreen" |
  "lightgrey" | "lightpink" | "lightsalmon" | "lightseagreen" |
  "lightskyblue" | "lightslateblue" | "lightslategray" |
  "lightsteelblue" | "lightyellow" | "lime" | "limegreen" | "linen" |
  "magenta" | "maroon" | "mediumaquamarine" | "mediumblue" |
  "mediumorchid" | "mediumpurple" | "mediumseagreen" |
  "mediumslateblue" | "mediumspringgreen" | "mediumturquoise" |
  "mediumvioletred" | "midnightblue" | "mintcream" | "mistyrose" |
  "moccasin" | "navajowhite" | "navy" | "oldlace" | "olive" |
  "olivedrab" | "orange" | "orangered" | "orchid" | "palegoldenrod" |
  "palegreen" | "paleturquoise" | "palevioletred" | "papayawhip" |
  "peachpuff" | "peru" | "pink" | "plum" | "powderblue" | "purple" |
  "red" | "rosybrown" | "royalblue" | "saddlebrown" | "salmon" |
  "sandybrown" | "seagreen" | "seashell" | "sienna" | "silver" |
  "skyblue" | "slateblue" | "slategray" | "snow" | "springgreen" |
  "steelblue" | "tan" | "teal" | "thistle" | "tomato" | "turquoise" |
  "violet" | "violetred" | "wheat" | "white" | "whitesmoke" | "yellow" |
  "yellowgreen" | "transparent"
 ] (value: to refinement! value)
]
tuple: [
 copy value [1 3 digits "." 1 3 digits "." 1 3 digits] (value: attempt [to tuple! value])
]

6.1.4 Rules for flag!

Each command has a list of available flags. Note that order is important (because of the way the bold and italic rules are defined, for example).

Rules for flag!

flag-words: [
 "table" [
  outline | dashed | dotted | solid | borderless | vertical | horizontal | all | hide | headerless |
  center | left | right | justify | middle | top | bottom | imagecenter | imageleft |
  imageright | imagemiddle | imagetop | imagebottom | float | space2 | tilev | shadow | rounded |
  tileh | tileless | tile | boxcenter | boxleft | boxright | times | helv | courier | bold | italic
 ]
 "cell" "row" "column" [
  outline | dashed | dotted | solid | borderless | all |
  center | left | right | justify | middle | top | bottom | imagecenter | imageleft |
  imageright | imagemiddle | imagetop | imagebottom | tilev |
  tileh | tileless | tile | times | helv | courier | bold | italic
 ]
 "box" [
  outline | dashed | dotted | solid | borderless |
  center | left | right | justify | middle | top | bottom | imagecenter | imageleft |
  imageright | imagemiddle | imagetop | imagebottom | float | tilev |
  tileh | tileless | tile | boxcenter | boxleft | boxright | times | helv | courier | shadow | rounded |
  bold | italic
 ]
 "image" [
  outline | dashed | dotted | solid | borderless | float |
  boxleft | space ;| shadow | rounded
 ]
 "font" "f" [
  times | helv | courier | bold | italic | space
 ]
 "span" none
 "csv" [show]
 "data" none
]
bold: ["b" opt "old" (value: 'bold)]
italic: ["i" opt ["talic" opt "s"] (value: 'italic)]
vertical: [["vertical" | "tablev"] (value: 'vertical)]
float: [["float" | "flow"] (value: 'float)]
tilev: ["tilev" opt "ertical" (value: 'tilev)]
tileh: ["tileh" opt "orizontal" (value: 'tileh)]
space2: ["space" (value: 'force-space)]

We have only specified the parse rules for a small set of flags; all the others are generated automatically: for example, [dashed] becomes ["dashed" (value: 'dashed)]. This is done by processing the flag-words block using parse.

Rules for flag! +≡

rule: word: none
parse flag-words [
 some [
  some string! set rule block! (
   while [not tail? rule] [
    either all [rule/1 <> '| not block? get/any word: rule/1] [
     rule: insert/only change rule
      form word to paren! compose [value: (to lit-word! word)]
    ] [rule: next rule]
   ]
  )
  |
  some string! rule: 'none (rule/1: [end skip])
 ]
]

6.1.5 Flag actions

By default, a flag sets the respective word to true in the object that represents the parsed options. Some flags, however, do something different; for example, the dashed flag sets the word outline-style to 'dashed. The flag-actions object contains the flags that do something different from the default.

Rules for flag! +≡

flag-actions: context [
 dashed: [outline-style: 'dashed]
 dotted: [outline-style: 'dotted]
 solid: [outline-style: 'solid]
 outline: [outline-style: 'solid]
 borderless: [outline-style: 'borderless]
 rounded: [outline-style: 'rounded]
 center: [text-halign: 'center]
 left: [text-halign: 'left]
 right: [text-halign: 'right]
 justify: [text-halign: 'justify]
 middle: [text-valign: 'middle]
 top: [text-valign: 'top]
 bottom: [text-valign: 'bottom]
 imagecenter: [image-halign: 'center]
 imageleft: [image-halign: 'left]
 imageright: [image-halign: 'right]
 imagemiddle: [image-valign: 'center]
 imagetop: [image-valign: 'top]
 imagebottom: [image-valign: 'bottom]
 tile: [image-tiling: 'both]
 tilev: [image-tiling: 'vertical]
 tileh: [image-tiling: 'horizontal]
 tileless: [image-tiling: 'neither]
 times: [typeface: 'times]
 helv: [typeface: 'helvetica]
 courier: [typeface: 'courier]
 boxcenter: [position: 'center]
 boxright: [position: 'right]
 boxleft: [position: 'left]
]

6.1.6 Rules for set-word!

Set-words are parsed in a way similar to flags. Each command has a list of set-words that are accepted.

Rules for set-word!

set-words: [
 "table" [
  color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height |
  name
 ]
 "cell" "row" "column" [
  color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height |
  column | row
 ]
 "box" [
  color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height
 ]
 "image" [
  background | outline | dashed | dotted | solid | src | width | height | space
 ]
 "font" "f" [
  color | typeface | fontsize | background | space
 ]
 "span" none
 "csv" [name]
 "data" [name | index]
]
color: [["colo" opt "u" "r:" | "foreground:" | "fg:"] (value: first [color:])]
typeface: [opt "type" "face:" (value: first [typeface:])]
fontsize: ["size" opt "face" ":" (value: first [fontsize:])]
background: [["background:" | "bg:"] (value: first [background:])]
width: ["w" opt "idth" ":" (value: first [width:])]
height: ["h" opt "eight" ":" (value: first [height:])]
column: ["c" opt "olumn" ":" (value: first [column:])]
row: ["r" opt "ow" ":" (value: first [row:])]

Like with flags, most set-word rules are generated automatically from the above block. This translates [outline] to ["outline:" (value: first [outline:])].

Rules for set-word! +≡

parse set-words [
 some [
  some string! set rule block! (
   while [not tail? rule] [
    either all [rule/1 <> '| not block? get/any word: rule/1] [
     rule: insert/only change rule
      append form word ":" to paren! compose/deep [value: first [(to set-word! word)]]
    ] [rule: next rule]
   ]
  )
  |
  some string! rule: 'none (rule/1: [end skip])
 ]
]

6.1.7 Set-word actions

By default a set-word sets the respective word to the value following it; some however do a special action.

Rules for set-word! +≡

set-actions: context [
 outline: solid: [outline-color: value outline-style: 'solid]
 dashed: [outline-color: value outline-style: 'dashed]
 dotted: [outline-color: value outline-style: 'dotted]
 column: [position: as-pair value 1]
 row: [position: as-pair 1 value]
]

6.1.8 Set-word expected types

Each set-word expects a value of a specified type (of one of a list of specified types) following it; if the following value cannot by parsed with the type rule specified, the assignment is ignored. (That is, you cannot set the color to an integer, or the width to a color.)

Rules for set-word! +≡

var-types: context [
 color: types/color!
 typeface: types/string!
 fontsize: types/integer!
 space: types/integer!
 background: types/color!
 outline: dashed: dotted: solid: types/color!
 width: height: bind [percent! | integer!] in types 'self
 column: row: types/integer!
 image: bind [url! | string!] in types 'self
 name: types/string!
 index: bind [pair! | integer!] in types 'self
]

6.2 Type map

When an option value (not a flag or set-word) is specified without any variable to set it to (i.e., no set-word precedes it), a "type map" is used to determine what variable should be set. Each command has its own type map; for example, the first color! value in the options for "table" is set as the background, the second as the text color, and so on.

Type map

type-map: [
 "table" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [table-size]
  comma-pair! [(width: value/1 height: value/2)]
 ]
 "row" "column" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [position width height fontsize]
  url! [image]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
 ]
 "cell" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [position]
  comma-pair! [(width: value/1 height: value/2)]
 ]
 "box" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
 ]
 "image" [
  color! [outline-color background]
  string! [src]
  integer! [width height]
  url! [src]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
 ]
 "font" "f" [
  color! [color background]
  string! [typeface]
  integer! [fontsize space]
  url! [none]
  percent! [none]
  pair! [none]
  comma-pair! [none]
 ]
 "span" [
  color! [none]
  string! [none]
  integer! [none]
  url! [none]
  percent! [none]
  pair! [start end]
  comma-pair! [none]
 ]
 "csv" [
  color! [none]
  string! [name]
  integer! [none]
  url! [none]
  percent! [none]
  pair! [none]
  comma-pair! [none]
 ]
 "data" [
  color! [none]
  string! [name]
  integer! [index]
  url! [none]
  percent! [none]
  pair! [index]
  comma-pair! [none]
 ]
]

Since values are mapped after being parsed, and so after being represented as REBOL types, the types specified above need to be replaced by the respective REBOL types that represent them; as usual we do this using parse on the type-map block.

Type map +≡

parse type-map [
 some [
  some string! set rule block! (
   foreach [from to] [
    color! [issue! refinement! tuple!]
    percent! money!
    comma-pair! block!
   ] [
    replace rule from to
   ]
  )
 ]
]

6.3 Object map

The object-map block defines the template options object for each command.

Object map

object-map: [
 "table" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  table-size: bold: italic: outline-style: vertical: text-halign: text-valign:
  image-halign: image-valign: float: image-tiling: position: force-space: shadow:
  all: hide: name: headerless: horizontal: none
 ]
 "row" "column" "cell" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  bold: italic: outline-style: text-halign: text-valign:
  image-halign: image-valign: image-tiling: position: all: none
 ]
 "box" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  bold: italic: outline-style: text-halign: text-valign:
  image-halign: image-valign: float: image-tiling: position: shadow: none
 ]
 "image" [
  background: outline-color: src: width: height: outline-style:
  float: position: space: none ;shadow: none
 ]
 "font" "f" [
  bold: italic: typeface: color: background: fontsize: space: none
 ]
 "span" [
  start: end: none
 ]
 "csv" [
  show: name: none
 ]
 "data" [
  name: index: none
 ]
]

6.4 Parse the args string into an object

Now that we have all that we need, we can write the code for the parse-arg-string function. First of all, we do some initialization, by setting flag-word and set-word to the appropriate rules, and by finding the correct type map and object template. Then we use parse on the args string: we can have a flag, in which case we either do the action specified in flag-actions or set the respective word to true; or we can have one or more set-words followed by a value; or we can have a value by itself, to be handled as specified in the type map (using the set-value-from-type function).

Note that we allow more than one set-word to be set to one value, like in "width: height: 100" which sets both the width and the heigth to 100. The accepted type of the value is the accepted type for the last variable specified, which is a problem, because you can write "face: size: 10" and have the font face set to 10, which is incorrect. However, since such a case is a user error anyway, we're not handling it for now. In the future, we need to make sure that the value is acceptable for all the words it is being set to.

Also note that we need to do some magic to make string work correctly even in assence of quotes; for example, in the case "=font[Arial Black]", since there are no quotes, two string values would be parsed, i.e. "Arial" and "Black", while we want it to be parsed as one string value, "Arial Black": so when we parse a string value, we check if the previous value was a string value too, and in this case we join the two values and consider it one. This is what the last-str word is used for.

Parse the args string into an object

flag-word: select* flag-words cmd
set-word: select* set-words cmd
tmap: select* type-map cmd
obj: get-obj cmd
tflag!: types/flag!
tset-word!: types/set-word!
parse/all args [
 any spc some [
  tflag! (
   last-str: none
   either in flag-actions value [
    do bind get in flag-actions value in obj 'self
   ] [
    set in obj value true
   ]
  )
  |
  (vars: clear [])
  some [tset-word! (append vars value: to word! :value)] (var-type: get in var-types value)
  var-type (
   last-str: either string? :value [value] [none]
   foreach var vars [
    either in set-actions var [
     do bind get in set-actions var in obj 'self
    ] [
     set in obj var value
    ]
   ]
  )
  |
  value-rule (
   either string? :value [
    either last-str [
     insert insert tail last-str " " value
    ] [
     last-str: value
     set-value-from-type tmap obj
    ]
   ] [
    last-str: none
    set-value-from-type tmap obj
   ]
  )
 ]
]
obj

6.4.1 parse-arg-string's locals

parse-arg-string's locals

obj tmap var-type last-str vars tset-word! tflag!

6.5 parse-arg-string's support functions

First, we need a version of select that can handle the maps we defined above, where more than one key map to the same (block!) value. This is easily done with parse.

The get-obj function is trivial, too; set-value-from-type uses the type map to decide which word in obj should be set to value (note that value is not passed as an argument; it's defined in the refinements object). It uses select* to get the list of words for the given type, then sets the first word in that list that is still set to none. If the block contains a paren!, it is evaluated instead. (See for example how this is used for comma-pair! in the type maps.)

parse-arg-string's support functions

select*: func [block value] [
 parse block [to value to block! set block block! | (block: none)]
 block
]

get-obj: func [cmd] [make object! select* object-map cmd]

set-value-from-type: func [tmap obj] [
 foreach word select* tmap type?/word :value [
  if paren? :word [
   do bind to block! word in obj 'self
   break
  ]
  if all [word <> 'none none? get word: in obj word] [
   set word value
   break
  ]
 ]
]

7. Balancing (stage 2): the Finite State Machine

The stage 2 FSM has two states (plus three "support" states); this is because QML commands are of two types: "block" level commands (a block is something that can contain paragraphs of text), and "inline" level commands (such as text, formatting commands, and so on - i.e. everything that can be inside a paragraph).

What this FSM needs to do is ensure that all opening and closing commands are correctly balanced, and that there is no block level command inside inline level commands. Formatting commands (almost all the inline level commands are formatting commands) need to be handled somewhat specially, since they can span across paragraphs (that is, exiting from inline level does not end the formatting) and they can be closed out of order; this means that the FSM has to close and reopen commands as needed to transform the input stream into a stream that is balanced and nests block and inline commands correctly.

Balancing (stage 2): the Finite State Machine

Stage 2 functions
The stage 2 FSM states

7.1 The stage 2 FSM states

The initial state of the FSM is the in-block state, which handles block level commands. (Note, we also use the stage 2 to convert commands that have aliases to their canonical name, i.e. "font" becomes "f", and so on.)

After processing, commands are passed to the stage3 function (see Generating QML document tree (stage 3)).

The stage 2 FSM states

in-block: [
 "'" "’" "`" {"} in-line-comment
 ";" "comment" "rem" in-comment
 
 default: (stage3 cmd opts)
 
 "word" (stage3 ":" opts)
 "def" (stage3 "::" opts)
 
 "example" "html" "rebol" "makedoc" "csv" (stage3 cmd opts) eat-one-newline
 
 "table" "center" "left" "justify" "right" "repeat" (open-block cmd opts) eat-one-newline
 "box" (open-block cmd opts)
 
 "box." "table." "cell." "center." "left."
  "justify." "right." "toc." "repeat." (close-block cmd) eat-one-newline
 
 "toc" (close-block "toc." open-block "toc" opts)
 
 "." (remove-all-inline close-all-block) eat-one-newline
 " " (remove-last-inline) eat-one-newline
 "," (remove-all-inline) eat-one-newline
 
 "" "link" "anchor" "a" "li" "image" "data" text: (reopen-inline) continue in-inline
 
 "b" "bold" (add-inline "b" opts) eat-one-newline
 "u" "underline" (add-inline "u" opts) eat-one-newline
 "i" "italics" "italic" (add-inline "i" opts) eat-one-newline
 "font" "f" (add-inline "f" opts) eat-one-newline
 "s" "strike" "strikethrough" (add-inline "s" opts) eat-one-newline
 
 "b." "bold." (remove-inline "b") eat-one-newline
 "u." "underline." (remove-inline "u") eat-one-newline
 "i." "italics." "italic." (remove-inline "i") eat-one-newline
 "font." "f." (remove-inline "f") eat-one-newline
 "s." "strike." "strikethrough." (remove-inline "s") eat-one-newline
 
 "row" "column" "row." "column." "span"
  (close-block/upto "cell." "table" stage3 cmd opts) eat-one-newline
 "cell" (close-block/upto "cell." "table" open-block cmd opts) eat-one-newline
]

The in-inline state processes commands at inline level.

The stage 2 FSM states +≡

in-inline: [
 "'" "’" "`" {"} in-line-comment
 ";" "comment" "rem" in-comment
 default: (stage3 cmd opts)
 
 "." (close-all-inline stage3 "^/" none) continue return
 " " (close-last-inline)
 "," (close-all-inline)
 
 "link" "li" (open-special "link" opts)
 "anchor" "a" (open-special "anchor" opts)
 "link." "li." (close-special "link.")
 "anchor." "a." (close-special "anchor.")
 
 "b" "bold" (open-inline "b" opts)
 "u" "underline" (open-inline "u" opts)
 "i" "italics" "italic" (open-inline "i" opts)
 "s" "strike" "strikethrough" (open-inline "s" opts)
 "font" "f" (open-inline "f" opts)
 "b." "bold." (close-inline "b.")
 "u." "underline." (close-inline "u.")
 "i." "italics." "italic." (close-inline "i.")
 "s." "strike." "strikethrough." (close-inline "s.")
 "font." "f." (close-inline "f.")
 
 "repeat" (open-inline "repeat" opts)
 "repeat." (close-repeat)
 
 "^/" (end-inline) return
 "box" "table" "c" "center" "center." "box." "table." "-"
 "1" "1'" "2" "2'" "3" "3'" "*" "**" "#" "##" "csv"
 ">" ":" "::" "word" "def" "example" "toc" "cell" "cell."
 "row" "column" "left" "right" "left." "right." "r" "l" "span"
 "html" "rebol" "makedoc" "justify" "j" "justify." "toc."
 "o" "x" "2’" "1’" "3’" "row." "column." "4" "5" "6"
  (end-inline) continue return
 
 close-inline: return
]

The other states we need are in-line-comment, in-comment and eat-one-newline.

The stage 2 FSM states +≡

in-line-comment: [
 "^/" return
]
in-comment: [
 ";." "comment." "rem." return
]
eat-one-newline: [
 "^/" return
 default: continue return
]

7.2 Stage 2 functions

Stage 2 functions

; block commands stack
block-stack: [ ]
; inline commands stack
inline-stack: [ ]
; special inline commands
special: none
end-inline: does [
 if special [close-special special]
 close-repeat/only
 temp-close-inline
 stage3 "^/" none
]
; open special inline command (=link or =anchor)
open-special: func [cmd opts] [
 ; close previous special if open
 if special [close-special special]
 special: join cmd "."
 temp-close-inline
 stage3 cmd opts
 reopen-inline
]
; close special inline command
close-special: func [cmd] [
 if special = cmd [
  temp-close-inline
  stage3 cmd none
  reopen-inline
  special: none
 ]
]
; close all open block commands
close-all-block: does [
 ; process in reverse order
 block-stack: skip tail block-stack -2
 while [not empty? block-stack] [
  ; close command
  stage3 join block-stack/1 "." none
  block-stack: skip clear block-stack -2
 ]
]
; reopen all inline commands that where temporarily closed
reopen-inline: does [
 foreach [cmd opts] inline-stack [
  stage3 cmd opts
 ]
]
; remove last command from inline stack
remove-last-inline: has [cmd] [
 either empty? inline-stack [
  if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [
   stage3 join cmd "." none
   clear skip tail block-stack -2
  ]
 ] [
  clear skip tail inline-stack -2
 ]
]
; remove all inline commands
remove-all-inline: has [cmd] [
 clear inline-stack
 block-stack: tail block-stack
 while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [
  stage3 join cmd "." none
  block-stack: skip block-stack -2
 ]
 block-stack: head clear block-stack
]
; open block command
open-block: func [cmd opts] [
 stage3 cmd opts
 insert/only insert tail block-stack cmd opts
]
; remove specific inline command
remove-inline: func [cmd] [
 if cmd: find/skip inline-stack cmd 2 [
  remove/part cmd 2
 ]
]
; close specific block command
close-block: func [cmd /upto noclosecmd /local] [
 ; cmd ends with a ., compute opening command without the .
 remove back tail cmd: copy cmd
 ; is the block open?
 if local: find/skip/last block-stack cmd 2 [
  if upto [
   ; closing cmd should never close noclosecmd
   ; so if cmd appears before noclosecmd, it should
   ; not actually be closed
   noclosecmd: find/skip/last block-stack noclosecmd 2
   if all [noclosecmd (index? local) < index? noclosecmd] [
    exit
   ]
  ]
  ; close all blocks up to specified one (included)
  block-stack: tail block-stack
  until [
   block-stack: skip block-stack -2
   ; close this block
   stage3 join block-stack/1 "." none
   block-stack/1 = cmd
  ]
  block-stack: head clear block-stack
 ]
]
; close all open inline commands
close-all-inline: does [
 ; process in reverse order
 inline-stack: skip tail inline-stack -2
 while [not empty? inline-stack] [
  ; close command
  ; since we happen to close and reopen inline commands
  ; a lot, it is possible that we are closing a command that
  ; had just been opened; to avoid that, we check the out block,
  ; and directly remove the open command if that was the last command
  ;either inline-stack/1 = pick tail out -2 [
  ; clear skip tail out -2
  ;] [
   stage3 join inline-stack/1 "." none
  ;]
  inline-stack: skip clear inline-stack -2
 ]
 ; close special commands too
 if special [close-special special]
 ; close alignment commands too
 block-stack: tail block-stack
 while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [
  ; we need to emit a newline too if we close an alignment
  if not close-inline? [
   stage3 "^/" none
   ; forces the fsm to go back to in-block state
   close-inline?: yes
  ]
  stage3 join cmd "." none
  block-stack: skip block-stack -2
 ]
 block-stack: head clear block-stack
]
; open inline command
open-inline: func [cmd opts] [
 stage3 cmd opts
 insert/only insert tail inline-stack cmd opts
]
; add inline command (to be opened when we enter inline mode)
add-inline: func [cmd opts] [
 insert/only insert tail inline-stack cmd opts
]
; close last opened inline command
close-last-inline: has [cmd] [
 ; if empty, check for special and alignment
 if empty? inline-stack [
  if special [
   close-special special
   exit
  ]
  if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [
   stage3 "^/" none
   ; forces the fsm to go back to in-block state
   close-inline?: yes
   stage3 join cmd "." none
   clear skip tail block-stack -2
  ]
  exit
 ]
 cmd: pick tail inline-stack -2
 ; check if this had just been opened (see note above)
 ;either cmd = pick tail out -2 [
 ; clear skip tail out -2
 ;] [
  stage3 join cmd "." none
 ;]
 ; remove from stack
 clear skip tail inline-stack -2
]
; close specific inline command
close-inline: func [cmd] [
 ; this is special because we need to close all commands up to
 ; the specified one (included), but then reopen all except the specified
 ; one. (so that "=b text =i text =b. text" becomes "=b text =i text =i. =b. =i text")
 ; cmd ends with a ., compute opening command without the .
 remove back tail cmd: copy cmd
 ; is the command open?
 if find/skip inline-stack cmd 2 [
  ; close all commands up to specified one (included)
  inline-stack: tail inline-stack
  until [
   inline-stack: skip inline-stack -2
   ; see note in close-all-inline
   ;either inline-stack/1 = pick tail out -2 [
   ; clear skip tail out -2
   ;] [
    stage3 join inline-stack/1 "." none
   ;]
   inline-stack/1 = cmd
  ]
  ; remove this command
  remove/part inline-stack 2
  ; reopen all other commands
  foreach [cmd opts] inline-stack [
   stage3 cmd opts
  ]
  inline-stack: head inline-stack
 ]
]
; temporarily close all inline commands (to be reopened later)
; this happens at block end... so that "=b =box." becomes "=b =b. =box. =b"
temp-close-inline: does [
 ; if empty, we have nothing to do
 if empty? inline-stack [exit]
 ; process in reverse order
 inline-stack: tail inline-stack
 until [
  inline-stack: skip inline-stack -2
  ; close command
  ; since we happen to close and reopen inline commands
  ; a lot, it is possible that we are closing a command that
  ; had just been opened; to avoid that, we check the out block,
  ; and directly remove the open command if that was the last command
  ;either inline-stack/1 = pick tail out -2 [
  ; clear skip tail out -2
  ;] [
   stage3 join inline-stack/1 "." none
  ;]
  head? inline-stack
 ]
]
close-inline?: no
close-repeat: func [/only] [
 ; similar to close-inline, but does not reopen commands; also could
 ; be closing a block repeat
 ; is the command open?
 either find/skip inline-stack "repeat" 2 [
  ; close all commands up to specified one (included)
  inline-stack: tail inline-stack
  until [
   inline-stack: skip inline-stack -2
   stage3 join inline-stack/1 "." none
   inline-stack/1 = "repeat"
  ]
  ; remove from stack
  inline-stack: head clear inline-stack
 ] [
  ; maybe it's a block repeat?
  if all [not only find/skip block-stack "repeat" 2] [
   ; go back to block mode
   if special [close-special special]
   temp-close-inline
   stage3 "^/" none
   close-inline?: yes
   close-block "repeat."
  ]
 ]
 
]

7.3 Initialization and termination of the FSM

Initialization and termination of the FSM

init-stage2: does [
 ; initialize stuff
 clear stage2-ctx/block-stack
 clear stage2-ctx/inline-stack
 stage2-ctx/special: none
 ;clear stage2-ctx/anchors
 stage2-fsm/init stage2-ctx/in-block
]
end-stage2: does [
 ; cause everything to be closed
 stage2-fsm/event "."
 ; actually, not needed...
 stage2-fsm/end
]

7.4 Additional code for the stage2 function

Additional code for the stage2 function

if stage2-ctx/close-inline? [
 stage2-fsm/event first [close-inline:]
 stage2-ctx/close-inline?: no
]

7.5 The merge-style function

The merge-style function

merge-style: func [old new /copy /local val] [
 if object? new [
  either object? old [
   if copy [old: make old [ ]]
   foreach word next first new [
    if val: get in new word [
     set in old word val
    ]
   ]
  ] [
   old: new
  ]
 ]
 old
]

8. Scan a text string and return a QML document tree

Scan a text string and return a QML document tree

init-stage2
init-stage3
if not keep [set-defaults defaults]
parse-qml text if defaults [select defaults "alias"]
end-stage2
; returns output
end-stage3

8.1 Search a QML document tree for a substring

Search a QML document tree for a substring

res: make block! 16
doc: build-search-index doc
if anchor: find doc/2 text [
 insert/only insert/only res anchor/3 anchor/2
]
foreach [header string] skip doc 2 [
 while [string: find/tail string text] [
  insert insert/only tail res header copy/part skip string -50 120
 ]
]
res

8.1.1 search's locals

search's locals

res anchor

9. Searching a QML document

Searching a QML document

build-search-index: func [doc [block!] /local rule result val anchor] [
 result: copy/deep [toa [] doc-start ""]
 rule: [
  into [
   [
    'para | 'header4 | 'header5 | 'header6 | 'bullets | 'enum | 'checks |
    'definitions | 'box | 'section | 'center | 'left | 'right | 'justify |
    'item | 'check | 'term | 'desc | 'header1* | 'header2* | 'header3*
   ] opt ['opts skip] any rule (insert tail last result newline)
   |
   val: ['header1 | 'header2 | 'header3] opt ['opts skip]
    (insert insert/only tail result val make string! 256)
    any rule (insert tail last result newline)
   |
   'escape string! set val string! (insert insert tail last result val newline)
   |
   'table opt ['opts skip] opt [into ['columns to end]] any [
    into [
     'row opt ['opts skip] opt 'header any [
      into ['cell opt ['opts skip] opt ['span skip] opt 'header any rule (insert tail last result " ")]
     ] (insert tail last result newline)
    ]
   ]
   |
   ['bold | 'italic | 'strike | 'link | 'alink | 'font] opt ['opts skip] any rule
   |
   anchor: 'anchor opt [
    'opts set val block! (
     if val: select val [name:] [
      insert/only insert/only insert tail result/2 val anchor pick tail result -2
     ]
    )
   ] any rule
  ]
  |
  set val string! (insert tail last result val)
  |
  skip
 ]
 parse doc ['qml any rule]
 result
]

10. Setting default options

Setting default options

default-number-style:
default-table:
default-row:
default-column:
default-cell:
default-box:
default-image: none
; defaults that cannot be currently set
default-data: context [name: "csv" index: none]
default-repeat: [csv in csv]
merge-style: func [old new /copy /local val] [
 if object? new [
  either object? old [
   if copy [old: make old [ ]]
   foreach word next first new [
    if val: get in new word [
     set in old word val
    ]
   ]
  ] [
   old: new
  ]
 ]
 old
]
set-defaults: func [defaults /local w] [
 default-number-style: default-table: default-row: default-column:
 default-cell: default-box: default-image: none
 if block? defaults [
  foreach [cmd opts] defaults [
   if w: select [
    "toc" default-number-style
    "table" default-table
    "row" default-row
    "column" default-column
    "cell" default-cell
    "box" default-box
    "image" default-image
   ] cmd [
    set w parse-command-options cmd opts
   ]
  ]
 ]
]

11. Generating QML document tree (stage 3)

Generating QML document tree (stage 3)

stage3: func [cmd opts] [
 ;brkpnt "STAGE 3" [cmd opts]
 if block? cmd [cmd: first cmd]
 stage3-ctx/cmd: cmd
 stage3-ctx/opts: opts
 stage3-fsm/event cmd
]
stage3-fsm: make fsm! [ ]
out: [ ]
init-stage3: does [
 clear out
 clear stage3-ctx/blocks
 clear stage3-ctx/vars
 clear stage3-ctx/anchors
 stage3-ctx/local-vars: copy [[ ]]
 stage3-ctx/csvid: stage3-ctx/tabid: 1
 insert out 'qml
 ;stage3-fsm/tracing: yes
 stage3-fsm/init stage3-ctx/initial
]
end-stage3: does [
 stage3-fsm/end
 rewrite out rewrite-rules
 make-toc out
 set-enum-counts out
 out
]
stage3-ctx: context [
 Stage 3 FSM
]
rewrite-rules: use [x y z] [[
 Stage 3 rewrite rules (optimization, handling of =data and =repeat)
]]
Words used by the rewrite rules
Header numbering
Postprocessing
process-link: func [target] [
 either parse/all target [["http://" | "mailto:" | "ftp://" | "www."] to end] [
  compose [target: (target) class: "external" text: (target)]
 ] [
  compose [target: (join http://www.qtask.com/qwiki.cgi?goto= target) class: "internal" text: (target)]
 ]
]

11.1 Header numbering

Header numbering

numbering: context [
 ; only 6 levels supported
 toc-counters: [0 0 0 0 0 0]
 ; number style for each level
 toc-style: ["1. " "1[.1] "]
 chars: complement charset "1AaIi[]"
 make-number: func [level /local style i res mk1 mk2 rpt term cont] [
  ; toc-style set to none means no numbering
  if not toc-style [return ""]
  i: 1
  style: any [pick toc-style level last toc-style]
  res: make string! 16
  ; first increment/reset counters
  poke toc-counters level 1 + pick toc-counters level
  change/dup skip toc-counters level 0 subtract length? toc-counters level
  ; generate number
  term: [
   mk1: any chars mk2: (insert/part tail res mk1 mk2)
   [
    "1" (insert tail res pick toc-counters i)
    |
    "A" (insert tail res pick "ABCDEFGHIJKLMNOPQRSTUVWXYZ" min 26 pick toc-counters i)
    |
    "a" (insert tail res pick "abcdefghijklmnopqrstuvwxyz" min 26 pick toc-counters i)
    |
    "I" (insert tail res uppercase to-roman pick toc-counters i)
    |
    "i" (insert tail res to-roman pick toc-counters i)
   ]
   mk1: any chars mk2: (insert/part tail res mk1 mk2)
   (i: i + 1 cont: either i > level ['break] [[ ]]) cont
  ]
  parse/all/case style [
   some [rpt: "[" some term "]" (cont: either i > level ['break] [[:rpt]]) cont | term]
   rpt: (insert tail res rpt)
  ]
  res
 ]
 romans: [
  ["" "i" "ii" "iii" "iv" "v" "vi" "vii" "viii" "ix"]
  ["" "x" "xx" "xxx" "xl" "l" "lx" "lxx" "lxxx" "xc"]
  ["" "c" "cc" "ccc" "cd" "d" "dc" "dcc" "dccc" "cm"]
 ]
 ; form an integer from 0 to 999 with roman numerals
 to-roman: func [int /local res] [
  int: form int
  res: make string! 16
  forall int [
   insert tail res pick pick romans length? int int/1 - #"/"
  ]
  res
 ]
 ; set toc-style
 set-style: func [style] [
  style: any [style default-number-style]
  if none? style [toc-style: none exit]
  if not toc-style: select/case [
   "1" ["1. " "1[.1] "]
   "A" ["A. " "A[.1] "]
   "a" ["a) " "a1) " "a1[.1]) "]
   "I" ["I. " "I[.1] "]
   "i" ["i) " "i[.1]) "]
  ] style [
   toc-style: parse/all style "|"
  ]
 ]
 ; reset counters
 reset: does [change/dup toc-counters 0 6]
]

11.2 Postprocessing

Postprocessing

collect: func [output doc rule /local node] [
 match doc [set node into rule (append/only output copy/deep node)]
 output
]
mkopts: func [level id] [compose [number: (numbering/make-number level) id: (id)]]
make-toc: func [doc /local style toc here l headid] [
 headid: 1
 match doc [
  'section
  [here: 'opts set style string! (remove/part here 2) :here | (style: none)]
  (numbering/set-style style)
  into ['toc toc: to end] here: to end (
   numbering/reset
   collect toc here [
    ['header1 (l: 1) | 'header2 (l: 2) | 'header3 (l: 3)]
    [
     'opts set here block! (append here mkopts l headid headid: headid + 1)
     |
     here: (here: insert/only insert here 'opts mkopts l headid headid: headid + 1) :here
    ]
    to end
    |
    ; add number but do not collect!
    ['header4 (l: 4) | 'header5 (l: 5) | 'header6 (l: 6)]
    [
     'opts set here block! (insert insert tail here [number:] numbering/make-number l)
     |
     here: (here: insert/only insert here 'opts compose [number: (numbering/make-number l)]) :here
    ]
    ; fail (so that it is not collected)
    end skip
   ]
   ; remove anchors from TOC
   rewrite toc [
    [into ['anchor opt ['opts skip] here: to end]] [(here)]
   ]
  )
 ]
]
count: func [counter options] [
 if block? options [
  options: construct options
  if all [in options 'force integer? options: attempt [to integer! options/force]] [set counter options]
 ]
 options: get counter
 set counter 1 + options
 options
]
set-enum-counts: func [doc /local count1 count2 opts] [
 count1: count2: 1
 match doc [
  'enum some [
   into [
    'item [
     'opts set opts block! (insert insert tail opts [number:] count 'count1 opts)
     |
     opts: (opts: insert/only insert opts 'opts compose [number: (count 'count1 opts)]) :opts
    ] (count2: 1) to end
    |
    'enum some [
     into [
      'item [
       'opts set opts block! (insert insert tail opts [number:] count 'count2 opts)
       |
       opts: (opts: insert/only insert opts 'opts compose [number: (count 'count2 opts)]) :opts
      ] to end
     ]
     |
     skip
    ]
   ]
   |
   skip
  ]
 ]
]

11.3 Stage 3 FSM

Stage 3 FSM

cmd: opts: none

emit: func [val] [
 repend out val
]
blocks: [ ]
open-block: func [name opts /only] [
 insert/only tail blocks out
 insert/only tail out out: make block! 16
 emit name
 if not none? opts [
  if all [object? opts not only] [opts: make-style opts]
  emit ['opts opts]
 ]
]
close-block: does [
 if empty? blocks [exit]
 out: last blocks
 remove back tail blocks
]
; shortcut to make the state inheritance trick readable.
; appending the parent state's directives to the new directives
; makes the new state "inherit" them.
inherit: func [parent-state new-directives] [
 append new-directives parent-state
]
make-style: func [obj /ignore block /local] [
 local: make block! length? obj: third obj
 block: any [block [ ]]
 foreach [word val] obj [
  if all [:val not find block to word! word] [insert/only insert tail local word :val]
 ]
 local
]

; common state directives, used by almost all states
common: [
 ; by default emit unknown commands verbatim
 default: (emit [reduce ['command cmd opts]])
 ; the dot command makes the fsm go to the initial state
 ; note: this should not happen anymore, since "." commands are handled by the scanner.
 "." rewind? initial
 ; collected anchor names block from the scanner
 ; (normally just at the beginning of the doc)
 ;anchors: (anchors: opts)
]
; basic block state, also the initial state
anchors: [ ]
; variables that can be used by =data
vars: [ ]
tabid: 1
open-table: func [opts] [
 if not object? opts [
  opts: refinements/get-obj "table"
 ]
 if not opts/name [
  opts/name: join "table" tabid
  if tabid = 1 [insert insert tail vars "table" context [type: 'alias dest: "table1"]]
  tabid: tabid + 1
 ]
 open-block/only 'table-proto opts
 insert insert tail vars opts/name context [type: 'table-proto name: opts/name contents: out]
]
in-block: initial: inherit common [
 "" "link" "anchor" "image" "b" "i" "f" "data" text: (open-block 'para none) continue in-para (close-block)
 after-para: ( )
 ; newlines produce an empty paragraph in the initial state
 "^/" (emit [[para]])
 "-" (emit [[hrule]])
 "1" (open-block 'header1 none) in-para (close-block)
 "2" (open-block 'header2 none) in-para (close-block)
 "3" (open-block 'header3 none) in-para (close-block)
 "4" (open-block 'header4 none) in-para (close-block)
 "5" (open-block 'header5 none) in-para (close-block)
 "6" (open-block 'header6 none) in-para (close-block)
 "1'" "1’" (open-block 'header1* none) in-para (close-block)
 "2'" "2’" (open-block 'header2* none) in-para (close-block)
 "3'" "3’" (open-block 'header3* none) in-para (close-block)
 "*" "**" (open-block 'bullets none) continue in-ulist (close-block)
 "#" "##" (open-block 'enum none) continue in-olist (close-block)
 "o" "x" (open-block 'checks none) continue in-checklist (close-block)
 ">" (open-block 'para compose [indent: (opts)]) in-para (close-block)
 "::" (open-block 'para [indent: 2]) in-item (close-block)
 ":" (open-block 'definitions none) continue in-dlist (close-block)
 "box" (open-block 'box merge-style/copy default-box opts) in-box (close-block)
 "toc" (open-block 'section opts open-block 'toc none) in-toc (close-block)
 "table" (open-table opts) in-table (close-block)
 "c" (open-block 'para [text-halign: center]) in-para (close-block)
 "center" (open-block 'center none) in-center (close-block)
 "l" (open-block 'para [text-halign: left]) in-para (close-block)
 "left" (open-block 'left none) in-left (close-block)
 "r" (open-block 'para [text-halign: right]) in-para (close-block)
 "right" (open-block 'right none) in-right (close-block)
 "j" (open-block 'para [text-halign: justify]) in-para (close-block)
 "justify" (open-block 'justify none) in-just (close-block)
 "example" "html" "rebol" "makedoc" (emit [reduce ['escape cmd opts]])
 "csv" (handle-csv opts)
 "repeat" (open-block 'repeat any [opts default-repeat]) in-repeat (close-block)
 ; these cannot actually happen in the initial state, but they could
 ; happen in all the states inheriting from in-block
 "center." continue rewind? in-center
 "left." continue rewind? in-left
 "right." continue rewind? in-right
 "justify." continue rewind? in-just
 "box." continue rewind? in-box
 "table." continue rewind? in-table
]
CSV data handling

11.4 CSV data handling

CSV data handling

csvid: 1
handle-csv: func [data] [
 if not object? data [exit]
 data: make data [type: 'csv]
 either data/name [
  insert insert tail vars data/name data
 ] [
  insert insert tail vars join "csv" csvid data
  if csvid = 1 [insert insert tail vars "csv" data]
  csvid: csvid + 1
 ]
 if data/show [
  open-block 'table none
  foreach row data/contents [
   open-block 'row none
   foreach column row [
    emit [reduce ['cell reduce ['para column]]]
   ]
   close-block
  ]
  close-block
 ]
]

11.5 Other states

Stage 3 FSM +≡

; inside a centered block. same as initial state, except that =center. will end it.
in-center: inherit in-block [
 "center." override after-para return
]
; inside a left aligned block. same as initial state, except that =left. will end it.
in-left: inherit in-block [
 "left." override after-para return
]
; inside a right aligned block. same as initial state, except that =right. will end it.
in-right: inherit in-block [
 "right." override after-para return
]
; inside a justified block. same as initial state, except that =justify. will end it.
in-just: inherit in-block [
 "justify." override after-para return
]

in-repeat: inherit in-block [
 "repeat." override after-para return
]

; inside a paragraph (or other inline content)
in-para: inherit common [
 text: (emit opts)
 ; anything else ends the paragraph
 "^/" override after-para return
 ;default: continue return
 ; formatting
 "b" (open-block 'bold none) in-bold (close-block)
 "i" (open-block 'italic none) in-italic (close-block)
 "s" (open-block 'strike none) in-strike (close-block)
 ;"u" (emit <u>) in-underline (emit </u>)
 ; the empty command is actually the =[qwiki link] command
 "" (emit [reduce ['qlink opts]])
 "link" (open-block 'link-proto opts) in-link (close-block)
 "f" (open-block 'font opts) in-font-inline (close-block)
 "image" (emit [reduce ['image 'opts make-style merge-style/copy default-image opts]])
 "anchor" (open-block 'anchor opts if opts [insert/only insert tail anchors opts out]) in-anchor (close-block)
 "data" (emit [reduce ['data make-style merge-style/copy default-data opts]])
 "repeat" (open-block 'repeat any [opts default-repeat]) in-repeat-inline (close-block)
]
in-repeat-inline: inherit in-para [
 "repeat." return
 ; note that "^/" cannot happen here thanks to stage2
]
; inside a link
in-link: inherit in-para [
 "link." return
]
; inside an anchor
in-anchor: inherit in-para [
 "anchor." return
]
; inside inline font command
in-font-inline: inherit in-para [
 "f." return
]
; inside inline bold
in-bold: inherit in-para [
 "b" in-bold ; recurse (to enforce balancing)
 "b." return
]
; inside inline italic
in-italic: inherit in-para [
 "i" in-italic ; recurse
 "i." return
]
; inside strikethrough
in-strike: inherit in-para [
 "s" in-strike ; recurse
 "s." return
]
; inside underline
in-underline: inherit in-para [
 "u" in-underline ; recurse
 "u." return
]

; inside a term definition block (=:)
in-dlist: [
 ; a new =: starts a new definition in the same block
 ":" (open-block 'term none) in-item (close-block)
 ; =:: starts the definition description
 "::" (open-block 'desc none) in-item (close-block)
 after-para: ( )
 ; anything else outside of a description ends the definition block
 default: continue return
]

; inside a checklist
in-checklist: [
 ; new items
 "o" (open-block 'check compose [checked: (no)]) in-item (close-block)
 "x" (open-block 'check compose [checked: (yes)]) in-item (close-block)
 ; nothing to do after para
 after-para: ( )
 ; anything else ends the list
 default: continue return
]

; inside a bullet list
in-ulist: [
 ; new item
 "*" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block)
 ; start sublists
 "**" (open-block 'bullets none) continue in-ulist2 (close-block)
 "##" (open-block 'enum none) continue in-olist2 (close-block)
 ; nothing to do after para
 after-para: ( )
 ; anything else ends the list
 default: continue return
]
; inside a numbered list
in-olist: [
 ; new item
 "#" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block)
 ; start sublists
 "##" (open-block 'enum none) continue in-olist2 (close-block)
 "**" (open-block 'bullets none) continue in-ulist2 (close-block)
 ; nothing to do after para
 after-para: ( )
 ; anything else ends the list
 default: continue return
]
; bullet sublist
in-ulist2: [
 ; new item
 "**" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block)
 ; nothing to do after para
 after-para: ( )
 ; anything else ends the list
 default: continue return
]
; numbered sublist
in-olist2: [
 ; new item
 "##" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block)
 ; nothing to do after para
 after-para: ( )
 ; anything else ends the list
 default: continue return
]
; inside a list's item
in-item: in-para

; inside a box
; box should work this way:
; if there's something after the =box command and before the newline, treat
; it as the box's header; otherwise, if there's a newline immediately following
; the =box command, emit no header
in-box: [
 ; newline just after =box, emit no header
 ; also go to box-contents after the header
 "^/" after-para: box-contents
 ; a block level command just after =box, emit no header
 default: continue box-contents
 ; inline level command means that we have a header
 "" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u"
 ; close command or coming back from box-contents
 "box." override after-para return
]
box-contents: inherit in-block [
 ; same as initial except that ends on =box.
 "box." continue return
]

in-toc: [
 ; newline just after =toc, emit no title
 ; also go to in-block after the header
 "^/" after-para: (close-block) in-toc2
 ; a block level command just after =toc, emit no title
 default: (close-block) continue in-toc2
 ; inline level command means that we have a title
 "" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u"
 ; close command or coming back from in-toc2
 "toc." (close-block) override after-para return
]
in-toc2: inherit in-block [
 "toc." override after-para 2 return
]

Generating tables

11.6 Generating tables

Generating tables

; table states (see also table code below)
; inside a table
in-table: inherit common [
 "row" (emit ['row opts])
 "column" (emit ['column opts])
 ; go back to saved position
 "row." "column." (emit 'return)
 "cell" (open-block/only 'cell opts) in-cell-block (close-block)
 "table." return
 ; define a spanning cell
 "span" (emit ['span opts])
 ; anything else starts a new cell
 default: (open-block 'cell none) continue in-cell (close-block)
 "repeat" (open-block 'repeat any [opts default-repeat]) in-table-repeat (close-block)
]
in-table-repeat: inherit in-table [
 "repeat." return
]
; inside an implied table cell (no =cell command given)
in-cell: inherit in-block [
 ; lists are treated specially in tables... each item in a cell
 "*" (
  open-block 'bullets none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]
 ) in-item (close-block close-block)
 "**" (
  open-block 'bullets none
  open-block 'bullets none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]
 ) in-item (close-block close-block close-block)
 "#" (
  open-block 'enum none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]
 ) in-item (close-block close-block)
 "##" (
  open-block 'enum none
  open-block 'enum none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]
 ) in-item (close-block close-block close-block)
 "::" (open-block 'para [indent: 2]) in-item (close-block)
 ":" (open-block 'definitions none) override define in-cell-dlist (close-block)
 "o" (open-block 'checks none open-block 'check no) in-item (close-block close-block)
 "x" (open-block 'checks none open-block 'check yes) in-item (close-block close-block)
 ; end current cell, go back to in-table state
 after-para: "^/" return
]
; inside a block table cell (=cell ... =cell.)
in-cell-block: inherit in-block [
 ; end current cell
 "cell." return
]
; inside a term definition block (=:) inside a cell
in-cell-dlist: [
 define: (open-block 'term none) in-item (close-block)
 ; =:: starts the definition description
 "::" (open-block 'desc none) in-item (close-block)
 after-para: ( )
 ; anything else outside of a description ends the definition block and the cell
 default: continue 2 return
]

header?: func [type col row] [
 switch type [
  horiz [row = 1]
  vert [col = 1]
  both [any [row = 1 col = 1]]
  none [false]
 ]
]
generate-table: func [opts body /local table result content tmp i j header] [
 result: copy [table]
 if opts [insert/only insert tail result 'opts make-style/ignore opts [name]]
 table: make table-state! [
  style: opts
  if all [object? style style/table-size] [size: style/table-size]
  table: make block! 16
  columns: make block! 16
 ]
 ; first, eval any 'repeat etc.
 body: rewrite copy body rewrite-rules
 parse body [
  some [
   'row set opts skip (add-row table opts)
   |
   'column set opts skip (add-col table opts)
   |
   'return (table-go-back table)
   |
   into ['cell ['opts set opts skip | (opts: none)] content: to end (add-cell table opts content)]
   |
   'span set opts skip (make-span table opts) opt [
    into ['cell ['opts set opts skip | (opts: none)] content: to end (set-cell table opts content)]
   ]
  ]
 ]
 header: 'horiz
 if object? table/style [
  if table/style/name [
   poke find vars table/style/name 2 table
  ]
  if table/style/hide [return [hidden-table]]
  if table/style/vertical [header: either table/style/horizontal ['both] ['vert]]
  if table/style/headerless [header: 'none]
 ]
 insert/only tail result content: copy [columns]
 foreach col table/columns [
  insert/only tail content either col [reduce ['column 'opts make-style col]] [[column]]
 ]
 j: 1
 foreach row table/table [
  ; emit a row
  ; row can be none if it's empty
  either object? row [
   i: 1
   insert/only tail result content: copy [row]
   if row/style [insert/only insert tail content 'opts make-style row/style]
   if all [find [horiz both] header j = 1] [insert tail content 'header]
   foreach cell row/contents [
    either object? cell [
     ; emit cell, unless it is hidden by spanned cell
     if cell/type = 'cell [
      insert/only tail content compose [
       cell (either cell/style ['opts] [[ ]])
        (either cell/style [reduce [make-style/ignore cell/style [position]]] [[ ]])
        (either cell/spansize ['span] [[ ]]) (any [cell/spansize [ ]])
        (either header? header i j ['header] [[ ]])
        (cell/out)
      ]
     ]
    ] [
     ; cell can be none if it's empty
     tmp: make-cell-style table none row pick table/columns i
     insert/only tail content compose [
      cell (either tmp ['opts] [[ ]])
       (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
       (either header? header i j ['header] [[ ]])
     ]
    ]
    i: i + 1
   ]
   loop table/size/x - length? row/contents [
    tmp: make-cell-style table none row pick table/columns i
    insert/only tail content compose [
     cell (either tmp ['opts] [[ ]])
      (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
      (either header? header i j ['header] [[ ]])
    ]
    i: i + 1
   ]
  ] [
   ; if row contains a spanned cell, then it is never empty (we put references there)
   insert/only tail result content: copy [row]
   if all [find [horiz both] header j = 1] [insert tail content 'header]
   if not any [table/style]
   repeat i table/size/x [
    tmp: make-cell-style table none none pick table/columns i
    insert/only tail content compose [
     cell (either tmp ['opts] [[ ]])
      (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
      (either header? header i j ['header] [[ ]])
    ]
   ]
  ]
  j: j + 1
 ]
 loop table/size/y - length? table/table [
  insert/only tail result content: copy [row]
  if all [find [horiz both] header j = 1] [insert tail content 'header]
  repeat i table/size/x [
   tmp: make-cell-style table none none pick table/columns i
   insert/only tail content compose [
    cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
     (either header? header i j ['header] [[ ]])
   ]
  ]
  j: j + 1
 ]
 result
]

; Table handling - needs two passes
table-state!: context [
 type: 'table
 ; block of rows
 table:
 ; list of column styles
 columns:
 ; current row
 currow:
 ; current cell
 curcell:
 ; current position (pair!)
 curpos: none
 ; direction (where to add new cells, 1x0 = to the right, 0x1 = to the bottom, and so on)
 dir: 0x1
 ; table size in cells
 size: 0x0
 ; table style
 style: none
 ; =row and =column save current position so that =row. and =column. can go back to it
 savepos: savedir: none
]
inherit-style: func [dest source words] [
 foreach word words [
  if none? get in dest word [
   set in dest word get in source word
  ]
 ]
 dest
]
make-cell-style: func [table-state style row col] [
 if not style [
  style: context [
   background: color: outline-color: image: typeface: width: height: fontsize:
   bold: italic: outline-style: text-halign: text-valign:
   image-halign: image-valign: image-tiling: position: none
  ]
 ]
 ; inherit outline style from table if needed
 if all [table-state/style table-state/style/all] [
  inherit-style style table-state/style [outline-color outline-style]
 ]
 either table-state/dir = 1x0 [
  ; row style has priority, so it must be applied last
  if all [col col/all] [
   ; background is there to override header cells
   inherit-style style col [
    background color outline-color typeface height fontsize bold italic outline-style
    text-halign text-valign
   ]
  ]
  if all [row row/style row/style/all] [
   inherit-style style row/style [
    background outline-color width outline-style
   ]
  ]
 ] [
  ; column style has priority, so it must be applied last
  if all [row row/style row/style/all] [
   inherit-style style row/style [
    background outline-color width outline-style
   ]
  ]
  if all [col col/all] [
   ; background is there to override header cells
   inherit-style style col [
    background color outline-color typeface height fontsize bold italic outline-style
    text-halign text-valign
   ]
  ]
 ]
 ; if no value has been specified, just cancel whole style
 if parse second style [object! some none! end] [style: none]
 style
]
make-row: func [table-state pos style' /local row] [
 ; do we need to add rows?
 if pos/y > length? table-state/table [
  insert/dup tail table-state/table none pos/y - length? table-state/table
 ]
 ; create it if it does not exist
 either row: pick table-state/table pos/y [
  row/style: merge-style row/style style'
 ] [
  ; (poke does not return ROW in older REBOLs)
  poke table-state/table pos/y row: context [
   ; row contents (cells)
   contents: make block! 16
   style: style'
   ; first row is header unless vertical style has been specified, or table is headerless
   ;header: if not all [table-state/style any [table-state/style/vertical table-state/style/headerless]] [pos/y = 1]
  ]
 ]
 row
]
make-col: func [table-state pos style /local col] [
 ; do we need to add rows?
 if pos/x > length? table-state/columns [
  insert/dup tail table-state/columns none pos/x - length? table-state/columns
 ]
 ; create it if it does not exist
 either col: pick table-state/columns pos/x [
  merge-style col style
 ] [
  poke table-state/columns pos/x style
  ; (poke does not return STYLE in older REBOLs)
  style
 ]
]
add-row: func [table-state args /local pos] [
 ; save current direction (for =row.)
 table-state/savedir: table-state/dir
 ; add new row (=row command)
 ; see if position has been specified
 if object? args [
  pos: args/position
  if pair? pos [pos: pos/y]
 ]
 ; if not, assign default position
 either pos [
  ; save current position (for =row.)
  table-state/savepos: table-state/curpos
  table-state/curpos: 0x1 * pos
 ] [
  either table-state/curpos [
   ; position must be saved so that =row. does not make the next
   ; cell overwrite the row
   table-state/savepos: table-state/curpos + 0x1
   table-state/curpos: table-state/curpos * 0x1 + 0x1
  ] [
   table-state/savepos: table-state/curpos: 0x1
  ]
 ]
 ; create row object
 args: merge-style/copy default-row args
 make-row table-state table-state/curpos args
 ; =row command also means switching to horizontal direction for new cells
 table-state/dir: 1x0
]
add-col: func [table-state args /local pos] [
 ; save current direction (for =column.)
 table-state/savedir: table-state/dir
 ; add new column (=column command)
 ; see if position has been specified
 if object? args [
  pos: args/position
  if pair? pos [pos: pos/x]
 ]
 ; if not, assign default position
 either pos [
  ; save current position (for =column.)
  table-state/savepos: table-state/curpos
  table-state/curpos: 1x0 * pos
 ] [
  either table-state/curpos [
   ; position must be saved so that =column. does not make the next
   ; cell overwrite the column
   table-state/savepos: table-state/curpos + 1x0
   table-state/curpos: table-state/curpos * 1x0 + 1x0
  ] [
   table-state/savepos: table-state/curpos: 1x0
  ]
 ]
 ; create col object
 args: merge-style/copy default-column args
 make-col table-state table-state/curpos args
 ; =column command also means switching to vertical direction for new cells
 table-state/dir: 0x1
]
table-go-back: func [table-state] [
 ; =row. or =column.
 if all [table-state/savepos table-state/savedir] [
  table-state/curpos: table-state/savepos
  table-state/dir: table-state/savedir
  table-state/currow: pick table-state/table table-state/curpos/y
  table-state/curcell: pick table-state/currow/contents table-state/curpos/x
  table-state/savepos: table-state/savedir: none
 ]
]
; make room for a new cell in the table and create a cell object
make-cell: func [table-state pos style' contents /span spanrc /local row cell] [
 row: make-row table-state pos none
 ; do we need to add cells?
 if pos/x > length? row/contents [
  insert/dup tail row/contents none pos/x - length? row/contents
 ]
 table-state/curpos: pos
 table-state/currow: row
 ; set this as the current cell (creates it if it does not exist)
 either cell: pick row/contents pos/x [
  if cell/type = 'span [
   ; span inside a span breaks the outer span
   either span [
    ; is the outer span already destroyed?
    either cell/reference/position [
     ; no, break it
     break-span table-state cell/reference pos pos + spanrc
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ] [
     ; yes, just overwrite it
     poke row/contents pos/x none
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ]
   ] [
    ; was this a destroyed cell?
    either cell/reference/position [
     ; no
     cell: cell/reference
    ] [
     ; yes, overwrite it
     poke row/contents pos/x none
     ; try again
     return make-cell table-state pos style' contents
    ]
   ]
  ]
  cell/style: merge-style cell/style style'
  if span [
   ; redefine the span?
   if cell/spansize [
    ; if we're reducing the span size we need to break the old span up
    if any [cell/spansize/y > spanrc/y cell/spansize/x > spanrc/x] [
     break-span table-state cell pos pos + spanrc
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ]
   ]
   cell/spansize: spanrc
  ]
  cell/out: contents
  table-state/curcell: cell
 ] [
  poke row/contents pos/x table-state/curcell: context [
   ; cell type - real cell (cells hidden by a spanning cell have type 'span)
   type: 'cell
   ; cell position in the table
   position: pos
   ; cell contents
   out: contents
   style: make-cell-style table-state style' row pick table-state/columns pos/x
   ; spanning cell?
   spansize: if span [spanrc]
   ; cell is header either if it is in the first row, or if vertical mode has been specified
   ; and it is in the first column; unless table is headerless
   ;header: either table-state/style [
   ; if not table-state/style/headerless [
   ; either table-state/style/vertical [pos/x = 1] [pos/y = 1]
   ; ]
   ;] [
   ; pos/y = 1
   ;]
  ]
  table-state/curcell
 ]
]
; break up a span (because a new span overlaps it)
break-span: func [table-state spancell breakstart breakend /local cellstart cellend] [
 ; we need to "subtract" the rectangle from breakstart to breakend from spancell
 ; we break spancell horizontally or vertically depending on the current table mode
 cellstart: spancell/position
 cellend: spancell/position + spancell/spansize
 ; spancell must be destroyed
 ; this flags it as destroyed when referencing from another cell
 spancell/spansize: none
 spancell/position: none
 ; remove it from the table
 poke get in pick table-state/table cellstart/y 'contents cellstart/x none
 ; remove position from original style (as it must not propagate to new cells)
 if spancell/style [spancell/style/position: none]
 either table-state/dir = 1x0 [
  ; horizontal mode
  if breakstart/y > cellstart/y [
   ; we have a row at the top
   make-span table-state context [
    start: cellstart
    end: cellend
    end/y: breakstart/y - 1
   ]
   cellstart/y: breakstart/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/y < cellend/y [
   ; we have a row at the bottom
   make-span table-state context [
    start: cellstart
    start/y: breakend/y + 1
    end: cellend
   ]
   cellend/y: breakend/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakstart/x > cellstart/x [
   ; we have a column at the left
   make-span table-state context [
    start: cellstart
    end: cellend
    end/x: breakstart/x - 1
   ]
   cellstart/x: breakstart/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/x < cellend/x [
   ; we have a column at the right
   make-span table-state context [
    start: cellstart
    start/x: breakend/x + 1
    end: cellend
   ]
   cellend/x: breakend/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
 ] [
  ; vertical mode
  if breakstart/x > cellstart/x [
   ; we have a column at the left
   make-span table-state context [
    start: cellstart
    end: cellend
    end/x: breakstart/x - 1
   ]
   cellstart/x: breakstart/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/x < cellend/x [
   ; we have a column at the right
   make-span table-state context [
    start: cellstart
    start/x: breakend/x + 1
    end: cellend
   ]
   cellend/x: breakend/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakstart/y > cellstart/y [
   ; we have a row at the top
   make-span table-state context [
    start: cellstart
    end: cellend
    end/y: breakstart/y - 1
   ]
   cellstart/y: breakstart/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/y < cellend/y [
   ; we have a row at the bottom
   make-span table-state context [
    start: cellstart
    start/y: breakend/y + 1
    end: cellend
   ]
   cellend/y: breakend/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
 ]
]
; make a cell reference (cell hidden by a spanned cell)
make-reference: func [table-state pos cell /local row old save] [
 ; make room if needed
 row: make-row table-state pos none
 ; do we need to add cells?
 if pos/x > length? row/contents [
  insert/dup tail row/contents none pos/x - length? row/contents
 ]
 ; are we overlapping another span?
 if all [old: pick row/contents pos/x old/type = 'span old/reference/position] [
  ; need to save curcell, curpos, currow and out as break-span may change them
  save: reduce bind [curcell curpos currow] in table-state 'self
  break-span table-state old/reference cell/position cell/position + cell/spansize
  set bind [curcell curpos currow] in table-state 'self save
 ]
 ; overwrite any cell
 poke row/contents pos/x local: context [
  ; cell type - span
  type: 'span
  reference: cell
 ]
 local
]
; add a new cell (=cell command, or new paragraph)
add-cell: func [table-state args contents /local pos] [
 ; see if position has been specified
 if object? args [pos: args/position]
 ; if not, assign default position (curpos + dir)
 if not pos [
  pos: any [table-state/curpos 1x0]
  pos: pos + table-state/dir
 ]
 ; create cell or pick existing one at given position
 args: merge-style/copy default-cell args
 make-cell table-state pos args contents
 ; update table size
 table-state/size: max table-state/size pos
]
; set the style for the current cell (=span=cell case)
set-cell: func [table-state args contents] [
 either all [args args/position] [
  ; if position has been specified, this is a new cell
  add-cell table-state args contents
 ] [
  table-state/curcell/style: merge-style table-state/curcell/style args
  table-state/curcell/out: contents
 ]
]
; create a cell span
make-span: func [table-state args /local cell pos] [
 ; arguments must be provided, and valid, otherwise command is ignored
 if not all [
  object? args pair? args/start pair? args/end
  set bind [start end] in args 'self reduce [min args/start args/end max args/start args/end]
 ] [exit]
 ; not really a span... make a normal cell
 if args/start = args/end [
  make-cell table-state args/start none [ ]
  exit
 ]
 ; the idea here is that we create a cell with spansize
 ; then we put a reference to it in all the cells it covers
 if cell: make-cell/span table-state args/start none [ ] 1x1 + args/end - args/start [
  pos: args/start + 1x0
  while [pos/y <= args/end/y] [
   while [pos/x <= args/end/x] [
    make-reference table-state pos cell
    pos: pos + 1x0
   ]
   pos/x: args/start/x
   pos: pos + 0x1
  ]
  ; update table size
  table-state/size: max table-state/size args/end
 ]
]

11.7 Stage 3 rewrite rules (optimization, handling of =data and =repeat)

Stage 3 rewrite rules (optimization, handling of =data and =repeat)

['table-proto ['opts set x skip | (x: none)] y: to end] [(stage3-ctx/generate-table x y)]
[into ['hidden-table]] [ ]
[into ['repeat 'opts set x block! into ['enum y: to end]]] [[enum [repeat opts [(x)] (y)]]]
[into ['repeat 'opts set x block! into ['bullets y: to end]]] [[bullets [repeat opts [(x)] (y)]]]
[into ['repeat 'opts set x block! y: to end]] [(stage3-ctx/eval-repeat x y)]
[into ['data none!]] [ ]
[into ['data set x block!]] [(stage3-ctx/eval-data x)]
[into ['qlink none!]] [ ]
['qlink set x string!] [(stage3-ctx/eval-qlink x)]
['link-proto 'opts set x url!] [link opts [target: (x)]]
['link-proto 'opts set x string!] [(stage3-ctx/eval-link x)]
['anchor 'opts set x string!] [anchor opts [name: (x)]]
; unwind cells outside of table (can be created by =data)
[y: into ['cell-if opt ['opts skip] opt ['span skip] opt ['header] x: to end]]
 [(either 'row = first head y [y/1/1: 'cell copy/part y 1] [x])]
; unwind block level inside paras (can be created by =data)
[into [x: 'para 'opts set z block! any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]]
 [[(copy/part x y)] (copy/part y 1) [para opts [(z)] (next y)]]
[into [x: 'para any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]]
 [[(copy/part x y)] (copy/part y 1) [para (next y)]]
[y: ['para | 'item] opt ['opts skip] any [z: into ['para to end] :z break | skip] into ['para to end] to end] [(
 rewrite copy y [
  [into ['para 'opts set x block! y: to end]] [[font opts [(x)] (y)]]
  [into ['para opt ['opts skip] y: to end]] [(y)]
 ]
)]
; box with only title
['box ['opts set x block! | (x: [ ])] into ['title y: to end] end] [box opts [(x)] [para (y)]]
; optimizations
[into ['bold]] [ ]
[into ['italic]] [ ]
[into ['strike]] [ ]
[into ['font opt ['opts skip]]] [ ]
; font with no opts
[into ['font x: [block! | string!] to end]] [(x)]
['font 'opts set x block! into ['bold y: to end] end] [font opts [(x) bold: (true)] (y)]
['font 'opts set x block! into ['italic y: to end] end] [font opts [(x) italic: (true)] (y)]
['bold into ['font 'opts set x block! y: to end] end] [font opts [(x) bold: (true)] (y)]
['italic into ['font 'opts set x block! y: to end] end] [font opts [(x) italic: (true)] (y)]
['link 'opts set x block! into ['font 'opts set y block! z: to end] end] [link opts [(x) (y)] (z)]
['alink 'opts set x block! into ['font 'opts set y block! z: to end] end] [alink opts [(x) (y)] (z)]
['anchor 'opts set x block! into ['font 'opts set y block! z: to end] end] [anchor opts [(x) (y)] (z)]
['para ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [para opts [(x) (y)] (z)]
['header1 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1 opts [(x) (y)] (z)]
['header1* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1* opts [(x) (y)] (z)]
['header2 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2 opts [(x) (y)] (z)]
['header2* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2* opts [(x) (y)] (z)]
['header3 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3 opts [(x) (y)] (z)]
['header3* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3* opts [(x) (y)] (z)]
['header4 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header4 opts [(x) (y)] (z)]
['header5 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header5 opts [(x) (y)] (z)]
['header6 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header6 opts [(x) (y)] (z)]
['item ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [item opts [(x) (y)] (z)]
['check ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [check opts [(x) (y)] (z)]
['term ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [term opts [(x) (y)] (z)]
['desc ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [desc opts [(x) (y)] (z)]
['title ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [title opts [(x) (y)] (z)]

11.7.1 Words used by the rewrite rules

Words used by the rewrite rules

block-level: [
 'hrule | 'header1 | 'header2 | 'header3 | 'header4 | 'header5 |
 'header6 | 'bullets | 'enum | 'checks | 'definitions | 'box |
 'table | 'center | 'left | 'right | 'justify | 'escape | 'header1* |
 'header2* | 'header3*
]

11.8 Other functions

Stage 3 FSM +≡

eval-qlink: func [target /local a] [
 either a: select anchors target [
  compose/deep [alink opts [target: (target)] (skip a 3)]
 ] [
  compose/deep [link opts [(a: process-link target)] (select a [text:])]
 ]
]
eval-link: func [target] [
 either find anchors target [
  compose/deep [alink opts [target: (target)]]
 ] [
  compose/deep [link opts [(process-link target)]]
 ]
]
eval-data: func [opts /local val p] [
 opts: construct/with opts context [name: none index: none]
 val: eval-var opts/name
 if not val [return [ ]]
 either object? val [
  p: in pickers val/type
  if p [
   do get in get p type?/word opts/index val opts/index
  ]
 ] [
  reduce [val]
 ]
]
pickers: context [
 csv: context [
  none!: func [val index] [
   "Not yet."
  ]
  integer!: func [val index] [
   csv-row/none! context [content: pick val/contents index] none
  ]
  pair!: func [val index] [
   pick pick val/contents index/y index/x
  ]
 ]
 table: context [
  none!: func [val index] [
   "Not yet."
  ]
  integer!: func [val index] [
   "Not yet."
  ]
  pair!: func [val index] [
   val: pick get in pick val/table index/y 'contents index/x
   if val/type = 'span [val: val/reference]
   cell/none! val none
  ]
 ]
 table-proto: context [
  none!: integer!: pair!: func [val index] [
   rewrite val/contents rewrite-rules
   eval-data compose [name: (val/name) index: (index)]
  ]
 ]
 alias: context [
  none!: integer!: pair!: func [val index] [
   eval-data compose [name: (val/dest) index: (index)]
  ]
 ]
 cell: context [
  none!: integer!: pair!: func [val index] [
   compose/deep [[
    cell-if (either val/style ['opts] [[ ]])
     (either val/style [reduce [make-style/ignore val/style [position]]] [[ ]])
     (either val/spansize ['span] [[ ]]) (any [val/spansize [ ]])
     (val/out)
   ]]
  ]
 ]
 csv-row: context [
  none!: func [val index] [
   either empty? val/content [
    [ ]
   ] [
    index: make block! 3 * length? val/content
    insert index first val/content
    foreach cell next val/content [
     insert insert tail index " " cell
    ]
    index
   ]
  ]
  integer!: func [val index] [
   pick val/content index
  ]
  pair!: func [val index] [
   pick val/content index/x
  ]
 ]
 table-row: none
]
eval-var: func [var] [
 if var: any [select/skip last local-vars var 2 select/skip vars var 2] [first var]
]
local-vars: [[ ]]
eval-repeat: func [spec body /local result var1 var2 val val2 val3 iter] [
 result: make block! 16
 parse spec [
  integer! end (loop spec/1 [insert tail result copy/deep body]) ; no need to recurse in this case
  |
  set var1 [word! | string! | into [some [word! | string!]]] opt 'in set var2 [word! | string!] end (
   if val: eval-var form var2 [
    insert/only tail local-vars local: copy last local-vars
    local: tail local
    if object? val [
     iter: in iterators val/type
     if iter [iter: get iter iter/iterate val local var1 result body]
    ]
    remove back tail local-vars
   ]
  )
  |
  set var1 [word! | string!]
  opt 'from set val skip
  opt 'to set val2 skip
  [['by | 'skip | 'step | none] set val3 skip | (val3: none)] end (
   attempt [
    insert/only tail local-vars local: copy last local-vars
    local: tail local
    for i val val2 any [val3 either val > val2 [-1] [1]] [
     clear local
     insert insert tail local form var1 form i
     insert tail result rewrite copy/deep body rewrite-rules ; recurse
    ]
    remove back tail local-vars
   ]
  )
 ]
 result
]
iterators: context [
 csv: context [
  iterate: func [val locals var result body /local bind-var] [
   bind-var: get in binders either block? var ['multi] ['single]
   foreach row val/contents [
    clear locals
    bind-var locals var row
    insert tail result rewrite copy/deep body rewrite-rules ; recurse
   ]
  ]
  binders: context [
   single: func [local var row] [
    insert insert tail local form var context [type: 'csv-row content: row]
   ]
   multi: func [local vars row] [
    foreach var vars [
     insert insert tail local form var row/1
     row: next row
    ]
   ]
  ]
 ]
 table: context [
  iterate: func [table locals var result body /local bind-var rows] [
   bind-var: get in binders either block? var ['multi] ['single]
   rows: table/table
   if not all [table/style any [table/style/headerless table/style/vertical]] [rows: next rows]
   foreach row rows [
    clear locals
    bind-var locals var row
    insert tail result rewrite copy/deep body rewrite-rules ; recurse
   ]
  ]
  binders: context [
   single: func [local var row] [
    insert insert tail local form var make row [type: 'table-row]
   ]
   multi: func [locals vars row /local i cell] [
    i: 1
    foreach var vars [
     cell: pick row/contents i
     if cell/type = 'span [cell: cell/reference]
     insert insert tail locals form var cell
     i: i + 1
    ]
   ]
  ]
 ]
 table-proto: context [
  iterate: func [tablep local var result body] [
   rewrite tablep/contents rewrite-rules
   table/iterate eval-var tablep/name local var result body
  ]
 ]
 alias: context [
  iterate: func [alias locals var result body /local iter] [
   alias: eval-var alias/dest
   iter: in iterators alias/type
   if iter [
    iter/iterate alias locals var result body
   ]
  ]
 ]
]