Qtask Markup Language - parser and other common code

Date	Version	Description	Author
16-Feb-2006	1.1.0	History start
16-Feb-2006	1.2.0	Fixed a bug with escape command parsing
20-Feb-2006	1.3.0	Moved options parsing from emitters to here
13-Mar-2006	1.4.0	Options parsing now collects values without a name; =, now treated as end command
13-Mar-2006	1.5.0	Added second pass to balance commands
17-Mar-2006	1.6.0	PARSE rules no more eat newlines after commands
17-Mar-2006	1.7.0	Removed header option for boxes
17-Mar-2006	1.8.0	=- now eats one newline
18-Mar-2006	1.9.0	Added args for =row and =column
18-Mar-2006	1.10.0	Changed box options
18-Mar-2006	1.11.0	=>>> etc.; fixed space only lines
21-Mar-2006	1.12.0	Spaces at beginning of lines are now ignored
23-Mar-2006	1.13.0	Changed handling of =cell, =row and =column in second pass
24-Mar-2006	1.14.0	Spaces no more required after ]
27-Mar-2006	1.15.0	New command options handling
27-Mar-2006	1.16.0	Added comment handling to second pass
29-Mar-2006	1.17.0	Minor changes to options handling; supports dashed: color etc. too
29-Mar-2006	1.18.0	Split =c and =center into two separate commands
29-Mar-2006	1.19.0	=word and =def as aliases for =: and =::
29-Mar-2006	1.20.0	Added new comma-pair! option type
29-Mar-2006	1.21.0	Second pass now eats newlines that should be ignored, and handles inline cmds in block mode differently
30-Mar-2006	1.22.1	Fixed table balancing
30-Mar-2006	1.23.1	Added boxcenter etc to =table
30-Mar-2006	1.24.1	Added #FFF etc. as color
30-Mar-2006	1.25.1	Changed escape commands (=example[end])
30-Mar-2006	1.26.1	Added =left, =right, =l, =r
30-Mar-2006	1.27.1	Added =span
1-Apr-2006	1.28.1	Added =justify, =j
5-Apr-2006	1.29.1	Added =table[space]
5-Apr-2006	1.30.1	Added shadow and rounded to =table
6-Apr-2006	1.31.1	Added all flag for table, row and column
6-Apr-2006	1.32.1	= and =, now close =left etc. too
18-Apr-2006	1.33.1	Fixed balancing for =toc
18-Apr-2006	1.34.1	Added =o and =x
21-Apr-2006	1.35.1	Added =s and =u
21-Apr-2006	1.36.1	Now collects font style for =toc
21-Apr-2006	1.37.1	Now collects font style for headers numbers
22-Apr-2006	1.38.1	Moved merge-style here (from xhtml emitter) and added /copy (fixes bug)
22-Apr-2006	1.39.1	Added =font[space]
22-Apr-2006	1.40.1	Added new color keywords, improved compatibility with older REBOLs
24-Apr-2006	1.41.1	Added =row. and =column.
24-Apr-2006	1.42.1	New =image (now inline)
24-Apr-2006	1.43.1	Added =4, =5 and =6
26-Apr-2006	1.44.1	Added a few missing colors
27-Apr-2006	1.45.1	The / char is now an alias for . (ending commands)
27-Apr-2006	1.46.1	Added initial support for =anchor, changed =link
28-Apr-2006	1.47.1	Finished =anchor support
12-May-2006	2.1.0	Started rewriting as RLP with new architecture
15-May-2006	2.2.0	Added table support
15-May-2006	2.3.0	Added =csv
17-May-2006	2.4.0	Added =data
17-May-2006	2.5.0	Added rewriting engine
17-May-2006	2.6.0	Added parsing for =repeat
17-May-2006	2.7.0	Added balancing for =repeat
17-May-2006	2.8.0	Basic =repeat support
18-May-2006	2.9.0	New table handling, supporting =repeat in =table
18-May-2006	2.10.0	Initial toc support
19-May-2006	2.11.0	Added numbering (finished toc), anchors
19-May-2006	2.12.0	Fixed links and qlinks
19-May-2006	2.13.0	Fixed =#
19-May-2006	2.14.0	=data[name] when name is a =table
20-May-2006	2.15.0	Added search
23-May-2006	2.16.0	Improved =repeat and =data, testing
24-May-2006	2.17.0	Rewritten numbering/counting to be =repeat friendly
24-May-2006	2.18.0	Added optimizations
25-May-2006	2.19.0	Changed =table[space]
25-May-2006	2.20.1	Added support for default options
26-May-2006	2.21.1	Made words optional in the repeat dialect
26-May-2006	2.22.1	Auto naming for tables and csv
26-May-2006	2.23.1	Added =table[headerless]
26-May-2006	2.24.1	=repeat on table now skips header row
29-May-2006	2.25.1	Fixed box with only a title
2-Jun-2006	2.26.1	Added defaults for =repeat and =data
7-Jun-2006	2.27.1	Fixed problem with =data, =image and =anchor as =box title
14-Jun-2006	2.28.1	Now removes anchors from TOC
14-Jun-2006	2.29.1	Search now lists anchor exact matches too
14-Jun-2006	2.30.1	Added =table[horizontal vertical]
16-Jun-2006	2.31.1	Added =image[space] and changed =image[image: ...] to =image[src: ...]
16-Jun-2006	2.32.1	Added /keep refinement to scan-doc
16-Jun-2006	2.33.1	Added process-link function (to be overridden by users)
29-Jun-2006	2.34.1	Fixed =span
21-Jul-2006	2.35.1	=l meant both one-line =left and abbreviation for =link; fixed (now =link can be abbreviated as =li)
21-Jul-2006	2.36.1	Fixed problem with =table[borderless]
21-Aug-2006	2.37.1	Release 2.0i: first public release

4. The parser (stage 1)

The first stage of the pipeline is the parser. The parse-qml function just uses parse on the text string.

〈The parser (stage 1)〉 ≡

〈The parse rules〉
〈The set-magic function〉

parse-qml: func [text [string!] magic [string! none!]] [
; initialize magic char
set-magic any [magic "="]
parse/all text qml-rule
]

4.1 The parse rules

The parse-qml function uses qml-rule to parse the text; everything that is not a command, is considered text and is handed over to the second stage (with the stage2 function, see 〈Balancing (stage 2): the stage2 function〉).

The txt-chars charset is initialized by the set-magic function (see 〈The set-magic function〉).

〈The parse rules〉 ≡

qml-rule: [
some [commands | text]
]

commands: [
〈parse rule for commands〉
]

txt: none
txt-chars: none
text: [copy txt some txt-chars (stage2 [text:] txt)]

4.2 parse rule for commands

The heart of the parser is the commands rule.

The newline character is treated like a command (produces the "^/" command); spaces at the end and at the beginning of lines are ignored. (As already noted, the stage2 function is used to hand the commands over to the second stage of the pipeline; see 〈Balancing (stage 2): the stage2 function〉.)

Other commands are introduced by magic-char (set by the set-magic function, see 〈The set-magic function〉); there are some special commands that require their own parse rules, but most commands are parsed by the rule shown here. This parses normal opening commands (like "=cmd"), opening commands with options ("=cmd[...]" or "=cmd{...}"), and ending commands ("=cmd." or "=cmd/"); commands with options eat at most one space following them (i.e., "=cmd[...]text" is exactly the same as "=cmd[...] text"; if you need a space after a command with options, use two spaces); ending commands do not eat any space (so "=cmd.text" is not the same as "=cmd. text"); commands without options require at least a space following them and will eat all spaces following them. Note that the mk: magic-char :mk construct allows putting commands one after another without spaces between them (e.g., "=b=i"). Note that this rule also parses the dot command "=.".

As you can see each command is sent to the second stage as a pair of values: the command as a string, and the command's options (which can be none). As you have already seen in 〈The parse rules〉, this is also true for text, which is sent as the text: (a set-word!) command with the text as options (a set-word! is used so that there is no clash with other commands, e.g. if we ever need to add a =text command and so on).

〈parse rule for commands〉 ≡

any spc newline any spc (stage2 "^/" none)
|
magic-char [
〈Special commands〉
|
; ignore a = at the end of the text
end
|
copy cmd any cmd-chars [
  "[" copy options to "]" skip opt spc (stage2 cmd options)
  |
  "{" copy options to "}" skip opt spc (stage2 cmd options)
  |
  ["." | "/"] (stage2 join any [cmd ""] "." none)
  |
  [some spc | mk: [newline | magic-char] :mk | end] (stage2 cmd none)
]
|
; malformed command?
(stage2 [text:] magic-char)
]

4.2.1 Special commands

Some command require custom parse rules. I've tried to keep this to a minimum, but Reichart loves special cases. ;-)

magic-char followed by magic-char allows inserting magic-char in the text;
magic-char followed by a space or a newline is the space command (i.e., " ");
"alias" is special for two reasons: first, its option is not enclosed in braces, but follows the =alias command (like "=alias ~"); second, it does not produce a command, but calls the set-magic function, which changes the value of magic-char and other things; (note, that since we are using some cmd-chars to parse the new magic-char, you cannot write something like "=alias =", which is probably ok, but maybe should be fixed);
"csv" is similar to an escape command (see below), in that text between "=csv" and "=csv." is not processed as QML text, but as CSV text; however, "=csv." must happen at the beginning of a line (it also eats spaces and one newline). The CSV data is sent as the command's options;
escape commands are used to include non-QML text in a QML document; you can specify an ending sequence for the escaped text using braces, otherwise the respective ending command is used as the ending sequence (e.g., "=html ... =html." or "=html[END] ... END"); note that one newline is eaten;
the horizontal ruler command "-" allows any number of dashes to be used (i.e., "=-" is the same as "=---" and so on) and eats one newline;
the indent command ">" takes as option the amount of indentation, but this is specified by the number of ">" characters following magic-char (e.g., "=>>>" becomes ">" 3);
the empty command ("=[...]" or "={...}" which become ["" "..."]) is special because it does not eat any following space;
the comma command is special because it does not eat any following spaces (behaving like the space and dot commands);
the repeat command is special because its options are parsed as REBOL block; this is the only case where you can have a "]" inside options created with "[" or a "}" inside options created with "{"; the rebol-options rule is used to parse its options.

〈Special commands〉 ≡

magic-char (stage2 [text:] magic-char)
|
[" " | mk: newline :mk] (stage2 " " none)
|
"alias" some spc copy cmd some cmd-chars any spc (set-magic cmd)
|
"csv" [
"[" copy options to "]" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""])
|
"{" copy options to "}" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""])
|
any spc opt newline (options: context [name: show: none])
] (csv: make block! 256) some [
[magic-char "csv" ["." | "/"] any spc opt newline | end] (stage2 "csv" make options [contents: csv]) break
|
[copy txt to newline newline | copy txt to end] (append/only csv parse/all txt ",")
]
|
copy cmd escape-cmd [
"[" copy options to "]" skip any spc opt newline
|
"{" copy options to "}" skip any spc opt newline
|
any spc opt newline (options: rejoin [magic-char cmd "."])
] [copy txt to options options any spc opt newline | copy txt to end] (stage2 cmd txt)
|
some "-" [some spc opt newline | newline | mk: magic-char :mk | end] (stage2 "-" none)
|
copy cmd some ">" [some spc | mk: [newline | magic-char] :mk | end] (stage2 ">" length? cmd)
|
"[" copy options to "]" skip (stage2 "" options)
|
"{" copy options to "}" skip (stage2 "" options)
|
"," (stage2 "," none)
|
"repeat" (options: make block! 16) [
(opt-open-char: "[" opt-close-char: "]") rebol-options (stage2 "repeat" options)
|
(opt-open-char: "{" opt-close-char: "}") rebol-options (stage2 "repeat" options)
|
["." | "/"] (stage2 "repeat." none)
]

4.2.2 Words used by the commands rule

The commands parse rule uses a number of words that we need to make local. It also uses the rebol-options subrule which we define here.

cmd-chars and magic-char are initialized by the set-magic function (see 〈The set-magic function〉); escape-cmd is the list of escape commands.

rebol-options uses the load-next function to parse the options as REBOL values; in turn it uses load/next to parse the text and tries to recover in case of errors.

〈The parse rules〉 +≡

mk: cmd: options: csv: none
spc: charset " ^-"
spc+: charset " ^-^/"
cmd-chars: none
magic-char: none

escape-cmd: ["HTML" | "REBOL" | "MakeDoc" | "Example"]

opt-open-char: "[" opt-close-char: "]"
rebol-options: [
opt-open-char
txt: (txt: load-next options txt) :txt
some [any spc+ opt-close-char break | end break | txt: (txt: load-next options txt) :txt]
opt spc
]
load-next: func [out text /local val] [
if error? try [
  set [val text] load/next text
  insert/only tail out val
] [
  insert tail out copy/part text text: any [find text opt-close-char tail text]
]
text
]

4.3 The set-magic function

This function initializes the values of magic-char, cmd-chars and txt-chars. cmd-chars is a charset containing the characters that are valid in command names; txt-chars contains the characters that are valid for text outside commands.

〈The set-magic function〉 ≡

set-magic: func [magic [string!]] [
; magic cannot be empty
if empty? magic [magic: "="]
magic-char: magic
cmd-chars: complement charset join " ^-^/[]{}./" first magic-char
txt-chars: complement charset join "^/" first magic-char
]

5. Balancing (stage 2): the stage2 function

The second stage of the pipeline is meant to simplify the third stage by "normalizing" the command stream. It takes care of ensuring command balancing, and handles the space, dot and comma commands ("= ", "=." and "=,"). Parsing of command options is also done in the stage2 function (just because this is a convenient place to do it, even though conceptually it belongs to the first stage).

This stage uses a Finite State Machine, handling each command as an event. Please see the documentation of the FSM interpreter for more informations (linked in from 〈Overview〉).

As seen in the first stage, the stage2 function is invoked with a command and its options. A command can be a string! or a set-word!; for simplicity set-words can be passed in a block (as in stage2 [text:] txt), so if cmd is a block we just extract the first value. The options are parsed with the parse-command-options function; the cmd and opts words are then set in the stage2-ctx object, so that they are available to the FSM rules, and cmd is issued as an event for the state machine. (stage2 needs to do another thing after this, we'll discuss this later on.)

〈Balancing (stage 2): the stage2 function〉 ≡

stage2-fsm: make fsm! [ ]

stage2: func [cmd opts] [
if block? cmd [cmd: first cmd]
stage2-ctx/opts: parse-command-options stage2-ctx/cmd: cmd :opts
stage2-fsm/event cmd
〈Additional code for the stage2 function〉
]

stage2-ctx: context [
cmd: opts: none
〈Balancing (stage 2): the Finite State Machine〉
]

〈Initialization and termination of the FSM〉

〈The merge-style function〉

6. Command options parsing

Command options (also called "refinements" in the code) are parsed by the parse-command-options function, which is called by the stage2 function (although, conceptually options parsing belongs to the first stage; the stage2 function parses the options before actually processing the command in the FSM).

Only some commands need their options to be parsed; if cmd is one of them, and options is a string, then the parse-arg-string function is used. Otherwise options is returned as-is.

〈Command options parsing〉 ≡

parse-command-options: func [cmd options] [
either all [
  string? options
  find [
   "table" "row" "column" "cell" "box"
   "image" "font" "f" "span" "data"
  ] cmd
] [
  refinements/parse-arg-string cmd options
] [
  options
]
]

refinements: context [
〈Option values' types〉
〈Type map〉
〈Object map〉
〈parse-arg-string's support functions〉
parse-arg-string: func [cmd args /local 〈parse-arg-string's locals〉] [
  〈Parse the args string into an object〉
]
]

6.1 Option values' types

Options are parsed as a list of values; option values can be of a number of different types, each recognized by syntax (like in REBOL) and parsed by a specific parse rule. Values can be of one of the following types: flag!, set-word!, color!, string!, integer!, url!, percent!, pair! and comma-pair!; after being parsed, each value is represented by a REBOL value, and each type is mapped to a REBOL type: flag! is mapped to word!; set-word!, string!, integer!, url! and pair! are mapped to the respective REBOL types; color! is mapped to one of issue!, refinement! or tuple! (depending on how the color was specified, if as a hex string, a color name or a tuple); percent! is mapped to money!; comma-pair! is mapped to block!.

The types object holds the parse rules for the option types. The flag-word and set-word rules are set dynamically based on the command that is being parsed. Note that strings can be specified even without quotes, so the string! rule must be applied last (other types need to be tried first); also note that we only allow integer percent values, even tough we use money! to represent them; also, comma-pair! can contain integers or percents, so the block representing it can contain integer! or money! values.

〈Option values' types〉 ≡

types: context [
flag!: [flag-word [some spc | end]]
set-word!: [set-word any spc]
color!: [
  [
   color-keyword
   |
   tuple
   |
   [opt "#" copy value 6 hex-digits | "#" copy value 3 hex-digits] (value: to issue! value)
  ]
  [some spc | end]
]
string!: [
  [{"} copy value some dquotechars {"} | "'" copy value some quotechars "'" | copy value some chars]
  [some spc | end]
]
integer!: [copy value some digits (value: to system/words/integer! value) [some spc | end]]
url!: [
  copy value [some urlchars ":" 0 2 "/" some urlchars any ["/" some urlchars]]
  (value: to system/words/url! value)
  [some spc | end]
]
percent!: [copy value 1 3 digits "%" (value: to money! value) [some spc | end]]
pair!: [copy value [some digits "x" some digits] (value: to system/words/pair! value) [some spc | end]]
comma-pair!: [
  (value: make block! 4)
  copy val some digits ["%" (append value to money! val) | none (append value to integer! val)]
  ","
  copy val some digits ["%" (append value to money! val) | none (append value to integer! val)]
  [some spc | end]
]
]

6.1.1 Subrules and other words used by the parse rules

As already said above, flag-word and set-word are dynamically set to a command-specific rule. See 〈Rules for flag!〉 and 〈Rules for set-word!〉.

〈Option values' types〉 +≡

value: val: none
chars: complement spc: charset " ^-^/"
urlchars: complement charset {"':/ ^- }
dquotechars: complement charset {"}
quotechars: complement charset "'"
digits: charset "1234567890"
hex-digits: union digits charset "ABCDEFabcdef"
flag-word: none
set-word: none
〈Rules for color!〉
〈Rules for flag!〉
〈Rules for set-word!〉

6.1.2 Rule for parsing a generic value

The rule value-rule can be used to parse a generic (non flag or set-word) value; the order of the rules is significant. (Note that it's written this way for compatibility with older REBOLs.)

〈Option values' types〉 +≡

6.1.3 Rules for color!

〈Rules for color!〉 ≡

6.1.4 Rules for flag!

Each command has a list of available flags. Note that order is important (because of the way the bold and italic rules are defined, for example).

〈Rules for flag!〉 ≡

We have only specified the parse rules for a small set of flags; all the others are generated automatically: for example, [dashed] becomes ["dashed" (value: 'dashed)]. This is done by processing the flag-words block using parse.

〈Rules for flag!〉 +≡

rule: word: none
parse flag-words [
some [
  some string! set rule block! (
   while [not tail? rule] [
    either all [rule/1 <> '| not block? get/any word: rule/1] [
     rule: insert/only change rule
      form word to paren! compose [value: (to lit-word! word)]
    ] [rule: next rule]
   ]
  )
  |
  some string! rule: 'none (rule/1: [end skip])
]
]

6.1.5 Flag actions

By default, a flag sets the respective word to true in the object that represents the parsed options. Some flags, however, do something different; for example, the dashed flag sets the word outline-style to 'dashed. The flag-actions object contains the flags that do something different from the default.

〈Rules for flag!〉 +≡

flag-actions: context [
dashed: [outline-style: 'dashed]
dotted: [outline-style: 'dotted]
solid: [outline-style: 'solid]
outline: [outline-style: 'solid]
borderless: [outline-style: 'borderless]
rounded: [outline-style: 'rounded]
center: [text-halign: 'center]
left: [text-halign: 'left]
right: [text-halign: 'right]
justify: [text-halign: 'justify]
middle: [text-valign: 'middle]
top: [text-valign: 'top]
bottom: [text-valign: 'bottom]
imagecenter: [image-halign: 'center]
imageleft: [image-halign: 'left]
imageright: [image-halign: 'right]
imagemiddle: [image-valign: 'center]
imagetop: [image-valign: 'top]
imagebottom: [image-valign: 'bottom]
tile: [image-tiling: 'both]
tilev: [image-tiling: 'vertical]
tileh: [image-tiling: 'horizontal]
tileless: [image-tiling: 'neither]
times: [typeface: 'times]
helv: [typeface: 'helvetica]
courier: [typeface: 'courier]
boxcenter: [position: 'center]
boxright: [position: 'right]
boxleft: [position: 'left]
]

6.1.6 Rules for set-word!

Set-words are parsed in a way similar to flags. Each command has a list of set-words that are accepted.

〈Rules for set-word!〉 ≡

Like with flags, most set-word rules are generated automatically from the above block. This translates [outline] to ["outline:" (value: first [outline:])].

〈Rules for set-word!〉 +≡

parse set-words [
some [
  some string! set rule block! (
   while [not tail? rule] [
    either all [rule/1 <> '| not block? get/any word: rule/1] [
     rule: insert/only change rule
      append form word ":" to paren! compose/deep [value: first [(to set-word! word)]]
    ] [rule: next rule]
   ]
  )
  |
  some string! rule: 'none (rule/1: [end skip])
]
]

6.1.7 Set-word actions

By default a set-word sets the respective word to the value following it; some however do a special action.

〈Rules for set-word!〉 +≡

set-actions: context [
outline: solid: [outline-color: value outline-style: 'solid]
dashed: [outline-color: value outline-style: 'dashed]
dotted: [outline-color: value outline-style: 'dotted]
column: [position: as-pair value 1]
row: [position: as-pair 1 value]
]

6.1.8 Set-word expected types

Each set-word expects a value of a specified type (of one of a list of specified types) following it; if the following value cannot by parsed with the type rule specified, the assignment is ignored. (That is, you cannot set the color to an integer, or the width to a color.)

〈Rules for set-word!〉 +≡

var-types: context [
color: types/color!
typeface: types/string!
fontsize: types/integer!
space: types/integer!
background: types/color!
outline: dashed: dotted: solid: types/color!
width: height: bind [percent! | integer!] in types 'self
column: row: types/integer!
image: bind [url! | string!] in types 'self
name: types/string!
index: bind [pair! | integer!] in types 'self
]

6.2 Type map

When an option value (not a flag or set-word) is specified without any variable to set it to (i.e., no set-word precedes it), a "type map" is used to determine what variable should be set. Each command has its own type map; for example, the first color! value in the options for "table" is set as the background, the second as the text color, and so on.

〈Type map〉 ≡

type-map: [
"table" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [table-size]
  comma-pair! [(width: value/1 height: value/2)]
]
"row" "column" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [position width height fontsize]
  url! [image]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
]
"cell" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [position]
  comma-pair! [(width: value/1 height: value/2)]
]
"box" [
  color! [background color outline-color]
  string! [image typeface]
  integer! [width height fontsize]
  url! [image]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
]
"image" [
  color! [outline-color background]
  string! [src]
  integer! [width height]
  url! [src]
  percent! [width height]
  pair! [none]
  comma-pair! [(width: value/1 height: value/2)]
]
"font" "f" [
  color! [color background]
  string! [typeface]
  integer! [fontsize space]
  url! [none]
  percent! [none]
  pair! [none]
  comma-pair! [none]
]
"span" [
  color! [none]
  string! [none]
  integer! [none]
  url! [none]
  percent! [none]
  pair! [start end]
  comma-pair! [none]
]
"csv" [
  color! [none]
  string! [name]
  integer! [none]
  url! [none]
  percent! [none]
  pair! [none]
  comma-pair! [none]
]
"data" [
  color! [none]
  string! [name]
  integer! [index]
  url! [none]
  percent! [none]
  pair! [index]
  comma-pair! [none]
]
]

Since values are mapped after being parsed, and so after being represented as REBOL types, the types specified above need to be replaced by the respective REBOL types that represent them; as usual we do this using parse on the type-map block.

〈Type map〉 +≡

parse type-map [
some [
  some string! set rule block! (
   foreach [from to] [
    color! [issue! refinement! tuple!]
    percent! money!
    comma-pair! block!
   ] [
    replace rule from to
   ]
  )
]
]

6.3 Object map

The object-map block defines the template options object for each command.

〈Object map〉 ≡

object-map: [
"table" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  table-size: bold: italic: outline-style: vertical: text-halign: text-valign:
  image-halign: image-valign: float: image-tiling: position: force-space: shadow:
  all: hide: name: headerless: horizontal: none
]
"row" "column" "cell" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  bold: italic: outline-style: text-halign: text-valign:
  image-halign: image-valign: image-tiling: position: all: none
]
"box" [
  background: color: outline-color: image: typeface: width: height: fontsize:
  bold: italic: outline-style: text-halign: text-valign:
  image-halign: image-valign: float: image-tiling: position: shadow: none
]
"image" [
  background: outline-color: src: width: height: outline-style:
  float: position: space: none ;shadow: none
]
"font" "f" [
  bold: italic: typeface: color: background: fontsize: space: none
]
"span" [
  start: end: none
]
"csv" [
  show: name: none
]
"data" [
  name: index: none
]
]

6.4 Parse the args string into an object

Now that we have all that we need, we can write the code for the parse-arg-string function. First of all, we do some initialization, by setting flag-word and set-word to the appropriate rules, and by finding the correct type map and object template. Then we use parse on the args string: we can have a flag, in which case we either do the action specified in flag-actions or set the respective word to true; or we can have one or more set-words followed by a value; or we can have a value by itself, to be handled as specified in the type map (using the set-value-from-type function).

Note that we allow more than one set-word to be set to one value, like in "width: height: 100" which sets both the width and the heigth to 100. The accepted type of the value is the accepted type for the last variable specified, which is a problem, because you can write "face: size: 10" and have the font face set to 10, which is incorrect. However, since such a case is a user error anyway, we're not handling it for now. In the future, we need to make sure that the value is acceptable for all the words it is being set to.

Also note that we need to do some magic to make string work correctly even in assence of quotes; for example, in the case "=font[Arial Black]", since there are no quotes, two string values would be parsed, i.e. "Arial" and "Black", while we want it to be parsed as one string value, "Arial Black": so when we parse a string value, we check if the previous value was a string value too, and in this case we join the two values and consider it one. This is what the last-str word is used for.

〈Parse the args string into an object〉 ≡

flag-word: select* flag-words cmd
set-word: select* set-words cmd
tmap: select* type-map cmd
obj: get-obj cmd
tflag!: types/flag!
tset-word!: types/set-word!
parse/all args [
any spc some [
  tflag! (
   last-str: none
   either in flag-actions value [
    do bind get in flag-actions value in obj 'self
   ] [
    set in obj value true
   ]
  )
  |
  (vars: clear [])
  some [tset-word! (append vars value: to word! :value)] (var-type: get in var-types value)
  var-type (
   last-str: either string? :value [value] [none]
   foreach var vars [
    either in set-actions var [
     do bind get in set-actions var in obj 'self
    ] [
     set in obj var value
    ]
   ]
  )
  |
  value-rule (
   either string? :value [
    either last-str [
     insert insert tail last-str " " value
    ] [
     last-str: value
     set-value-from-type tmap obj
    ]
   ] [
    last-str: none
    set-value-from-type tmap obj
   ]
  )
]
]
obj

6.4.1 parse-arg-string's locals

〈parse-arg-string's locals〉 ≡

obj tmap var-type last-str vars tset-word! tflag!

6.5 parse-arg-string's support functions

First, we need a version of select that can handle the maps we defined above, where more than one key map to the same (block!) value. This is easily done with parse.

The get-obj function is trivial, too; set-value-from-type uses the type map to decide which word in obj should be set to value (note that value is not passed as an argument; it's defined in the refinements object). It uses select* to get the list of words for the given type, then sets the first word in that list that is still set to none. If the block contains a paren!, it is evaluated instead. (See for example how this is used for comma-pair! in the type maps.)

〈parse-arg-string's support functions〉 ≡

select*: func [block value] [
parse block [to value to block! set block block! | (block: none)]
block
]

get-obj: func [cmd] [make object! select* object-map cmd]

set-value-from-type: func [tmap obj] [
foreach word select* tmap type?/word :value [
  if paren? :word [
   do bind to block! word in obj 'self
   break
  ]
  if all [word <> 'none none? get word: in obj word] [
   set word value
   break
  ]
]
]

7. Balancing (stage 2): the Finite State Machine

The stage 2 FSM has two states (plus three "support" states); this is because QML commands are of two types: "block" level commands (a block is something that can contain paragraphs of text), and "inline" level commands (such as text, formatting commands, and so on - i.e. everything that can be inside a paragraph).

What this FSM needs to do is ensure that all opening and closing commands are correctly balanced, and that there is no block level command inside inline level commands. Formatting commands (almost all the inline level commands are formatting commands) need to be handled somewhat specially, since they can span across paragraphs (that is, exiting from inline level does not end the formatting) and they can be closed out of order; this means that the FSM has to close and reopen commands as needed to transform the input stream into a stream that is balanced and nests block and inline commands correctly.

〈Balancing (stage 2): the Finite State Machine〉 ≡

〈Stage 2 functions〉
〈The stage 2 FSM states〉

7.1 The stage 2 FSM states

The initial state of the FSM is the in-block state, which handles block level commands. (Note, we also use the stage 2 to convert commands that have aliases to their canonical name, i.e. "font" becomes "f", and so on.)

After processing, commands are passed to the stage3 function (see 〈Generating QML document tree (stage 3)〉).

Both in block and inline level we need to strip off comments. (Maybe, we should not parse them at all.) There are two kind of comments: one-line comments and normal comments; one-line comments are started by the "='" (or its variations) command, and end at the first newline; normal comments are started by "=;" (or "=rem" etc.) and end with "=;." (or "=rem." and so on). We use the in-line-comment and in-comment states to handle them; see below.
By default, commands are sent to stage3 as they are.
"word" and "def" are translated to ":" and "::".
The escape commands "example", "html", "rebol" and "makedoc" and the special command "csv" need to ignore a newline that immediately follows them; the eat-one-newline state is used for this (see below).
Block commands are opened by the open-block function (see 〈Stage 2 functions〉); "table", "center", "left", "justify", "right" and "repeat" also need to eat an immediately following newline, while "box" doesn't need to. (Note, that "repeat" is special, because it is a neutral command, that can happen both at block and inline levels, so its interpretation depends on the current state. There are a number of unresolved issues because of this, see below.)
Block commands are closed by the close-block function, and they all eat one newline.
The "toc" command cannot be nested, so it closes any previously opened "toc" when opening a new one.
The three commands ".", "," and " " are special; they (respectively) close everything that is open, end all the formatting commands, and end the last formatting command. Since we are at the block level in this state, inline commands are never open at this point and don't need to be closed; however, since formatting commands remain valid across paragraphs, we keep track of them, and need to remove them from the list of inline commands that will be reopened when we switch to inline level.
The "", "link", "anchor", "a", "li", "image", "data" and text: commands make the FSM switch to the in-inline state, because they mean we are switching to inline level from block level. The reopen-inline function is used to reopen any formatting command that is still active.
Formatting commands found while in block level do not make the FSM switch to inline level; they just are activated and will be opened as soon as we switch to inline level. The add-inline function is used to activate them. They also eat one newline when used in block level.
Formatting commands are deactivated with the remove-inline function, and eat one newline.
The table commands "row", "column", "row.", "column." and "span" are special, because they close the current cell (but without crossing the boundary of the current table, see 〈Stage 2 functions〉); they also eat one newline. The "cell" command also does not nest, and closes any previously opened cell, without crossing the table boundary, and it eats one newline too.

〈The stage 2 FSM states〉 ≡

in-block: [
"'" "�" "`" {"} in-line-comment
";" "comment" "rem" in-comment

default: (stage3 cmd opts)

"word" (stage3 ":" opts)
"def" (stage3 "::" opts)

"example" "html" "rebol" "makedoc" "csv" (stage3 cmd opts) eat-one-newline

"table" "center" "left" "justify" "right" "repeat" (open-block cmd opts) eat-one-newline
"box" (open-block cmd opts)

"box." "table." "cell." "center." "left."
"justify." "right." "toc." "repeat." (close-block cmd) eat-one-newline

"toc" (close-block "toc." open-block "toc" opts)

"." (remove-all-inline close-all-block) eat-one-newline
" " (remove-last-inline) eat-one-newline
"," (remove-all-inline) eat-one-newline

"" "link" "anchor" "a" "li" "image" "data" text: (reopen-inline) continue in-inline

"b" "bold" (add-inline "b" opts) eat-one-newline
"u" "underline" (add-inline "u" opts) eat-one-newline
"i" "italics" "italic" (add-inline "i" opts) eat-one-newline
"font" "f" (add-inline "f" opts) eat-one-newline
"s" "strike" "strikethrough" (add-inline "s" opts) eat-one-newline

"b." "bold." (remove-inline "b") eat-one-newline
"u." "underline." (remove-inline "u") eat-one-newline
"i." "italics." "italic." (remove-inline "i") eat-one-newline
"font." "f." (remove-inline "f") eat-one-newline
"s." "strike." "strikethrough." (remove-inline "s") eat-one-newline

"row" "column" "row." "column." "span"
(close-block/upto "cell." "table" stage3 cmd opts) eat-one-newline
"cell" (close-block/upto "cell." "table" open-block cmd opts) eat-one-newline
]

The in-inline state processes commands at inline level.

We handle comments like in in-block, and by default commands are passed as-is.
The "=." command at inline level closes all inline commands, adds a newline (to end the inline level and make the stage 3 to go back to block level), and continues to in-block, which in turn will close all block level commands; "= " and "=," respectively close the last and all inline commands.
"=link" and "=anchor" are special inline commands, and behave differently from other inline commands. See 〈Stage 2 functions〉 for more details.
Formatting commands are handled by the open-inline and close-inline functions; they behave differently from open-block and close-block because inline commands can be opened and closed out of order; these functions take care of reordering the commands so that in the stream that arrives at the third stage commands are balanced and correctly nested. See 〈Stage 2 functions〉 for more details.
The "=repeat" command is a problem here; first of all, "=repeat." could both be closing an inline or a block level repeat; the close-repeat function checks for this and if it was closing a block level repeat, it ends the inline level. (To do this, it generates the close-inline: event, see 〈Stage 2 functions〉 for more details.) But the major problem is that repeat does not behave like inline commands, but it does not behave like the link and anchor special commands; for this reason, there are a few unresolved issues here. In particular, we have problems if a "=repeat" in inline level is not correctly balanced and nested with other commands. I hope to fix this in the future.
The inline mode is ended by a newline, or by any block level commands. When inline mode ends, we need to close a special command if it is open, close any inline level repeat, temporarely close all inline commands, and emit a newline command; the end-inline function does this.
The close-inline: event is generated by functions that conditionally need to end the inline mode; see 〈Stage 2 functions〉 for more details.

〈The stage 2 FSM states〉 +≡

in-inline: [
"'" "�" "`" {"} in-line-comment
";" "comment" "rem" in-comment
default: (stage3 cmd opts)

"." (close-all-inline stage3 "^/" none) continue return
" " (close-last-inline)
"," (close-all-inline)

"link" "li" (open-special "link" opts)
"anchor" "a" (open-special "anchor" opts)
"link." "li." (close-special "link.")
"anchor." "a." (close-special "anchor.")

"b" "bold" (open-inline "b" opts)
"u" "underline" (open-inline "u" opts)
"i" "italics" "italic" (open-inline "i" opts)
"s" "strike" "strikethrough" (open-inline "s" opts)
"font" "f" (open-inline "f" opts)
"b." "bold." (close-inline "b.")
"u." "underline." (close-inline "u.")
"i." "italics." "italic." (close-inline "i.")
"s." "strike." "strikethrough." (close-inline "s.")
"font." "f." (close-inline "f.")

"repeat" (open-inline "repeat" opts)
"repeat." (close-repeat)

"^/" (end-inline) return
"box" "table" "c" "center" "center." "box." "table." "-"
"1" "1'" "2" "2'" "3" "3'" "*" "**" "#" "##" "csv"
">" ":" "::" "word" "def" "example" "toc" "cell" "cell."
"row" "column" "left" "right" "left." "right." "r" "l" "span"
"html" "rebol" "makedoc" "justify" "j" "justify." "toc."
"o" "x" "2�" "1�" "3�" "row." "column." "4" "5" "6"
(end-inline) continue return

close-inline: return
]

The other states we need are in-line-comment, in-comment and eat-one-newline.

〈The stage 2 FSM states〉 +≡

in-line-comment: [
"^/" return
]
in-comment: [
";." "comment." "rem." return
]
eat-one-newline: [
"^/" return
default: continue return
]

7.2 Stage 2 functions

〈Stage 2 functions〉 ≡

; block commands stack
block-stack: [ ]
; inline commands stack
inline-stack: [ ]
; special inline commands
special: none
end-inline: does [
if special [close-special special]
close-repeat/only
temp-close-inline
stage3 "^/" none
]
; open special inline command (=link or =anchor)
open-special: func [cmd opts] [
; close previous special if open
if special [close-special special]
special: join cmd "."
temp-close-inline
stage3 cmd opts
reopen-inline
]
; close special inline command
close-special: func [cmd] [
if special = cmd [
  temp-close-inline
  stage3 cmd none
  reopen-inline
  special: none
]
]
; close all open block commands
close-all-block: does [
; process in reverse order
block-stack: skip tail block-stack -2
while [not empty? block-stack] [
  ; close command
  stage3 join block-stack/1 "." none
  block-stack: skip clear block-stack -2
]
]
; reopen all inline commands that where temporarily closed
reopen-inline: does [
foreach [cmd opts] inline-stack [
  stage3 cmd opts
]
]
; remove last command from inline stack
remove-last-inline: has [cmd] [
either empty? inline-stack [
  if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [
   stage3 join cmd "." none
   clear skip tail block-stack -2
  ]
] [
  clear skip tail inline-stack -2
]
]
; remove all inline commands
remove-all-inline: has [cmd] [
clear inline-stack
block-stack: tail block-stack
while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [
  stage3 join cmd "." none
  block-stack: skip block-stack -2
]
block-stack: head clear block-stack
]
; open block command
open-block: func [cmd opts] [
stage3 cmd opts
insert/only insert tail block-stack cmd opts
]
; remove specific inline command
remove-inline: func [cmd] [
if cmd: find/skip inline-stack cmd 2 [
  remove/part cmd 2
]
]
; close specific block command
close-block: func [cmd /upto noclosecmd /local] [
; cmd ends with a ., compute opening command without the .
remove back tail cmd: copy cmd
; is the block open?
if local: find/skip/last block-stack cmd 2 [
  if upto [
   ; closing cmd should never close noclosecmd
   ; so if cmd appears before noclosecmd, it should
   ; not actually be closed
   noclosecmd: find/skip/last block-stack noclosecmd 2
   if all [noclosecmd (index? local) < index? noclosecmd] [
    exit
   ]
  ]
  ; close all blocks up to specified one (included)
  block-stack: tail block-stack
  until [
   block-stack: skip block-stack -2
   ; close this block
   stage3 join block-stack/1 "." none
   block-stack/1 = cmd
  ]
  block-stack: head clear block-stack
]
]
; close all open inline commands
close-all-inline: does [
; process in reverse order
inline-stack: skip tail inline-stack -2
while [not empty? inline-stack] [
  ; close command
  ; since we happen to close and reopen inline commands
  ; a lot, it is possible that we are closing a command that
  ; had just been opened; to avoid that, we check the out block,
  ; and directly remove the open command if that was the last command
  ;either inline-stack/1 = pick tail out -2 [
  ; clear skip tail out -2
  ;] [
   stage3 join inline-stack/1 "." none
  ;]
  inline-stack: skip clear inline-stack -2
]
; close special commands too
if special [close-special special]
; close alignment commands too
block-stack: tail block-stack
while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [
  ; we need to emit a newline too if we close an alignment
  if not close-inline? [
   stage3 "^/" none
   ; forces the fsm to go back to in-block state
   close-inline?: yes
  ]
  stage3 join cmd "." none
  block-stack: skip block-stack -2
]
block-stack: head clear block-stack
]
; open inline command
open-inline: func [cmd opts] [
stage3 cmd opts
insert/only insert tail inline-stack cmd opts
]
; add inline command (to be opened when we enter inline mode)
add-inline: func [cmd opts] [
insert/only insert tail inline-stack cmd opts
]
; close last opened inline command
close-last-inline: has [cmd] [
; if empty, check for special and alignment
if empty? inline-stack [
  if special [
   close-special special
   exit
  ]
  if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [
   stage3 "^/" none
   ; forces the fsm to go back to in-block state
   close-inline?: yes
   stage3 join cmd "." none
   clear skip tail block-stack -2
  ]
  exit
]
cmd: pick tail inline-stack -2
; check if this had just been opened (see note above)
;either cmd = pick tail out -2 [
; clear skip tail out -2
;] [
  stage3 join cmd "." none
;]
; remove from stack
clear skip tail inline-stack -2
]
; close specific inline command
close-inline: func [cmd] [
; this is special because we need to close all commands up to
; the specified one (included), but then reopen all except the specified
; one. (so that "=b text =i text =b. text" becomes "=b text =i text =i. =b. =i text")
; cmd ends with a ., compute opening command without the .
remove back tail cmd: copy cmd
; is the command open?
if find/skip inline-stack cmd 2 [
  ; close all commands up to specified one (included)
  inline-stack: tail inline-stack
  until [
   inline-stack: skip inline-stack -2
   ; see note in close-all-inline
   ;either inline-stack/1 = pick tail out -2 [
   ; clear skip tail out -2
   ;] [
    stage3 join inline-stack/1 "." none
   ;]
   inline-stack/1 = cmd
  ]
  ; remove this command
  remove/part inline-stack 2
  ; reopen all other commands
  foreach [cmd opts] inline-stack [
   stage3 cmd opts
  ]
  inline-stack: head inline-stack
]
]
; temporarily close all inline commands (to be reopened later)
; this happens at block end... so that "=b =box." becomes "=b =b. =box. =b"
temp-close-inline: does [
; if empty, we have nothing to do
if empty? inline-stack [exit]
; process in reverse order
inline-stack: tail inline-stack
until [
  inline-stack: skip inline-stack -2
  ; close command
  ; since we happen to close and reopen inline commands
  ; a lot, it is possible that we are closing a command that
  ; had just been opened; to avoid that, we check the out block,
  ; and directly remove the open command if that was the last command
  ;either inline-stack/1 = pick tail out -2 [
  ; clear skip tail out -2
  ;] [
   stage3 join inline-stack/1 "." none
  ;]
  head? inline-stack
]
]
close-inline?: no
close-repeat: func [/only] [
; similar to close-inline, but does not reopen commands; also could
; be closing a block repeat
; is the command open?
either find/skip inline-stack "repeat" 2 [
  ; close all commands up to specified one (included)
  inline-stack: tail inline-stack
  until [
   inline-stack: skip inline-stack -2
   stage3 join inline-stack/1 "." none
   inline-stack/1 = "repeat"
  ]
  ; remove from stack
  inline-stack: head clear inline-stack
] [
  ; maybe it's a block repeat?
  if all [not only find/skip block-stack "repeat" 2] [
   ; go back to block mode
   if special [close-special special]
   temp-close-inline
   stage3 "^/" none
   close-inline?: yes
   close-block "repeat."
  ]
]

]

7.3 Initialization and termination of the FSM

〈Initialization and termination of the FSM〉 ≡

init-stage2: does [
; initialize stuff
clear stage2-ctx/block-stack
clear stage2-ctx/inline-stack
stage2-ctx/special: none
;clear stage2-ctx/anchors
stage2-fsm/init stage2-ctx/in-block
]
end-stage2: does [
; cause everything to be closed
stage2-fsm/event "."
; actually, not needed...
stage2-fsm/end
]

7.4 Additional code for the stage2 function

〈Additional code for the stage2 function〉 ≡

if stage2-ctx/close-inline? [
stage2-fsm/event first [close-inline:]
stage2-ctx/close-inline?: no
]

7.5 The merge-style function

〈The merge-style function〉 ≡

merge-style: func [old new /copy /local val] [
if object? new [
  either object? old [
   if copy [old: make old [ ]]
   foreach word next first new [
    if val: get in new word [
     set in old word val
    ]
   ]
  ] [
   old: new
  ]
]
old
]

8. Scan a text string and return a QML document tree

〈Scan a text string and return a QML document tree〉 ≡

init-stage2
init-stage3
if not keep [set-defaults defaults]
parse-qml text if defaults [select defaults "alias"]
end-stage2
; returns output
end-stage3

8.1 Search a QML document tree for a substring

〈Search a QML document tree for a substring〉 ≡

res: make block! 16
doc: build-search-index doc
if anchor: find doc/2 text [
insert/only insert/only res anchor/3 anchor/2
]
foreach [header string] skip doc 2 [
while [string: find/tail string text] [
insert insert/only tail res header copy/part skip string -50 120
]
]
res

8.1.1 search's locals

〈search's locals〉 ≡

res anchor

9. Searching a QML document

〈Searching a QML document〉 ≡

10. Setting default options

〈Setting default options〉 ≡

default-number-style:
default-table:
default-row:
default-column:
default-cell:
default-box:
default-image: none
; defaults that cannot be currently set
default-data: context [name: "csv" index: none]
default-repeat: [csv in csv]
merge-style: func [old new /copy /local val] [
if object? new [
  either object? old [
   if copy [old: make old [ ]]
   foreach word next first new [
    if val: get in new word [
     set in old word val
    ]
   ]
  ] [
   old: new
  ]
]
old
]
set-defaults: func [defaults /local w] [
default-number-style: default-table: default-row: default-column:
default-cell: default-box: default-image: none
if block? defaults [
  foreach [cmd opts] defaults [
   if w: select [
    "toc" default-number-style
    "table" default-table
    "row" default-row
    "column" default-column
    "cell" default-cell
    "box" default-box
    "image" default-image
   ] cmd [
    set w parse-command-options cmd opts
   ]
  ]
]
]

11. Generating QML document tree (stage 3)

〈Generating QML document tree (stage 3)〉 ≡

stage3: func [cmd opts] [
;brkpnt "STAGE 3" [cmd opts]
if block? cmd [cmd: first cmd]
stage3-ctx/cmd: cmd
stage3-ctx/opts: opts
stage3-fsm/event cmd
]
stage3-fsm: make fsm! [ ]
out: [ ]
init-stage3: does [
clear out
clear stage3-ctx/blocks
clear stage3-ctx/vars
clear stage3-ctx/anchors
stage3-ctx/local-vars: copy [[ ]]
stage3-ctx/csvid: stage3-ctx/tabid: 1
insert out 'qml
;stage3-fsm/tracing: yes
stage3-fsm/init stage3-ctx/initial
]
end-stage3: does [
stage3-fsm/end
rewrite out rewrite-rules
make-toc out
set-enum-counts out
out
]
stage3-ctx: context [
〈Stage 3 FSM〉
]
rewrite-rules: use [x y z] [[
〈Stage 3 rewrite rules (optimization, handling of =data and =repeat)〉
]]
〈Words used by the rewrite rules〉
〈Header numbering〉
〈Postprocessing〉
process-link: func [target] [
either parse/all target [["http://" | "mailto:" | "ftp://" | "www."] to end] [
compose [target: (target) class: "external" text: (target)]
] [
compose [target: (join http://www.qtask.com/qwiki.cgi?goto= target) class: "internal" text: (target)]
]
]

11.1 Header numbering

〈Header numbering〉 ≡

numbering: context [
; only 6 levels supported
toc-counters: [0 0 0 0 0 0]
; number style for each level
toc-style: ["1. " "1[.1] "]
chars: complement charset "1AaIi[]"
make-number: func [level /local style i res mk1 mk2 rpt term cont] [
  ; toc-style set to none means no numbering
  if not toc-style [return ""]
  i: 1
  style: any [pick toc-style level last toc-style]
  res: make string! 16
  ; first increment/reset counters
  poke toc-counters level 1 + pick toc-counters level
  change/dup skip toc-counters level 0 subtract length? toc-counters level
  ; generate number
  term: [
   mk1: any chars mk2: (insert/part tail res mk1 mk2)
   [
    "1" (insert tail res pick toc-counters i)
    |
    "A" (insert tail res pick "ABCDEFGHIJKLMNOPQRSTUVWXYZ" min 26 pick toc-counters i)
    |
    "a" (insert tail res pick "abcdefghijklmnopqrstuvwxyz" min 26 pick toc-counters i)
    |
    "I" (insert tail res uppercase to-roman pick toc-counters i)
    |
    "i" (insert tail res to-roman pick toc-counters i)
   ]
   mk1: any chars mk2: (insert/part tail res mk1 mk2)
   (i: i + 1 cont: either i > level ['break] [[ ]]) cont
  ]
  parse/all/case style [
   some [rpt: "[" some term "]" (cont: either i > level ['break] [[:rpt]]) cont | term]
   rpt: (insert tail res rpt)
  ]
  res
]
romans: [
  ["" "i" "ii" "iii" "iv" "v" "vi" "vii" "viii" "ix"]
  ["" "x" "xx" "xxx" "xl" "l" "lx" "lxx" "lxxx" "xc"]
  ["" "c" "cc" "ccc" "cd" "d" "dc" "dcc" "dccc" "cm"]
]
; form an integer from 0 to 999 with roman numerals
to-roman: func [int /local res] [
  int: form int
  res: make string! 16
  forall int [
   insert tail res pick pick romans length? int int/1 - #"/"
  ]
  res
]
; set toc-style
set-style: func [style] [
  style: any [style default-number-style]
  if none? style [toc-style: none exit]
  if not toc-style: select/case [
   "1" ["1. " "1[.1] "]
   "A" ["A. " "A[.1] "]
   "a" ["a) " "a1) " "a1[.1]) "]
   "I" ["I. " "I[.1] "]
   "i" ["i) " "i[.1]) "]
  ] style [
   toc-style: parse/all style "|"
  ]
]
; reset counters
reset: does [change/dup toc-counters 0 6]
]

11.2 Postprocessing

〈Postprocessing〉 ≡

collect: func [output doc rule /local node] [
match doc [set node into rule (append/only output copy/deep node)]
output
]
mkopts: func [level id] [compose [number: (numbering/make-number level) id: (id)]]
make-toc: func [doc /local style toc here l headid] [
headid: 1
match doc [
  'section
  [here: 'opts set style string! (remove/part here 2) :here | (style: none)]
  (numbering/set-style style)
  into ['toc toc: to end] here: to end (
   numbering/reset
   collect toc here [
    ['header1 (l: 1) | 'header2 (l: 2) | 'header3 (l: 3)]
    [
     'opts set here block! (append here mkopts l headid headid: headid + 1)
     |
     here: (here: insert/only insert here 'opts mkopts l headid headid: headid + 1) :here
    ]
    to end
    |
    ; add number but do not collect!
    ['header4 (l: 4) | 'header5 (l: 5) | 'header6 (l: 6)]
    [
     'opts set here block! (insert insert tail here [number:] numbering/make-number l)
     |
     here: (here: insert/only insert here 'opts compose [number: (numbering/make-number l)]) :here
    ]
    ; fail (so that it is not collected)
    end skip
   ]
   ; remove anchors from TOC
   rewrite toc [
    [into ['anchor opt ['opts skip] here: to end]] [(here)]
   ]
  )
]
]
count: func [counter options] [
if block? options [
  options: construct options
  if all [in options 'force integer? options: attempt [to integer! options/force]] [set counter options]
]
options: get counter
set counter 1 + options
options
]
set-enum-counts: func [doc /local count1 count2 opts] [
count1: count2: 1
match doc [
  'enum some [
   into [
    'item [
     'opts set opts block! (insert insert tail opts [number:] count 'count1 opts)
     |
     opts: (opts: insert/only insert opts 'opts compose [number: (count 'count1 opts)]) :opts
    ] (count2: 1) to end
    |
    'enum some [
     into [
      'item [
       'opts set opts block! (insert insert tail opts [number:] count 'count2 opts)
       |
       opts: (opts: insert/only insert opts 'opts compose [number: (count 'count2 opts)]) :opts
      ] to end
     ]
     |
     skip
    ]
   ]
   |
   skip
  ]
]
]

11.3 Stage 3 FSM

〈Stage 3 FSM〉 ≡

cmd: opts: none

emit: func [val] [
repend out val
]
blocks: [ ]
open-block: func [name opts /only] [
insert/only tail blocks out
insert/only tail out out: make block! 16
emit name
if not none? opts [
  if all [object? opts not only] [opts: make-style opts]
  emit ['opts opts]
]
]
close-block: does [
if empty? blocks [exit]
out: last blocks
remove back tail blocks
]
; shortcut to make the state inheritance trick readable.
; appending the parent state's directives to the new directives
; makes the new state "inherit" them.
inherit: func [parent-state new-directives] [
append new-directives parent-state
]
make-style: func [obj /ignore block /local] [
local: make block! length? obj: third obj
block: any [block [ ]]
foreach [word val] obj [
  if all [:val not find block to word! word] [insert/only insert tail local word :val]
]
local
]

; common state directives, used by almost all states
common: [
; by default emit unknown commands verbatim
default: (emit [reduce ['command cmd opts]])
; the dot command makes the fsm go to the initial state
; note: this should not happen anymore, since "." commands are handled by the scanner.
"." rewind? initial
; collected anchor names block from the scanner
; (normally just at the beginning of the doc)
;anchors: (anchors: opts)
]
; basic block state, also the initial state
anchors: [ ]
; variables that can be used by =data
vars: [ ]
tabid: 1
open-table: func [opts] [
if not object? opts [
  opts: refinements/get-obj "table"
]
if not opts/name [
  opts/name: join "table" tabid
  if tabid = 1 [insert insert tail vars "table" context [type: 'alias dest: "table1"]]
  tabid: tabid + 1
]
open-block/only 'table-proto opts
insert insert tail vars opts/name context [type: 'table-proto name: opts/name contents: out]
]
in-block: initial: inherit common [
"" "link" "anchor" "image" "b" "i" "f" "data" text: (open-block 'para none) continue in-para (close-block)
after-para: ( )
; newlines produce an empty paragraph in the initial state
"^/" (emit [[para]])
"-" (emit [[hrule]])
"1" (open-block 'header1 none) in-para (close-block)
"2" (open-block 'header2 none) in-para (close-block)
"3" (open-block 'header3 none) in-para (close-block)
"4" (open-block 'header4 none) in-para (close-block)
"5" (open-block 'header5 none) in-para (close-block)
"6" (open-block 'header6 none) in-para (close-block)
"1'" "1�" (open-block 'header1* none) in-para (close-block)
"2'" "2�" (open-block 'header2* none) in-para (close-block)
"3'" "3�" (open-block 'header3* none) in-para (close-block)
"*" "**" (open-block 'bullets none) continue in-ulist (close-block)
"#" "##" (open-block 'enum none) continue in-olist (close-block)
"o" "x" (open-block 'checks none) continue in-checklist (close-block)
">" (open-block 'para compose [indent: (opts)]) in-para (close-block)
"::" (open-block 'para [indent: 2]) in-item (close-block)
":" (open-block 'definitions none) continue in-dlist (close-block)
"box" (open-block 'box merge-style/copy default-box opts) in-box (close-block)
"toc" (open-block 'section opts open-block 'toc none) in-toc (close-block)
"table" (open-table opts) in-table (close-block)
"c" (open-block 'para [text-halign: center]) in-para (close-block)
"center" (open-block 'center none) in-center (close-block)
"l" (open-block 'para [text-halign: left]) in-para (close-block)
"left" (open-block 'left none) in-left (close-block)
"r" (open-block 'para [text-halign: right]) in-para (close-block)
"right" (open-block 'right none) in-right (close-block)
"j" (open-block 'para [text-halign: justify]) in-para (close-block)
"justify" (open-block 'justify none) in-just (close-block)
"example" "html" "rebol" "makedoc" (emit [reduce ['escape cmd opts]])
"csv" (handle-csv opts)
"repeat" (open-block 'repeat any [opts default-repeat]) in-repeat (close-block)
; these cannot actually happen in the initial state, but they could
; happen in all the states inheriting from in-block
"center." continue rewind? in-center
"left." continue rewind? in-left
"right." continue rewind? in-right
"justify." continue rewind? in-just
"box." continue rewind? in-box
"table." continue rewind? in-table
]
〈CSV data handling〉

11.4 CSV data handling

〈CSV data handling〉 ≡

csvid: 1
handle-csv: func [data] [
if not object? data [exit]
data: make data [type: 'csv]
either data/name [
  insert insert tail vars data/name data
] [
  insert insert tail vars join "csv" csvid data
  if csvid = 1 [insert insert tail vars "csv" data]
  csvid: csvid + 1
]
if data/show [
  open-block 'table none
  foreach row data/contents [
   open-block 'row none
   foreach column row [
    emit [reduce ['cell reduce ['para column]]]
   ]
   close-block
  ]
  close-block
]
]

11.5 Other states

〈Stage 3 FSM〉 +≡

; inside a centered block. same as initial state, except that =center. will end it.
in-center: inherit in-block [
"center." override after-para return
]
; inside a left aligned block. same as initial state, except that =left. will end it.
in-left: inherit in-block [
"left." override after-para return
]
; inside a right aligned block. same as initial state, except that =right. will end it.
in-right: inherit in-block [
"right." override after-para return
]
; inside a justified block. same as initial state, except that =justify. will end it.
in-just: inherit in-block [
"justify." override after-para return
]

in-repeat: inherit in-block [
"repeat." override after-para return
]

; inside a paragraph (or other inline content)
in-para: inherit common [
text: (emit opts)
; anything else ends the paragraph
"^/" override after-para return
;default: continue return
; formatting
"b" (open-block 'bold none) in-bold (close-block)
"i" (open-block 'italic none) in-italic (close-block)
"s" (open-block 'strike none) in-strike (close-block)
;"u" (emit <u>) in-underline (emit </u>)
; the empty command is actually the =[qwiki link] command
"" (emit [reduce ['qlink opts]])
"link" (open-block 'link-proto opts) in-link (close-block)
"f" (open-block 'font opts) in-font-inline (close-block)
"image" (emit [reduce ['image 'opts make-style merge-style/copy default-image opts]])
"anchor" (open-block 'anchor opts if opts [insert/only insert tail anchors opts out]) in-anchor (close-block)
"data" (emit [reduce ['data make-style merge-style/copy default-data opts]])
"repeat" (open-block 'repeat any [opts default-repeat]) in-repeat-inline (close-block)
]
in-repeat-inline: inherit in-para [
"repeat." return
; note that "^/" cannot happen here thanks to stage2
]
; inside a link
in-link: inherit in-para [
"link." return
]
; inside an anchor
in-anchor: inherit in-para [
"anchor." return
]
; inside inline font command
in-font-inline: inherit in-para [
"f." return
]
; inside inline bold
in-bold: inherit in-para [
"b" in-bold ; recurse (to enforce balancing)
"b." return
]
; inside inline italic
in-italic: inherit in-para [
"i" in-italic ; recurse
"i." return
]
; inside strikethrough
in-strike: inherit in-para [
"s" in-strike ; recurse
"s." return
]
; inside underline
in-underline: inherit in-para [
"u" in-underline ; recurse
"u." return
]

; inside a term definition block (=:)
in-dlist: [
; a new =: starts a new definition in the same block
":" (open-block 'term none) in-item (close-block)
; =:: starts the definition description
"::" (open-block 'desc none) in-item (close-block)
after-para: ( )
; anything else outside of a description ends the definition block
default: continue return
]

; inside a checklist
in-checklist: [
; new items
"o" (open-block 'check compose [checked: (no)]) in-item (close-block)
"x" (open-block 'check compose [checked: (yes)]) in-item (close-block)
; nothing to do after para
after-para: ( )
; anything else ends the list
default: continue return
]

; inside a bullet list
in-ulist: [
; new item
"*" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block)
; start sublists
"**" (open-block 'bullets none) continue in-ulist2 (close-block)
"##" (open-block 'enum none) continue in-olist2 (close-block)
; nothing to do after para
after-para: ( )
; anything else ends the list
default: continue return
]
; inside a numbered list
in-olist: [
; new item
"#" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block)
; start sublists
"##" (open-block 'enum none) continue in-olist2 (close-block)
"**" (open-block 'bullets none) continue in-ulist2 (close-block)
; nothing to do after para
after-para: ( )
; anything else ends the list
default: continue return
]
; bullet sublist
in-ulist2: [
; new item
"**" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block)
; nothing to do after para
after-para: ( )
; anything else ends the list
default: continue return
]
; numbered sublist
in-olist2: [
; new item
"##" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block)
; nothing to do after para
after-para: ( )
; anything else ends the list
default: continue return
]
; inside a list's item
in-item: in-para

; inside a box
; box should work this way:
; if there's something after the =box command and before the newline, treat
; it as the box's header; otherwise, if there's a newline immediately following
; the =box command, emit no header
in-box: [
; newline just after =box, emit no header
; also go to box-contents after the header
"^/" after-para: box-contents
; a block level command just after =box, emit no header
default: continue box-contents
; inline level command means that we have a header
"" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u"
; close command or coming back from box-contents
"box." override after-para return
]
box-contents: inherit in-block [
; same as initial except that ends on =box.
"box." continue return
]

in-toc: [
; newline just after =toc, emit no title
; also go to in-block after the header
"^/" after-para: (close-block) in-toc2
; a block level command just after =toc, emit no title
default: (close-block) continue in-toc2
; inline level command means that we have a title
"" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u"
; close command or coming back from in-toc2
"toc." (close-block) override after-para return
]
in-toc2: inherit in-block [
"toc." override after-para 2 return
]

〈Generating tables〉

11.6 Generating tables

〈Generating tables〉 ≡

; table states (see also table code below)
; inside a table
in-table: inherit common [
"row" (emit ['row opts])
"column" (emit ['column opts])
; go back to saved position
"row." "column." (emit 'return)
"cell" (open-block/only 'cell opts) in-cell-block (close-block)
"table." return
; define a spanning cell
"span" (emit ['span opts])
; anything else starts a new cell
default: (open-block 'cell none) continue in-cell (close-block)
"repeat" (open-block 'repeat any [opts default-repeat]) in-table-repeat (close-block)
]
in-table-repeat: inherit in-table [
"repeat." return
]
; inside an implied table cell (no =cell command given)
in-cell: inherit in-block [
; lists are treated specially in tables... each item in a cell
"*" (
  open-block 'bullets none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]
) in-item (close-block close-block)
"**" (
  open-block 'bullets none
  open-block 'bullets none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]
) in-item (close-block close-block close-block)
"#" (
  open-block 'enum none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]
) in-item (close-block close-block)
"##" (
  open-block 'enum none
  open-block 'enum none
  open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]
) in-item (close-block close-block close-block)
"::" (open-block 'para [indent: 2]) in-item (close-block)
":" (open-block 'definitions none) override define in-cell-dlist (close-block)
"o" (open-block 'checks none open-block 'check no) in-item (close-block close-block)
"x" (open-block 'checks none open-block 'check yes) in-item (close-block close-block)
; end current cell, go back to in-table state
after-para: "^/" return
]
; inside a block table cell (=cell ... =cell.)
in-cell-block: inherit in-block [
; end current cell
"cell." return
]
; inside a term definition block (=:) inside a cell
in-cell-dlist: [
define: (open-block 'term none) in-item (close-block)
; =:: starts the definition description
"::" (open-block 'desc none) in-item (close-block)
after-para: ( )
; anything else outside of a description ends the definition block and the cell
default: continue 2 return
]

header?: func [type col row] [
switch type [
  horiz [row = 1]
  vert [col = 1]
  both [any [row = 1 col = 1]]
  none [false]
]
]
generate-table: func [opts body /local table result content tmp i j header] [
result: copy [table]
if opts [insert/only insert tail result 'opts make-style/ignore opts [name]]
table: make table-state! [
  style: opts
  if all [object? style style/table-size] [size: style/table-size]
  table: make block! 16
  columns: make block! 16
]
; first, eval any 'repeat etc.
body: rewrite copy body rewrite-rules
parse body [
  some [
   'row set opts skip (add-row table opts)
   |
   'column set opts skip (add-col table opts)
   |
   'return (table-go-back table)
   |
   into ['cell ['opts set opts skip | (opts: none)] content: to end (add-cell table opts content)]
   |
   'span set opts skip (make-span table opts) opt [
    into ['cell ['opts set opts skip | (opts: none)] content: to end (set-cell table opts content)]
   ]
  ]
]
header: 'horiz
if object? table/style [
  if table/style/name [
   poke find vars table/style/name 2 table
  ]
  if table/style/hide [return [hidden-table]]
  if table/style/vertical [header: either table/style/horizontal ['both] ['vert]]
  if table/style/headerless [header: 'none]
]
insert/only tail result content: copy [columns]
foreach col table/columns [
  insert/only tail content either col [reduce ['column 'opts make-style col]] [[column]]
]
j: 1
foreach row table/table [
  ; emit a row
  ; row can be none if it's empty
  either object? row [
   i: 1
   insert/only tail result content: copy [row]
   if row/style [insert/only insert tail content 'opts make-style row/style]
   if all [find [horiz both] header j = 1] [insert tail content 'header]
   foreach cell row/contents [
    either object? cell [
     ; emit cell, unless it is hidden by spanned cell
     if cell/type = 'cell [
      insert/only tail content compose [
       cell (either cell/style ['opts] [[ ]])
        (either cell/style [reduce [make-style/ignore cell/style [position]]] [[ ]])
        (either cell/spansize ['span] [[ ]]) (any [cell/spansize [ ]])
        (either header? header i j ['header] [[ ]])
        (cell/out)
      ]
     ]
    ] [
     ; cell can be none if it's empty
     tmp: make-cell-style table none row pick table/columns i
     insert/only tail content compose [
      cell (either tmp ['opts] [[ ]])
       (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
       (either header? header i j ['header] [[ ]])
     ]
    ]
    i: i + 1
   ]
   loop table/size/x - length? row/contents [
    tmp: make-cell-style table none row pick table/columns i
    insert/only tail content compose [
     cell (either tmp ['opts] [[ ]])
      (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
      (either header? header i j ['header] [[ ]])
    ]
    i: i + 1
   ]
  ] [
   ; if row contains a spanned cell, then it is never empty (we put references there)
   insert/only tail result content: copy [row]
   if all [find [horiz both] header j = 1] [insert tail content 'header]
   if not any [table/style]
   repeat i table/size/x [
    tmp: make-cell-style table none none pick table/columns i
    insert/only tail content compose [
     cell (either tmp ['opts] [[ ]])
      (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
      (either header? header i j ['header] [[ ]])
    ]
   ]
  ]
  j: j + 1
]
loop table/size/y - length? table/table [
  insert/only tail result content: copy [row]
  if all [find [horiz both] header j = 1] [insert tail content 'header]
  repeat i table/size/x [
   tmp: make-cell-style table none none pick table/columns i
   insert/only tail content compose [
    cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]])
     (either header? header i j ['header] [[ ]])
   ]
  ]
  j: j + 1
]
result
]

; Table handling - needs two passes
table-state!: context [
type: 'table
; block of rows
table:
; list of column styles
columns:
; current row
currow:
; current cell
curcell:
; current position (pair!)
curpos: none
; direction (where to add new cells, 1x0 = to the right, 0x1 = to the bottom, and so on)
dir: 0x1
; table size in cells
size: 0x0
; table style
style: none
; =row and =column save current position so that =row. and =column. can go back to it
savepos: savedir: none
]
inherit-style: func [dest source words] [
foreach word words [
  if none? get in dest word [
   set in dest word get in source word
  ]
]
dest
]
make-cell-style: func [table-state style row col] [
if not style [
  style: context [
   background: color: outline-color: image: typeface: width: height: fontsize:
   bold: italic: outline-style: text-halign: text-valign:
   image-halign: image-valign: image-tiling: position: none
  ]
]
; inherit outline style from table if needed
if all [table-state/style table-state/style/all] [
  inherit-style style table-state/style [outline-color outline-style]
]
either table-state/dir = 1x0 [
  ; row style has priority, so it must be applied last
  if all [col col/all] [
   ; background is there to override header cells
   inherit-style style col [
    background color outline-color typeface height fontsize bold italic outline-style
    text-halign text-valign
   ]
  ]
  if all [row row/style row/style/all] [
   inherit-style style row/style [
    background outline-color width outline-style
   ]
  ]
] [
  ; column style has priority, so it must be applied last
  if all [row row/style row/style/all] [
   inherit-style style row/style [
    background outline-color width outline-style
   ]
  ]
  if all [col col/all] [
   ; background is there to override header cells
   inherit-style style col [
    background color outline-color typeface height fontsize bold italic outline-style
    text-halign text-valign
   ]
  ]
]
; if no value has been specified, just cancel whole style
if parse second style [object! some none! end] [style: none]
style
]
make-row: func [table-state pos style' /local row] [
; do we need to add rows?
if pos/y > length? table-state/table [
  insert/dup tail table-state/table none pos/y - length? table-state/table
]
; create it if it does not exist
either row: pick table-state/table pos/y [
  row/style: merge-style row/style style'
] [
  ; (poke does not return ROW in older REBOLs)
  poke table-state/table pos/y row: context [
   ; row contents (cells)
   contents: make block! 16
   style: style'
   ; first row is header unless vertical style has been specified, or table is headerless
   ;header: if not all [table-state/style any [table-state/style/vertical table-state/style/headerless]] [pos/y = 1]
  ]
]
row
]
make-col: func [table-state pos style /local col] [
; do we need to add rows?
if pos/x > length? table-state/columns [
  insert/dup tail table-state/columns none pos/x - length? table-state/columns
]
; create it if it does not exist
either col: pick table-state/columns pos/x [
  merge-style col style
] [
  poke table-state/columns pos/x style
  ; (poke does not return STYLE in older REBOLs)
  style
]
]
add-row: func [table-state args /local pos] [
; save current direction (for =row.)
table-state/savedir: table-state/dir
; add new row (=row command)
; see if position has been specified
if object? args [
  pos: args/position
  if pair? pos [pos: pos/y]
]
; if not, assign default position
either pos [
  ; save current position (for =row.)
  table-state/savepos: table-state/curpos
  table-state/curpos: 0x1 * pos
] [
  either table-state/curpos [
   ; position must be saved so that =row. does not make the next
   ; cell overwrite the row
   table-state/savepos: table-state/curpos + 0x1
   table-state/curpos: table-state/curpos * 0x1 + 0x1
  ] [
   table-state/savepos: table-state/curpos: 0x1
  ]
]
; create row object
args: merge-style/copy default-row args
make-row table-state table-state/curpos args
; =row command also means switching to horizontal direction for new cells
table-state/dir: 1x0
]
add-col: func [table-state args /local pos] [
; save current direction (for =column.)
table-state/savedir: table-state/dir
; add new column (=column command)
; see if position has been specified
if object? args [
  pos: args/position
  if pair? pos [pos: pos/x]
]
; if not, assign default position
either pos [
  ; save current position (for =column.)
  table-state/savepos: table-state/curpos
  table-state/curpos: 1x0 * pos
] [
  either table-state/curpos [
   ; position must be saved so that =column. does not make the next
   ; cell overwrite the column
   table-state/savepos: table-state/curpos + 1x0
   table-state/curpos: table-state/curpos * 1x0 + 1x0
  ] [
   table-state/savepos: table-state/curpos: 1x0
  ]
]
; create col object
args: merge-style/copy default-column args
make-col table-state table-state/curpos args
; =column command also means switching to vertical direction for new cells
table-state/dir: 0x1
]
table-go-back: func [table-state] [
; =row. or =column.
if all [table-state/savepos table-state/savedir] [
  table-state/curpos: table-state/savepos
  table-state/dir: table-state/savedir
  table-state/currow: pick table-state/table table-state/curpos/y
  table-state/curcell: pick table-state/currow/contents table-state/curpos/x
  table-state/savepos: table-state/savedir: none
]
]
; make room for a new cell in the table and create a cell object
make-cell: func [table-state pos style' contents /span spanrc /local row cell] [
row: make-row table-state pos none
; do we need to add cells?
if pos/x > length? row/contents [
  insert/dup tail row/contents none pos/x - length? row/contents
]
table-state/curpos: pos
table-state/currow: row
; set this as the current cell (creates it if it does not exist)
either cell: pick row/contents pos/x [
  if cell/type = 'span [
   ; span inside a span breaks the outer span
   either span [
    ; is the outer span already destroyed?
    either cell/reference/position [
     ; no, break it
     break-span table-state cell/reference pos pos + spanrc
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ] [
     ; yes, just overwrite it
     poke row/contents pos/x none
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ]
   ] [
    ; was this a destroyed cell?
    either cell/reference/position [
     ; no
     cell: cell/reference
    ] [
     ; yes, overwrite it
     poke row/contents pos/x none
     ; try again
     return make-cell table-state pos style' contents
    ]
   ]
  ]
  cell/style: merge-style cell/style style'
  if span [
   ; redefine the span?
   if cell/spansize [
    ; if we're reducing the span size we need to break the old span up
    if any [cell/spansize/y > spanrc/y cell/spansize/x > spanrc/x] [
     break-span table-state cell pos pos + spanrc
     ; try again
     return make-cell/span table-state pos style' contents spanrc
    ]
   ]
   cell/spansize: spanrc
  ]
  cell/out: contents
  table-state/curcell: cell
] [
  poke row/contents pos/x table-state/curcell: context [
   ; cell type - real cell (cells hidden by a spanning cell have type 'span)
   type: 'cell
   ; cell position in the table
   position: pos
   ; cell contents
   out: contents
   style: make-cell-style table-state style' row pick table-state/columns pos/x
   ; spanning cell?
   spansize: if span [spanrc]
   ; cell is header either if it is in the first row, or if vertical mode has been specified
   ; and it is in the first column; unless table is headerless
   ;header: either table-state/style [
   ; if not table-state/style/headerless [
   ; either table-state/style/vertical [pos/x = 1] [pos/y = 1]
   ; ]
   ;] [
   ; pos/y = 1
   ;]
  ]
  table-state/curcell
]
]
; break up a span (because a new span overlaps it)
break-span: func [table-state spancell breakstart breakend /local cellstart cellend] [
; we need to "subtract" the rectangle from breakstart to breakend from spancell
; we break spancell horizontally or vertically depending on the current table mode
cellstart: spancell/position
cellend: spancell/position + spancell/spansize
; spancell must be destroyed
; this flags it as destroyed when referencing from another cell
spancell/spansize: none
spancell/position: none
; remove it from the table
poke get in pick table-state/table cellstart/y 'contents cellstart/x none
; remove position from original style (as it must not propagate to new cells)
if spancell/style [spancell/style/position: none]
either table-state/dir = 1x0 [
  ; horizontal mode
  if breakstart/y > cellstart/y [
   ; we have a row at the top
   make-span table-state context [
    start: cellstart
    end: cellend
    end/y: breakstart/y - 1
   ]
   cellstart/y: breakstart/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/y < cellend/y [
   ; we have a row at the bottom
   make-span table-state context [
    start: cellstart
    start/y: breakend/y + 1
    end: cellend
   ]
   cellend/y: breakend/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakstart/x > cellstart/x [
   ; we have a column at the left
   make-span table-state context [
    start: cellstart
    end: cellend
    end/x: breakstart/x - 1
   ]
   cellstart/x: breakstart/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/x < cellend/x [
   ; we have a column at the right
   make-span table-state context [
    start: cellstart
    start/x: breakend/x + 1
    end: cellend
   ]
   cellend/x: breakend/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
] [
  ; vertical mode
  if breakstart/x > cellstart/x [
   ; we have a column at the left
   make-span table-state context [
    start: cellstart
    end: cellend
    end/x: breakstart/x - 1
   ]
   cellstart/x: breakstart/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/x < cellend/x [
   ; we have a column at the right
   make-span table-state context [
    start: cellstart
    start/x: breakend/x + 1
    end: cellend
   ]
   cellend/x: breakend/x
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakstart/y > cellstart/y [
   ; we have a row at the top
   make-span table-state context [
    start: cellstart
    end: cellend
    end/y: breakstart/y - 1
   ]
   cellstart/y: breakstart/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
  if breakend/y < cellend/y [
   ; we have a row at the bottom
   make-span table-state context [
    start: cellstart
    start/y: breakend/y + 1
    end: cellend
   ]
   cellend/y: breakend/y
   ; new span gets the style from original span
   set-cell table-state spancell/style [ ]
  ]
]
]
; make a cell reference (cell hidden by a spanned cell)
make-reference: func [table-state pos cell /local row old save] [
; make room if needed
row: make-row table-state pos none
; do we need to add cells?
if pos/x > length? row/contents [
  insert/dup tail row/contents none pos/x - length? row/contents
]
; are we overlapping another span?
if all [old: pick row/contents pos/x old/type = 'span old/reference/position] [
  ; need to save curcell, curpos, currow and out as break-span may change them
  save: reduce bind [curcell curpos currow] in table-state 'self
  break-span table-state old/reference cell/position cell/position + cell/spansize
  set bind [curcell curpos currow] in table-state 'self save
]
; overwrite any cell
poke row/contents pos/x local: context [
  ; cell type - span
  type: 'span
  reference: cell
]
local
]
; add a new cell (=cell command, or new paragraph)
add-cell: func [table-state args contents /local pos] [
; see if position has been specified
if object? args [pos: args/position]
; if not, assign default position (curpos + dir)
if not pos [
  pos: any [table-state/curpos 1x0]
  pos: pos + table-state/dir
]
; create cell or pick existing one at given position
args: merge-style/copy default-cell args
make-cell table-state pos args contents
; update table size
table-state/size: max table-state/size pos
]
; set the style for the current cell (=span=cell case)
set-cell: func [table-state args contents] [
either all [args args/position] [
  ; if position has been specified, this is a new cell
  add-cell table-state args contents
] [
  table-state/curcell/style: merge-style table-state/curcell/style args
  table-state/curcell/out: contents
]
]
; create a cell span
make-span: func [table-state args /local cell pos] [
; arguments must be provided, and valid, otherwise command is ignored
if not all [
  object? args pair? args/start pair? args/end
  set bind [start end] in args 'self reduce [min args/start args/end max args/start args/end]
] [exit]
; not really a span... make a normal cell
if args/start = args/end [
  make-cell table-state args/start none [ ]
  exit
]
; the idea here is that we create a cell with spansize
; then we put a reference to it in all the cells it covers
if cell: make-cell/span table-state args/start none [ ] 1x1 + args/end - args/start [
  pos: args/start + 1x0
  while [pos/y <= args/end/y] [
   while [pos/x <= args/end/x] [
    make-reference table-state pos cell
    pos: pos + 1x0
   ]
   pos/x: args/start/x
   pos: pos + 0x1
  ]
  ; update table size
  table-state/size: max table-state/size args/end
]
]

11.7 Stage 3 rewrite rules (optimization, handling of =data and =repeat)

〈Stage 3 rewrite rules (optimization, handling of =data and =repeat)〉 ≡

['table-proto ['opts set x skip | (x: none)] y: to end] [(stage3-ctx/generate-table x y)]
[into ['hidden-table]] [ ]
[into ['repeat 'opts set x block! into ['enum y: to end]]] [[enum [repeat opts [(x)] (y)]]]
[into ['repeat 'opts set x block! into ['bullets y: to end]]] [[bullets [repeat opts [(x)] (y)]]]
[into ['repeat 'opts set x block! y: to end]] [(stage3-ctx/eval-repeat x y)]
[into ['data none!]] [ ]
[into ['data set x block!]] [(stage3-ctx/eval-data x)]
[into ['qlink none!]] [ ]
['qlink set x string!] [(stage3-ctx/eval-qlink x)]
['link-proto 'opts set x url!] [link opts [target: (x)]]
['link-proto 'opts set x string!] [(stage3-ctx/eval-link x)]
['anchor 'opts set x string!] [anchor opts [name: (x)]]
; unwind cells outside of table (can be created by =data)
[y: into ['cell-if opt ['opts skip] opt ['span skip] opt ['header] x: to end]]
[(either 'row = first head y [y/1/1: 'cell copy/part y 1] [x])]
; unwind block level inside paras (can be created by =data)
[into [x: 'para 'opts set z block! any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]]
[[(copy/part x y)] (copy/part y 1) [para opts [(z)] (next y)]]
[into [x: 'para any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]]
[[(copy/part x y)] (copy/part y 1) [para (next y)]]
[y: ['para | 'item] opt ['opts skip] any [z: into ['para to end] :z break | skip] into ['para to end] to end] [(
rewrite copy y [
[into ['para 'opts set x block! y: to end]] [[font opts [(x)] (y)]]
[into ['para opt ['opts skip] y: to end]] [(y)]
]
)]
; box with only title
['box ['opts set x block! | (x: [ ])] into ['title y: to end] end] [box opts [(x)] [para (y)]]
; optimizations
[into ['bold]] [ ]
[into ['italic]] [ ]
[into ['strike]] [ ]
[into ['font opt ['opts skip]]] [ ]
; font with no opts
[into ['font x: [block! | string!] to end]] [(x)]
['font 'opts set x block! into ['bold y: to end] end] [font opts [(x) bold: (true)] (y)]
['font 'opts set x block! into ['italic y: to end] end] [font opts [(x) italic: (true)] (y)]
['bold into ['font 'opts set x block! y: to end] end] [font opts [(x) bold: (true)] (y)]
['italic into ['font 'opts set x block! y: to end] end] [font opts [(x) italic: (true)] (y)]
['link 'opts set x block! into ['font 'opts set y block! z: to end] end] [link opts [(x) (y)] (z)]
['alink 'opts set x block! into ['font 'opts set y block! z: to end] end] [alink opts [(x) (y)] (z)]
['anchor 'opts set x block! into ['font 'opts set y block! z: to end] end] [anchor opts [(x) (y)] (z)]
['para ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [para opts [(x) (y)] (z)]
['header1 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1 opts [(x) (y)] (z)]
['header1* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1* opts [(x) (y)] (z)]
['header2 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2 opts [(x) (y)] (z)]
['header2* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2* opts [(x) (y)] (z)]
['header3 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3 opts [(x) (y)] (z)]
['header3* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3* opts [(x) (y)] (z)]
['header4 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header4 opts [(x) (y)] (z)]
['header5 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header5 opts [(x) (y)] (z)]
['header6 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header6 opts [(x) (y)] (z)]
['item ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [item opts [(x) (y)] (z)]
['check ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [check opts [(x) (y)] (z)]
['term ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [term opts [(x) (y)] (z)]
['desc ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [desc opts [(x) (y)] (z)]
['title ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [title opts [(x) (y)] (z)]

11.7.1 Words used by the rewrite rules

〈Words used by the rewrite rules〉 ≡

11.8 Other functions

〈Stage 3 FSM〉 +≡

eval-qlink: func [target /local a] [
either a: select anchors target [
  compose/deep [alink opts [target: (target)] (skip a 3)]
] [
  compose/deep [link opts [(a: process-link target)] (select a [text:])]
]
]
eval-link: func [target] [
either find anchors target [
  compose/deep [alink opts [target: (target)]]
] [
  compose/deep [link opts [(process-link target)]]
]
]
eval-data: func [opts /local val p] [
opts: construct/with opts context [name: none index: none]
val: eval-var opts/name
if not val [return [ ]]
either object? val [
  p: in pickers val/type
  if p [
   do get in get p type?/word opts/index val opts/index
  ]
] [
  reduce [val]
]
]
pickers: context [
csv: context [
  none!: func [val index] [
   "Not yet."
  ]
  integer!: func [val index] [
   csv-row/none! context [content: pick val/contents index] none
  ]
  pair!: func [val index] [
   pick pick val/contents index/y index/x
  ]
]
table: context [
  none!: func [val index] [
   "Not yet."
  ]
  integer!: func [val index] [
   "Not yet."
  ]
  pair!: func [val index] [
   val: pick get in pick val/table index/y 'contents index/x
   if val/type = 'span [val: val/reference]
   cell/none! val none
  ]
]
table-proto: context [
  none!: integer!: pair!: func [val index] [
   rewrite val/contents rewrite-rules
   eval-data compose [name: (val/name) index: (index)]
  ]
]
alias: context [
  none!: integer!: pair!: func [val index] [
   eval-data compose [name: (val/dest) index: (index)]
  ]
]
cell: context [
  none!: integer!: pair!: func [val index] [
   compose/deep [[
    cell-if (either val/style ['opts] [[ ]])
     (either val/style [reduce [make-style/ignore val/style [position]]] [[ ]])
     (either val/spansize ['span] [[ ]]) (any [val/spansize [ ]])
     (val/out)
   ]]
  ]
]
csv-row: context [
  none!: func [val index] [
   either empty? val/content [
    [ ]
   ] [
    index: make block! 3 * length? val/content
    insert index first val/content
    foreach cell next val/content [
     insert insert tail index " " cell
    ]
    index
   ]
  ]
  integer!: func [val index] [
   pick val/content index
  ]
  pair!: func [val index] [
   pick val/content index/x
  ]
]
table-row: none
]
eval-var: func [var] [
if var: any [select/skip last local-vars var 2 select/skip vars var 2] [first var]
]
local-vars: [[ ]]
eval-repeat: func [spec body /local result var1 var2 val val2 val3 iter] [
result: make block! 16
parse spec [
  integer! end (loop spec/1 [insert tail result copy/deep body]) ; no need to recurse in this case
  |
  set var1 [word! | string! | into [some [word! | string!]]] opt 'in set var2 [word! | string!] end (
   if val: eval-var form var2 [
    insert/only tail local-vars local: copy last local-vars
    local: tail local
    if object? val [
     iter: in iterators val/type
     if iter [iter: get iter iter/iterate val local var1 result body]
    ]
    remove back tail local-vars
   ]
  )
  |
  set var1 [word! | string!]
  opt 'from set val skip
  opt 'to set val2 skip
  [['by | 'skip | 'step | none] set val3 skip | (val3: none)] end (
   attempt [
    insert/only tail local-vars local: copy last local-vars
    local: tail local
    for i val val2 any [val3 either val > val2 [-1] [1]] [
     clear local
     insert insert tail local form var1 form i
     insert tail result rewrite copy/deep body rewrite-rules ; recurse
    ]
    remove back tail local-vars
   ]
  )
]
result
]
iterators: context [
csv: context [
  iterate: func [val locals var result body /local bind-var] [
   bind-var: get in binders either block? var ['multi] ['single]
   foreach row val/contents [
    clear locals
    bind-var locals var row
    insert tail result rewrite copy/deep body rewrite-rules ; recurse
   ]
  ]
  binders: context [
   single: func [local var row] [
    insert insert tail local form var context [type: 'csv-row content: row]
   ]
   multi: func [local vars row] [
    foreach var vars [
     insert insert tail local form var row/1
     row: next row
    ]
   ]
  ]
]
table: context [
  iterate: func [table locals var result body /local bind-var rows] [
   bind-var: get in binders either block? var ['multi] ['single]
   rows: table/table
   if not all [table/style any [table/style/headerless table/style/vertical]] [rows: next rows]
   foreach row rows [
    clear locals
    bind-var locals var row
    insert tail result rewrite copy/deep body rewrite-rules ; recurse
   ]
  ]
  binders: context [
   single: func [local var row] [
    insert insert tail local form var make row [type: 'table-row]
   ]
   multi: func [locals vars row /local i cell] [
    i: 1
    foreach var vars [
     cell: pick row/contents i
     if cell/type = 'span [cell: cell/reference]
     insert insert tail locals form var cell
     i: i + 1
    ]
   ]
  ]
]
table-proto: context [
  iterate: func [tablep local var result body] [
   rewrite tablep/contents rewrite-rules
   table/iterate eval-var tablep/name local var result body
  ]
]
alias: context [
  iterate: func [alias locals var result body /local iter] [
   alias: eval-var alias/dest
   iter: in iterators alias/type
   if iter [
    iter/iterate alias locals var result body
   ]
  ]
]
]

Qtask Markup Language - parser and other common code

Gabriele Santilli<giesse@rebol.it>

21-Aug-2006, 2.37.1

Contents:

Warning

1. TODO

2. Introduction

3. Overview

3.1 Example usage

4. The parser (stage 1)

4.1 The parse rules

4.2 parse rule for commands

4.2.1 Special commands

4.2.2 Words used by the commands rule

4.3 The set-magic function

5. Balancing (stage 2): the stage2 function

6. Command options parsing

6.1 Option values' types

6.1.1 Subrules and other words used by the parse rules

6.1.2 Rule for parsing a generic value

6.1.3 Rules for color!

6.1.4 Rules for flag!

6.1.5 Flag actions

6.1.6 Rules for set-word!

6.1.7 Set-word actions

6.1.8 Set-word expected types

6.2 Type map

6.3 Object map

6.4 Parse the args string into an object

6.4.1 parse-arg-string's locals

6.5 parse-arg-string's support functions

7. Balancing (stage 2): the Finite State Machine

7.1 The stage 2 FSM states

7.2 Stage 2 functions

7.3 Initialization and termination of the FSM

7.4 Additional code for the stage2 function

7.5 The merge-style function

8. Scan a text string and return a QML document tree

8.1 Search a QML document tree for a substring

8.1.1 search's locals

9. Searching a QML document

10. Setting default options

11. Generating QML document tree (stage 3)

11.1 Header numbering

11.2 Postprocessing

11.3 Stage 3 FSM

11.4 CSV data handling

11.5 Other states

11.6 Generating tables

11.7 Stage 3 rewrite rules (optimization, handling of =data and =repeat)

11.7.1 Words used by the rewrite rules

11.8 Other functions