Qtask Markup Language - parser and other common code Purpose: { This program implements the base for QML (Qtask Markup Language) converters (for example it's the base for a QML to XHTML converter used in Qtask), by implementing the parsing of a QML text string into a QML document tree. } Author: "Gabriele Santilli" EMail: giesse@rebol.it File: %qml-base.r License: { Copyright (c) 2006 Prolific Publishing, Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. } Date: 21-Aug-2006 Version: 2.37.1 ; majorv.minorv.status ; status: 0: unfinished; 1: testing; 2: stable History: [ 16-Feb-2006 1.1.0 "History start" 16-Feb-2006 1.2.0 "Fixed a bug with escape command parsing" 20-Feb-2006 1.3.0 "Moved options parsing from emitters to here" 13-Mar-2006 1.4.0 "Options parsing now collects values without a name; =, now treated as end command" 13-Mar-2006 1.5.0 "Added second pass to balance commands" 17-Mar-2006 1.6.0 "PARSE rules no more eat newlines after commands" 17-Mar-2006 1.7.0 "Removed header option for boxes" 17-Mar-2006 1.8.0 "=- now eats one newline" 18-Mar-2006 1.9.0 "Added args for =row and =column" 18-Mar-2006 1.10.0 "Changed box options" 18-Mar-2006 1.11.0 "=>>> etc.; fixed space only lines" 21-Mar-2006 1.12.0 "Spaces at beginning of lines are now ignored" 23-Mar-2006 1.13.0 "Changed handling of =cell, =row and =column in second pass" 24-Mar-2006 1.14.0 "Spaces no more required after ]" 27-Mar-2006 1.15.0 "New command options handling" 27-Mar-2006 1.16.0 "Added comment handling to second pass" 29-Mar-2006 1.17.0 "Minor changes to options handling; supports dashed: color etc. too" 29-Mar-2006 1.18.0 "Split =c and =center into two separate commands" 29-Mar-2006 1.19.0 "=word and =def as aliases for =: and =::" 29-Mar-2006 1.20.0 "Added new comma-pair! option type" 29-Mar-2006 1.21.0 "Second pass now eats newlines that should be ignored, and handles inline cmds in block mode differently" 30-Mar-2006 1.22.1 "Fixed table balancing" 30-Mar-2006 1.23.1 "Added boxcenter etc to =table" 30-Mar-2006 1.24.1 "Added #FFF etc. as color" 30-Mar-2006 1.25.1 "Changed escape commands (=example[end])" 30-Mar-2006 1.26.1 "Added =left, =right, =l, =r" 30-Mar-2006 1.27.1 "Added =span" 1-Apr-2006 1.28.1 "Added =justify, =j" 5-Apr-2006 1.29.1 "Added =table[space]" 5-Apr-2006 1.30.1 "Added shadow and rounded to =table" 6-Apr-2006 1.31.1 "Added all flag for table, row and column" 6-Apr-2006 1.32.1 "= and =, now close =left etc. too" 18-Apr-2006 1.33.1 "Fixed balancing for =toc" 18-Apr-2006 1.34.1 "Added =o and =x" 21-Apr-2006 1.35.1 "Added =s and =u" 21-Apr-2006 1.36.1 "Now collects font style for =toc" 21-Apr-2006 1.37.1 "Now collects font style for headers numbers" 22-Apr-2006 1.38.1 "Moved merge-style here (from xhtml emitter) and added /copy (fixes bug)" 22-Apr-2006 1.39.1 "Added =font[space]" 22-Apr-2006 1.40.1 "Added new color keywords, improved compatibility with older REBOLs" 24-Apr-2006 1.41.1 "Added =row. and =column." 24-Apr-2006 1.42.1 "New =image (now inline)" 24-Apr-2006 1.43.1 "Added =4, =5 and =6" 26-Apr-2006 1.44.1 "Added a few missing colors" 27-Apr-2006 1.45.1 "The / char is now an alias for . (ending commands)" 27-Apr-2006 1.46.1 "Added initial support for =anchor, changed =link" 28-Apr-2006 1.47.1 "Finished =anchor support" 12-May-2006 2.1.0 "Started rewriting as RLP with new architecture" 15-May-2006 2.2.0 "Added table support" 15-May-2006 2.3.0 "Added =csv" 17-May-2006 2.4.0 "Added =data" 17-May-2006 2.5.0 "Added rewriting engine" 17-May-2006 2.6.0 "Added parsing for =repeat" 17-May-2006 2.7.0 "Added balancing for =repeat" 17-May-2006 2.8.0 "Basic =repeat support" 18-May-2006 2.9.0 "New table handling, supporting =repeat in =table" 18-May-2006 2.10.0 "Initial toc support" 19-May-2006 2.11.0 "Added numbering (finished toc), anchors" 19-May-2006 2.12.0 "Fixed links and qlinks" 19-May-2006 2.13.0 "Fixed =#" 19-May-2006 2.14.0 "=data[name] when name is a =table" 20-May-2006 2.15.0 "Added search" 23-May-2006 2.16.0 "Improved =repeat and =data, testing" 24-May-2006 2.17.0 "Rewritten numbering/counting to be =repeat friendly" 24-May-2006 2.18.0 "Added optimizations" 25-May-2006 2.19.0 "Changed =table[space]" 25-May-2006 2.20.1 "Added support for default options" 26-May-2006 2.21.1 "Made words optional in the repeat dialect" 26-May-2006 2.22.1 "Auto naming for tables and csv" 26-May-2006 2.23.1 "Added =table[headerless]" 26-May-2006 2.24.1 "=repeat on table now skips header row" 29-May-2006 2.25.1 "Fixed box with only a title" 2-Jun-2006 2.26.1 "Added defaults for =repeat and =data" 7-Jun-2006 2.27.1 "Fixed problem with =data, =image and =anchor as =box title" 14-Jun-2006 2.28.1 "Now removes anchors from TOC" 14-Jun-2006 2.29.1 "Search now lists anchor exact matches too" 14-Jun-2006 2.30.1 "Added =table[horizontal vertical]" 16-Jun-2006 2.31.1 "Added =image[space] and changed =image[image: ...] to =image[src: ...]" 16-Jun-2006 2.32.1 "Added /keep refinement to scan-doc" 16-Jun-2006 2.33.1 "Added process-link function (to be overridden by users)" 29-Jun-2006 2.34.1 "Fixed =span" 21-Jul-2006 2.35.1 "=l meant both one-line =left and abbreviation for =link; fixed (now =link can be abbreviated as =li)" 21-Jul-2006 2.36.1 "Fixed problem with =table[borderless]" 21-Aug-2006 2.37.1 "Release 2.0i: first public release" ] \note Warning This documentation is incomplete. It will be finished as soon as possible. /note ===TODO *sections (in source...) *get-anchors, get-links etc. *debug colors: olivedrab limegreen lavenderblush greenyellow goldenrod blueviolet aquamarine *fix "=c =1 something" ===Introduction QML is a markup language designed by Baron R.K. von Wolfsheild for Qtask. This program parses a string of QML text into a QML document tree (that can then be converted to, say, XHTML with an appropriate "emitter"). Parsing is done with a pipeline made of three stages. The first stage uses |parse| on the input string and sends QML commands to the second stage; the second stage, whose main function is greatly simplifying the implementation of the third stage, ensures command balancing ("open" commands must be followed by "close" commands in the correct order, and correctly nested; also there are two levels of commands, "block" and "inline", that must nest properly), preprocessing and translating commands, and then sends them to the third stage; the third stage converts the stream of commands into a QML document tree. The second and third stages are implemented as finite state machines (FSMs). The document tree that comes out of the third stage is optimized with a tree rewriting engine before returning it. The three rewriting rules also process the =repeat and =data commands; after the document has been completely expanded and optimized (which may require more than one pass), any table of contents is created and headers are numbered; numbers for enum lists are calculated too. (Numbering and TOC creation has to be done last because of =repeat.) ===Overview This section is the one users of this scanner are interested the most, because it shows the "interface". The QML scanner uses a stack-based finite state machine interpreter and a data structure matching and rewriting engine, so we include them here. The |qml-scanner| object is defined, which contains all the code for the scanner. Two functions are available for users: |scan-doc| and |search|. |scan-doc| takes a string as its argument and returns a QML document tree (a block). Please note that |scan-doc| is not reentrant and you cannot call it recursively (not that there is any way for you to do so). You can use the |/with| refinement to provide a block of pairs of strings, to set default options for commands such as =box and =table. (See the |-example1-|.) It is also possible to use the |/keep| refinement to stop |scan-doc| from resetting default options, so that you can use the ones that have been set in a previous session; note that if you use |/keep|, |/with| is ignored, so you should not provide both. (There is also a limitation, in that |/keep| does not actually keep the default value for |"alias"|.) |search| takes a string and a QML document tree; it performs a simple substring search inside the document (ignoring formatting etc.). A block is returned, with pairs of header and text snippet (with one exception explained below); the text snippet is a string of about 100 characters around the substring match, and the header is a QML header node (a block) for the "section" that contains the text (i.e., the last header encountered before the string was found). The returned block contains a pair of header and snippet for each occurrence of the substring in the document. If the substring is found before encountering any header in the document, the result block will contain the word |'doc-start| in place of a header block. Note that the text snippet does not contain any formatting (but paragraphs are separated with a newline). |search| also returns anchor name exact matches: if the document contains an anchor whose name matches the |text| string exactly, then the first pair in the result block will be the header for the section that contains the anchor (or |'doc-start|), and the anchor node. (You can distinguish this case because the "snippet" is a block instead of a string; also, this can only happen for the first pair in the result block, and since anchor names need to be unique only one can match.) -main-: #include %fsm.r #include %rewrite.r qml-scanner: context [ -parser- -refinements- -default-handling- -stage2- -stage3- -search- ; the interface search: func [ "Search a QML document tree for a substring" doc [block!] "The QML document tree (as returned by SCAN-DOC)" text [string!] "The substring to search for" /local -search-locals- ] [ -search-doc- ] scan-doc: func [ "Parse a QML text string and return a QML document tree" text [string!] /with defaults [block!] "Default options for commands" /keep "Keep default options from previous session (ignores /with)" ] [ -scan-doc- ] ] ---Example usage To scan a document, just call |scan-doc| with the document text; you can use the |/with| refinement to set default options for commands. (Commands that support default options are: =toc (sets the default number style), =alias (sets the default "magic" sequence), =table, =row, =column, =cell, =box and =image.) You can pass the resulting document tree to an emitter (such as the XHTML emitter) to generate a document that the user can read. To search a substring in a document just call |search| passing the document tree and the substring to search for. -example1-: text: {...some QML text...} doc: qml-scanner/scan-doc text doc: qml-scanner/scan-doc/with text ["box" "yellow dashed" "table" "all dotted"] search-results: qml-scanner/search doc "substring" ===The parser (stage 1) The first stage of the pipeline is the parser. The |parse-qml| function just uses |parse| on the |text| string. -parser-: -parse-rules- -set-magic-function- parse-qml: func [text [string!] magic [string! none!]] [ ; initialize magic char set-magic any [magic "="] parse/all text qml-rule ] ---The parse rules The |parse-qml| function uses |qml-rule| to parse the text; everything that is not a command, is considered text and is handed over to the second stage (with the |stage2| function, see |-stage2-|). The |txt-chars| charset is initialized by the |set-magic| function (see |-set-magic-function-|). -parse-rules-: qml-rule: [ some [commands | text] ] commands: [ -commands-rule- ] txt: none txt-chars: none text: [copy txt some txt-chars (stage2 [text:] txt)] ---|parse| rule for commands The heart of the parser is the |commands| rule. The newline character is treated like a command (produces the |"^/"| command); spaces at the end and at the beginning of lines are ignored. (As already noted, the |stage2| function is used to hand the commands over to the second stage of the pipeline; see |-stage2-|.) Other commands are introduced by |magic-char| (set by the |set-magic| function, see |-set-magic-function-|); there are some special commands that require their own parse rules, but most commands are parsed by the rule shown here. This parses normal opening commands (like |"=cmd"|), opening commands with options (|"=cmd[...]"| or |"=cmd{...}"|), and ending commands (|"=cmd."| or |"=cmd/"|); commands with options eat at most one space following them (i.e., |"=cmd[...]text"| is exactly the same as |"=cmd[...] text"|; if you need a space after a command with options, use two spaces); ending commands do not eat any space (so |"=cmd.text"| is *not* the same as |"=cmd. text"|); commands without options require at least a space following them and will eat all spaces following them. Note that the |mk: magic-char :mk| construct allows putting commands one after another without spaces between them (e.g., |"=b=i"|). Note that this rule also parses the dot command |"=."|. As you can see each command is sent to the second stage as a pair of values: the command as a string, and the command's options (which can be |none|). As you have already seen in |-parse-rules-|, this is also true for text, which is sent as the |text:| (a |set-word!|) command with the text as options (a |set-word!| is used so that there is no clash with other commands, e.g. if we ever need to add a =text command and so on). -commands-rule-: any spc newline any spc (stage2 "^/" none) | magic-char [ -special-commands- | ; ignore a = at the end of the text end | copy cmd any cmd-chars [ "[" copy options to "]" skip opt spc (stage2 cmd options) | "{" copy options to "}" skip opt spc (stage2 cmd options) | ["." | "/"] (stage2 join any [cmd ""] "." none) | [some spc | mk: [newline | magic-char] :mk | end] (stage2 cmd none) ] | ; malformed command? (stage2 [text:] magic-char) ] +++Special commands Some command require custom parse rules. I've tried to keep this to a minimum, but *Reichart loves special cases*. ;-) *|magic-char| followed by |magic-char| allows inserting |magic-char| in the text; *|magic-char| followed by a space or a newline is the *space* command (i.e., |" "|); *|"alias"| is special for two reasons: first, its option is not enclosed in braces, but follows the =alias command (like |"=alias ~"|); second, it does not produce a command, but calls the |set-magic| function, which changes the value of |magic-char| and other things; (note, that since we are using |some cmd-chars| to parse the new |magic-char|, you cannot write something like |"=alias ="|, which is probably ok, but maybe should be fixed); *|"csv"| is similar to an escape command (see below), in that text between |"=csv"| and |"=csv."| is not processed as QML text, but as CSV text; however, |"=csv."| must happen at the beginning of a line (it also eats spaces and one newline). The CSV data is sent as the command's options; *escape commands are used to include non-QML text in a QML document; you can specify an ending sequence for the escaped text using braces, otherwise the respective ending command is used as the ending sequence (e.g., |"=html ... =html."| or |"=html[END] ... END"|); note that one newline is eaten; *the horizontal ruler command |"-"| allows any number of dashes to be used (i.e., |"=-"| is the same as |"=---"| and so on) and eats one newline; *the indent command |">"| takes as option the amount of indentation, but this is specified by the number of |">"| characters following |magic-char| (e.g., |"=>>>"| becomes |">" 3|); *the empty command (|"=[...]"| or |"={...}"| which become |["" "..."]|) is special because it does not eat any following space; *the comma command is special because it does not eat any following spaces (behaving like the space and dot commands); *the repeat command is special because its options are parsed as REBOL block; this is the only case where you can have a |"]"| inside options created with |"["| or a |"}"| inside options created with |"{"|; the |rebol-options| rule is used to parse its options. -special-commands-: magic-char (stage2 [text:] magic-char) | [" " | mk: newline :mk] (stage2 " " none) | "alias" some spc copy cmd some cmd-chars any spc (set-magic cmd) | "csv" [ "[" copy options to "]" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""]) | "{" copy options to "}" skip any spc opt newline (options: refinements/parse-arg-string "csv" any [options ""]) | any spc opt newline (options: context [name: show: none]) ] (csv: make block! 256) some [ [magic-char "csv" ["." | "/"] any spc opt newline | end] (stage2 "csv" make options [contents: csv]) break | [copy txt to newline newline | copy txt to end] (append/only csv parse/all txt ",") ] | copy cmd escape-cmd [ "[" copy options to "]" skip any spc opt newline | "{" copy options to "}" skip any spc opt newline | any spc opt newline (options: rejoin [magic-char cmd "."]) ] [copy txt to options options any spc opt newline | copy txt to end] (stage2 cmd txt) | some "-" [some spc opt newline | newline | mk: magic-char :mk | end] (stage2 "-" none) | copy cmd some ">" [some spc | mk: [newline | magic-char] :mk | end] (stage2 ">" length? cmd) | "[" copy options to "]" skip (stage2 "" options) | "{" copy options to "}" skip (stage2 "" options) | "," (stage2 "," none) | "repeat" (options: make block! 16) [ (opt-open-char: "[" opt-close-char: "]") rebol-options (stage2 "repeat" options) | (opt-open-char: "{" opt-close-char: "}") rebol-options (stage2 "repeat" options) | ["." | "/"] (stage2 "repeat." none) ] +++Words used by the |commands| rule The |commands| parse rule uses a number of words that we need to make local. It also uses the |rebol-options| subrule which we define here. |cmd-chars| and |magic-char| are initialized by the |set-magic| function (see |-set-magic-function-|); |escape-cmd| is the list of escape commands. |rebol-options| uses the |load-next| function to parse the options as REBOL values; in turn it uses |load/next| to parse the text and tries to recover in case of errors. -parse-rules-: mk: cmd: options: csv: none spc: charset " ^-" spc+: charset " ^-^/" cmd-chars: none magic-char: none escape-cmd: ["HTML" | "REBOL" | "MakeDoc" | "Example"] opt-open-char: "[" opt-close-char: "]" rebol-options: [ opt-open-char txt: (txt: load-next options txt) :txt some [any spc+ opt-close-char break | end break | txt: (txt: load-next options txt) :txt] opt spc ] load-next: func [out text /local val] [ if error? try [ set [val text] load/next text insert/only tail out val ] [ insert tail out copy/part text text: any [find text opt-close-char tail text] ] text ] ---The |set-magic| function This function initializes the values of |magic-char|, |cmd-chars| and |txt-chars|. |cmd-chars| is a charset containing the characters that are valid in command names; |txt-chars| contains the characters that are valid for text outside commands. -set-magic-function-: set-magic: func [magic [string!]] [ ; magic cannot be empty if empty? magic [magic: "="] magic-char: magic cmd-chars: complement charset join " ^-^/[]{}./" first magic-char txt-chars: complement charset join "^/" first magic-char ] ===Balancing (stage 2): the |stage2| function The second stage of the pipeline is meant to simplify the third stage by "normalizing" the command stream. It takes care of ensuring command balancing, and handles the space, dot and comma commands (|"= "|, |"=."| and |"=,"|). Parsing of command options is also done in the |stage2| function (just because this is a convenient place to do it, even though conceptually it belongs to the first stage). This stage uses a Finite State Machine, handling each command as an event. Please see the documentation of the FSM interpreter for more informations (linked in from |-main-|). As seen in the first stage, the |stage2| function is invoked with a command and its options. A command can be a |string!| or a |set-word!|; for simplicity set-words can be passed in a block (as in |stage2 [text:] txt|), so if |cmd| is a block we just extract the first value. The options are parsed with the |parse-command-options| function; the |cmd| and |opts| words are then set in the |stage2-ctx| object, so that they are available to the FSM rules, and |cmd| is issued as an event for the state machine. (|stage2| needs to do another thing after this, we'll discuss this later on.) -stage2-: stage2-fsm: make fsm! [ ] stage2: func [cmd opts] [ if block? cmd [cmd: first cmd] stage2-ctx/opts: parse-command-options stage2-ctx/cmd: cmd :opts stage2-fsm/event cmd -additional-stage2-code- ] stage2-ctx: context [ cmd: opts: none -stage2-fsm- ] -stage2-initialization- -merge-style-function- ===Command options parsing Command options (also called "refinements" in the code) are parsed by the |parse-command-options| function, which is called by the |stage2| function (although, conceptually options parsing belongs to the first stage; the |stage2| function parses the options before actually processing the command in the FSM). Only some commands need their options to be parsed; if |cmd| is one of them, and |options| is a string, then the |parse-arg-string| function is used. Otherwise |options| is returned as-is. -refinements-: parse-command-options: func [cmd options] [ either all [ string? options find [ "table" "row" "column" "cell" "box" "image" "font" "f" "span" "data" ] cmd ] [ refinements/parse-arg-string cmd options ] [ options ] ] refinements: context [ -types- -type-map- -object-map- -parse-arg-string-support- parse-arg-string: func [cmd args /local -pas-locals-] [ -parse-arg-string-code- ] ] ---Option values' types Options are parsed as a list of *values*; option values can be of a number of different *types*, each recognized by syntax (like in REBOL) and parsed by a specific |parse| rule. Values can be of one of the following types: |flag!|, |set-word!|, |color!|, |string!|, |integer!|, |url!|, |percent!|, |pair!| and |comma-pair!|; after being parsed, each value is represented by a REBOL value, and each type is mapped to a REBOL type: |flag!| is mapped to |word!|; |set-word!|, |string!|, |integer!|, |url!| and |pair!| are mapped to the respective REBOL types; |color!| is mapped to one of |issue!|, |refinement!| or |tuple!| (depending on how the color was specified, if as a hex string, a color name or a tuple); |percent!| is mapped to |money!|; |comma-pair!| is mapped to |block!|. The |types| object holds the |parse| rules for the option types. The |flag-word| and |set-word| rules are set dynamically based on the command that is being parsed. Note that strings can be specified even without quotes, so the |string!| rule must be applied last (other types need to be tried first); also note that we only allow integer percent values, even tough we use |money!| to represent them; also, |comma-pair!| can contain integers or percents, so the block representing it can contain |integer!| or |money!| values. -types-: types: context [ flag!: [flag-word [some spc | end]] set-word!: [set-word any spc] color!: [ [ color-keyword | tuple | [opt "#" copy value 6 hex-digits | "#" copy value 3 hex-digits] (value: to issue! value) ] [some spc | end] ] string!: [ [{"} copy value some dquotechars {"} | {'} copy value some quotechars {'} | copy value some chars] [some spc | end] ] integer!: [copy value some digits (value: to system/words/integer! value) [some spc | end]] url!: [ copy value [some urlchars ":" 0 2 "/" some urlchars any ["/" some urlchars]] (value: to system/words/url! value) [some spc | end] ] percent!: [copy value 1 3 digits "%" (value: to money! value) [some spc | end]] pair!: [copy value [some digits "x" some digits] (value: to system/words/pair! value) [some spc | end]] comma-pair!: [ (value: make block! 4) copy val some digits ["%" (append value to money! val) | none (append value to integer! val)] "," copy val some digits ["%" (append value to money! val) | none (append value to integer! val)] [some spc | end] ] ] +++Subrules and other words used by the |parse| rules As already said above, |flag-word| and |set-word| are dynamically set to a command-specific rule. See |-flags-| and |-set-words-|. -types-: value: val: none chars: complement spc: charset " ^-^/" urlchars: complement charset {"':/ ^-^/} dquotechars: complement charset {"} quotechars: complement charset {'} digits: charset "1234567890" hex-digits: union digits charset "ABCDEFabcdef" flag-word: none set-word: none -color-rules- -flags- -set-words- +++Rule for parsing a generic value The rule |value-rule| can be used to parse a generic (non flag or set-word) value; the order of the rules is significant. (Note that it's written this way for compatibility with older REBOLs.) -types-: value-rule: bind [color! | percent! | pair! | comma-pair! | integer! | url! | string!] in types 'self +++Rules for |color!| -color-rules-: color-keyword: [ "clear" (value: /transparent) | copy value [ "aliceblue" | "antiquewhite" | "aqua" | "aquamarine" | "azure" | "beige" | "bisque" | "black" | "blanchedalmond" | "blue" | "blueviolet" | "brown" | "burlywood" | "cadetblue" | "chartreuse" | "chocolate" | "coral" | "cornflowerblue" | "cornsilk" | "crimson" | "cyan" | "darkblue" | "darkcyan" | "darkgoldenrod" | "darkgray" | "darkgreen" | "darkkhaki" | "darkmagenta" | "darkolivegreen" | "darkorange" | "darkorchid" | "darkred" | "darksalmon" | "darkseagreen" | "darkslateblue" | "darkslategray" | "darkturquoise" | "darkviolet" | "deeppink" | "deepskyblue" | "dimgray" | "dodgerblue" | "feldspar" | "firebrick" | "floralwhite" | "forestgreen" | "fuchsia" | "gainsboro" | "ghostwhite" | "gold" | "goldenrod" | "gray" | "green" | "greenyellow" | "honeydew" | "hotpink" | "indianred" | "indigo" | "ivory" | "khaki" | "lavender" | "lavenderblush" | "lawngreen" | "lemonchiffon" | "lightblue" | "lightcoral" | "lightcyan" | "lightgoldenrodyellow" | "lightgreen" | "lightgrey" | "lightpink" | "lightsalmon" | "lightseagreen" | "lightskyblue" | "lightslateblue" | "lightslategray" | "lightsteelblue" | "lightyellow" | "lime" | "limegreen" | "linen" | "magenta" | "maroon" | "mediumaquamarine" | "mediumblue" | "mediumorchid" | "mediumpurple" | "mediumseagreen" | "mediumslateblue" | "mediumspringgreen" | "mediumturquoise" | "mediumvioletred" | "midnightblue" | "mintcream" | "mistyrose" | "moccasin" | "navajowhite" | "navy" | "oldlace" | "olive" | "olivedrab" | "orange" | "orangered" | "orchid" | "palegoldenrod" | "palegreen" | "paleturquoise" | "palevioletred" | "papayawhip" | "peachpuff" | "peru" | "pink" | "plum" | "powderblue" | "purple" | "red" | "rosybrown" | "royalblue" | "saddlebrown" | "salmon" | "sandybrown" | "seagreen" | "seashell" | "sienna" | "silver" | "skyblue" | "slateblue" | "slategray" | "snow" | "springgreen" | "steelblue" | "tan" | "teal" | "thistle" | "tomato" | "turquoise" | "violet" | "violetred" | "wheat" | "white" | "whitesmoke" | "yellow" | "yellowgreen" | "transparent" ] (value: to refinement! value) ] tuple: [ copy value [1 3 digits "." 1 3 digits "." 1 3 digits] (value: attempt [to tuple! value]) ] +++Rules for |flag!| Each command has a list of available flags. Note that order is important (because of the way the |bold| and |italic| rules are defined, for example). -flags-: flag-words: [ "table" [ outline | dashed | dotted | solid | borderless | vertical | horizontal | all | hide | headerless | center | left | right | justify | middle | top | bottom | imagecenter | imageleft | imageright | imagemiddle | imagetop | imagebottom | float | space2 | tilev | shadow | rounded | tileh | tileless | tile | boxcenter | boxleft | boxright | times | helv | courier | bold | italic ] "cell" "row" "column" [ outline | dashed | dotted | solid | borderless | all | center | left | right | justify | middle | top | bottom | imagecenter | imageleft | imageright | imagemiddle | imagetop | imagebottom | tilev | tileh | tileless | tile | times | helv | courier | bold | italic ] "box" [ outline | dashed | dotted | solid | borderless | center | left | right | justify | middle | top | bottom | imagecenter | imageleft | imageright | imagemiddle | imagetop | imagebottom | float | tilev | tileh | tileless | tile | boxcenter | boxleft | boxright | times | helv | courier | shadow | rounded | bold | italic ] "image" [ outline | dashed | dotted | solid | borderless | float | boxleft | space ;| shadow | rounded ] "font" "f" [ times | helv | courier | bold | italic | space ] "span" none "csv" [show] "data" none ] bold: ["b" opt "old" (value: 'bold)] italic: ["i" opt ["talic" opt "s"] (value: 'italic)] vertical: [["vertical" | "tablev"] (value: 'vertical)] float: [["float" | "flow"] (value: 'float)] tilev: ["tilev" opt "ertical" (value: 'tilev)] tileh: ["tileh" opt "orizontal" (value: 'tileh)] space2: ["space" (value: 'force-space)] We have only specified the parse rules for a small set of flags; all the others are generated automatically: for example, |[dashed]| becomes |["dashed" (value: 'dashed)]|. This is done by processing the flag-words block using |parse|. -flags-: rule: word: none parse flag-words [ some [ some string! set rule block! ( while [not tail? rule] [ either all [rule/1 <> '| not block? get/any word: rule/1] [ rule: insert/only change rule form word to paren! compose [value: (to lit-word! word)] ] [rule: next rule] ] ) | some string! rule: 'none (rule/1: [end skip]) ] ] +++Flag actions By default, a flag sets the respective word to |true| in the object that represents the parsed options. Some flags, however, do something different; for example, the |dashed| flag sets the word |outline-style| to |'dashed|. The |flag-actions| object contains the flags that do something different from the default. -flags-: flag-actions: context [ dashed: [outline-style: 'dashed] dotted: [outline-style: 'dotted] solid: [outline-style: 'solid] outline: [outline-style: 'solid] borderless: [outline-style: 'borderless] rounded: [outline-style: 'rounded] center: [text-halign: 'center] left: [text-halign: 'left] right: [text-halign: 'right] justify: [text-halign: 'justify] middle: [text-valign: 'middle] top: [text-valign: 'top] bottom: [text-valign: 'bottom] imagecenter: [image-halign: 'center] imageleft: [image-halign: 'left] imageright: [image-halign: 'right] imagemiddle: [image-valign: 'center] imagetop: [image-valign: 'top] imagebottom: [image-valign: 'bottom] tile: [image-tiling: 'both] tilev: [image-tiling: 'vertical] tileh: [image-tiling: 'horizontal] tileless: [image-tiling: 'neither] times: [typeface: 'times] helv: [typeface: 'helvetica] courier: [typeface: 'courier] boxcenter: [position: 'center] boxright: [position: 'right] boxleft: [position: 'left] ] +++Rules for |set-word!| Set-words are parsed in a way similar to flags. Each command has a list of set-words that are accepted. -set-words-: set-words: [ "table" [ color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height | name ] "cell" "row" "column" [ color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height | column | row ] "box" [ color | typeface | fontsize | background | outline | dashed | dotted | solid | image | width | height ] "image" [ background | outline | dashed | dotted | solid | src | width | height | space ] "font" "f" [ color | typeface | fontsize | background | space ] "span" none "csv" [name] "data" [name | index] ] color: [["colo" opt "u" "r:" | "foreground:" | "fg:"] (value: first [color:])] typeface: [opt "type" "face:" (value: first [typeface:])] fontsize: ["size" opt "face" ":" (value: first [fontsize:])] background: [["background:" | "bg:"] (value: first [background:])] width: ["w" opt "idth" ":" (value: first [width:])] height: ["h" opt "eight" ":" (value: first [height:])] column: ["c" opt "olumn" ":" (value: first [column:])] row: ["r" opt "ow" ":" (value: first [row:])] Like with flags, most set-word rules are generated automatically from the above block. This translates |[outline]| to |["outline:" (value: first [outline:])]|. -set-words-: parse set-words [ some [ some string! set rule block! ( while [not tail? rule] [ either all [rule/1 <> '| not block? get/any word: rule/1] [ rule: insert/only change rule append form word ":" to paren! compose/deep [value: first [(to set-word! word)]] ] [rule: next rule] ] ) | some string! rule: 'none (rule/1: [end skip]) ] ] +++Set-word actions By default a set-word sets the respective word to the value following it; some however do a special action. -set-words-: set-actions: context [ outline: solid: [outline-color: value outline-style: 'solid] dashed: [outline-color: value outline-style: 'dashed] dotted: [outline-color: value outline-style: 'dotted] column: [position: as-pair value 1] row: [position: as-pair 1 value] ] +++Set-word expected types Each set-word expects a value of a specified type (of one of a list of specified types) following it; if the following value cannot by parsed with the type rule specified, the assignment is ignored. (That is, you cannot set the color to an integer, or the width to a color.) -set-words-: var-types: context [ color: types/color! typeface: types/string! fontsize: types/integer! space: types/integer! background: types/color! outline: dashed: dotted: solid: types/color! width: height: bind [percent! | integer!] in types 'self column: row: types/integer! image: bind [url! | string!] in types 'self name: types/string! index: bind [pair! | integer!] in types 'self ] ---Type map When an option value (not a flag or set-word) is specified without any variable to set it to (i.e., no set-word precedes it), a "type map" is used to determine what variable should be set. Each command has its own type map; for example, the first |color!| value in the options for |"table"| is set as the background, the second as the text color, and so on. -type-map-: type-map: [ "table" [ color! [background color outline-color] string! [image typeface] integer! [width height fontsize] url! [image] percent! [width height] pair! [table-size] comma-pair! [(width: value/1 height: value/2)] ] "row" "column" [ color! [background color outline-color] string! [image typeface] integer! [position width height fontsize] url! [image] percent! [width height] pair! [none] comma-pair! [(width: value/1 height: value/2)] ] "cell" [ color! [background color outline-color] string! [image typeface] integer! [width height fontsize] url! [image] percent! [width height] pair! [position] comma-pair! [(width: value/1 height: value/2)] ] "box" [ color! [background color outline-color] string! [image typeface] integer! [width height fontsize] url! [image] percent! [width height] pair! [none] comma-pair! [(width: value/1 height: value/2)] ] "image" [ color! [outline-color background] string! [src] integer! [width height] url! [src] percent! [width height] pair! [none] comma-pair! [(width: value/1 height: value/2)] ] "font" "f" [ color! [color background] string! [typeface] integer! [fontsize space] url! [none] percent! [none] pair! [none] comma-pair! [none] ] "span" [ color! [none] string! [none] integer! [none] url! [none] percent! [none] pair! [start end] comma-pair! [none] ] "csv" [ color! [none] string! [name] integer! [none] url! [none] percent! [none] pair! [none] comma-pair! [none] ] "data" [ color! [none] string! [name] integer! [index] url! [none] percent! [none] pair! [index] comma-pair! [none] ] ] Since values are mapped after being parsed, and so after being represented as REBOL types, the types specified above need to be replaced by the respective REBOL types that represent them; as usual we do this using |parse| on the |type-map| block. -type-map-: parse type-map [ some [ some string! set rule block! ( foreach [from to] [ color! [issue! refinement! tuple!] percent! money! comma-pair! block! ] [ replace rule from to ] ) ] ] ---Object map The |object-map| block defines the template options object for each command. -object-map-: object-map: [ "table" [ background: color: outline-color: image: typeface: width: height: fontsize: table-size: bold: italic: outline-style: vertical: text-halign: text-valign: image-halign: image-valign: float: image-tiling: position: force-space: shadow: all: hide: name: headerless: horizontal: none ] "row" "column" "cell" [ background: color: outline-color: image: typeface: width: height: fontsize: bold: italic: outline-style: text-halign: text-valign: image-halign: image-valign: image-tiling: position: all: none ] "box" [ background: color: outline-color: image: typeface: width: height: fontsize: bold: italic: outline-style: text-halign: text-valign: image-halign: image-valign: float: image-tiling: position: shadow: none ] "image" [ background: outline-color: src: width: height: outline-style: float: position: space: none ;shadow: none ] "font" "f" [ bold: italic: typeface: color: background: fontsize: space: none ] "span" [ start: end: none ] "csv" [ show: name: none ] "data" [ name: index: none ] ] ---Parse the |args| string into an object Now that we have all that we need, we can write the code for the |parse-arg-string| function. First of all, we do some initialization, by setting |flag-word| and |set-word| to the appropriate rules, and by finding the correct type map and object template. Then we use |parse| on the |args| string: we can have a flag, in which case we either do the action specified in |flag-actions| or set the respective word to |true|; or we can have one or more set-words followed by a value; or we can have a value by itself, to be handled as specified in the type map (using the |set-value-from-type| function). Note that we allow more than one set-word to be set to one value, like in |"width: height: 100"| which sets both the width and the heigth to 100. The accepted type of the value is the accepted type for the last variable specified, which is a problem, because you can write |"face: size: 10"| and have the font face set to |10|, which is incorrect. However, since such a case is a user error anyway, we're not handling it for now. In the future, we need to make sure that the value is acceptable for *all* the words it is being set to. Also note that we need to do some magic to make string work correctly even in assence of quotes; for example, in the case |"=font[Arial Black]"|, since there are no quotes, two string values would be parsed, i.e. |"Arial"| and |"Black"|, while we want it to be parsed as one string value, |"Arial Black"|: so when we parse a string value, we check if the previous value was a string value too, and in this case we join the two values and consider it one. This is what the |last-str| word is used for. -parse-arg-string-code-: flag-word: select* flag-words cmd set-word: select* set-words cmd tmap: select* type-map cmd obj: get-obj cmd tflag!: types/flag! tset-word!: types/set-word! parse/all args [ any spc some [ tflag! ( last-str: none either in flag-actions value [ do bind get in flag-actions value in obj 'self ] [ set in obj value true ] ) | (vars: clear []) some [tset-word! (append vars value: to word! :value)] (var-type: get in var-types value) var-type ( last-str: either string? :value [value] [none] foreach var vars [ either in set-actions var [ do bind get in set-actions var in obj 'self ] [ set in obj var value ] ] ) | value-rule ( either string? :value [ either last-str [ insert insert tail last-str " " value ] [ last-str: value set-value-from-type tmap obj ] ] [ last-str: none set-value-from-type tmap obj ] ) ] ] obj +++|parse-arg-string|'s locals -pas-locals-: obj tmap var-type last-str vars tset-word! tflag! ---|parse-arg-string|'s support functions First, we need a version of |select| that can handle the maps we defined above, where more than one key map to the same (|block!|) value. This is easily done with |parse|. The |get-obj| function is trivial, too; |set-value-from-type| uses the type map to decide which word in |obj| should be set to |value| (note that |value| is not passed as an argument; it's defined in the |refinements| object). It uses |select*| to get the list of words for the given type, then sets the first word in that list that is still set to |none|. If the block contains a |paren!|, it is evaluated instead. (See for example how this is used for |comma-pair!| in the type maps.) -parse-arg-string-support-: select*: func [block value] [ parse block [to value to block! set block block! | (block: none)] block ] get-obj: func [cmd] [make object! select* object-map cmd] set-value-from-type: func [tmap obj] [ foreach word select* tmap type?/word :value [ if paren? :word [ do bind to block! word in obj 'self break ] if all [word <> 'none none? get word: in obj word] [ set word value break ] ] ] ===Balancing (stage 2): the Finite State Machine The stage 2 FSM has two states (plus three "support" states); this is because QML commands are of two types: "block" level commands (a block is something that can contain paragraphs of text), and "inline" level commands (such as text, formatting commands, and so on - i.e. everything that can be *inside* a paragraph). What this FSM needs to do is ensure that all opening and closing commands are correctly balanced, and that there is no block level command inside inline level commands. Formatting commands (almost all the inline level commands are formatting commands) need to be handled somewhat specially, since they can span across paragraphs (that is, exiting from inline level does not end the formatting) and they can be closed out of order; this means that the FSM has to close and reopen commands as needed to transform the input stream into a stream that is balanced and nests block and inline commands correctly. -stage2-fsm-: -stage2-functions- -stage2-states- ---The stage 2 FSM states The initial state of the FSM is the |in-block| state, which handles block level commands. (Note, we also use the stage 2 to convert commands that have aliases to their canonical name, i.e. |"font"| becomes |"f"|, and so on.) After processing, commands are passed to the |stage3| function (see |-stage3-|). *Both in block and inline level we need to strip off comments. (Maybe, we should not parse them at all.) There are two kind of comments: one-line comments and normal comments; one-line comments are started by the |"='"| (or its variations) command, and end at the first newline; normal comments are started by |"=;"| (or |"=rem"| etc.) and end with |"=;."| (or |"=rem."| and so on). We use the |in-line-comment| and |in-comment| states to handle them; see below. *By default, commands are sent to |stage3| as they are. *|"word"| and |"def"| are translated to |":"| and |"::"|. *The escape commands |"example"|, |"html"|, |"rebol"| and |"makedoc"| and the special command |"csv"| need to ignore a newline that immediately follows them; the |eat-one-newline| state is used for this (see below). *Block commands are opened by the |open-block| function (see |-stage2-functions-|); |"table"|, |"center"|, |"left"|, |"justify"|, |"right"| and |"repeat"| also need to eat an immediately following newline, while |"box"| doesn't need to. (Note, that |"repeat"| is special, because it is a neutral command, that can happen both at block and inline levels, so its interpretation depends on the current state. There are a number of unresolved issues because of this, see below.) *Block commands are closed by the |close-block| function, and they all eat one newline. *The |"toc"| command cannot be nested, so it closes any previously opened |"toc"| when opening a new one. *The three commands |"."|, |","| and |" "| are special; they (respectively) close everything that is open, end all the formatting commands, and end the last formatting command. Since we are at the block level in this state, inline commands are never open at this point and don't need to be closed; however, since formatting commands remain valid across paragraphs, we keep track of them, and need to remove them from the list of inline commands that will be reopened when we switch to inline level. *The |""|, |"link"|, |"anchor"|, |"a"|, |"li"|, |"image"|, |"data"| and |text:| commands make the FSM switch to the |in-inline| state, because they mean we are switching to inline level from block level. The |reopen-inline| function is used to reopen any formatting command that is still active. *Formatting commands found while in block level do not make the FSM switch to inline level; they just are activated and will be opened as soon as we switch to inline level. The |add-inline| function is used to activate them. They also eat one newline when used in block level. *Formatting commands are deactivated with the |remove-inline| function, and eat one newline. *The table commands |"row"|, |"column"|, |"row."|, |"column."| and |"span"| are special, because they close the current cell (but without crossing the boundary of the current table, see |-stage2-functions-|); they also eat one newline. The |"cell"| command also does not nest, and closes any previously opened cell, without crossing the table boundary, and it eats one newline too. -stage2-states-: in-block: [ "'" "’" "`" {"} in-line-comment ";" "comment" "rem" in-comment default: (stage3 cmd opts) "word" (stage3 ":" opts) "def" (stage3 "::" opts) "example" "html" "rebol" "makedoc" "csv" (stage3 cmd opts) eat-one-newline "table" "center" "left" "justify" "right" "repeat" (open-block cmd opts) eat-one-newline "box" (open-block cmd opts) "box." "table." "cell." "center." "left." "justify." "right." "toc." "repeat." (close-block cmd) eat-one-newline "toc" (close-block "toc." open-block "toc" opts) "." (remove-all-inline close-all-block) eat-one-newline " " (remove-last-inline) eat-one-newline "," (remove-all-inline) eat-one-newline "" "link" "anchor" "a" "li" "image" "data" text: (reopen-inline) continue in-inline "b" "bold" (add-inline "b" opts) eat-one-newline "u" "underline" (add-inline "u" opts) eat-one-newline "i" "italics" "italic" (add-inline "i" opts) eat-one-newline "font" "f" (add-inline "f" opts) eat-one-newline "s" "strike" "strikethrough" (add-inline "s" opts) eat-one-newline "b." "bold." (remove-inline "b") eat-one-newline "u." "underline." (remove-inline "u") eat-one-newline "i." "italics." "italic." (remove-inline "i") eat-one-newline "font." "f." (remove-inline "f") eat-one-newline "s." "strike." "strikethrough." (remove-inline "s") eat-one-newline "row" "column" "row." "column." "span" (close-block/upto "cell." "table" stage3 cmd opts) eat-one-newline "cell" (close-block/upto "cell." "table" open-block cmd opts) eat-one-newline ] The |in-inline| state processes commands at inline level. *We handle comments like in |in-block|, and by default commands are passed as-is. *The |"=."| command at inline level closes all inline commands, adds a newline (to end the inline level and make the stage 3 to go back to block level), and *continues* to |in-block|, which in turn will close all block level commands; |"= "| and |"=,"| respectively close the last and all inline commands. *|"=link"| and |"=anchor"| are special inline commands, and behave differently from other inline commands. See |-stage2-functions-| for more details. *Formatting commands are handled by the |open-inline| and |close-inline| functions; they behave differently from |open-block| and |close-block| because inline commands can be opened and closed out of order; these functions take care of reordering the commands so that in the stream that arrives at the third stage commands are balanced and correctly nested. See |-stage2-functions-| for more details. *The |"=repeat"| command is a problem here; first of all, |"=repeat."| could both be closing an inline or a block level repeat; the |close-repeat| function checks for this and if it was closing a block level repeat, it ends the inline level. (To do this, it generates the |close-inline:| event, see |-stage2-functions-| for more details.) But the major problem is that repeat does not behave like inline commands, but it does not behave like the link and anchor special commands; for this reason, there are a few unresolved issues here. In particular, we have problems if a |"=repeat"| in inline level is not correctly balanced and nested with other commands. I hope to fix this in the future. *The inline mode is ended by a newline, or by any block level commands. When inline mode ends, we need to close a special command if it is open, close any inline level repeat, temporarely close all inline commands, and emit a newline command; the |end-inline| function does this. *The |close-inline:| event is generated by functions that conditionally need to end the inline mode; see |-stage2-functions-| for more details. -stage2-states-: in-inline: [ "'" "’" "`" {"} in-line-comment ";" "comment" "rem" in-comment default: (stage3 cmd opts) "." (close-all-inline stage3 "^/" none) continue return " " (close-last-inline) "," (close-all-inline) "link" "li" (open-special "link" opts) "anchor" "a" (open-special "anchor" opts) "link." "li." (close-special "link.") "anchor." "a." (close-special "anchor.") "b" "bold" (open-inline "b" opts) "u" "underline" (open-inline "u" opts) "i" "italics" "italic" (open-inline "i" opts) "s" "strike" "strikethrough" (open-inline "s" opts) "font" "f" (open-inline "f" opts) "b." "bold." (close-inline "b.") "u." "underline." (close-inline "u.") "i." "italics." "italic." (close-inline "i.") "s." "strike." "strikethrough." (close-inline "s.") "font." "f." (close-inline "f.") "repeat" (open-inline "repeat" opts) "repeat." (close-repeat) "^/" (end-inline) return "box" "table" "c" "center" "center." "box." "table." "-" "1" "1'" "2" "2'" "3" "3'" "*" "**" "#" "##" "csv" ">" ":" "::" "word" "def" "example" "toc" "cell" "cell." "row" "column" "left" "right" "left." "right." "r" "l" "span" "html" "rebol" "makedoc" "justify" "j" "justify." "toc." "o" "x" "2’" "1’" "3’" "row." "column." "4" "5" "6" (end-inline) continue return close-inline: return ] The other states we need are |in-line-comment|, |in-comment| and |eat-one-newline|. -stage2-states-: in-line-comment: [ "^/" return ] in-comment: [ ";." "comment." "rem." return ] eat-one-newline: [ "^/" return default: continue return ] ---Stage 2 functions -stage2-functions-: ; block commands stack block-stack: [ ] ; inline commands stack inline-stack: [ ] ; special inline commands special: none end-inline: does [ if special [close-special special] close-repeat/only temp-close-inline stage3 "^/" none ] ; open special inline command (=link or =anchor) open-special: func [cmd opts] [ ; close previous special if open if special [close-special special] special: join cmd "." temp-close-inline stage3 cmd opts reopen-inline ] ; close special inline command close-special: func [cmd] [ if special = cmd [ temp-close-inline stage3 cmd none reopen-inline special: none ] ] ; close all open block commands close-all-block: does [ ; process in reverse order block-stack: skip tail block-stack -2 while [not empty? block-stack] [ ; close command stage3 join block-stack/1 "." none block-stack: skip clear block-stack -2 ] ] ; reopen all inline commands that where temporarily closed reopen-inline: does [ foreach [cmd opts] inline-stack [ stage3 cmd opts ] ] ; remove last command from inline stack remove-last-inline: has [cmd] [ either empty? inline-stack [ if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [ stage3 join cmd "." none clear skip tail block-stack -2 ] ] [ clear skip tail inline-stack -2 ] ] ; remove all inline commands remove-all-inline: has [cmd] [ clear inline-stack block-stack: tail block-stack while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [ stage3 join cmd "." none block-stack: skip block-stack -2 ] block-stack: head clear block-stack ] ; open block command open-block: func [cmd opts] [ stage3 cmd opts insert/only insert tail block-stack cmd opts ] ; remove specific inline command remove-inline: func [cmd] [ if cmd: find/skip inline-stack cmd 2 [ remove/part cmd 2 ] ] ; close specific block command close-block: func [cmd /upto noclosecmd /local] [ ; cmd ends with a ., compute opening command without the . remove back tail cmd: copy cmd ; is the block open? if local: find/skip/last block-stack cmd 2 [ if upto [ ; closing cmd should never close noclosecmd ; so if cmd appears before noclosecmd, it should ; not actually be closed noclosecmd: find/skip/last block-stack noclosecmd 2 if all [noclosecmd (index? local) < index? noclosecmd] [ exit ] ] ; close all blocks up to specified one (included) block-stack: tail block-stack until [ block-stack: skip block-stack -2 ; close this block stage3 join block-stack/1 "." none block-stack/1 = cmd ] block-stack: head clear block-stack ] ] ; close all open inline commands close-all-inline: does [ ; process in reverse order inline-stack: skip tail inline-stack -2 while [not empty? inline-stack] [ ; close command ; since we happen to close and reopen inline commands ; a lot, it is possible that we are closing a command that ; had just been opened; to avoid that, we check the out block, ; and directly remove the open command if that was the last command ;either inline-stack/1 = pick tail out -2 [ ; clear skip tail out -2 ;] [ stage3 join inline-stack/1 "." none ;] inline-stack: skip clear inline-stack -2 ] ; close special commands too if special [close-special special] ; close alignment commands too block-stack: tail block-stack while [find ["left" "right" "center" "justify"] cmd: pick block-stack -2] [ ; we need to emit a newline too if we close an alignment if not close-inline? [ stage3 "^/" none ; forces the fsm to go back to in-block state close-inline?: yes ] stage3 join cmd "." none block-stack: skip block-stack -2 ] block-stack: head clear block-stack ] ; open inline command open-inline: func [cmd opts] [ stage3 cmd opts insert/only insert tail inline-stack cmd opts ] ; add inline command (to be opened when we enter inline mode) add-inline: func [cmd opts] [ insert/only insert tail inline-stack cmd opts ] ; close last opened inline command close-last-inline: has [cmd] [ ; if empty, check for special and alignment if empty? inline-stack [ if special [ close-special special exit ] if find ["left" "right" "center" "justify"] cmd: pick tail block-stack -2 [ stage3 "^/" none ; forces the fsm to go back to in-block state close-inline?: yes stage3 join cmd "." none clear skip tail block-stack -2 ] exit ] cmd: pick tail inline-stack -2 ; check if this had just been opened (see note above) ;either cmd = pick tail out -2 [ ; clear skip tail out -2 ;] [ stage3 join cmd "." none ;] ; remove from stack clear skip tail inline-stack -2 ] ; close specific inline command close-inline: func [cmd] [ ; this is special because we need to close all commands up to ; the specified one (included), but then reopen all except the specified ; one. (so that "=b text =i text =b. text" becomes "=b text =i text =i. =b. =i text") ; cmd ends with a ., compute opening command without the . remove back tail cmd: copy cmd ; is the command open? if find/skip inline-stack cmd 2 [ ; close all commands up to specified one (included) inline-stack: tail inline-stack until [ inline-stack: skip inline-stack -2 ; see note in close-all-inline ;either inline-stack/1 = pick tail out -2 [ ; clear skip tail out -2 ;] [ stage3 join inline-stack/1 "." none ;] inline-stack/1 = cmd ] ; remove this command remove/part inline-stack 2 ; reopen all other commands foreach [cmd opts] inline-stack [ stage3 cmd opts ] inline-stack: head inline-stack ] ] ; temporarily close all inline commands (to be reopened later) ; this happens at block end... so that "=b =box." becomes "=b =b. =box. =b" temp-close-inline: does [ ; if empty, we have nothing to do if empty? inline-stack [exit] ; process in reverse order inline-stack: tail inline-stack until [ inline-stack: skip inline-stack -2 ; close command ; since we happen to close and reopen inline commands ; a lot, it is possible that we are closing a command that ; had just been opened; to avoid that, we check the out block, ; and directly remove the open command if that was the last command ;either inline-stack/1 = pick tail out -2 [ ; clear skip tail out -2 ;] [ stage3 join inline-stack/1 "." none ;] head? inline-stack ] ] close-inline?: no close-repeat: func [/only] [ ; similar to close-inline, but does not reopen commands; also could ; be closing a block repeat ; is the command open? either find/skip inline-stack "repeat" 2 [ ; close all commands up to specified one (included) inline-stack: tail inline-stack until [ inline-stack: skip inline-stack -2 stage3 join inline-stack/1 "." none inline-stack/1 = "repeat" ] ; remove from stack inline-stack: head clear inline-stack ] [ ; maybe it's a block repeat? if all [not only find/skip block-stack "repeat" 2] [ ; go back to block mode if special [close-special special] temp-close-inline stage3 "^/" none close-inline?: yes close-block "repeat." ] ] ] ---Initialization and termination of the FSM -stage2-initialization-: init-stage2: does [ ; initialize stuff clear stage2-ctx/block-stack clear stage2-ctx/inline-stack stage2-ctx/special: none ;clear stage2-ctx/anchors stage2-fsm/init stage2-ctx/in-block ] end-stage2: does [ ; cause everything to be closed stage2-fsm/event "." ; actually, not needed... stage2-fsm/end ] ---Additional code for the |stage2| function -additional-stage2-code-: if stage2-ctx/close-inline? [ stage2-fsm/event first [close-inline:] stage2-ctx/close-inline?: no ] ---The |merge-style| function -merge-style-function-: merge-style: func [old new /copy /local val] [ if object? new [ either object? old [ if copy [old: make old [ ]] foreach word next first new [ if val: get in new word [ set in old word val ] ] ] [ old: new ] ] old ] ===Scan a text string and return a QML document tree -scan-doc-: init-stage2 init-stage3 if not keep [set-defaults defaults] parse-qml text if defaults [select defaults "alias"] end-stage2 ; returns output end-stage3 ---Search a QML document tree for a substring -search-doc-: res: make block! 16 doc: build-search-index doc if anchor: find doc/2 text [ insert/only insert/only res anchor/3 anchor/2 ] foreach [header string] skip doc 2 [ while [string: find/tail string text] [ insert insert/only tail res header copy/part skip string -50 120 ] ] res +++|search|'s locals -search-locals-: res anchor ===Searching a QML document -search-: build-search-index: func [doc [block!] /local rule result val anchor] [ result: copy/deep [toa [] doc-start ""] rule: [ into [ [ 'para | 'header4 | 'header5 | 'header6 | 'bullets | 'enum | 'checks | 'definitions | 'box | 'section | 'center | 'left | 'right | 'justify | 'item | 'check | 'term | 'desc | 'header1* | 'header2* | 'header3* ] opt ['opts skip] any rule (insert tail last result newline) | val: ['header1 | 'header2 | 'header3] opt ['opts skip] (insert insert/only tail result val make string! 256) any rule (insert tail last result newline) | 'escape string! set val string! (insert insert tail last result val newline) | 'table opt ['opts skip] opt [into ['columns to end]] any [ into [ 'row opt ['opts skip] opt 'header any [ into ['cell opt ['opts skip] opt ['span skip] opt 'header any rule (insert tail last result " ")] ] (insert tail last result newline) ] ] | ['bold | 'italic | 'strike | 'link | 'alink | 'font] opt ['opts skip] any rule | anchor: 'anchor opt [ 'opts set val block! ( if val: select val [name:] [ insert/only insert/only insert tail result/2 val anchor pick tail result -2 ] ) ] any rule ] | set val string! (insert tail last result val) | skip ] parse doc ['qml any rule] result ] ===Setting default options -default-handling-: default-number-style: default-table: default-row: default-column: default-cell: default-box: default-image: none ; defaults that cannot be currently set default-data: context [name: "csv" index: none] default-repeat: [csv in csv] merge-style: func [old new /copy /local val] [ if object? new [ either object? old [ if copy [old: make old [ ]] foreach word next first new [ if val: get in new word [ set in old word val ] ] ] [ old: new ] ] old ] set-defaults: func [defaults /local w] [ default-number-style: default-table: default-row: default-column: default-cell: default-box: default-image: none if block? defaults [ foreach [cmd opts] defaults [ if w: select [ "toc" default-number-style "table" default-table "row" default-row "column" default-column "cell" default-cell "box" default-box "image" default-image ] cmd [ set w parse-command-options cmd opts ] ] ] ] ===Generating QML document tree (stage 3) -stage3-: stage3: func [cmd opts] [ ;brkpnt "STAGE 3" [cmd opts] if block? cmd [cmd: first cmd] stage3-ctx/cmd: cmd stage3-ctx/opts: opts stage3-fsm/event cmd ] stage3-fsm: make fsm! [ ] out: [ ] init-stage3: does [ clear out clear stage3-ctx/blocks clear stage3-ctx/vars clear stage3-ctx/anchors stage3-ctx/local-vars: copy [[ ]] stage3-ctx/csvid: stage3-ctx/tabid: 1 insert out 'qml ;stage3-fsm/tracing: yes stage3-fsm/init stage3-ctx/initial ] end-stage3: does [ stage3-fsm/end rewrite out rewrite-rules make-toc out set-enum-counts out out ] stage3-ctx: context [ -stage3-fsm- ] rewrite-rules: use [x y z] [[ -rewrite-rules- ]] -other-rwrules- -numbering- -postprocessing- process-link: func [target] [ either parse/all target [["http://" | "mailto:" | "ftp://" | "www."] to end] [ compose [target: (target) class: "external" text: (target)] ] [ compose [target: (join http://www.qtask.com/qwiki.cgi?goto= target) class: "internal" text: (target)] ] ] ---Header numbering -numbering-: numbering: context [ ; only 6 levels supported toc-counters: [0 0 0 0 0 0] ; number style for each level toc-style: ["1. " "1[.1] "] chars: complement charset "1AaIi[]" make-number: func [level /local style i res mk1 mk2 rpt term cont] [ ; toc-style set to none means no numbering if not toc-style [return ""] i: 1 style: any [pick toc-style level last toc-style] res: make string! 16 ; first increment/reset counters poke toc-counters level 1 + pick toc-counters level change/dup skip toc-counters level 0 subtract length? toc-counters level ; generate number term: [ mk1: any chars mk2: (insert/part tail res mk1 mk2) [ "1" (insert tail res pick toc-counters i) | "A" (insert tail res pick "ABCDEFGHIJKLMNOPQRSTUVWXYZ" min 26 pick toc-counters i) | "a" (insert tail res pick "abcdefghijklmnopqrstuvwxyz" min 26 pick toc-counters i) | "I" (insert tail res uppercase to-roman pick toc-counters i) | "i" (insert tail res to-roman pick toc-counters i) ] mk1: any chars mk2: (insert/part tail res mk1 mk2) (i: i + 1 cont: either i > level ['break] [[ ]]) cont ] parse/all/case style [ some [rpt: "[" some term "]" (cont: either i > level ['break] [[:rpt]]) cont | term] rpt: (insert tail res rpt) ] res ] romans: [ ["" "i" "ii" "iii" "iv" "v" "vi" "vii" "viii" "ix"] ["" "x" "xx" "xxx" "xl" "l" "lx" "lxx" "lxxx" "xc"] ["" "c" "cc" "ccc" "cd" "d" "dc" "dcc" "dccc" "cm"] ] ; form an integer from 0 to 999 with roman numerals to-roman: func [int /local res] [ int: form int res: make string! 16 forall int [ insert tail res pick pick romans length? int int/1 - #"/" ] res ] ; set toc-style set-style: func [style] [ style: any [style default-number-style] if none? style [toc-style: none exit] if not toc-style: select/case [ "1" ["1. " "1[.1] "] "A" ["A. " "A[.1] "] "a" ["a) " "a1) " "a1[.1]) "] "I" ["I. " "I[.1] "] "i" ["i) " "i[.1]) "] ] style [ toc-style: parse/all style "|" ] ] ; reset counters reset: does [change/dup toc-counters 0 6] ] ---Postprocessing -postprocessing-: collect: func [output doc rule /local node] [ match doc [set node into rule (append/only output copy/deep node)] output ] mkopts: func [level id] [compose [number: (numbering/make-number level) id: (id)]] make-toc: func [doc /local style toc here l headid] [ headid: 1 match doc [ 'section [here: 'opts set style string! (remove/part here 2) :here | (style: none)] (numbering/set-style style) into ['toc toc: to end] here: to end ( numbering/reset collect toc here [ ['header1 (l: 1) | 'header2 (l: 2) | 'header3 (l: 3)] [ 'opts set here block! (append here mkopts l headid headid: headid + 1) | here: (here: insert/only insert here 'opts mkopts l headid headid: headid + 1) :here ] to end | ; add number but do not collect! ['header4 (l: 4) | 'header5 (l: 5) | 'header6 (l: 6)] [ 'opts set here block! (insert insert tail here [number:] numbering/make-number l) | here: (here: insert/only insert here 'opts compose [number: (numbering/make-number l)]) :here ] ; fail (so that it is not collected) end skip ] ; remove anchors from TOC rewrite toc [ [into ['anchor opt ['opts skip] here: to end]] [(here)] ] ) ] ] count: func [counter options] [ if block? options [ options: construct options if all [in options 'force integer? options: attempt [to integer! options/force]] [set counter options] ] options: get counter set counter 1 + options options ] set-enum-counts: func [doc /local count1 count2 opts] [ count1: count2: 1 match doc [ 'enum some [ into [ 'item [ 'opts set opts block! (insert insert tail opts [number:] count 'count1 opts) | opts: (opts: insert/only insert opts 'opts compose [number: (count 'count1 opts)]) :opts ] (count2: 1) to end | 'enum some [ into [ 'item [ 'opts set opts block! (insert insert tail opts [number:] count 'count2 opts) | opts: (opts: insert/only insert opts 'opts compose [number: (count 'count2 opts)]) :opts ] to end ] | skip ] ] | skip ] ] ] ---Stage 3 FSM -stage3-fsm-: cmd: opts: none emit: func [val] [ repend out val ] blocks: [ ] open-block: func [name opts /only] [ insert/only tail blocks out insert/only tail out out: make block! 16 emit name if not none? opts [ if all [object? opts not only] [opts: make-style opts] emit ['opts opts] ] ] close-block: does [ if empty? blocks [exit] out: last blocks remove back tail blocks ] ; shortcut to make the state inheritance trick readable. ; appending the parent state's directives to the new directives ; makes the new state "inherit" them. inherit: func [parent-state new-directives] [ append new-directives parent-state ] make-style: func [obj /ignore block /local] [ local: make block! length? obj: third obj block: any [block [ ]] foreach [word val] obj [ if all [:val not find block to word! word] [insert/only insert tail local word :val] ] local ] ; common state directives, used by almost all states common: [ ; by default emit unknown commands verbatim default: (emit [reduce ['command cmd opts]]) ; the dot command makes the fsm go to the initial state ; note: this should not happen anymore, since "." commands are handled by the scanner. "." rewind? initial ; collected anchor names block from the scanner ; (normally just at the beginning of the doc) ;anchors: (anchors: opts) ] ; basic block state, also the initial state anchors: [ ] ; variables that can be used by =data vars: [ ] tabid: 1 open-table: func [opts] [ if not object? opts [ opts: refinements/get-obj "table" ] if not opts/name [ opts/name: join "table" tabid if tabid = 1 [insert insert tail vars "table" context [type: 'alias dest: "table1"]] tabid: tabid + 1 ] open-block/only 'table-proto opts insert insert tail vars opts/name context [type: 'table-proto name: opts/name contents: out] ] in-block: initial: inherit common [ "" "link" "anchor" "image" "b" "i" "f" "data" text: (open-block 'para none) continue in-para (close-block) after-para: ( ) ; newlines produce an empty paragraph in the initial state "^/" (emit [[para]]) "-" (emit [[hrule]]) "1" (open-block 'header1 none) in-para (close-block) "2" (open-block 'header2 none) in-para (close-block) "3" (open-block 'header3 none) in-para (close-block) "4" (open-block 'header4 none) in-para (close-block) "5" (open-block 'header5 none) in-para (close-block) "6" (open-block 'header6 none) in-para (close-block) "1'" "1’" (open-block 'header1* none) in-para (close-block) "2'" "2’" (open-block 'header2* none) in-para (close-block) "3'" "3’" (open-block 'header3* none) in-para (close-block) "*" "**" (open-block 'bullets none) continue in-ulist (close-block) "#" "##" (open-block 'enum none) continue in-olist (close-block) "o" "x" (open-block 'checks none) continue in-checklist (close-block) ">" (open-block 'para compose [indent: (opts)]) in-para (close-block) "::" (open-block 'para [indent: 2]) in-item (close-block) ":" (open-block 'definitions none) continue in-dlist (close-block) "box" (open-block 'box merge-style/copy default-box opts) in-box (close-block) "toc" (open-block 'section opts open-block 'toc none) in-toc (close-block) "table" (open-table opts) in-table (close-block) "c" (open-block 'para [text-halign: center]) in-para (close-block) "center" (open-block 'center none) in-center (close-block) "l" (open-block 'para [text-halign: left]) in-para (close-block) "left" (open-block 'left none) in-left (close-block) "r" (open-block 'para [text-halign: right]) in-para (close-block) "right" (open-block 'right none) in-right (close-block) "j" (open-block 'para [text-halign: justify]) in-para (close-block) "justify" (open-block 'justify none) in-just (close-block) "example" "html" "rebol" "makedoc" (emit [reduce ['escape cmd opts]]) "csv" (handle-csv opts) "repeat" (open-block 'repeat any [opts default-repeat]) in-repeat (close-block) ; these cannot actually happen in the initial state, but they could ; happen in all the states inheriting from in-block "center." continue rewind? in-center "left." continue rewind? in-left "right." continue rewind? in-right "justify." continue rewind? in-just "box." continue rewind? in-box "table." continue rewind? in-table ] -csv-handling- ---CSV data handling -csv-handling-: csvid: 1 handle-csv: func [data] [ if not object? data [exit] data: make data [type: 'csv] either data/name [ insert insert tail vars data/name data ] [ insert insert tail vars join "csv" csvid data if csvid = 1 [insert insert tail vars "csv" data] csvid: csvid + 1 ] if data/show [ open-block 'table none foreach row data/contents [ open-block 'row none foreach column row [ emit [reduce ['cell reduce ['para column]]] ] close-block ] close-block ] ] ---Other states -stage3-fsm-: ; inside a centered block. same as initial state, except that =center. will end it. in-center: inherit in-block [ "center." override after-para return ] ; inside a left aligned block. same as initial state, except that =left. will end it. in-left: inherit in-block [ "left." override after-para return ] ; inside a right aligned block. same as initial state, except that =right. will end it. in-right: inherit in-block [ "right." override after-para return ] ; inside a justified block. same as initial state, except that =justify. will end it. in-just: inherit in-block [ "justify." override after-para return ] in-repeat: inherit in-block [ "repeat." override after-para return ] ; inside a paragraph (or other inline content) in-para: inherit common [ text: (emit opts) ; anything else ends the paragraph "^/" override after-para return ;default: continue return ; formatting "b" (open-block 'bold none) in-bold (close-block) "i" (open-block 'italic none) in-italic (close-block) "s" (open-block 'strike none) in-strike (close-block) ;"u" (emit ) in-underline (emit ) ; the empty command is actually the =[qwiki link] command "" (emit [reduce ['qlink opts]]) "link" (open-block 'link-proto opts) in-link (close-block) "f" (open-block 'font opts) in-font-inline (close-block) "image" (emit [reduce ['image 'opts make-style merge-style/copy default-image opts]]) "anchor" (open-block 'anchor opts if opts [insert/only insert tail anchors opts out]) in-anchor (close-block) "data" (emit [reduce ['data make-style merge-style/copy default-data opts]]) "repeat" (open-block 'repeat any [opts default-repeat]) in-repeat-inline (close-block) ] in-repeat-inline: inherit in-para [ "repeat." return ; note that "^/" cannot happen here thanks to stage2 ] ; inside a link in-link: inherit in-para [ "link." return ] ; inside an anchor in-anchor: inherit in-para [ "anchor." return ] ; inside inline font command in-font-inline: inherit in-para [ "f." return ] ; inside inline bold in-bold: inherit in-para [ "b" in-bold ; recurse (to enforce balancing) "b." return ] ; inside inline italic in-italic: inherit in-para [ "i" in-italic ; recurse "i." return ] ; inside strikethrough in-strike: inherit in-para [ "s" in-strike ; recurse "s." return ] ; inside underline in-underline: inherit in-para [ "u" in-underline ; recurse "u." return ] ; inside a term definition block (=:) in-dlist: [ ; a new =: starts a new definition in the same block ":" (open-block 'term none) in-item (close-block) ; =:: starts the definition description "::" (open-block 'desc none) in-item (close-block) after-para: ( ) ; anything else outside of a description ends the definition block default: continue return ] ; inside a checklist in-checklist: [ ; new items "o" (open-block 'check compose [checked: (no)]) in-item (close-block) "x" (open-block 'check compose [checked: (yes)]) in-item (close-block) ; nothing to do after para after-para: ( ) ; anything else ends the list default: continue return ] ; inside a bullet list in-ulist: [ ; new item "*" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block) ; start sublists "**" (open-block 'bullets none) continue in-ulist2 (close-block) "##" (open-block 'enum none) continue in-olist2 (close-block) ; nothing to do after para after-para: ( ) ; anything else ends the list default: continue return ] ; inside a numbered list in-olist: [ ; new item "#" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block) ; start sublists "##" (open-block 'enum none) continue in-olist2 (close-block) "**" (open-block 'bullets none) continue in-ulist2 (close-block) ; nothing to do after para after-para: ( ) ; anything else ends the list default: continue return ] ; bullet sublist in-ulist2: [ ; new item "**" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]]) in-item (close-block) ; nothing to do after para after-para: ( ) ; anything else ends the list default: continue return ] ; numbered sublist in-olist2: [ ; new item "##" (open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]]) in-item (close-block) ; nothing to do after para after-para: ( ) ; anything else ends the list default: continue return ] ; inside a list's item in-item: in-para ; inside a box ; box should work this way: ; if there's something after the =box command and before the newline, treat ; it as the box's header; otherwise, if there's a newline immediately following ; the =box command, emit no header in-box: [ ; newline just after =box, emit no header ; also go to box-contents after the header "^/" after-para: box-contents ; a block level command just after =box, emit no header default: continue box-contents ; inline level command means that we have a header "" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u" ; close command or coming back from box-contents "box." override after-para return ] box-contents: inherit in-block [ ; same as initial except that ends on =box. "box." continue return ] in-toc: [ ; newline just after =toc, emit no title ; also go to in-block after the header "^/" after-para: (close-block) in-toc2 ; a block level command just after =toc, emit no title default: (close-block) continue in-toc2 ; inline level command means that we have a title "" "link" "b" "i" "f" "s" "anchor" "data" "image" text: (open-block 'title none) continue in-para (close-block) ; "u" ; close command or coming back from in-toc2 "toc." (close-block) override after-para return ] in-toc2: inherit in-block [ "toc." override after-para 2 return ] -table-handling- ---Generating tables -table-handling-: ; table states (see also table code below) ; inside a table in-table: inherit common [ "row" (emit ['row opts]) "column" (emit ['column opts]) ; go back to saved position "row." "column." (emit 'return) "cell" (open-block/only 'cell opts) in-cell-block (close-block) "table." return ; define a spanning cell "span" (emit ['span opts]) ; anything else starts a new cell default: (open-block 'cell none) continue in-cell (close-block) "repeat" (open-block 'repeat any [opts default-repeat]) in-table-repeat (close-block) ] in-table-repeat: inherit in-table [ "repeat." return ] ; inside an implied table cell (no =cell command given) in-cell: inherit in-block [ ; lists are treated specially in tables... each item in a cell "*" ( open-block 'bullets none open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]] ) in-item (close-block close-block) "**" ( open-block 'bullets none open-block 'bullets none open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [type: (opts)]] ) in-item (close-block close-block close-block) "#" ( open-block 'enum none open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]] ) in-item (close-block close-block) "##" ( open-block 'enum none open-block 'enum none open-block 'item if all [opts opts: attempt [to integer! opts]] [compose [force: (opts)]] ) in-item (close-block close-block close-block) "::" (open-block 'para [indent: 2]) in-item (close-block) ":" (open-block 'definitions none) override define in-cell-dlist (close-block) "o" (open-block 'checks none open-block 'check no) in-item (close-block close-block) "x" (open-block 'checks none open-block 'check yes) in-item (close-block close-block) ; end current cell, go back to in-table state after-para: "^/" return ] ; inside a block table cell (=cell ... =cell.) in-cell-block: inherit in-block [ ; end current cell "cell." return ] ; inside a term definition block (=:) inside a cell in-cell-dlist: [ define: (open-block 'term none) in-item (close-block) ; =:: starts the definition description "::" (open-block 'desc none) in-item (close-block) after-para: ( ) ; anything else outside of a description ends the definition block and the cell default: continue 2 return ] header?: func [type col row] [ switch type [ horiz [row = 1] vert [col = 1] both [any [row = 1 col = 1]] none [false] ] ] generate-table: func [opts body /local table result content tmp i j header] [ result: copy [table] if opts [insert/only insert tail result 'opts make-style/ignore opts [name]] table: make table-state! [ style: opts if all [object? style style/table-size] [size: style/table-size] table: make block! 16 columns: make block! 16 ] ; first, eval any 'repeat etc. body: rewrite copy body rewrite-rules parse body [ some [ 'row set opts skip (add-row table opts) | 'column set opts skip (add-col table opts) | 'return (table-go-back table) | into ['cell ['opts set opts skip | (opts: none)] content: to end (add-cell table opts content)] | 'span set opts skip (make-span table opts) opt [ into ['cell ['opts set opts skip | (opts: none)] content: to end (set-cell table opts content)] ] ] ] header: 'horiz if object? table/style [ if table/style/name [ poke find vars table/style/name 2 table ] if table/style/hide [return [hidden-table]] if table/style/vertical [header: either table/style/horizontal ['both] ['vert]] if table/style/headerless [header: 'none] ] insert/only tail result content: copy [columns] foreach col table/columns [ insert/only tail content either col [reduce ['column 'opts make-style col]] [[column]] ] j: 1 foreach row table/table [ ; emit a row ; row can be none if it's empty either object? row [ i: 1 insert/only tail result content: copy [row] if row/style [insert/only insert tail content 'opts make-style row/style] if all [find [horiz both] header j = 1] [insert tail content 'header] foreach cell row/contents [ either object? cell [ ; emit cell, unless it is hidden by spanned cell if cell/type = 'cell [ insert/only tail content compose [ cell (either cell/style ['opts] [[ ]]) (either cell/style [reduce [make-style/ignore cell/style [position]]] [[ ]]) (either cell/spansize ['span] [[ ]]) (any [cell/spansize [ ]]) (either header? header i j ['header] [[ ]]) (cell/out) ] ] ] [ ; cell can be none if it's empty tmp: make-cell-style table none row pick table/columns i insert/only tail content compose [ cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]]) (either header? header i j ['header] [[ ]]) ] ] i: i + 1 ] loop table/size/x - length? row/contents [ tmp: make-cell-style table none row pick table/columns i insert/only tail content compose [ cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]]) (either header? header i j ['header] [[ ]]) ] i: i + 1 ] ] [ ; if row contains a spanned cell, then it is never empty (we put references there) insert/only tail result content: copy [row] if all [find [horiz both] header j = 1] [insert tail content 'header] if not any [table/style] repeat i table/size/x [ tmp: make-cell-style table none none pick table/columns i insert/only tail content compose [ cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]]) (either header? header i j ['header] [[ ]]) ] ] ] j: j + 1 ] loop table/size/y - length? table/table [ insert/only tail result content: copy [row] if all [find [horiz both] header j = 1] [insert tail content 'header] repeat i table/size/x [ tmp: make-cell-style table none none pick table/columns i insert/only tail content compose [ cell (either tmp ['opts] [[ ]]) (either tmp [reduce [make-style/ignore tmp [position]]] [[ ]]) (either header? header i j ['header] [[ ]]) ] ] j: j + 1 ] result ] ; Table handling - needs two passes table-state!: context [ type: 'table ; block of rows table: ; list of column styles columns: ; current row currow: ; current cell curcell: ; current position (pair!) curpos: none ; direction (where to add new cells, 1x0 = to the right, 0x1 = to the bottom, and so on) dir: 0x1 ; table size in cells size: 0x0 ; table style style: none ; =row and =column save current position so that =row. and =column. can go back to it savepos: savedir: none ] inherit-style: func [dest source words] [ foreach word words [ if none? get in dest word [ set in dest word get in source word ] ] dest ] make-cell-style: func [table-state style row col] [ if not style [ style: context [ background: color: outline-color: image: typeface: width: height: fontsize: bold: italic: outline-style: text-halign: text-valign: image-halign: image-valign: image-tiling: position: none ] ] ; inherit outline style from table if needed if all [table-state/style table-state/style/all] [ inherit-style style table-state/style [outline-color outline-style] ] either table-state/dir = 1x0 [ ; row style has priority, so it must be applied last if all [col col/all] [ ; background is there to override header cells inherit-style style col [ background color outline-color typeface height fontsize bold italic outline-style text-halign text-valign ] ] if all [row row/style row/style/all] [ inherit-style style row/style [ background outline-color width outline-style ] ] ] [ ; column style has priority, so it must be applied last if all [row row/style row/style/all] [ inherit-style style row/style [ background outline-color width outline-style ] ] if all [col col/all] [ ; background is there to override header cells inherit-style style col [ background color outline-color typeface height fontsize bold italic outline-style text-halign text-valign ] ] ] ; if no value has been specified, just cancel whole style if parse second style [object! some none! end] [style: none] style ] make-row: func [table-state pos style' /local row] [ ; do we need to add rows? if pos/y > length? table-state/table [ insert/dup tail table-state/table none pos/y - length? table-state/table ] ; create it if it does not exist either row: pick table-state/table pos/y [ row/style: merge-style row/style style' ] [ ; (poke does not return ROW in older REBOLs) poke table-state/table pos/y row: context [ ; row contents (cells) contents: make block! 16 style: style' ; first row is header unless vertical style has been specified, or table is headerless ;header: if not all [table-state/style any [table-state/style/vertical table-state/style/headerless]] [pos/y = 1] ] ] row ] make-col: func [table-state pos style /local col] [ ; do we need to add rows? if pos/x > length? table-state/columns [ insert/dup tail table-state/columns none pos/x - length? table-state/columns ] ; create it if it does not exist either col: pick table-state/columns pos/x [ merge-style col style ] [ poke table-state/columns pos/x style ; (poke does not return STYLE in older REBOLs) style ] ] add-row: func [table-state args /local pos] [ ; save current direction (for =row.) table-state/savedir: table-state/dir ; add new row (=row command) ; see if position has been specified if object? args [ pos: args/position if pair? pos [pos: pos/y] ] ; if not, assign default position either pos [ ; save current position (for =row.) table-state/savepos: table-state/curpos table-state/curpos: 0x1 * pos ] [ either table-state/curpos [ ; position must be saved so that =row. does not make the next ; cell overwrite the row table-state/savepos: table-state/curpos + 0x1 table-state/curpos: table-state/curpos * 0x1 + 0x1 ] [ table-state/savepos: table-state/curpos: 0x1 ] ] ; create row object args: merge-style/copy default-row args make-row table-state table-state/curpos args ; =row command also means switching to horizontal direction for new cells table-state/dir: 1x0 ] add-col: func [table-state args /local pos] [ ; save current direction (for =column.) table-state/savedir: table-state/dir ; add new column (=column command) ; see if position has been specified if object? args [ pos: args/position if pair? pos [pos: pos/x] ] ; if not, assign default position either pos [ ; save current position (for =column.) table-state/savepos: table-state/curpos table-state/curpos: 1x0 * pos ] [ either table-state/curpos [ ; position must be saved so that =column. does not make the next ; cell overwrite the column table-state/savepos: table-state/curpos + 1x0 table-state/curpos: table-state/curpos * 1x0 + 1x0 ] [ table-state/savepos: table-state/curpos: 1x0 ] ] ; create col object args: merge-style/copy default-column args make-col table-state table-state/curpos args ; =column command also means switching to vertical direction for new cells table-state/dir: 0x1 ] table-go-back: func [table-state] [ ; =row. or =column. if all [table-state/savepos table-state/savedir] [ table-state/curpos: table-state/savepos table-state/dir: table-state/savedir table-state/currow: pick table-state/table table-state/curpos/y table-state/curcell: pick table-state/currow/contents table-state/curpos/x table-state/savepos: table-state/savedir: none ] ] ; make room for a new cell in the table and create a cell object make-cell: func [table-state pos style' contents /span spanrc /local row cell] [ row: make-row table-state pos none ; do we need to add cells? if pos/x > length? row/contents [ insert/dup tail row/contents none pos/x - length? row/contents ] table-state/curpos: pos table-state/currow: row ; set this as the current cell (creates it if it does not exist) either cell: pick row/contents pos/x [ if cell/type = 'span [ ; span inside a span breaks the outer span either span [ ; is the outer span already destroyed? either cell/reference/position [ ; no, break it break-span table-state cell/reference pos pos + spanrc ; try again return make-cell/span table-state pos style' contents spanrc ] [ ; yes, just overwrite it poke row/contents pos/x none ; try again return make-cell/span table-state pos style' contents spanrc ] ] [ ; was this a destroyed cell? either cell/reference/position [ ; no cell: cell/reference ] [ ; yes, overwrite it poke row/contents pos/x none ; try again return make-cell table-state pos style' contents ] ] ] cell/style: merge-style cell/style style' if span [ ; redefine the span? if cell/spansize [ ; if we're reducing the span size we need to break the old span up if any [cell/spansize/y > spanrc/y cell/spansize/x > spanrc/x] [ break-span table-state cell pos pos + spanrc ; try again return make-cell/span table-state pos style' contents spanrc ] ] cell/spansize: spanrc ] cell/out: contents table-state/curcell: cell ] [ poke row/contents pos/x table-state/curcell: context [ ; cell type - real cell (cells hidden by a spanning cell have type 'span) type: 'cell ; cell position in the table position: pos ; cell contents out: contents style: make-cell-style table-state style' row pick table-state/columns pos/x ; spanning cell? spansize: if span [spanrc] ; cell is header either if it is in the first row, or if vertical mode has been specified ; and it is in the first column; unless table is headerless ;header: either table-state/style [ ; if not table-state/style/headerless [ ; either table-state/style/vertical [pos/x = 1] [pos/y = 1] ; ] ;] [ ; pos/y = 1 ;] ] table-state/curcell ] ] ; break up a span (because a new span overlaps it) break-span: func [table-state spancell breakstart breakend /local cellstart cellend] [ ; we need to "subtract" the rectangle from breakstart to breakend from spancell ; we break spancell horizontally or vertically depending on the current table mode cellstart: spancell/position cellend: spancell/position + spancell/spansize ; spancell must be destroyed ; this flags it as destroyed when referencing from another cell spancell/spansize: none spancell/position: none ; remove it from the table poke get in pick table-state/table cellstart/y 'contents cellstart/x none ; remove position from original style (as it must not propagate to new cells) if spancell/style [spancell/style/position: none] either table-state/dir = 1x0 [ ; horizontal mode if breakstart/y > cellstart/y [ ; we have a row at the top make-span table-state context [ start: cellstart end: cellend end/y: breakstart/y - 1 ] cellstart/y: breakstart/y ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakend/y < cellend/y [ ; we have a row at the bottom make-span table-state context [ start: cellstart start/y: breakend/y + 1 end: cellend ] cellend/y: breakend/y ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakstart/x > cellstart/x [ ; we have a column at the left make-span table-state context [ start: cellstart end: cellend end/x: breakstart/x - 1 ] cellstart/x: breakstart/x ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakend/x < cellend/x [ ; we have a column at the right make-span table-state context [ start: cellstart start/x: breakend/x + 1 end: cellend ] cellend/x: breakend/x ; new span gets the style from original span set-cell table-state spancell/style [ ] ] ] [ ; vertical mode if breakstart/x > cellstart/x [ ; we have a column at the left make-span table-state context [ start: cellstart end: cellend end/x: breakstart/x - 1 ] cellstart/x: breakstart/x ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakend/x < cellend/x [ ; we have a column at the right make-span table-state context [ start: cellstart start/x: breakend/x + 1 end: cellend ] cellend/x: breakend/x ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakstart/y > cellstart/y [ ; we have a row at the top make-span table-state context [ start: cellstart end: cellend end/y: breakstart/y - 1 ] cellstart/y: breakstart/y ; new span gets the style from original span set-cell table-state spancell/style [ ] ] if breakend/y < cellend/y [ ; we have a row at the bottom make-span table-state context [ start: cellstart start/y: breakend/y + 1 end: cellend ] cellend/y: breakend/y ; new span gets the style from original span set-cell table-state spancell/style [ ] ] ] ] ; make a cell reference (cell hidden by a spanned cell) make-reference: func [table-state pos cell /local row old save] [ ; make room if needed row: make-row table-state pos none ; do we need to add cells? if pos/x > length? row/contents [ insert/dup tail row/contents none pos/x - length? row/contents ] ; are we overlapping another span? if all [old: pick row/contents pos/x old/type = 'span old/reference/position] [ ; need to save curcell, curpos, currow and out as break-span may change them save: reduce bind [curcell curpos currow] in table-state 'self break-span table-state old/reference cell/position cell/position + cell/spansize set bind [curcell curpos currow] in table-state 'self save ] ; overwrite any cell poke row/contents pos/x local: context [ ; cell type - span type: 'span reference: cell ] local ] ; add a new cell (=cell command, or new paragraph) add-cell: func [table-state args contents /local pos] [ ; see if position has been specified if object? args [pos: args/position] ; if not, assign default position (curpos + dir) if not pos [ pos: any [table-state/curpos 1x0] pos: pos + table-state/dir ] ; create cell or pick existing one at given position args: merge-style/copy default-cell args make-cell table-state pos args contents ; update table size table-state/size: max table-state/size pos ] ; set the style for the current cell (=span=cell case) set-cell: func [table-state args contents] [ either all [args args/position] [ ; if position has been specified, this is a new cell add-cell table-state args contents ] [ table-state/curcell/style: merge-style table-state/curcell/style args table-state/curcell/out: contents ] ] ; create a cell span make-span: func [table-state args /local cell pos] [ ; arguments must be provided, and valid, otherwise command is ignored if not all [ object? args pair? args/start pair? args/end set bind [start end] in args 'self reduce [min args/start args/end max args/start args/end] ] [exit] ; not really a span... make a normal cell if args/start = args/end [ make-cell table-state args/start none [ ] exit ] ; the idea here is that we create a cell with spansize ; then we put a reference to it in all the cells it covers if cell: make-cell/span table-state args/start none [ ] 1x1 + args/end - args/start [ pos: args/start + 1x0 while [pos/y <= args/end/y] [ while [pos/x <= args/end/x] [ make-reference table-state pos cell pos: pos + 1x0 ] pos/x: args/start/x pos: pos + 0x1 ] ; update table size table-state/size: max table-state/size args/end ] ] ---Stage 3 rewrite rules (optimization, handling of =data and =repeat) -rewrite-rules-: ['table-proto ['opts set x skip | (x: none)] y: to end] [(stage3-ctx/generate-table x y)] [into ['hidden-table]] [ ] [into ['repeat 'opts set x block! into ['enum y: to end]]] [[enum [repeat opts [(x)] (y)]]] [into ['repeat 'opts set x block! into ['bullets y: to end]]] [[bullets [repeat opts [(x)] (y)]]] [into ['repeat 'opts set x block! y: to end]] [(stage3-ctx/eval-repeat x y)] [into ['data none!]] [ ] [into ['data set x block!]] [(stage3-ctx/eval-data x)] [into ['qlink none!]] [ ] ['qlink set x string!] [(stage3-ctx/eval-qlink x)] ['link-proto 'opts set x url!] [link opts [target: (x)]] ['link-proto 'opts set x string!] [(stage3-ctx/eval-link x)] ['anchor 'opts set x string!] [anchor opts [name: (x)]] ; unwind cells outside of table (can be created by =data) [y: into ['cell-if opt ['opts skip] opt ['span skip] opt ['header] x: to end]] [(either 'row = first head y [y/1/1: 'cell copy/part y 1] [x])] ; unwind block level inside paras (can be created by =data) [into [x: 'para 'opts set z block! any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]] [[(copy/part x y)] (copy/part y 1) [para opts [(z)] (next y)]] [into [x: 'para any [y: into [block-level to end] :y break | skip] into [block-level to end] to end]] [[(copy/part x y)] (copy/part y 1) [para (next y)]] [y: ['para | 'item] opt ['opts skip] any [z: into ['para to end] :z break | skip] into ['para to end] to end] [( rewrite copy y [ [into ['para 'opts set x block! y: to end]] [[font opts [(x)] (y)]] [into ['para opt ['opts skip] y: to end]] [(y)] ] )] ; box with only title ['box ['opts set x block! | (x: [ ])] into ['title y: to end] end] [box opts [(x)] [para (y)]] ; optimizations [into ['bold]] [ ] [into ['italic]] [ ] [into ['strike]] [ ] [into ['font opt ['opts skip]]] [ ] ; font with no opts [into ['font x: [block! | string!] to end]] [(x)] ['font 'opts set x block! into ['bold y: to end] end] [font opts [(x) bold: (true)] (y)] ['font 'opts set x block! into ['italic y: to end] end] [font opts [(x) italic: (true)] (y)] ['bold into ['font 'opts set x block! y: to end] end] [font opts [(x) bold: (true)] (y)] ['italic into ['font 'opts set x block! y: to end] end] [font opts [(x) italic: (true)] (y)] ['link 'opts set x block! into ['font 'opts set y block! z: to end] end] [link opts [(x) (y)] (z)] ['alink 'opts set x block! into ['font 'opts set y block! z: to end] end] [alink opts [(x) (y)] (z)] ['anchor 'opts set x block! into ['font 'opts set y block! z: to end] end] [anchor opts [(x) (y)] (z)] ['para ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [para opts [(x) (y)] (z)] ['header1 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1 opts [(x) (y)] (z)] ['header1* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header1* opts [(x) (y)] (z)] ['header2 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2 opts [(x) (y)] (z)] ['header2* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header2* opts [(x) (y)] (z)] ['header3 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3 opts [(x) (y)] (z)] ['header3* ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header3* opts [(x) (y)] (z)] ['header4 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header4 opts [(x) (y)] (z)] ['header5 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header5 opts [(x) (y)] (z)] ['header6 ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [header6 opts [(x) (y)] (z)] ['item ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [item opts [(x) (y)] (z)] ['check ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [check opts [(x) (y)] (z)] ['term ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [term opts [(x) (y)] (z)] ['desc ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [desc opts [(x) (y)] (z)] ['title ['opts set x block! | (x: [ ])] into ['font 'opts set y block! z: to end] end] [title opts [(x) (y)] (z)] +++Words used by the rewrite rules -other-rwrules-: block-level: [ 'hrule | 'header1 | 'header2 | 'header3 | 'header4 | 'header5 | 'header6 | 'bullets | 'enum | 'checks | 'definitions | 'box | 'table | 'center | 'left | 'right | 'justify | 'escape | 'header1* | 'header2* | 'header3* ] ---Other functions -stage3-fsm-: eval-qlink: func [target /local a] [ either a: select anchors target [ compose/deep [alink opts [target: (target)] (skip a 3)] ] [ compose/deep [link opts [(a: process-link target)] (select a [text:])] ] ] eval-link: func [target] [ either find anchors target [ compose/deep [alink opts [target: (target)]] ] [ compose/deep [link opts [(process-link target)]] ] ] eval-data: func [opts /local val p] [ opts: construct/with opts context [name: none index: none] val: eval-var opts/name if not val [return [ ]] either object? val [ p: in pickers val/type if p [ do get in get p type?/word opts/index val opts/index ] ] [ reduce [val] ] ] pickers: context [ csv: context [ none!: func [val index] [ "Not yet." ] integer!: func [val index] [ csv-row/none! context [content: pick val/contents index] none ] pair!: func [val index] [ pick pick val/contents index/y index/x ] ] table: context [ none!: func [val index] [ "Not yet." ] integer!: func [val index] [ "Not yet." ] pair!: func [val index] [ val: pick get in pick val/table index/y 'contents index/x if val/type = 'span [val: val/reference] cell/none! val none ] ] table-proto: context [ none!: integer!: pair!: func [val index] [ rewrite val/contents rewrite-rules eval-data compose [name: (val/name) index: (index)] ] ] alias: context [ none!: integer!: pair!: func [val index] [ eval-data compose [name: (val/dest) index: (index)] ] ] cell: context [ none!: integer!: pair!: func [val index] [ compose/deep [[ cell-if (either val/style ['opts] [[ ]]) (either val/style [reduce [make-style/ignore val/style [position]]] [[ ]]) (either val/spansize ['span] [[ ]]) (any [val/spansize [ ]]) (val/out) ]] ] ] csv-row: context [ none!: func [val index] [ either empty? val/content [ [ ] ] [ index: make block! 3 * length? val/content insert index first val/content foreach cell next val/content [ insert insert tail index " " cell ] index ] ] integer!: func [val index] [ pick val/content index ] pair!: func [val index] [ pick val/content index/x ] ] table-row: none ] eval-var: func [var] [ if var: any [select/sk