1. Introduction

Structure matching is the ability to match a pattern over a data structure. By using block parsing, it is possible to easily implement structure matching with parse rules. It also becomes easy to implement a rewriting engine that replaces a substructure matched by a pattern with another. Such an engine can be very useful in some special (but very interesting) cases, such as tree rewriting (which is a technique often used by optimizing compilers).

2. Overview

We provide two functions, match and rewrite. (The latter is based on the former.) The former takes a block or string as input, and tries to find a match for the parse rule given as second argument; it returns true if a match is found, otherwise returns false. (Since the pattern is a parse rule, you can use parens to apply any code to the matches.)

rewrite takes a block or string as input, and modifies it according to the rules given as second argument; these consist of a list of pairs of blocks. The first block of each pair is a parse rule to use as a pattern, while the second block is compose'ed and replaces the content matched by the pattern whenever it matches. This rewrite process is iterative, and is repeated until no pattern in the rules matches. (So, be careful as it's easy to get infinite loops, e.g. if your pattern always matches the second block.)


match: func [
 "Match a pattern over data"
 data [block! string!] "Data to match the pattern to"
 rule [block!] "PARSE rule to use as pattern"

 /local match's locals
] [
 Match the parse rule rule over data

rewrite: func [
 "Apply a list of rewrite rules to data"
 data [block! string!] "Data to change"
 rules [block!] "List of rewrite rules"
 /trace "Trace rewriting process (for debugging)"

 /local rewrite's locals
] [
 Apply the rewrite rules to data

3. Match the parse rule rule over data

We use result to hold the result of the match, which is false by default and becomes true as soon as we find one occurrence of the pattern. If data is a block, our parse rule needs to recurse into subblocks too; if it's a string instead, the into command is invalid; so we need two separate rules for strings and blocks.

As you see, rule is applied repeatedly over data. Note that it does not stop when a match is found, but it will look for other matches too; this is useful if you have code in the parse rule (for example you could collect all matches).

Match the parse rule rule over data

result: false
recurse: either block? data [
  some [
   rule (result: true)
   into recurse
] [
  some [
   rule (result: true)
parse data recurse

3.1 match's locals

match's locals

result recurse

3.2 Example

An interesting example use of match is a collect function that collects all matches of a giver rule:


collect: func [output data rule /local x] [
 match data [copy x rule (append/only output x)]

4. Apply the rewrite rules to data

We need to apply the given rewrite rules (list of pairs of blocks in rules) to data. If rules is empty, we have nothing to do, so we just return data.

Otherwise, we create a parse rules containing all the patterns in the list, in order. The pair [rule] [production] is added to the parse rule rules* as [rule] (prod: compose/deep [production]) |; this means rules* becomes a list of alternatives in the form [... | ... | ... |], and we need to remove the last |.

Then, we use match with a pattern that tries to match rules*, and replaces any match with prod (that has been computed in rules*). After replacing the parse cursor is set back to the beginning of the change, so as to apply the rules again on it. The call to match is repeated until it returns false (nothing more to rewrite).

If the /trace refinement has been used, each iteration of rewriting is printed on screen and a step-by-step tracing is performed.

Apply the rewrite rules to data

if empty? rules [return data]
rules*: make block! 16
foreach [pattern production] rules [
 insert insert/only insert/only tail rules* pattern make paren! compose/only [
  prod: compose/deep (production)
 ] '|
remove back tail rules*
until [
 if trace [probe data ask "? "]
 not match data [mk1: rules* mk2: (change/part mk1 prod mk2) :mk1]

4.1 rewrite's locals

rewrite's locals

rules* prod mk1 mk2

4.2 Examples


digits: charset "1234567890"
reverse rewrite reverse "1234567890" [[x: 4 digits] [(copy/part x 3) "," (pick x 4)]]
; rewrites "1234567890" to "1,234,567,890"

rules: [
 ['lit set x number!] [push (x)]
 ['plus into ['lit 1 1 0] set x block!] [(x)]
 ['plus set x block! into ['lit 1 1 0]] [(x)]
 ['mult into ['lit 1 1 1] set x block!] [(x)]
 ['mult set x block! into ['lit 1 1 1]] [(x)]
 ['plus set x block! set y block!] [(x) (y) add]
 ['mult set x block! set y block!] [(x) (y) mul]
test: [
 plus [plus [mult [lit 3] [lit 1]] [lit 2]] [lit 0]
rewrite test rules
; rewrites test into [push 3 push 2 add]