Author: | haxscramper |
---|
This module implements pattern matching for objects, tuples, sequences, key-value pairs, case and derived objects. DSL can also be used to create object trees (AST).
Example | Explanation |
---|---|
(fld: @val) |
Field fld into variable @val |
Kind() |
Object with .kind == Kind() [1] |
of Derived() |
Match object of derived type |
(@val, _) |
First element in tuple in @val |
(@val, @val) |
Tuple with two equal elements |
{"key" : @val} |
Table with "key", capture into @val [2] |
[_, _] |
Sequence with len == 2 [3] |
[_, .._] |
At least one element |
[_, all @val] |
All elements starting from index 1 |
[until @val == "2", .._] |
Capture all elements until first "2" [4] |
[until @val == 1, @val] |
All including first match |
[all @val == 12] |
All elements are == 12 , capture into @val |
[some @val == 12] |
At least one is == 12 , capture all matching into @val |
- [1] Kind fields can use shorted enum names - both
nnkStrLit
andStrLit
will work (prefixnnk
can be omitted) - [2] Or any object with
contains
and[]
defined (for necessary types) - [3] Or any object with
len
proc or field - [4] Note that sequence must match fully and it is necessary to have
.._
at the end in order to accept sequences of arbitrary length.
- seqs - matched using
[Patt1(), Patt2(), ..]
. Must havelen(): int
anditerator items(): T
defined. - tuples - matched using
(Patt1(), Patt2(), ..)
. - pairable - matched using
{Key: Patt()}
. Must have[Key]: T
defined.Key
is not a pattern - search for whole collection won't be performed. - set - matched using
{Val1, Val2, .._}
. Must havecontains
defined. If variable is captured thenVal1
must be comparable and collection should also implementitems
andincl
. - object - matched using
(field: Val)
. Case objects are matched usingKind(field: Val)
. If you want to check agains multiple values for kind field(kind: in SomeSetOfKinds)
To determine whether particular object matches pattern access
path is generated - sequence of fields and []
operators that you
would normally write by hand, like fld.subfield["value"].len
. Due to
support for method call syntax
there is no difference between field access and proc call, so things
like (len: < 12) also work as expected.
(fld: "3")
Match fieldfld
against"3"
. Generated access- is
expr.fld == "3"
. ["2"]
Match first element of expression agains patt. Generate- acess
expr[pos] == "2"
, wherepos
is an integer index for current position in sequence.
("2")
For each field generate access using [1]
{"key": "val"}
First check"key" in expr
and thenexpr["key"] == "val"
. No exception on missing keys, just fail match.
It is possible to have mixed assess for objects. Mixed object access
via (gg: _, [], {})
creates the same code for checking. E.g ([_])
is the same as [_]
, ({"key": "val"})
and is identical to just
{"key": "val"}
. You can also call functions and check their values
(like (len: _(it < 10))
or (len: in {0 .. 10})
) to check for
sequence length.
- Any operators with exception of
is
(subpattern) andof
(derived object subpattern) is considered final comparison and just pasted as-is into generated pattern match code. E.g.fld: in {2,3,4}
will generateexpr.fld in {2,3,4}
(fld: Patt())
- check ifexpr.fld
matches patternPatt()
(fld: _.matchesPredicate())
- if call tomatchesPredicate(expr.fld)
evaluates to true.
Notation: <expr>
refers to any possible combination of checks. For
example
fld: in {1,2,3}
-<expr>
isin {1,2,3}
[_]
-<expr>
is_
fld: Patt()
-<expr>
isPatt()
(fld: 12)
If rhs for key-value pair is integer, string or identifier then==
comparison will be generated.(fld: == ident("33"))
if rhs is a prefix of==
then==
will be generated. Any for of prefix operator will be converted toexpr.fld <op> <rhs>
.(fld: in {1, 3, 3})
or(fld: in Anything)
createsfld.expr in Anything
. Eitherin
ornotin
can be used.
Match can be bound to new variable. All variable declarations happen
via @varname
syntax.
- To bind element to variable without any additional checks do:
(fld: @varname)
- To bind element with some additional operator checks do:
(fld: @varname <operator> Value)
first perform check using<operator>
and then addValue
to@varname
-(fld: @hello is ("2" | "3"))
- Predicate checks:
fld: @a.matchPredicate()
- Arbitrary expression:
fld: @a(it mod 2 == 0)
. If expression has no type it is consideredtrue
.
Bind order: if check evaluates to true variable is bound immediately,
making it possible to use in other checks. [@head, any @tail !=
head]
is a valid pattern. First match head
and then any number
of @tail
elements. Can use any _(if it != head: tail.add it)
and declare tail
externally.
Variable is never rebound. After it is bound, then it will have the value of first binding.
- Any variadics are mapped to sequence
- Only once in alternative is option
- Explicitly optional is option
- Optional with default value is regular value
- Variable can be used only once if in alternative
Pattern | Injected variables |
---|---|
[@a] |
var a: typeof(expr[0]) |
{"key": @val} |
var val: typeof(expr["key"]) |
[all @a] |
var a: seq[typeof(expr[0])] |
[opt @val] |
var a: Option[typeof(expr[0])] |
[opt @val or default] |
var a: typeof(expr[0]) |
(fld: @val) |
var val: typeof(expr.fld) |
Input sequence: [1,2,3,4,5,6,5,6]
Pattern | Result | Comment |
---|---|---|
[_] |
Fail | Input sequence size mismatch |
[.._] |
Ok | |
[@a] |
Fail | Input sequence size mismatch |
[@a, .._] |
Ok, a = 1 |
|
[any @a, .._] |
Error | |
[any @a(it < 10)] |
Ok, a = [1..6] |
Capture all elements that match |
[until @a == 6, .._] |
Ok | All until first ocurrence of 6 |
[all @a == 6, .._] |
Ok a = [] |
All leading 6 |
[any @a(it > 100)] |
Fail | No elements > 100 |
[none @a(it in {6 .. 10})] |
Fail | There is an element == 6 |
[0 .. 2 is < 10, .._] |
Ok | First three elements < 10 |
[0 .. 2 is < 10] |
Fail | Missing trailing .._ |
until
non-greedy. Match everything until
<expr>
until <expr>
: match all until the first element that matches Expr
all
greedy. Match everything that matches
<expr>
all <expr>
: all elements should match Exprall @val is <expr>
: capture all elements in@val
if<expr>
is true for every one of them.
opt
Optional single element match - if sequence contains fewer elements than necessary element is considered missing. In that case either default fallback (if present) is used as value, or capture is set to None(T).
opt @a
: match optional element and bind it to aopt @a or "default"
: either match element to a or set a to "default"
any
greedy. Consume all sequence elements until the end and succeed only if at least one element has matched.
any @val is "d"
: capture all element that matchis "d"
none
- greedy. Consume all sequence elements until the end and succed only if any element has matched. EE
[m .. n @capture]
- Capture slice of elements from index m to n
Greedy patterns match until the end of a sequence and cannot be followed by anything else.
For sequence to match is must either be completely matched by all
subpatterns or have trailing .._
in pattern.
Sequence | Pattern | Match result |
---|---|---|
[1,2,3] |
[1,2]
[1, .._]
[1,2,_] |
Fail Ok Ok |
- capture all elements in sequence:
[all @elems]
- get all elements until (not including "d"):
[until @a is "d"]
- All leading "d":
[all @leading is "d"]
- Match first two elements and ignore the rest
[_, _, .._]
- Match optional third element
[_, _, opt @trail]
- Match third element and if not matched use default value
[_, _, opt @trail or "default"]
- Capture all elements until first separator:
[until @leading is "sep", @middle is "sep", all @trailing]
- Extract all conditions from IfStmt:
IfStmt([all ElseIf([@cond, _]), .._])
In addition to working with nested subpatterns it is possible to use pattern matching as simple text scanner, similar to strscans. Main difference is that it allows working on arbitrary sequences, meaning it is possible, for example, to operate on tokens, or as in this example on strings (for the sake of simplicity).
func allIs(str: string, chars: set[char]): bool = str.allIt(it in chars)
"2019-10-11 school start".split({'-', ' '}).assertMatch([
pref @dateParts(it.allIs({'0' .. '9'})),
pref _(it.allIs({' '})),
all @text
])
doAssert dateParts == @["2019", "10", "11"]
doAssert text == @["school", "start"]
Input tuple: (1, 2, "fa")
Pattern | Result | Comment |
---|---|---|
(_, _, _) |
Ok | Match all |
(@a, @a, _) |
Fail | |
(@a is (1 | 2), @a, _) |
Fail | [1] |
(1, 1 | 2, _) |
Ok |
- [1] Pattern backtracking is not performed,
@a
is first bound to 1, and in subsequent match attempts pattern fails.
Tuple element matches support any regular match expression like
@capture
, and not different from field matches. You can also use opt
@capture or "default"
in order to assign fallback value on tuple
unpacking.
(@a, (@b, _), _) := ("hello", ("world", 11), 0.2)
For matching object fields you can use (fld: value)
-
type
Obj = object
fld1: int8
func len(o: Obj): int = 0
case Obj():
of (fld1: < -10):
discard
of (len: > 10):
# can use results of function evaluation as fields - same idea as
# method call syntax in regular code.
discard
of (fld1: in {1 .. 10}):
discard
of (fld1: @capture):
doAssert capture == 0
For objects with Option[T]
fields it is possible to use field: opt
@capture or "default"
to either get capture value, or set it to fallback
expression.
Matching on .kind
field is a very common operation and has special
syntax sugar - ForStmt()
is functionally equivalent to (kind:
nnkForStmt)
, but much more concise.
nnk pefix can be omitted - in general if your enum field name folows nep1 naming conventions (each enum name starts with underscore prefix (common for all enum elements), followed PascalCase enum name.
Input AST
ForStmt
Ident "i"
Infix
Ident ".."
IntLit 1
IntLit 10
StmtList
Command
Ident "echo"
IntLit 12
ForStmt([== ident("i"), .._])
Only for loops withi
as variableForStmt([@a is Ident(), .._])
Capture for loop variableForStmt([@a.isTuple(), .._])
for loops in which first subnode satisfies predicateisTuple()
. Bind match toa
ForStmt([_, _, (len: in {1 .. 10})])
between one to ten statements in the for loop body- Using object name for pattern matching
ObjectName()
does not produce a hard error, but if.kind
field does not need to be checked(fld: <pattern>)
will be sufficient. - To check
.kind
against multiple operators prefixin
can be used -(kind: in {nnkForStmt, nnkWhileStmt})
It is possible to unpack regular object using tuple matcher syntax - in
this case overload for []
operator must be provided that accepts
static[FieldIndex]
argument and returns a field.
type
Point = object
x: int
y: int
proc `[]`(p: Point, idx: static[FieldIndex]): auto =
when idx == 0:
p.x
elif idx == 1:
p.y
else:
static:
error("Cannot unpack `Point` into three-tuple")
let point = Point(x: 12, y: 13)
(@x, @y) := point
assertEq x, 12
assertEq y, 13
Note auto
return type for []
proc - it is necessary if different
types of fields might be returned on tuple unpacking, but not mandatory.
If different fields have varying types when
must and static
be
used to allow for compile-time code selection.
By default object fields are either matched using recursive pattern, or
compared for equality (when field: "some value"
is used). It is also
possible to explicitly specify operator, for example using =~
from
std/pegs
module:
It should be noted that implicitly injected matches
variable is also
visible in the case branch.
Matching expressions using custom predicates is also possible. If it is not
necessary to capture matched element placeholder _.
should be used as a
first argument:
proc lenEq(s: openarray[int], value: int): bool = s.len == value
case [1, 2]:
of _.lenEq(3):
# fails
of _.lenEq(2):
# matches
To capture value using predicate placeholder can be replaced with
@capture
pattern:
let arr = @[@[1, 2], @[2, 3], @[4]]
discard arr.matches([any @capture.lenEq(2)])
doAssert capture == @[@[1, 2], @[2, 3]]
It is also possible to match derived ref
objects with patterns using
of
operator. It allows for runtime selection of different derived
types.
Note that of
operator is necessary for distinguishing between multiple
derived objects, or getting fields that are present only in derived types.
In addition to it performs isNil()
check in the object, so it might be
used in cases when you are not dealing with derived types.
Due to isNil()
check this pattern only makes sense when working with
ref
objects.
type
Base1 = ref object of RootObj
fld: int
First1 = ref object of Base1
first: float
Second1 = ref object of Base1
second: string
let elems: seq[Base1] = @[
Base1(fld: 123),
First1(fld: 456, first: 0.123),
Second1(fld: 678, second: "test"),
nil
]
for elem in elems:
case elem:
of of First1(fld: @capture1, first: @first):
# Only capture `Frist1` elements
doAssert capture1 == 456
doAssert first == 0.123
of of Second1(fld: @capture2, second: @second):
# Capture `second` field in derived object
doAssert capture2 == 678
doAssert second == "test"
of of Base1(fld: @default):
# Match all *non-nil* base elements
doAssert default == 123
else:
doAssert isNil(elem)
Pattern matchig also support key-value pairs - any type that has []
and
contains
defined for the necessary types can be used. In this example
we would use JsonNode
type from the standard library.
Input json in all examples in this section (node
variable)
{"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()",
"ondrop": "OnDrop()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
Get each "value" from an array. Resulting match would be stored values
variable - @[%"New", %"Open", %"Close"]
It is also possible to mix key-value pairs, field and kind object matching.
In this example case first branch would trigger if node
contains
"value"
that is a jstring, with value "File"
. Second would trigger
if string value is anything else (but it must still be a jstring).
Collect "ondrop"
from all elements in array, providing fallback
values - drops
would contain @[%"OnDrop()", %"<defaulted>",
%"<defaulted>"]
Collect only explicitly specified values - capture @[%"OnDrop()"]
Some(@x)
and None()
is a special case that will be rewritten into
(isSome: true, get: @x)
and (isNone: true)
respectively. This is
made to allow better integration with optional types. [9]_ .
Note: implementation does not explicitly require to use
std/options.Option
type, but instead works with anything that provides
following functions:
isSome(): bool
(for Some() pattern check),isNone(): bool
(for None() pattern), andget(): T
(for getting value if type is some).
Some()
pattern can be used with ?=
to unpack optionals in
conditions:
for it in @[some(12), none(int)]:
if Some(@unpacked) ?= it:
doAssert unpacked == 12
Some()
and None()
checks are used only when working with
Option
type (and any that provides the same API). When such pattern is
encountered it is immediately transformed into isSome/isNone
calls.
opt
on the other hand is used for dealing with potentially missing
values and providing default fallback values. In sequences, tables or
optional fields. When used opt
might add one layer of optionality if
default value is not provided, or remove one layer if value is provided.
For deeply nested AST structures it might be really inconvenient to write
one-line expression with lots of ProcDef[@name is Ident() | Postfix[_,
@name is Ident()]]
and so on. But it is possible to use block syntax for
patterns if necessary -
ProcDef:
@name is Ident() | Postfix[_, @name is Ident()]
# Other pattern parts
In case of ProcDef:
pattern braces can be omitted because it is clear
that we are trying to match a case object here.
Tree matching syntax has a nice property of being extremely close
(copy-pastable) from treeRepr
for NimNode
. For a following proc declaration:
proc testProc1(arg1: int) {.unpackProc.} =
discard
We have an ast
ProcDef
Ident "testProc1"
Empty
Empty
FormalParams
Empty
IdentDefs
Ident "arg1"
Ident "int"
Empty
Empty
Empty
StmtList
DiscardStmt
Empty
That can be matched using following pattern:
makeTree
provides 'reversed' implementation of pattern matching,
which allows to construct tree from pattern, using variables.
Example of use
type
HtmlNodeKind = enum
htmlBase = "base"
htmlHead = "head"
htmlLink = "link"
HtmlNode = object
kind*: HtmlNodeKind
text*: string
subn*: seq[HtmlNode]
func add(n: var HtmlNode, s: HtmlNode) = n.subn.add s
discard makeTree(HtmlNode):
base:
link(text: "hello")
In order to construct tree, kind=
and add
have to be defined.
Internally DSL just creats resulting object, sets kind=
and then
repeatedly add
elements to it. In order to properties for objects
either the field has to be exported, or fld=
has to be defined
(where fld
is the name of property you want to set).