Plotnik is a pattern-matching language for tree-sitter syntax trees. It extends tree-sitter's query syntax with named expressions, recursion, and static type inference.
Tree-sitter predicates (#eq?, #match?) and directives (#set!) are not supported. Plotnik has its own inline predicate syntax (see Predicates).
NFA-based cursor walk with backtracking.
- Root-anchored: Matches the entire tree structure (like
^...$in regex) - Backtracking: Failed branches restore state and try alternatives
- Ordered choice:
[A B C]tries branches left-to-right; first match wins
Comments and "extra" nodes (per tree-sitter grammar) are automatically skipped unless explicitly matched.
(function_declaration (identifier) @name (block) @body)
Matches even with comments between children:
function foo /* comment */() {
/* body */
}The . anchor enforces adjacency, but its strictness depends on what's being anchored:
Between named nodes — skips trivia, disallows other named nodes:
(dotted_name (identifier) @a . (identifier) @b)
Matches a.b even if there's a comment like a /* x */ .b (trivia skipped), but won't match if another named node appears between them.
With anonymous nodes — strict, nothing skipped:
(array "[" . (identifier) @first) ; must be immediately after bracket
(call_expression (identifier) @fn . "(") ; no trivia between name and paren
When any side of the anchor is an anonymous node (literal token), the match is exact — no trivia allowed.
Rule: The anchor is as strict as its strictest operand. Anonymous nodes demand precision; named nodes tolerate trivia.
Node patterns are open — unmentioned children are ignored:
(binary_expression left: (identifier) @left)
Matches any binary_expression with an identifier in left, regardless of operator, right, etc.
Sequences {...} advance through siblings in order, skipping non-matching nodes.
field: pattern requires the child to have that field AND match the pattern:
(binary_expression
left: (identifier) @x
right: (number) @y
)
Fields participate in sequential matching — they're not independent lookups.
A .ptk file contains definitions:
```
; Helper (can also be used as entrypoint)
Expr = [(identifier) (number) (string)]
; Another definition
Stmt = (statement) @stmt
All definitions are entrypoints and included in the binary. Use --entry <Name> to select which one to execute.
Script (-q flag): Anonymous expressions allowed, auto-wrapped in language root.
plotnik exec -q '(identifier) @id' -s app.jsModule (.ptk files): Only named definitions allowed.
; ERROR in .ptk file
(identifier) @id
; OK
Query = (identifier) @id
A directory of .ptk files loaded as a single compilation unit.
- Flat namespace:
Fooina.ptkvisible inb.ptkwithout imports - Global uniqueness: Duplicate names are errors
- Non-recursive: Subdirectories are separate workspaces
- Dead code elimination: Unreachable internals stripped
Inferred from directory name (queries.ts/ → TypeScript, java-checks/ → Java). Override with -l/--lang.
- Single definition: Default entrypoint
- Multiple definitions: Use
--entry <Name>
helpers.ptk:
Ident = (identifier)
DeepSearch = [
(Ident) @target
(_ (DeepSearch)*)
]
main.ptk:
AllIdentifiers = (program (DeepSearch)*)
| Kind | Case | Examples |
|---|---|---|
| Definitions, labels, types | PascalCase |
Expr, Statement, BinaryOp |
| Node kinds | snake_case |
function_declaration, identifier |
| Captures, fields | snake_case |
@name, @func_body |
Tree-sitter allows @function.name; Plotnik requires @function_name because captures map to struct fields.
Plotnik infers output types from your query. See Type System for full details.
Query nesting does NOT create output nesting. All captures bubble up to the nearest scope boundary:
(function_declaration
name: (identifier) @name
body: (block
(return_statement (expression) @retval)))
Output type:
{ name: Node, retval: Node } // flat, not nestedThe pattern is 4 levels deep, but the output is flat. You're extracting specific pieces from an AST, not reconstructing its shape.
Quantifiers (*, +) containing internal captures require a struct capture.
// ERROR: internal capture without struct capture
(method_definition name: (identifier) @name)*
// OK: struct capture on the group
{ (method_definition name: (identifier) @name) @method }* @methods
→ { methods: { method: Node, name: Node }[] }
This prevents association loss — each struct is a distinct object, not parallel arrays that lose per-iteration grouping. See Type System: Strict Dimensionality.
Default capture type — a reference to a tree-sitter node:
interface Node {
kind: string; // e.g. "identifier"
text: string; // source text
start: Position; // { row, column }
end: Position;
}
Quantifiers determine whether a field is singular, optional, or an array:
| Pattern | Output Type | Meaning |
|---|---|---|
(x) @a |
a: T |
exactly one |
(x)? @a |
a?: T |
zero or one |
(x)* @a |
a: T[] |
zero or more (scalar list) |
(x)+ @a |
a: [T, ...T[]] |
one or more (scalar list) |
Node arrays work when the quantified pattern has no internal captures. For patterns with internal captures, use struct arrays:
| Pattern | Output Type | Meaning |
|---|---|---|
{...}* @items |
items: T[] |
zero or more structs |
{...}+ @items |
items: [T, ...] |
one or more structs |
{...}? @item |
item?: T |
optional struct (bubbles if uncaptured) |
Capture a sequence {...} or alternation [...] to create a new scope. Braces alone don't introduce nesting:
{
(function_declaration
name: (identifier) @name
body: (_) @body
) @node
} @func
Output type:
{ func: { node: Node, name: Node, body: Node } }The @func capture on the group creates a nested scope. All captures inside (@node, @name, @body) become fields of that nested object.
:: after a capture controls the output type:
| Annotation | Effect |
|---|---|
@x |
Inferred (usually Node) |
@x :: string |
Extract node.text as string |
@x :: T |
Name the type T in codegen |
Only :: string changes data; other :: T affect only generated type names.
Suppress captures from contributing to output with @_ or @_name:
Expr = (binary_expression left: (number) @left right: (number) @right)
; Without suppression: @left, @right bubble up
Query = (statement (Expr) @expr)
; Output: { expr: Node, left: Node, right: Node }
; With suppression: inner captures are suppressed
Query = (statement { (Expr) @_ } @expr)
; Output: { expr: Node }
Use cases:
- Match structurally, don't extract: Use a definition's pattern but discard its captures
- Wrap and isolate:
{ inner @_ } @outercaptures the outer node while suppressing inner captures
Rules:
@_and@_namematch like regular captures but produce no output- Named suppressive captures (
@_foo) are equivalent to@_— the name is documentation only - Type annotations are not allowed on suppressive captures
- Nesting works:
@_outercontaining@_innercorrectly suppresses both
Example:
{
(function_declaration
name: (identifier) @name :: string
body: (_) @body
) @node
} @func :: FunctionDeclaration
Output type:
interface FunctionDeclaration {
node: Node;
name: string; // :: string converted this
body: Node;
}
{
func: FunctionDeclaration;
}| Pattern | Output |
|---|---|
@name |
Field in current scope |
(x)? @a |
Optional field |
(x)* @a |
Node array (no internal captures) |
{...}* @items |
Struct array (with internal captures) |
{...} @x / [...] @x |
Nested object (new scope) |
@x :: string |
String value |
@x :: T |
Custom type name |
Match named nodes (non-terminals and named terminals) by type:
(function_declaration)
(binary_expression (identifier) (number))
Children can be partial — this matches any binary_expression with at least one string_literal child:
(binary_expression (string_literal))
With captures:
(binary_expression
(identifier) @left
(number) @right)
Output type:
{ left: Node, right: Node }Filter nodes by their text content with inline predicates:
(identifier == "foo") ; text equals "foo"
(identifier != "bar") ; text does not equal "bar"
(identifier ^= "get") ; text starts with "get"
(identifier $= "_id") ; text ends with "_id"
(identifier *= "test") ; text contains "test"
(identifier =~ /^[A-Z]/) ; text matches regex
(identifier !~ /^_/) ; text does not match regex
| Operator | Meaning |
|---|---|
== |
equals |
!= |
not equals |
^= |
starts with |
$= |
ends with |
*= |
contains |
=~ |
matches regex |
!~ |
does not match |
Regex patterns use /pattern/ syntax. Full Unicode is supported. Patterns match anywhere in the text (use ^ and $ anchors for full-match semantics).
(identifier =~ /^test_/) ; starts with "test_"
(identifier =~ /Handler$/) ; ends with "Handler"
(identifier =~ /^[A-Z][a-z]+(?:[A-Z][a-z]+)*$/) ; PascalCase
Unsupported regex features (compile-time error):
- Backreferences (
\1,\2) - Lookahead/lookbehind (
(?=...),(?!...),(?<=...),(?<!...)) - Named captures (
(?P<name>...))
Predicates don't affect output types — they're structural constraints like anchors.
Match literal tokens (operators, keywords, punctuation) with double or single quotes:
(binary_expression operator: "!=")
(return_statement "return")
Single quotes are equivalent to double quotes, useful when the query itself is wrapped in double quotes (e.g., in tool calls or JSON):
(return_statement 'return')
Anonymous nodes can be captured directly:
(binary_expression "+" @op)
"return" @keyword
Output type:
{
op: Node;
}
{
keyword: Node;
}| Syntax | Matches |
|---|---|
(_) |
Any named node |
_ |
Any node (named or anonymous) |
(call_expression function: (_) @fn)
(pair key: _ @key value: _ @value)
(ERROR)— matches parser error nodes(MISSING)— matches nodes inserted by error recovery(MISSING identifier)— matches a specific missing node type(MISSING ";")— matches a missing anonymous node
(ERROR) @syntax_error
(MISSING ";") @missing_semicolon
Output type:
{
syntax_error: Node;
}
{
missing_semicolon: Node;
}Query abstract node types directly, or narrow with /:
(expression) @expr
(expression/binary_expression) @binary
(expression/"()") @empty_parens
Constrain children to named fields. A field value must be a node pattern, an alternation, or a quantifier applied to one of these. Groups {...} are not allowed as direct field values.
(assignment_expression
left: (identifier) @target
right: (call_expression) @value)
Output type:
{ target: Node, value: Node }With type annotations:
(assignment_expression
left: (identifier) @target :: string
right: (call_expression) @value)
Output type:
{ target: string, value: Node }Quantifiers and captures after a field value apply to the entire field constraint, not just the value:
decorator: (decorator)* @decorators ; repeats the whole field
value: [A: (x) B: (y)] @kind ; captures the field (containing the alternation)
This allows repeating fields (useful for things like decorators in JavaScript). The capture still correctly produces the value's type — for alternations, you get the tagged union, not a raw node.
Assert a field is absent with -:
(function_declaration
name: (identifier) @name
-type_parameters)
Negated fields don't affect the output type — they're purely structural constraints:
{
name: Node;
}?— zero or one (optional)*— zero or more+— one or more (non-empty)
(function_declaration (decorator)? @decorator)
(function_declaration (decorator)* @decorators)
(function_declaration (decorator)+ @decorators)
Output types:
{ decorator?: Node }
{ decorators: Node[] }
{ decorators: [Node, ...Node[]] }The + quantifier always produces non-empty arrays — no opt-out.
Plotnik also supports non-greedy variants: *?, +?, ??
Match sibling patterns in order with braces.
⚠️ Syntax Difference from Tree-sitterTree-sitter:
((a) (b))— parentheses for sequences Plotnik:{(a) (b)}— braces for sequencesThis avoids ambiguity:
(foo)is always a node,{...}is always a sequence. Using tree-sitter's((a) (b))syntax in Plotnik is a parse error.
Plotnik uses {...} to visually distinguish grouping from node patterns, and adds scope creation when captured ({...} @name).
{
(comment)
(function_declaration)
}
Quantifiers apply to sequences:
{
(number)
{"," (number)}*
}
Capture elements inside a sequence:
{
(decorator)* @decorators
(function_declaration) @fn
}
Output type:
{ decorators: Node[], fn: Node }Capture the entire sequence with a type name:
{
(comment)+
(function_declaration) @fn
}+ @sections :: Section
Output type:
interface Section {
fn: Node;
}
{ sections: [Section, ...Section[]] }Match alternatives with [...]:
- Untagged: Fields merge across branches
- Tagged (with labels): Discriminated union
[
(identifier)
(string_literal)
] @value
Captures merge: present in all branches → required; some branches → optional. Same-name captures must have compatible types.
Branches must be type-compatible. Bare nodes are auto-promoted to single-field structs when mixed with structured branches.
(statement
[
(assignment_expression left: (identifier) @left)
(call_expression function: (identifier) @func)
])
Output type:
{ left?: Node, func?: Node } // each appears in one branch onlyWhen the same capture appears in all branches:
[
(identifier) @name
(string) @name
]
Output type:
{
name: Node;
} // required: present in all branches, same typeMixed presence:
[
(binary_expression
left: (_) @x
right: (_) @y)
(identifier) @x
]
The second branch (identifier) @x is auto-promoted to a structure { x: Node }, making it compatible with the first branch.
Output type:
{ x: Node, y?: Node } // x in all branches (required), y in one (optional)Type mismatch is an error:
[(identifier) @x :: string (number) @x :: number] // ERROR: @x has different types
With a capture on the alternation itself, the type is non-optional since exactly one branch must match:
[
(identifier)
(number)
] @value
Output type:
{
value: Node;
}Labels create a discriminated union ($tag + $data):
[
Assign: (assignment_expression left: (identifier) @left)
Call: (call_expression function: (identifier) @func)
] @stmt :: Stmt
type Stmt =
| { $tag: "Assign"; $data: { left: Node } }
| { $tag: "Call"; $data: { func: Node } };
When a merge alternation produces a structure (branches have internal captures), the capture on the alternation must have an explicit type annotation for codegen:
(call_expression
function: [
(identifier) @fn
(member_expression property: (property_identifier) @method)
] @target :: Target)
Output type:
interface Target {
fn?: Node;
method?: Node;
}
{
target: Target;
}The anchor . constrains sibling positions. Anchors don't affect types — they're structural constraints.
Anchor behavior depends on the node types being anchored:
| Pattern | Trivia Between | Named Nodes Between |
|---|---|---|
(a) . (b) |
Allowed | Disallowed |
"x" . (b) |
Disallowed | Disallowed |
(a) . "x" |
Disallowed | Disallowed |
"x" . "y" |
Disallowed | Disallowed |
When anchoring named nodes, trivia (comments, whitespace) is skipped but no other named nodes may appear between. When any operand is an anonymous node (literal token), the anchor enforces exact adjacency — nothing in between.
First child:
(array . (identifier) @first)
Last child:
(block (_) @last .)
(dotted_name (identifier) @a . (identifier) @b)
Without the anchor, @a and @b would match non-adjacent pairs too. With the anchor, only consecutive identifiers match (trivia like comments between them is tolerated).
For strict token-level adjacency:
(call_expression (identifier) @fn . "(")
Here, no trivia is allowed between the function name and the opening parenthesis because "(" is an anonymous node.
Anchors are structural constraints only — they don't affect output types:
{ first: Node }
{ last: Node }
{ a: Node, b: Node }Anchors ignore anonymous nodes.
Anchors require parent node context to be meaningful:
Valid positions:
(parent . (first)) ; first child anchor
(parent (last) .) ; last child anchor
(parent (a) . (b)) ; adjacent siblings
(parent {. (a) (b) .}) ; anchors in sequence inside node
{(a) . (b)} ; interior anchor (between items)
Invalid positions:
Q = . (a) ; definition level (no parent node)
Q = {. (a)} ; sequence boundary without parent
Q = {(a) .} ; sequence boundary without parent
Q = [(a) . (b)] ; directly in alternation
To anchor within alternation branches, wrap in a sequence:
Q = [{(a) . (b)} (c)] ; valid: anchor inside sequence branch
The rules:
- Boundary anchors (at start/end of sequence) need a parent named node to provide first/last child or adjacent sibling semantics
- Interior anchors (between items in a sequence) are always valid because both sides are explicitly defined
- Alternations cannot contain anchors directly — anchors must be inside a branch expression
Define reusable patterns:
BinaryOp =
(binary_expression
left: (_) @left
operator: _ @op
right: (_) @right)
Use as node types:
(return_statement (BinaryOp) @expr)
Encapsulation: (Name) matches but extracts nothing. You must capture ((Name) @x) to access fields. This separates structural reuse from data extraction.
Named expressions define both pattern and type:
Expr = [(BinaryOp) (UnaryOp) (identifier) (number)]
Named expressions can self-reference:
NestedCall =
(call_expression
function: [(identifier) @name (NestedCall) @inner]
arguments: (arguments))
Matches a(), a()(), a()()(), etc. → { name?: Node, inner?: NestedCall }
Tagged recursive example:
MemberChain = [
Base: (identifier) @name
Access: (member_expression
object: (MemberChain) @object
property: (property_identifier) @property)
]
Statement = [
Assign: (assignment_expression
left: (identifier) @target :: string
right: (Expression) @value)
Call: (call_expression
function: (identifier) @func :: string
arguments: (arguments (Expression)* @args))
Return: (return_statement
(Expression)? @value)
]
Expression = [
Ident: (identifier) @name :: string
Num: (number) @value :: string
Str: (string) @value :: string
]
Root = (program (Statement)+ @statements)
Output types:
type Statement =
| { $tag: "Assign"; $data: { target: string; value: Expression } }
| { $tag: "Call"; $data: { func: string; args: Expression[] } }
| { $tag: "Return"; $data: { value?: Expression } };
type Expression =
| { $tag: "Ident"; $data: { name: string } }
| { $tag: "Num"; $data: { value: string } }
| { $tag: "Str"; $data: { value: string } };
type Root = {
statements: [Statement, ...Statement[]];
};| Feature | Tree-sitter | Plotnik |
|---|---|---|
| Capture | @name |
@name (snake_case only) |
| Suppressive capture | @_ or @_name |
|
| Type annotation | @x :: T |
|
| Text extraction | @x :: string |
|
| Named node | (type) |
(type) |
| Anonymous node | "text" |
"text" |
| Any node | _ |
_ |
| Any named node | (_) |
(_) |
| Field constraint | field: pattern |
field: pattern |
| Negated field | !field |
-field |
| Quantifiers | ? * + |
? * + |
| Non-greedy | ?? *? +? |
|
| Sequence | ((a) (b)) |
{(a) (b)} |
| Alternation | [a b] |
[a b] |
| Tagged alternation | [A: (a) B: (b)] |
|
| Anchor | . |
. |
| Predicate | (#eq? @x "foo") |
(node == "foo") |
| Regex predicate | (#match? @x "p") |
(node =~ /p/) |
| Named expression | Name = pattern |
|
| Use named expression | (Name) |
Priority-based suppression: when diagnostics overlap, lower-priority ones are hidden. You see the root cause, not cascading symptoms.