Skip to content

Commit 346edf4

Browse files
authored
docs: Update README.md (#305)
1 parent 48c0a80 commit 346edf4

1 file changed

Lines changed: 89 additions & 170 deletions

File tree

README.md

Lines changed: 89 additions & 170 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,3 @@
1-
<br/>
2-
<br/>
3-
41
<p align="center">
52
<img width="400" alt="The logo: a curled wood shaving on a workbench" src="https://github.com/user-attachments/assets/8f1162aa-5769-415d-babe-56b962256747" />
63
</p>
@@ -11,7 +8,7 @@
118

129
<p align="center">
1310
A type-safe query language for <a href="https://tree-sitter.github.io">Tree-sitter</a>.<br/>
14-
Query in, typed data out.
11+
Powered by the <a href="https://github.com/bearcove/arborium">arborium</a> grammar collection.
1512
</p>
1613

1714
<br/>
@@ -26,202 +23,124 @@
2623
<br/>
2724

2825
<p align="center">
29-
⚠️ <a href="#status">ALPHA STAGE</a>: not for production use ⚠️<br/>
26+
<sub>
27+
⚠️ Beta: not for production use ⚠️<br/>
28+
</sub>
3029
</p>
3130

3231
<br/>
33-
<br/>
34-
35-
## The problem
36-
37-
Tree-sitter solved parsing. It powers syntax highlighting and code navigation at GitHub, drives the editing experience in Zed, Helix, and Neovim. It gives you a fast, accurate, incremental syntax tree for virtually any language.
38-
39-
The hard problem now is what comes _after_ parsing: extracting structured data from the tree:
40-
41-
```typescript
42-
function extractFunction(node: SyntaxNode): FunctionInfo | null {
43-
if (node.type !== "function_declaration") {
44-
return null;
45-
}
46-
const name = node.childForFieldName("name");
47-
const body = node.childForFieldName("body");
48-
if (!name || !body) {
49-
return null;
50-
}
51-
return {
52-
name: name.text,
53-
body,
54-
};
55-
}
56-
```
57-
58-
Every extraction requires a new function, each one a potential source of bugs that won't surface until production.
59-
60-
## The solution
61-
62-
Plotnik extends Tree-sitter queries with type annotations:
63-
64-
```clojure
65-
(function_declaration
66-
name: (identifier) @name :: string
67-
body: (statement_block) @body
68-
) @func :: FunctionInfo
69-
```
70-
71-
The query describes structure, and Plotnik infers the output type:
72-
73-
```typescript
74-
interface FunctionInfo {
75-
name: string;
76-
body: SyntaxNode;
77-
}
78-
```
79-
80-
This structure is guaranteed by the query engine. No defensive programming needed.
81-
82-
## But what about Tree-sitter queries?
83-
84-
Tree-sitter already has queries:
85-
86-
```clojure
87-
(function_declaration
88-
name: (identifier) @name
89-
body: (statement_block) @body)
90-
```
91-
92-
The result is a flat capture list:
9332

94-
```typescript
95-
query.matches(tree.rootNode);
96-
// → [{ captures: [{ name: "name", node }, { name: "body", node }] }, ...]
97-
```
98-
99-
The assembly layer is up to you:
100-
101-
```typescript
102-
const name = match.captures.find((c) => c.name === "name")?.node;
103-
const body = match.captures.find((c) => c.name === "body")?.node;
104-
if (!name || !body) throw new Error("Missing capture");
105-
return { name: name.text, body };
106-
```
107-
108-
This means string-based lookup, null checks, and manual type definitions kept in sync by convention.
109-
110-
Tree-sitter queries are designed for matching. Plotnik adds the typing layer: the query _is_ the type definition.
111-
112-
## Why Plotnik?
33+
Tree-sitter gives you the syntax tree. Extracting structured data from it still means writing imperative navigation code, null checks, and maintaining type definitions by hand. Plotnik makes extraction declarative: write a pattern, get typed data. The query is the type definition.
11334

114-
| Hand-written extraction | Plotnik |
115-
| -------------------------- | ---------------------------- |
116-
| Manual navigation | Declarative pattern matching |
117-
| Runtime type errors | Compile-time type inference |
118-
| Repetitive extraction code | Single-query extraction |
119-
| Ad-hoc data structures | Generated structs/interfaces |
35+
## Features
12036

121-
Plotnik extends Tree-sitter's query syntax with:
37+
- [x] Static type inference from query structure
38+
- [x] Named expressions for composition and reuse
39+
- [x] Recursion for nested structures
40+
- [x] Tagged unions (discriminated unions)
41+
- [x] TypeScript type generation
42+
- [x] CLI: `exec` for matches, `infer` for types, `ast`/`trace`/`dump` for debug
43+
- [ ] Grammar verification (validate queries against tree-sitter node types)
44+
- [ ] Compile-time queries via proc-macro
45+
- [ ] LSP server
46+
- [ ] Editor extensions
12247

123-
- **Named expressions** for composition and reuse
124-
- **Recursion** for arbitrarily nested structures
125-
- **Type annotations** for precise output shapes
126-
- **Alternations**: untagged for simplicity, tagged for precision (discriminated unions)
48+
## Example
12749

128-
## Use cases
50+
Extract function signatures from Rust. `Type` references itself to handle nested generics like `Option<Vec<String>>`.
12951

130-
- **Scripting:** Count patterns, extract metrics, audit dependencies
131-
- **Custom linters:** Encode your business rules and architecture constraints
132-
- **LLM Pipelines:** Extract signatures and types as structured data for RAG
133-
- **Code Intelligence:** Outline views, navigation, symbol extraction across grammars
134-
135-
## Language design
136-
137-
Start simple—extract all function names from a file:
52+
`query.ptk`:
13853

13954
```clojure
140-
Functions = (program
141-
{(function_declaration name: (identifier) @name :: string)}* @functions)
142-
```
55+
Type = [
56+
Simple: [(type_identifier) (primitive_type)] @name :: string
57+
Generic: (generic_type
58+
type: (type_identifier) @name :: string
59+
type_arguments: (type_arguments (Type)* @args))
60+
]
14361

144-
Plotnik infers the output type:
62+
Func = (function_item
63+
name: (identifier) @name :: string
64+
parameters: (parameters
65+
(parameter
66+
pattern: (identifier) @param :: string
67+
type: (Type) @type
68+
)* @params))
14569

146-
```typescript
147-
type Functions = {
148-
functions: { name: string }[];
149-
};
70+
Funcs = (source_file (Func)* @funcs)
15071
```
15172

152-
Scale up to tagged unions for richer structure:
153-
154-
```clojure
155-
Statement = [
156-
Assign: (assignment_expression
157-
left: (identifier) @target :: string
158-
right: (Expression) @value)
159-
Call: (call_expression
160-
function: (identifier) @func :: string
161-
arguments: (arguments (Expression)* @args))
162-
]
73+
`lib.rs`:
16374

164-
Expression = [
165-
Ident: (identifier) @name :: string
166-
Num: (number) @value :: string
167-
]
75+
```rust
76+
fn get(key: Option<Vec<String>>) {}
16877

169-
TopDefinitions = (program (Statement)+ @statements)
78+
fn set(key: String, val: i32) {}
17079
```
17180

172-
This produces:
81+
Plotnik infers TypeScript types from the query structure. `Type` is recursive: `args: Type[]`.
17382

174-
```typescript
175-
type Statement =
176-
| { $tag: "Assign"; $data: { target: string; value: Expression } }
177-
| { $tag: "Call"; $data: { func: string; args: Expression[] } };
83+
```sh
84+
❯ plotnik infer query.ptk -l rust
85+
export type Type =
86+
| { $tag: "Simple"; $data: { name: string } }
87+
| { $tag: "Generic"; $data: { name: string; args: Type[] } };
17888

179-
type Expression =
180-
| { $tag: "Ident"; $data: { name: string } }
181-
| { $tag: "Num"; $data: { value: string } };
89+
export interface Func {
90+
name: string;
91+
params: { param: string; type: Type }[];
92+
}
18293

183-
type TopDefinitions = {
184-
statements: [Statement, ...Statement[]];
185-
};
94+
export interface Funcs {
95+
funcs: Func[];
96+
}
18697
```
18798
188-
Then process the results:
189-
190-
```typescript
191-
for (const stmt of result.statements) {
192-
switch (stmt.$tag) {
193-
case "Assign":
194-
console.log(`Assignment to ${stmt.$data.target}`);
195-
break;
196-
case "Call":
197-
console.log(
198-
`Call to ${stmt.$data.func} with ${stmt.$data.args.length} args`,
199-
);
200-
break;
201-
}
99+
Run the query against `lib.rs` to extract structured JSON:
100+
101+
```sh
102+
❯ plotnik exec query.ptk lib.rs
103+
{
104+
"funcs": [
105+
{
106+
"name": "get",
107+
"params": [{
108+
"param": "key",
109+
"type": {
110+
"$tag": "Generic",
111+
"$data": {
112+
"name": "Option",
113+
"args": [{
114+
"$tag": "Generic",
115+
"$data": {
116+
"name": "Vec",
117+
"args": [{ "$tag": "Simple", "$data": { "name": "String" } }]
118+
}
119+
}]
120+
}
121+
}
122+
}]
123+
},
124+
{
125+
"name": "set",
126+
"params": [
127+
{ "param": "key", "type": { "$tag": "Simple", "$data": { "name": "String" } } },
128+
{ "param": "val", "type": { "$tag": "Simple", "$data": { "name": "i32" } } }
129+
]
130+
}
131+
]
202132
}
203133
```
204134
205-
For the detailed specification, see the [Language Reference](docs/lang-reference.md).
135+
## Why
206136
207-
## Documentation
137+
Pattern matching over syntax trees is powerful, but tree-sitter queries produce flat capture lists. You still need to assemble the results, handle missing captures, and define types by hand. Plotnik closes this gap: the query describes structure, the engine guarantees it.
208138
209-
- [CLI Guide](docs/cli.md) — Command-line tool usage
210-
- [Language Reference](docs/lang-reference.md) — Complete syntax and semantics
211-
- [Type System](docs/type-system.md) — How output types are inferred from queries
212-
- [Runtime Engine](docs/runtime-engine.md) — VM execution model (for contributors)
213-
214-
## Supported Languages
215-
216-
Plotnik bundles 15 languages out of the box: Bash, C, C++, CSS, Go, HTML, Java, JavaScript, JSON, Python, Rust, TOML, TSX, TypeScript, and YAML. The underlying [arborium](https://github.com/bearcove/arborium) collection includes 60+ permissively-licensed grammars—additional languages can be enabled as needed.
217-
218-
## Status
219-
220-
**Working now:** Parser with error recovery, type inference, query execution, CLI tools (`check`, `dump`, `infer`, `exec`, `trace`, `tree`, `langs`).
221-
222-
**Next up:** CLI distribution (Homebrew, npm), language bindings (TypeScript/WASM, Python), LSP server, editor extensions.
139+
## Documentation
223140
224-
⚠️ Alpha stage—API may change. Not for production use.
141+
- [CLI Guide](docs/cli.md)
142+
- [Language Reference](docs/lang-reference.md)
143+
- [Type System](docs/type-system.md)
225144
226145
## Acknowledgments
227146

0 commit comments

Comments
 (0)