Skip to content

Commit 87c188b

Browse files
authored
docs: Separate pre_effects and post_effects in ADR-0003 (#105)
1 parent 04c3eca commit 87c188b

1 file changed

Lines changed: 57 additions & 40 deletions

File tree

docs/adr/ADR-0003-query-intermediate-representation.md

Lines changed: 57 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -68,12 +68,13 @@ Each named definition has an entry point. The default entry is the last definiti
6868

6969
```rust
7070
struct Transition {
71-
matcher: Option<Matcher>, // None = epsilon (no node consumed)
72-
pre_anchored: bool, // must match at current position, no scanning
73-
post_anchored: bool, // after match, cursor must be at last sibling
74-
effects: Vec<Effect>, // data construction ops emitted on success
71+
matcher: Option<Matcher>, // None = epsilon (no node consumed)
72+
pre_anchored: bool, // must match at current position, no scanning
73+
post_anchored: bool, // after match, cursor must be at last sibling
74+
pre_effects: Vec<Effect>, // effects before match (consume previous current)
75+
post_effects: Vec<Effect>, // effects after match (consume new current)
7576
ref_marker: Option<RefTransition>, // call boundary marker
76-
next: Vec<TransitionId>, // successors; order = priority (first = greedy)
77+
next: Vec<TransitionId>, // successors; order = priority (first = greedy)
7778
}
7879

7980
enum RefTransition {
@@ -189,12 +190,13 @@ enum Container<'a> {
189190

190191
For any given transition, the execution order is strict to ensure data consistency during backtracking:
191192

192-
1. **Match**: Validate node kind/fields. If fail, abort.
193-
2. **Enter**: Push `Frame` with current `builder.watermark()`.
194-
3. **Effects**: Emit new effects (committed tentatively).
195-
4. **Exit**: Pop `Frame` (validate return).
193+
1. **Enter**: Push `Frame` with current `builder.watermark()`.
194+
2. **Pre-Effects**: Emit `pre_effects` (uses previous `current` value).
195+
3. **Match**: Validate node kind/fields. If fail, rollback to watermark and abort.
196+
4. **Post-Effects**: Emit `post_effects` (uses new `current` value).
197+
5. **Exit**: Pop `Frame` (validate return).
196198

197-
This order ensures that if a definition call succeeds, its effects are present. If it fails later, the watermark saved during `Enter` allows rolling back all effects emitted by that definition.
199+
This order ensures correct behavior during epsilon elimination. Pre-effects run before the match overwrites `current`, allowing effects like `PushElement` to be safely merged from preceding epsilon transitions. Post-effects run after, for effects that need the newly matched node.
198200

199201
#### Example
200202

@@ -208,24 +210,26 @@ Func = (function_declaration
208210

209211
Input: `function foo(a, b) {}`
210212

211-
Effect stream:
213+
Effect stream (annotated with pre/post classification):
212214

213215
```
214-
StartObject
215-
(match "foo")
216-
Field("name")
217-
StartArray
218-
(match "a")
219-
ToString
220-
PushElement
221-
(match "b")
222-
ToString
223-
PushElement
224-
EndArray
225-
Field("params")
226-
EndObject
216+
pre: StartObject
217+
(match "foo")
218+
post: Field("name")
219+
pre: StartArray
220+
(match "a")
221+
post: ToString
222+
post: PushElement
223+
(match "b")
224+
post: ToString
225+
post: PushElement
226+
post: EndArray
227+
post: Field("params")
228+
post: EndObject
227229
```
228230

231+
Note: In the raw graph, effects live on epsilon transitions between matches. The pre/post classification determines where they land after epsilon elimination. `StartObject` and `StartArray` are pre-effects (setup before matching). `Field`, `PushElement`, `ToString`, and `End*` are post-effects (consume the matched node or finalize containers).
232+
229233
Execution trace:
230234

231235
| Effect | current | stack |
@@ -304,27 +308,31 @@ Same structure, different `next` order. The first successor has priority.
304308
Array construction uses epsilon transitions with effects:
305309

306310
```
307-
T0: ε + StartArray next: [T1]
308-
T1: ε (branch) next: [T2, T5] // try match or exit
311+
T0: ε + StartArray next: [T1] // pre-effect: setup array
312+
T1: ε (branch) next: [T2, T4] // try match or exit
309313
T2: Match(expr) next: [T3]
310-
T3: ε + PushElement next: [T1] // loop back
311-
T4: ε + EndArray next: [T5]
312-
T5: ε + Field("items") next: [...]
314+
T3: ε + PushElement next: [T1] // post-effect: consume matched node
315+
T4: ε + EndArray next: [T5] // post-effect: finalize array
316+
T5: ε + Field("items") next: [...] // post-effect: assign to field
313317
```
314318

319+
After epsilon elimination, `PushElement` from T3 merges into T2 as a post-effect. `StartArray` from T0 merges into T2 as a pre-effect (first iteration only—loop iterations enter from T3, not T0).
320+
315321
Backtracking naturally handles partial arrays: truncating the effect stream removes uncommitted `PushElement` effects.
316322

317323
### Scopes
318324

319325
Nested objects from `{...} @name` use `StartObject`/`EndObject` effects:
320326

321327
```
322-
T0: ε + StartObject next: [T1]
328+
T0: ε + StartObject next: [T1] // pre-effect: setup object
323329
T1: ... (sequence contents) next: [T2]
324-
T2: ε + EndObject next: [T3]
325-
T3: ε + Field("name") next: [...]
330+
T2: ε + EndObject next: [T3] // post-effect: finalize object
331+
T3: ε + Field("name") next: [...] // post-effect: assign to field
326332
```
327333

334+
`StartObject` is a pre-effect (merges forward). `EndObject` and `Field` are post-effects (merge backward onto preceding match).
335+
328336
### Tagged Alternations
329337

330338
Tagged branches use `StartVariant` to create explicit tagged structures.
@@ -420,19 +428,28 @@ struct Interpreter<'a> {
420428

421429
### Epsilon Elimination (Optimization)
422430

423-
After initial construction, epsilon transitions can be eliminated by computing epsilon closures:
431+
After initial construction, epsilon transitions can be eliminated by computing epsilon closures. The `pre_effects`/`post_effects` split is essential for correctness here.
432+
433+
**Why the split matters**: A match transition overwrites `current` with the matched node. Effects from *preceding* epsilon transitions (like `PushElement`) need the *previous* `current` value. Without the split, merging them into a single post-match list would use the wrong value.
424434

425435
```
426-
Before:
427-
T0: ε + StartArray next: [T1]
428-
T1: ε + Field next: [T2]
429-
T2: Match(kind) next: [T3]
436+
Before (raw graph):
437+
T1: Match(A) next: [T2] // current = A
438+
T2: ε + PushElement next: [T3] // pushes A (correct)
439+
T3: Match(B) next: [...] // current = B
430440
431-
After:
432-
T0': Match(kind) + [StartArray, Field] next: [T3']
441+
After elimination (with split):
442+
T3': pre: [PushElement], Match(B), post: [] // PushElement runs before Match(B), pushes A ✓
443+
444+
Wrong (without split, effects merged as post):
445+
T3': Match(B) + [PushElement] // PushElement runs after Match(B), pushes B ✗
433446
```
434447

435-
Effects from eliminated epsilons accumulate on the surviving match transition. This is why `effects` is `Vec<Effect>` rather than `Option<Effect>`.
448+
**Accumulation rules**:
449+
- Effects from incoming epsilon paths → accumulate into `pre_effects`
450+
- Effects from outgoing epsilon paths → accumulate into `post_effects`
451+
452+
This is why both are `Vec<Effect>` rather than `Option<Effect>`.
436453

437454
**Reference expansion**: For definition references, epsilon elimination propagates `Enter`/`Exit` markers to surviving transitions:
438455

0 commit comments

Comments
 (0)