Conversation
…imize lexer perf (~34% faster)
… skip dollar/comma passes
a838967 to
e3919cc
Compare
There was a problem hiding this comment.
Pull request overview
This PR focuses on improving parser performance by eliminating intermediate allocations during lex/parse and by merging the $ and , tree-transform passes, while also adding a simple benchmark harness and updating project tooling/CI for Yarn 4.
Changes:
- Replace the previous token-buffer-based pipeline with a single-pass
lexAndBuild()parser insrc/index.ts. - Optimize tree transforms in
src/tree.ts(no-slice helpers + new combinedresolveDollarComma) and add equivalence tests. - Add benchmarking support (
src/bench.ts,tsconfig.bench.json,yarn bench) and update Yarn/CI/docs.
Reviewed changes
Copilot reviewed 11 out of 13 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/index.ts |
New single-pass lexer+builder and conditional $/, transform invocation. |
src/tree.ts |
No-slice comma/dollar helpers + new combined resolveDollarComma pass. |
src/parser.test.ts |
Adds equivalence tests to validate combined $/, behavior. |
src/bench.ts |
Adds a benchmark runner for local perf testing. |
tsconfig.bench.json |
Adds a bench-specific TS config (Node types, ES2022 target). |
package.json |
Adds bench script, bumps version, updates dev deps, pins Yarn via packageManager. |
Agents.md |
Adds contributor/agent perf/testing/tooling notes. |
.yarnrc.yml |
Configures Yarn to use node-modules linker. |
.gitignore |
Ignores bench artifacts and Yarn state files. |
.github/workflows/upload.yaml |
Updates Node setup + Yarn caching/immutable installs + uses yarn compile. |
.github/workflows/npm-publish.yaml |
Updates Node setup + Yarn caching/immutable installs + uses yarn compile. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ```bash | ||
| yarn test # run all 22 Jest tests | ||
| yarn bench # benchmark on bundled test fixtures (×20, ~16 KB) | ||
| BENCH_FILE=/absolute/path/to/large.cirru yarn bench # benchmark on a real file |
There was a problem hiding this comment.
This doc line hard-codes "run all 22 Jest tests", but src/parser.test.ts currently defines more than 22 tests (and later in this document it also mentions 27). Consider avoiding a specific count here (or update it) so it doesn't go stale.
| | File | Purpose | | ||
| | ----------------------- | ----------------------------------------------------------- | | ||
| | `src/index.ts` | Parser entry point; `lexAndBuild`, `parse`, `parseOneLiner` | | ||
| | `src/tree.ts` | Tree helpers: `resolveDollar`, `resolveComma`, utilities | | ||
| | `src/types.ts` | `ELexState`, `ELexControl`, `ICirruNode` | | ||
| | `src/parser.test.ts` | All 22 tests | | ||
| | `src/bench.ts` | Benchmark script | | ||
| | `tsconfig.bench.json` | Separate tsconfig for bench (es2022, node types) | | ||
| | `tsconfig-compile.json` | Library build config | | ||
| | `test/cirru/*.cirru` | Test input fixtures | | ||
| | `test/ast/*.json` | Expected AST outputs | | ||
| | `lib/` | Compiled library output (committed) | |
There was a problem hiding this comment.
The File Map claims lib/ is "Compiled library output (committed)", but .gitignore ignores lib/* and CI builds lib/ via yarn compile. Please clarify whether lib/ is expected to be committed or generated, and update this row accordingly to avoid confusing contributors.
| | File | Purpose | | |
| | ----------------------- | ----------------------------------------------------------- | | |
| | `src/index.ts` | Parser entry point; `lexAndBuild`, `parse`, `parseOneLiner` | | |
| | `src/tree.ts` | Tree helpers: `resolveDollar`, `resolveComma`, utilities | | |
| | `src/types.ts` | `ELexState`, `ELexControl`, `ICirruNode` | | |
| | `src/parser.test.ts` | All 22 tests | | |
| | `src/bench.ts` | Benchmark script | | |
| | `tsconfig.bench.json` | Separate tsconfig for bench (es2022, node types) | | |
| | `tsconfig-compile.json` | Library build config | | |
| | `test/cirru/*.cirru` | Test input fixtures | | |
| | `test/ast/*.json` | Expected AST outputs | | |
| | `lib/` | Compiled library output (committed) | | |
| | File | Purpose | | |
| | ----------------------- | ----------------------------------------------------------------------- | | |
| | `src/index.ts` | Parser entry point; `lexAndBuild`, `parse`, `parseOneLiner` | | |
| | `src/tree.ts` | Tree helpers: `resolveDollar`, `resolveComma`, utilities | | |
| | `src/types.ts` | `ELexState`, `ELexControl`, `ICirruNode` | | |
| | `src/parser.test.ts` | All 22 tests | | |
| | `src/bench.ts` | Benchmark script | | |
| | `tsconfig.bench.json` | Separate tsconfig for bench (es2022, node types) | | |
| | `tsconfig-compile.json` | Library build config | | |
| | `test/cirru/*.cirru` | Test input fixtures | | |
| | `test/ast/*.json` | Expected AST outputs | | |
| | `lib/` | Compiled library output (generated; not committed—run `yarn compile`) | |
|
|
||
| - Framework: Jest + `ts-jest` | ||
| - Test file: `src/parser.test.ts` | ||
| - Fixtures: `test/cirru/*.cirru` (source) + `test/ast/*.json` (expected output) | ||
| - Every optimization must keep all **27 tests passing**. |
There was a problem hiding this comment.
The testing section states "keep all 27 tests passing", but src/parser.test.ts currently has 26 test(...) cases. To avoid future drift, consider removing the exact number here (or ensure it matches the actual Jest test count).
| | `src/index.ts` | Parser entry point; `lexAndBuild`, `parse`, `parseOneLiner` | | ||
| | `src/tree.ts` | Tree helpers: `resolveDollar`, `resolveComma`, utilities | | ||
| | `src/types.ts` | `ELexState`, `ELexControl`, `ICirruNode` | | ||
| | `src/parser.test.ts` | All 22 tests | |
There was a problem hiding this comment.
The File Map entry for src/parser.test.ts says "All 22 tests", but the test file currently contains more than 22 test cases. Consider removing the specific number (or updating it) so the file map stays accurate over time.
| | `src/parser.test.ts` | All 22 tests | | |
| | `src/parser.test.ts` | Jest tests for the parser | |
| case CHAR_LPAREN: | ||
| if (isOdd(indentCount)) throw new Error(`Invalid indentation size ${indentCount}`); | ||
| flushIndent(indentCount >> 1); | ||
| emitOpen(); | ||
| state = ELexState.space; | ||
| indentCount = 0; | ||
| break; | ||
| default: | ||
| if (isOdd(indentCount)) throw new Error(`Invalid indentation size ${indentCount}`); | ||
| flushIndent(indentCount >> 1); | ||
| state = ELexState.token; | ||
| tokenStart = pointer - 1; | ||
| indentCount = 0; | ||
| break; |
There was a problem hiding this comment.
In ELexState.indent, a leading ) is currently treated as the start of a token (falls through to default) instead of emitting a close paren. This makes ) behave inconsistently depending on lexer state and can break inputs where a line begins with ) (it would produce a literal ")" token instead of closing a list). Add a CHAR_RPAREN case here that mirrors the CHAR_LPAREN handling (validate indent, flushIndent(...), then emitClose() and switch to space).
| const emitClose = () => { | ||
| const completed = current!; | ||
| if (stack.length === 0) { | ||
| result.push(completed); | ||
| current = null; | ||
| } else { | ||
| throw new Error(`Unexpected chunk ${JSON.stringify(chunk)}`); | ||
| current = stack.pop()!; | ||
| current.push(completed); | ||
| } | ||
| } | ||
| }; | ||
| }; |
There was a problem hiding this comment.
emitClose() uses a non-null assertion (current!) and will throw a runtime TypeError if a close paren or indentation-driven close is encountered when current is null (e.g., malformed input with an extra )). The previous implementation threw a controlled parse error for unexpected closes; consider guarding emitClose() and throwing an explicit Error (e.g. unexpected ")").
| state = ELexState.escape; | ||
| break; | ||
| case CHAR_NEWLINE: | ||
| throw new Error("Expected newline in string"); |
There was a problem hiding this comment.
The error message thrown on newline inside a string literal is misleading: throw new Error("Expected newline in string") is triggered because a newline was encountered before the closing quote (i.e., an unterminated/invalid string). Consider updating the message to something like "Unexpected newline in string" or "Unterminated string literal" to match the actual error condition.
| throw new Error("Expected newline in string"); | |
| throw new Error("Unexpected newline in string"); |
No description provided.