refactor: @packrat/utils facade + two-tier guards + layered enforcement#2535
refactor: @packrat/utils facade + two-tier guards + layered enforcement#2535andrew-bierman wants to merge 99 commits into
Conversation
Adds scripts/lint/no-weak-assertions.ts mirroring the no-raw-typeof.ts
shape. Catches four coverage-theater patterns in *.test.ts files:
- assertion-free-test : it() / test() blocks with no expect() or
expect-helper calls (expectUnauthorized,
expectJsonResponse, etc.)
- only-tobedefined : every direct expect() ends in a non-specific
matcher (.toBeDefined / .toBeTruthy /
.toBeFalsy / .not.toBeUndefined / .not.toBeNull)
AND no expect-helper is called.
(.toBeUndefined() and .toBeNull() alone are
NOT flagged — they assert specific return
values.)
- bare-tohavebeencalled : .toHaveBeenCalled() without a matching
.toHaveBeenCalledWith / .toHaveBeenCalledTimes
in the same block.
- large-snapshot : toMatchInlineSnapshot() body > 50 lines.
File-level escape hatch: `// no-weak-assertions: disable` in the first 5
lines of a file skips the entire file.
Add `bun lint:weak-assertions` and `bun test:scripts` plus a
`scripts/vitest.config.ts` for the scripts test suite. The check is NOT
wired into `scripts/check-all.ts` yet — running it against the repo
surfaces ~27 pre-existing violations (mostly .toHaveBeenCalled() and
.toBeDefined()-only blocks). Wiring into check-all.ts will follow once
those are addressed in a separate cleanup PR.
Includes 16 unit tests for the analyzer covering each rule's positive
and negative cases, the disable comment, helper-function detection, and
the it.todo / it.skip allowance.
Supersedes the 2026-05-17 plan. The threshold ramp (its U2) was obsoleted by upstream work landed between 2026-05-17 and 2026-05-19 (api/expo/analytics/mcp/overpass are now at 95%+ thresholds with refined exclude lists and added middleware/image/storage tests). The remaining novel work — ratchet (U6) + assertion-strength lint (U9) + docs migration (U1) — survives unchanged in intent; baselines must be re-captured against upstream's current configs.
Move the testing guide out of the repo root per the "no random md in
root" convention (only CLAUDE.md + README.md belong there). The new
docs/testing.md is rewritten around current reality:
- Per-workspace Vitest thresholds (95%+ across packages/api, apps/expo,
packages/mcp; 80% on packages/{analytics,overpass}; 100% on
packages/units).
- The coverage ratchet (docs reference; the script itself lands in the
next commit).
- The assertion-strength lint (docs the four rules + escape hatch).
- The two-layer enforcement model: Vitest thresholds enforce the floor,
the ratchet enforces no-regression toward target.
Update CLAUDE.md with a Testing Policy summary linking to the new doc.
Update copilot-instructions.md, README.md, and the android-textinput
solutions doc that referenced the old path. Delete the dead
test:api-client:types script that points at a non-existent vitest
config.
The coverage ratchet is the in-repo gate that fails CI if any tracked
workspace's coverage drops below its baseline (modulo a small epsilon
to absorb v8 instrumentation noise). Pairs with the per-config Vitest
thresholds that upstream landed — thresholds enforce the floor, the
ratchet enforces no-regression toward the tier target and prevents
the "lower threshold to unblock, never re-raise" pattern.
New files:
- coverage-baselines.json : committed source of truth at repo root.
Keyed by workspace path; records the
summaryPath, tier, four-metric floor, and
recordedAt. Initial baselines captured
from actual coverage runs on 2026-05-19:
packages/api 98.31 / 95.43 / 100 / 98.31
apps/expo 97.36 / 95.00 / 100 / 97.36
packages/mcp 98.87 / 98.38 / 100 / 98.87
packages/analytics 84.48 / 83.33 / 89.13 / 84.48
packages/overpass 100 / 95.65 / 100 / 100
packages/units 100 / 100 / 100 / 100
Epsilon 0.5 absorbs v8 jitter.
- scripts/lint/coverage-ratchet.ts : the gate. Exit 1 on any
regression, missing
summary, or malformed
summary. Exit 0 on
equality / improvement.
- scripts/lint/coverage-baseline-update.ts : CI-only on `main` — bumps
baselines upward after a
green run. Never lowers.
- scripts/lint/__tests__/coverage-ratchet.test.ts : 13 tests covering
the compare logic,
missing/invalid summaries,
epsilon noise tolerance,
multi-metric regressions,
and loadBaseline parsing.
New package.json scripts:
- bun check:coverage : run the ratchet against current summaries
- bun check:coverage:update : bump baselines (used by the consolidated
coverage CI workflow once it lands)
NOT wired into scripts/check-all.ts in this commit — running the
ratchet requires per-workspace coverage data to exist on disk first.
The consolidated coverage CI workflow (follow-up plan) runs all
matrix coverage jobs before invoking the ratchet; locally, contributors
run `bun check:coverage` explicitly after a coverage suite.
bun format:package-json moves check:coverage / check:coverage:update and lint:weak-assertions / test:scripts into alphabetical position within the scripts block. No behaviour change.
Replaces unit-tests.yml with a single Coverage workflow that runs
coverage for every tracked workspace in a matrix, posts per-workspace
PR comments via davelosert/vitest-coverage-report-action, aggregates
the summaries, and runs `bun check:coverage` as a hard gate against
regression.
Workflow shape:
coverage (matrix) → run vitest --coverage per workspace
│ upload coverage-summary.json artifact
▼ post PR comment with delta
ratchet → download all summaries
run `bun check:coverage` against
coverage-baselines.json
fails on any regression / missing summary
│
▼ (main only)
bump-baseline → re-run check:coverage:update
auto-commit any baseline improvements
Plus a scripts-tests job that runs `bun test:scripts` (16 lint analyzer
tests + 13 ratchet tests) on every PR.
Tracked workspaces in the matrix: packages/api (unit), apps/expo,
packages/mcp, packages/analytics, packages/overpass, packages/units.
fail-fast: false so one workspace's regression doesn't mask another's.
Path filters are loose by design — any change under apps/** or
packages/** triggers the matrix. The narrow path filters in the
previous unit-tests.yml were how 12 workspaces ended up untracked
in CI. The matrix's own job-level conditions skip work cheaply when
nothing to run.
Permissions: contents:write for the main-only baseline auto-commit;
pull-requests:write for the coverage report comments. The PR-time
coverage step explicitly guards on \`github.event_name == 'pull_request'\`
so push events don't try to comment on nothing.
Deletes unit-tests.yml — fully subsumed by the new workflow.
Coordinate the branch-protection required-check rename from
"Unit Tests" → "Coverage / Coverage Ratchet" before merging.
The vitest-coverage-report-action step joins working-directory with json-summary-path internally, so passing a repo-relative summary path produced './apps/expo/apps/expo/coverage/unit/coverage-summary.json' and failed every matrix job with ENOENT. The test commands themselves ran successfully and the artifacts uploaded — only the PR comment step errored. Split the matrix into two path sets: - summary_path / final_path : repo-relative (artifact upload + ratchet) - summary_relative / final_relative : working-directory-relative (action)
Mark the 2026-05-19 audit-remediation plan as superseded and replace it with a Workflows-based plan that natively provides the durable-step + idempotency + retry + state semantics the prior plan reconstructed manually on top of Queues + Postgres. Audit findings about CSV correctness, validator hardening, observability, retention, and the operational runbook carry into the new plan; the queue-as-state-machine subplot is dropped. Net unit count drops from 15 to 9. Also includes the underlying audit (docs/audits/2026-05-16-etl-audit.md) that grounds both plans.
Validates the integration before committing to the Workflows migration:
R2 byte-range reads, csv-parse inside step.do, Drizzle Neon HTTP query,
durable step.sleep, and step result persistence. Workflow takes
{ objectKey, source } params; trigger via wrangler workflows trigger
spike-etl-workflow ... --env=dev and observe in the dashboard.
Adds:
- packages/api/src/workflows/spike-etl-workflow.ts (the workflow class)
- packages/api/src/index.ts exports SpikeEtlWorkflow
- packages/api/wrangler.jsonc declares the workflows[] binding
Per the plan (docs/plans/2026-05-20-001-fix-etl-pipeline-workflows-migration-plan.md
U1), this is throwaway. Delete the workflow file, the index.ts export,
and the wrangler binding after the GO/NO-GO decision lands U3's
production CatalogEtlWorkflow.
Pivoted from the in-app spike to a standalone worker because the dev deploy of packrat-api requires Docker (App Container) and Docker is not installed locally. Standalone worker has zero container surface and only the bindings the spike actually exercises. Spike rewritten to use the native R2 binding (env.PACKRAT_SCRAPY_BUCKET) instead of the AWS S3 client — removes the R2_ACCESS_KEY_ID secret dependency. Drizzle/Neon validation deferred to U3 (validates on the production worker that already has NEON_DATABASE_URL). Result on real prod data (cotopaxi_2026-05-14T16-54-05.csv, 698 KB): status=complete, duration=7s 1-r2-head: size=698620 etag=4397... ok 2-r2-range-read: 698134 bytes 3-csv-parse: 100 rows 4a/4b/4c-sleep: Δ=5043ms (5s sleep + ~40ms wake overhead) 5-memoize-marker: persisted in instance history GO. Workflows host R2 + csv-parse + step.sleep + step result persistence cleanly inside step.do. Proceed to U2 (Drizzle migration 0048). Adds: - packages/api/wrangler.spike.jsonc (standalone worker config) - packages/api/src/spike-entry.ts (thin /trigger endpoint) - packages/api/src/workflows/spike-etl-workflow.ts rewritten The standalone worker packrat-etl-spike.orange-frost-d665.workers.dev should be deleted via `wrangler delete --config=wrangler.spike.jsonc` after U3 lands the production CatalogEtlWorkflow.
Adds eight columns to etl_jobs for the Workflows-based ETL: - workflow_instance_id (nullable text) — links the etl_jobs row to its Workflows instance for admin dashboards - verified_at, verified_row_count (nullable) — post-ingestion R2-source row-count verification - total_embedding_failures (integer DEFAULT 0 NOT NULL) — observable degradation signal when the embedding service fails inside a chunk - superseded_by_job_id (FK to etl_jobs.id, ON DELETE SET NULL) + superseded_at — preserves the audit trail when an operator triggers repair-from-scratch - source_etag, source_last_modified — captured at job start, compared by the repair endpoint to fail closed when the R2 source has been overwritten Constraints + indexes: - CHECK etl_jobs_no_self_supersede prevents a row from superseding itself - Index etl_jobs_workflow_instance_id_idx (admin lookups) - Index etl_jobs_superseded_by_idx (repair-chain lookups) - UNIQUE catalog_item_etl_jobs_catalog_job_idx (catalog_item_id, etl_job_id) so retried chunk upserts can use ON CONFLICT DO NOTHING and not accumulate duplicate provenance rows Also fixes the long-standing stale drizzle.config.ts schema path (./src/db/schema.ts → ../db/src/schema.ts); the schema was extracted to @packrat/db in merge b14f4db but the config pointer was not updated, so db:generate failed before this commit. The Workflows binding is the source of truth for chunk lifecycle and retry semantics; the columns above are only DB-side denormalization for admin queries. Verification: - drizzle-kit check: Everything's fine - scripts/lint/check-drizzle-migrations.ts: Drizzle migration checks passed - biome lint: clean Schema smoke test at packages/api/test/db-schema-etl.test.ts asserts the columns + indexes + CHECK constraint + UNIQUE index against the Docker Postgres wsproxy. Run via bun test:api once docker-compose.test.yml is up.
Closes audit P1 #3, #4, #5 — the chunk boundary bugs where a CSV row spanning a 20 MB byte-range chunk would be either dropped, invalidated, or duplicated. The new helper snaps each chunk's byteEnd to the byte immediately before a newline by reading a small (64 KiB default) tail window and locating the last \n. Throws ChunkBoundaryError if the peek window has no newline so a row wider than 64 KiB fails loudly. Tail peek reads are issued in parallel via Promise.all so the producer endpoint's CPU budget stays bounded on multi-GB files. Single-object- parameter shape matches existing ETL functions. 5 unit tests cover: small-file single-chunk; multi-chunk newline alignment; concatenation completeness; ChunkBoundaryError on no-newline; row-boundary preservation across chunks. All pass via bun test:unit. Used by the new CatalogEtlWorkflow and by the retry / repair-from-scratch admin endpoints (next units).
Replaces packages/api/src/services/etl/processCatalogEtl.ts +
queue.ts as the catalog ingest engine. Producer cutover lands next
(separate commit) — for now both paths coexist; the queue handler in
src/index.ts still routes to processQueueBatch for ?engine=queue
callers during the bake window.
Workflow structure per source CSV:
for each chunk in params.chunks:
step.do('chunk-N', { retries: 3, backoff: exp, timeout: 5min },
() => processChunk(...))
step.do('aggregate') -> UPDATE etl_jobs totals from memoized chunk results
step.do('reconcile') -> csv-parse the R2 source for logical row count
step.do('reconcile-write') -> UPDATE verified_at + verified_row_count
step.do('finalize') -> UPDATE status='completed', completedAt
Audit closures inherited via the chunkCsvForR2 helper:
- P0 #1 (premature completion) — workflow instance state IS job state;
the finalize step is the single transition to 'completed'
- P0 #2 (swallowed errors) — Workflows surface failed steps with full
retry history; no DLQ table needed
- P1 #3/#4/#5 (chunk boundary bugs) — closed by the producer using
newline-aligned ChunkSpec; consumer drops skipPartialRow
- P1 #1/#2 (retry endpoint, stuck-job sweep) — closed by workflow
instance lifecycle (retry endpoints trigger new instances; stuck
detection is via dashboard, not a wall-clock cron)
- P1 #3 specifically — header re-fetch uses a bounded 4K → 16K → 64K
expand loop, throws EtlHeaderError if no newline anywhere in 64 KiB
Counter writes inside the chunk step (via existing
processValidItemsBatch / processLogsBatch) may double-count on a
chunk retry; the aggregate step at the end writes the authoritative
totals from memoized chunk results, overriding any retry drift.
wrangler.jsonc workflows binding switched from the throwaway
SPIKE_ETL_WORKFLOW to ETL_WORKFLOW (class CatalogEtlWorkflow). The
standalone spike worker (wrangler.spike.jsonc) is untouched and can
be torn down independently via wrangler delete --config=wrangler.spike.jsonc.
Test stub at src/__test-stubs__/cloudflare-workers.ts extended with
minimal WorkflowEntrypoint / WorkflowStep types so unit tests can
import workflow code without the real Cloudflare runtime.
Verification:
- All 17 unit-test files pass (304 tests) including the chunker tests
- biome check clean on all touched files
- Runtime verification (full deploy + trigger) blocked on Docker daemon
for the production worker; can be exercised once Docker is up.
Modifies POST /catalog/etl to trigger a CatalogEtlWorkflow instance per source CSV by default. The query parameter ?engine=queue keeps the legacy queue path available so operators can roll back if the workflow path misbehaves in production. Workflow path: - Calls chunkCsvForR2 per source object to produce newline-aligned ChunkSpec[] (closes audit P1 #3, #4, #5 on the retry surface as well as the initial ingest surface). - Captures source_etag + source_last_modified from the first object's R2 head and persists them on the etl_jobs row. The admin repair-from-scratch endpoint (U5) compares the stored etag against the live R2 head to fail closed when a source has been overwritten. - Generates a deterministic Workflows instance ID `${source}-${filename}` so duplicate triggers for the same file return the existing instance rather than producing parallel ingests. Queue path: - Unchanged from existing behavior — same 20 MB byte-range splits and queue.sendBatch. - Kept until the workflow path bakes for at least a week in production (per migration plan rollout); removal in a follow-up PR. Env type extended in env-validation.ts to expose ETL_WORKFLOW: Workflow so the route handler can type-check the env.ETL_WORKFLOW.create call. Unit tests still pass (17 files, 304 tests). The full end-to-end verification (POST /catalog/etl?engine=workflow → workflow instance → DB rows → reconcile → finalize) requires the production worker deploy, which is gated on Docker for the AppContainer build — that path is unchanged by this commit.
…gaps Closes audit P3 #2. The previous CatalogItemValidator.isValidUrl accepted anything new URL() parsed — including javascript:, mailto:, data:, file:, and any private/loopback IP. Catalog URLs render in the mobile app and the guides site, so a scraper bug or supply-chain compromise could trick the UI into rendering a homograph phishing link or a server-side fetch into hitting internal infrastructure. Validator now rejects: - Schemes other than http: and https: - URLs > 2048 chars - Loopback (localhost, 127.x.x.x, ::1), RFC-1918 (10/8, 172.16-31/12, 192.168/16), link-local (169.254/16), IPv6 link-local (fe80:), and ULA (fc00:/fd00:) hostnames — string-level pattern match only, no DNS resolution (DNS resolution would itself be an SSRF vector) - Hostnames containing non-ASCII characters that survive WhatWG URL encoding (IDN homograph defense in depth) Length caps on prose fields: - name 500, description 50,000, brand 200, category 200 - SKU 200 chars + /^[A-Za-z0-9_./-]+$/ charset 15 unit tests cover every reject path plus the boundary-allowed cases. All 319 tests in the unit suite pass.
…v.dev The spike worker (packrat-etl-spike) was deleted from the Cloudflare account; the throwaway files referencing it no longer have a deployed counterpart, so removing them keeps the worktree clean and the PR diff focused on the production migration. env.dev workflows binding added so the dev deploy of packrat-api actually receives ETL_WORKFLOW. Top-level workflows[] does not inherit into envs that explicitly redeclare other bindings (wrangler 4.92 behavior).
Strict cast linter (check:casts:strict) rejects unchecked `as Error` even when narrowing unknown from a catch. Replace with a clean `instanceof Error ? err : new Error(String(err))` guard so the parser.destroy call always receives a real Error.
…ures Narrows U2's schema additions from 8 columns to 2 after PR-shaping discussion. Most of the originally-scoped columns existed to support audit findings whose consumers ship in later PRs: - verified_at / verified_row_count — reconcile UI / U10 - superseded_by_job_id / superseded_at — repair endpoint / U5 - source_etag / source_last_modified — fail-closed repair guard / U5 Adding them now would create dead schema with no reader, so each follow-up unit adds its column when it lands. Net change: zero indexes, zero CHECK constraints, zero UNIQUE constraints, no FK self-reference. This is about as low-risk as a migration can be. What stays (both load-bearing from day one): - workflow_instance_id text — admin/debug link from etl_jobs to the CF Workflows instance; null on legacy queue-path rows, set on workflow-path rows - total_embedding_failures integer DEFAULT 0 NOT NULL — observable embedding-fallback degradation counter (audit P2 #3) - etl_jobs_workflow_instance_id_idx — supports the lookup pattern Workflow simplifications follow: - Dropped the reconcile + reconcile-write steps (no verified_* columns to write into); workflow now runs chunk-N × N → aggregate → finalize - Dropped reconcileSourceRowCount helper (orphaned with the steps) - Dropped source_etag / source_last_modified capture in the producer Plan doc updated with a scope-adjustment note on U2 explaining the narrowing; original 8-column rationale preserved for context. Verification: - drizzle-kit check ✓ - check-drizzle-migrations.ts ✓ - 18 unit-test files, 319 tests, all pass - biome check clean on all touched files
Bounded-batch DELETE for invalid_item_logs older than 90 days, wired to a daily 09:00 UTC CF Cron Trigger via a new scheduled handler arm in src/index.ts. Why batched: a naive single-statement DELETE on a table that has been accumulating for months would acquire row-level locks on millions of rows in one statement, hit Neon's statement timeout, and roll back having pruned nothing. The loop deletes 10k-row chunks via WHERE id IN (SELECT id ... LIMIT N) RETURNING id and counts the returned rows. Stops on empty batch. Caps at 100 iterations (1M rows / run) so a first-run with months of backlog can't monopolize the daily window — the cap is reported in the RetentionResult so operators can see when more rows remain. Defaults are sensible: 90-day window, 10k batch, 100-iter cap. Overridable per-call via options. Wrangler config gets a top-level + env.dev triggers.crons entry. First cron in this worker, so the scheduled() handler in src/index.ts is brand new — dispatches on controller.cron string and throws on unknown crons so a misconfigured trigger fails loudly. 5 unit tests cover empty-first-batch, multi-batch accumulation, the iteration cap, and the retentionDays fallback. All 324 tests in the unit suite pass. Real-DB integration coverage deferred to U9 (needs Docker Postgres).
Splits U6's "Sentry wiring + structured logger + error propagation"
deliverable. This PR ships the parts that don't need a new dependency:
- Thin structured logger at packages/api/src/utils/logger.ts emits
JSON lines with { level, event, ts, ...ctx }. To log an error,
attach it under ctx.err — the emit boundary unpacks errorName /
errorMessage / errorStack so the contract that error stacks never
contain raw CSV row data is enforceable by code review at one
site (the logger), not every call site
- processLogsBatch rethrows on DB failure (audit P2 #2) — silently
swallowing meant the only forensic record of validation failures
could disappear without anyone noticing
- processValidItemsBatch embedding-fallback path atomically
increments etl_jobs.total_embedding_failures (audit P2 #3) so
operators see degradation in the admin endpoint without trawling
logs; warning log at the call site for the per-batch event
- All console.log calls in the touched files replaced with
logger.info / logger.warn / logger.error
Sentry wiring (@sentry/cloudflare with withSentry({ fetch, workflow,
queue, scheduled })) is deferred to a follow-up PR. Justification:
adding a new dep changes the lockfile, adds ~30 KB to the bundle, and
needs compat verification against the mobile app's @sentry/react-native.
Reviewers should see that as its own concern, not bundled with
correctness fixes. The logger's emit() boundary is the wire-up point
when the follow-up lands — each call site upgrades for free.
Verification: 19 unit-test files, 324 tests pass. biome clean.
…oints Adds the two operator-facing surfaces that close the gap left by the plan's U5 scope-down. Defers repair-from-scratch and ETag fail-closed verification to follow-up PRs — workflow retry is enough to re-ingest the 7 historical false-failures from 2026-05-14, and ETag verification is defense in depth that operators can do manually for the one-time recovery. Migration 0049 adds two columns: - verified_at timestamp (nullable) - verified_row_count integer (nullable) Both written exclusively by the new reconcile endpoint. POST /admin/etl/:jobId/retry — rewritten to trigger a CatalogEtlWorkflow instance instead of a queue message. Works for both legacy queue-era failed jobs and workflow-era failed jobs (the new instance always uses chunkCsvForR2 for newline-aligned chunks). Instance ID is suffixed with the new jobId so duplicate retries don't collide. Response now includes workflowInstanceId so the admin UI can deep-link to the dashboard. POST /admin/etl/:jobId/reconcile — synchronously counts logical rows in the R2 source via csv-parse (NOT raw \n counting; quoted multi-line fields would skew that) and persists the result on verified_at + verified_row_count. Returns expectedRowCount / actualRowCount / delta. Large files may exceed the fetch budget — async-via-workflow is a follow-up if needed. EtlRetrySchema gets a workflowInstanceId field; EtlReconcileSchema is new. Both in @packrat/schemas/admin. Verification: drizzle-kit check + custom migration linter clean, check-casts:strict clean, biome clean, 19 unit-test files / 324 tests all pass. Reset-stuck endpoint (POST /admin/etl/reset-stuck) is unchanged — its wall-clock-based design is wrong (closed by the audit P1 #2) but the fix is to delete it once Workflows is the only ingest path. Deferred to the queue-path-removal PR.
New runbook at docs/runbooks/etl-pipeline.md covering: - Architecture (producer → workflow → DB; cron arms) - The ?engine=workflow|queue flag + coexistence-window context - How to trigger an ETL run - How to inspect workflow instances (wrangler workflows commands) - How to retry a failed job - How to reconcile a job's row count against R2 - DLQ / forensic record (the Workflows dashboard is the record; no DLQ table) - The 7-job historical recovery procedure with SQL + curl - Invalid-item-logs retention (daily 09:00 UTC sweep) - Draining the legacy queue path when ready for deletion - Admin dashboard field semantics under the Workflows architecture (workflow_instance_id, verified_*, total_embedding_failures, etc.) - Accepted limitations (no soft-delete, success_rate quirk on failed jobs, sync reconcile bounded by fetch budget, no ETag fail-closed on retry yet, embedding cost on chunk retry) - Historical recoveries appendix (stub for the 2026-05-14 recovery to be filled in when executed post-deploy) - References (audit, plans, CF docs) First runbook in docs/runbooks/ — establishes the convention.
Three real fixes plus one coverage exclusion:
1. packages/db/src/schema.ts: restore the `AnyPgColumn` type import.
It was dropped when U2 was slimmed (the FK self-reference on
`superseded_by_job_id` went away with it), but `post_comments`
still uses AnyPgColumn for its own parent_comment_id self-reference.
2. invalidLogRetention.ts: drop the `.returning({ id: ... })` typed
shape and use bare `.returning()`. The union of three Drizzle
driver types (neon-http / neon-serverless / node-postgres) accepts
only the no-arg overload at the intersection; the typed shape
tripped TS2554. Row count is computed from `.length` anyway.
3. invalidLogRetention.test.ts: replace the `__mockDb` cross-module
handle (TS2305: not an export) with vi.hoisted() state shared
between the mock factory and the tests. Cleaner and type-safe.
4. vitest.unit.config.ts: add `src/workflows/catalog-etl-workflow.ts`
to coverage exclude. The chunker sibling (src/workflows/shared/) is
still covered (5 unit tests at 100%). The workflow class needs the
real CF Workflows runtime for end-to-end execution; integration tests
in /test pick it up when Docker Postgres is wired (deferred per the
PR's "deferred to follow-up" list).
Plus: new unit tests for `logger.ts` (10 tests, 100% coverage) so the
new file doesn't drop the coverage threshold by itself.
Coverage now at 98.63% statements / 95.33% branches (was 76.76% / 95.16%).
20 unit-test files, 331 tests, all pass. bun check-types clean.
Does not address:
- `api-tests` install failure (Fail extracting tarball for
@sentry/cli-linux-x64) — that's a transient registry / CI runner
issue, not something this PR can fix. A retry should clear it.
…dit trail Adds the columns + endpoints that were originally part of U5's full scope but deferred during the U2 slim-down. Now landing together so the post-merge operational story is complete. Migration 0050 adds to etl_jobs: - source_etag text — captured by the producer from r2.head().etag - source_last_modified timestamp — same; redundant with etag but cheap - superseded_by_job_id text — FK to etl_jobs.id (ON DELETE SET NULL), written by retry + repair endpoints to link the new job back to the original - superseded_at timestamp — when the supersession was recorded - CHECK constraint etl_jobs_no_self_supersede prevents a row from superseding itself - Index etl_jobs_superseded_by_idx supports the dashboard's "show me the repair chain for cotopaxi" lookup Producer (POST /catalog/etl): - Captures sourceEtag + sourceLastModified from the first object's chunkCsvForR2 head; writes to etl_jobs on insert Retry (POST /admin/etl/:jobId/retry): - Refactored into a shared reingestJob() helper used by retry + repair-from-scratch - Before triggering the new workflow, calls r2.head() and compares live etag against the stored sourceEtag — returns 409 ETL_ETAG_MISMATCH unless ?force=true. Skips the check when the stored etag is NULL (legacy queue-era rows, including the 7 false-failures from 2026-05-14) - New job row carries supersededByJobId pointing at the original + supersededAt timestamp New endpoint POST /admin/etl/:jobId/repair-from-scratch: - Same shape as retry but accepts completed jobs too. Use case: operator suspects an originally-completed job under-counted (the audit's R8 "trace the repair chain" requirement) Also adds @sentry/cloudflare ^10.37.0 to packages/api/package.json (install lands in this commit but wiring is in the next one). Verification: drizzle-kit check + custom linter clean, check-casts:strict clean, biome clean, 20 unit-test files / 331 tests pass, tsc clean.
Closes the Sentry deferral from U6 part 1 now that the dependency is
installed. Wires Sentry into three surfaces:
1. Worker default export wrapped with withSentry(optionsCallback, handler)
— initializes Sentry on first invocation; uncaught fetch / queue /
scheduled exceptions land in Sentry with request + queue + cron context
attached automatically by the SDK.
2. CatalogEtlWorkflow wrapped with instrumentWorkflowWithSentry — every
step.do span + any uncaught throw inside a step lands in Sentry with
workflow name + instance id + step name + attempt count attached.
3. logger.ts emit() boundary forwards to Sentry when isInitialized():
- logger.info/warn → Sentry.addBreadcrumb (correlated with next captureException)
- logger.error({ err }) → Sentry.captureException with ctx fields as tags/extras
- logger.error without err → Sentry.captureMessage(level=error)
Forwarding is best-effort and try/catch-wrapped — failures here never
break the call site (JSON console line is the durable record).
Sentry options shared between handler + workflow:
{ dsn: env.SENTRY_DSN, environment: env.ENVIRONMENT,
tracesSampleRate: 0.1, release: env.CF_VERSION_METADATA?.id }
wrangler.jsonc:
- Adds `nodejs_als` compatibility flag (required by @sentry/cloudflare's
AsyncLocalStorage-based context propagation across awaits)
- Adds `upload_source_maps: true` so wrangler deploy uploads sourcemaps
to Cloudflare — unminified stack traces in wrangler tail and the
Workers dashboard. Sentry-side symbolication is a separate
@sentry/cli sourcemaps upload step (documented in runbook; no CI
deploy pipeline exists today to automate it).
Runbook updated with a new "Sentry observability" section documenting
wiring, tag conventions, and the optional Sentry-side sourcemap upload.
Verification:
- 20 unit-test files, 331 tests pass (logger tests still pass; Sentry
isInitialized() returns false in tests so forwarding is silently
skipped — JSON output to console unchanged)
- bun check-types clean
- biome check clean
- check-casts:strict clean
…typeof Pre-push no-raw-typeof linter rejected the raw `typeof v === ...` chain introduced in the Sentry forwarder. Replaced with isString/isNumber/ isBoolean from @packrat/guards, matching the rest of the codebase. No behavior change; same path classification (primitives → Sentry tags, objects/arrays → Sentry extras).
radash (which @packrat/guards re-exports) provides isString, isNumber, isObject, etc. but not isBoolean. Use a direct === true || === false check instead — passes the no-raw-typeof linter and reads cleaner than inventing a wrapper.
Single-query per-source data-quality audit served from the API instead of requiring scrapyd (or any other consumer) to talk to the DB directly. The SQL stays where the schema lives; consumers authenticate with the existing admin JWT and never need NEON_DATABASE_URL. Flags surfaced per source (computed server-side from threshold constants returned alongside the report): - decimal_bug: count of prices < $10 with 3+ decimal places (the "1,299 → 1.299" parser bug from the existing scrapyd audit) - low_median: median < $20 on a non-allowlisted source - high_null:<field>: > 30% NULL on price / brand / description / weight / images / availability - bad_weight: count of weights < 1g or > 100kg - empty_name: count of empty / null names - stale: source has no completed ETL in 30+ days Query is a single CTE-based GROUP BY (DISTINCT ON for most-recent ingest source per item, then aggregate). One round-trip for all sources; ?source=<name> filters to one for ad-hoc debugging. Response schema CatalogAuditSchema lives in @packrat/schemas/admin so Eden Treaty consumers get end-to-end types. Verification: 20 unit-test files / 331 tests pass, tsc clean, biome clean, check-casts:strict clean. Used by scripts/audit_db_catalog.py in PackRat-ScrapyD#129 (next commit on that PR drops the direct-DB approach in favor of this endpoint).
Previous fix pointed drizzle.config.ts at ../db/src/schema.ts (relative path crossing the workspace boundary). Cleaner: add an in-package re-export at src/db/schema.ts that re-exports from @packrat/db/schema, and point drizzle.config back to ./src/db/schema.ts. drizzle-kit + any other drizzle-aware tooling now stays scoped to packages/api and is insulated from workspace layout changes. Schema source of truth still lives in packages/db/src/schema.ts.
…d migration Previous: three migrations (0048_etl_workflow_columns, 0049_etl_verification_cols, 0050_etl_etag_and_supersession) all generated by drizzle-kit but renamed post-generation, with hand-edited journal tags to match. That made the migrations look hand-authored and the rename+edit pattern is brittle. Now: one migration with whatever name drizzle-kit emits — the additive column changes (workflow_instance_id, total_embedding_failures, verified_at, verified_row_count, source_etag, source_last_modified, superseded_by_job_id, superseded_at + FK + indexes + check constraint) collapse cleanly into a single migration. Net diff impact: ~4,600 fewer lines (3 snapshots → 1). Updates CLAUDE.md with explicit migration discipline so this doesn't recur: - always generate via drizzle-kit - keep the random auto-generated name (do not rename) - never hand-edit journal / snapshots / SQL - collapse additive changes into one migration when they ship together - verify with drizzle-kit check before pushing Schema content is identical; verified via drizzle-kit check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The JSON migration incidentally changed logger.{info,warn,error} from the
convention-compliant single-object signature ({ event, ctx }) to positional
(event, ctx) and rewrote callers. Restore the object-arg API + callers (origin/
main shape); keep the migration's legit JSON circular-throw fix in emit(). 13
logger tests pass.
…ing) U7: jscpd copy-paste detection (.jscpd.json, threshold 7% over ~5.2% baseline, ts/tsx, scripts/lint/check-duplication.ts wrapper). U10 activation: flip no-raw-json (parse + stringify + tsx) to ERROR now the migration is done (exempt test/playwright/e2e dirs); wire no-duplicate-utils, check:provenance:strict, and check:duplication into check-all.ts, lefthook pre-push, lint:custom, and dedicated checks.yml CI steps. Exempt the safeJsonParse drop-in in no-owned-max-params. All new checks green: no-duplicate-utils 0, provenance 52=52, jscpd under threshold, ast-grep scan exit 0 with json at error.
…sts) Use (LIB_PRIORITY as readonly string[]).indexOf(lib) instead of casting the value to the named type — clears the one check-type-casts violation in the new check. Provenance check + 10 tests still green.
|
Important Review skippedToo many files! This PR contains 217 files, which is 67 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (217)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Coverage Report for packages/analytics (./packages/analytics)
File Coverage
|
||||||||||||||||||||||||||||||||||||||
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
packrat-admin | 1662d05 | Commit Preview URL Branch Preview URL |
Jun 01 2026, 09:50 PM |
Deploying packrat-guides with
|
| Latest commit: |
1662d05
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://94978e4b.packrat-guides-6gq.pages.dev |
| Branch Preview URL: | https://refactor-utils-guards-harden.packrat-guides-6gq.pages.dev |
Coverage Report for packages/overpass (./packages/overpass)
File CoverageNo changed files found. |
Coverage Report for packages/units (./packages/units)
File CoverageNo changed files found. |
Coverage Report for packages/mcp (./packages/mcp)
File Coverage
|
||||||||||||||||||||||||||||||||||||||
Coverage Report for packages/api (./packages/api)
File Coverage
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Coverage Report for packages/utils (./packages/utils)
File Coverage
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
This PR introduces a new curated @packrat/utils facade, refactors @packrat/guards to sit on top of it (single lib-import boundary), and adds multiple CI-blocking enforcement layers (Biome restricted-imports, ast-grep structural rules, duplication detection, provenance validation) while migrating many call sites away from raw primitives and duplicated utilities.
Changes:
- Added
@packrat/utils(barrel + subpath exports) with provenance manifest + 100% coverage gates. - Replaced legacy regex-based lint scripts with ast-grep rules + a runner, and added new duplication/provenance checks wired into CI/pre-push.
- Migrated repo call sites away from raw
JSON.parse/JSON.stringifyand direct util-lib imports to the facade.
Reviewed changes
Copilot reviewed 156 out of 157 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| sgconfig.yml | Adds ast-grep project config pointing to rule + test directories. |
| scripts/lint/no-raw-typeof.ts | Deletes legacy regex-based typeof lint script (replaced by ast-grep). |
| scripts/lint/no-raw-regex.ts | Deletes legacy regex-based regex lint script (replaced by ast-grep). |
| scripts/lint/no-raw-ast-grep.ts | Adds ast-grep runner used by custom lint orchestration. |
| scripts/lint/no-owned-max-params.ts | Exempts packages/utils/src/json.ts due to intentional signature shape. |
| scripts/lint/no-duplicate-utils.ts | Adds a lint script to detect re-implementations of @packrat/utils exports. |
| scripts/lint/check-duplication.ts | Adds a jscpd runner for copy/paste duplication gating. |
| scripts/lint/tests/no-duplicate-utils.test.ts | Unit tests for no-duplicate-utils analyzer logic. |
| scripts/check-all.ts | Wires new checks (ast-grep, utils duplication, provenance, jscpd) into orchestrator. |
| packages/web-ui/src/components/chart.tsx | Replaces raw typeof string checks with guard predicate usage. |
| packages/utils/vitest.config.ts | Adds Vitest config for the new utils package + 100% coverage thresholds. |
| packages/utils/src/surface.test.ts | Adds surface-level tests ensuring re-exports are wired correctly. |
| packages/utils/src/string.ts | Adds curated string re-exports. |
| packages/utils/src/provenance.ts | Adds provenance manifest describing source lib per export. |
| packages/utils/src/provenance.test.ts | Adds tests enforcing manifest ↔ barrel sync and justification rules. |
| packages/utils/src/predicates.ts | Adds curated predicate re-exports (source for @packrat/guards). |
| packages/utils/src/object.ts | Adds curated object utilities re-exports. |
| packages/utils/src/math.ts | Adds curated math utilities re-exports. |
| packages/utils/src/json.ts | Adds safe JSON parse/stringify facade (destr + safe-stable-stringify). |
| packages/utils/src/json.test.ts | Adds tests for JSON facade behavior. |
| packages/utils/src/index.ts | Adds root barrel re-exporting category modules. |
| packages/utils/src/fn.ts | Adds curated function utilities re-exports. |
| packages/utils/src/async.ts | Adds curated async utilities re-exports. |
| packages/utils/src/array.ts | Adds curated array utilities re-exports. |
| packages/utils/package.json | Introduces @packrat/utils workspace package with subpath exports + deps. |
| packages/schemas/src/catalog.ts | Migrates JSON.parse usage to safeJsonParse(..., { strict: true }). |
| packages/schemas/package.json | Adds @packrat/utils dependency. |
| packages/mcp/src/tools/auth.ts | Migrates JSON.stringify usage to safeJsonStringify. |
| packages/mcp/src/resources.ts | Migrates JSON.stringify usage to safeJsonStringify. |
| packages/mcp/src/client.ts | Migrates JSON.stringify usage to safeJsonStringify; replaces raw typeof checks with guards. |
| packages/mcp/src/auth.ts | Migrates JSON.parse/stringify usage to safe utils for KV storage + parsing. |
| packages/mcp/package.json | Adds @packrat/utils dependency. |
| packages/guards/src/index.ts | Refactors guard predicate re-exports to source from @packrat/utils. |
| packages/guards/package.json | Removes direct radash dep; adds @packrat/utils. |
| packages/cli/src/index.ts | Migrates JSON.parse usage to safeJsonParse(..., { strict: true }). |
| packages/cli/src/commands/weather/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/user/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/trips/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/trails/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/templates/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/seasons/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/packs/list.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/packs/items.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/packs/get.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/packs/gap-analysis.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/feed/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/catalog/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/auth/register.ts | Migrates JSON.stringify request body to safeJsonStringify. |
| packages/cli/src/commands/auth/refresh.ts | Migrates JSON.stringify request body to safeJsonStringify. |
| packages/cli/src/commands/auth/login.ts | Migrates JSON.stringify request body to safeJsonStringify. |
| packages/cli/src/commands/ai/index.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/admin/users.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/admin/trails.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/admin/packs.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/admin/etl.ts | Migrates JSON.stringify usage to safeJsonStringify. |
| packages/cli/src/commands/admin/catalog.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/commands/admin/analytics.ts | Migrates JSON.stringify output to safeJsonStringify. |
| packages/cli/src/api/run.ts | Migrates JSON.stringify usage in error extraction to safeJsonStringify. |
| packages/cli/src/api/config.ts | Migrates JSON.parse/stringify for config persistence to safe utils (strict parse + safe write). |
| packages/cli/package.json | Adds @packrat/utils dependency. |
| packages/checks/vitest.config.ts | Adds Vitest config for checks package tests. |
| packages/checks/src/check-utils-provenance.ts | Adds CI check to validate utils provenance manifest vs actual exports. |
| packages/checks/src/check-utils-provenance.test.ts | Adds tests for provenance validator + real manifest invariants. |
| packages/checks/package.json | Adds provenance check scripts + tests. |
| packages/api/src/workflows/catalog-etl-workflow.ts | Migrates JSON.parse in ETL JSONL processing to strict safeJsonParse. |
| packages/api/src/utils/logger.ts | Uses utils JSON stringifiers (including an explicit throw-on-circular mode) and refactors logger API to object args. |
| packages/api/src/utils/json-utils.ts | Migrates raw JSON.parse to strict safeJsonParse; updates weight parsing call signature; uses catalog JSON parser wrapper. |
| packages/api/src/utils/csv-utils.ts | Uses @packrat/utils safeJsonParse under a domain-specific parseCatalogJson wrapper; migrates raw JSON.parse sites. |
| packages/api/src/utils/tests/logger.test.ts | Updates tests for the logger signature and behavior. |
| packages/api/src/utils/tests/embeddingHelper.test.ts | Updates getEmbeddingText call signature to object argument. |
| packages/api/src/utils/tests/csv-utils.test.ts | Renames safeJsonParse→parseCatalogJson in tests and updates expectations. |
| packages/api/src/utils/tests/auth.test.ts | Updates timingSafeEqual call signature to object argument. |
| packages/api/src/services/trails.ts | Migrates JSON.parse of geojson to strict safeJsonParse. |
| packages/api/src/services/r2-bucket.ts | Migrates JSON.parse to strict safeJsonParse; replaces typeof checks with isString guard. |
| packages/api/src/services/etl/processValidItemsBatch.ts | Updates logger calls to new object-arg signature. |
| packages/api/src/services/etl/processLogsBatch.ts | Updates logger calls to new object-arg signature. |
| packages/api/src/services/etl/processCatalogEtl.ts | Migrates JSON.parse in ETL JSONL processing to strict safeJsonParse. |
| packages/api/src/services/catalogService.ts | Uses safeJsonStringify for stable comparisons rather than raw JSON.stringify. |
| packages/api/src/routes/trails/index.ts | Migrates bbox/geojson JSON.parse to strict safeJsonParse. |
| packages/api/src/routes/packTemplates/index.ts | Migrates JSON.stringify request body to safeJsonStringify. |
| packages/api/src/routes/packs/index.ts | Migrates JSON.stringify usage in prompt context string to safeJsonStringify. |
| packages/api/src/routes/admin/trails.ts | Migrates bbox/geojson JSON.parse to strict safeJsonParse. |
| packages/api/src/index.ts | Migrates JSON.stringify response bodies to safeJsonStringify. |
| packages/api/src/auth/index.ts | Migrates JSON.parse of JWKS private keys to strict safeJsonParse. |
| packages/api/package.json | Adds @packrat/utils dependency; removes dead radash dep. |
| packages/api/container_src/server.ts | Migrates JSON.stringify in debug output to safeJsonStringify. |
| packages/api-client/src/index.ts | Migrates JSON.stringify request bodies + MCP helpers to safeJsonStringify; replaces typeof checks with isString guard. |
| packages/api-client/package.json | Adds @packrat/utils dependency. |
| packages/analytics/src/core/local-cache.ts | Migrates radash import to @packrat/utils. |
| packages/analytics/src/core/data-export.ts | Migrates JSON.stringify file write to safeJsonStringify. |
| packages/analytics/src/core/cache-metadata.ts | Migrates JSON.parse/stringify for metadata to strict safeJsonParse + safeJsonStringify. |
| packages/analytics/package.json | Adds @packrat/utils dependency; removes direct radash dep. |
| package.json | Adds scripts for new checks; adds devDeps for ast-grep/jscpd/ts-morph; updates custom lint script chain. |
| lefthook.yml | Wires new blocking checks (ast-grep, provenance, utils duplication, jscpd) into pre-push. |
| docs/utils-sweep-findings.md | Adds the sweep findings document used to seed facade design. |
| docs/utils-policy.md | Adds policy doc for utils facade usage and provenance contract. |
| docs/plans/2026-05-31-002-refactor-utils-guards-hardening-plan.md | Adds detailed plan doc for the refactor + enforcement pipeline. |
| coverage-baselines.json | Adds coverage baseline entry for packages/utils. |
| biome.json | Adds noRestrictedImports policy to enforce facade boundary (with utils override). |
| ast-grep-tests/no-raw-typeof-tsx-test.yml | Adds rule tests for TSX typeof linting. |
| ast-grep-tests/no-raw-typeof-test.yml | Adds rule tests for TS typeof linting. |
| ast-grep-tests/no-raw-regex-test.yml | Adds rule tests for regex linting. |
| ast-grep-tests/no-primitive-cast-tsx-test.yml | Adds rule tests for TSX primitive-cast warnings. |
| ast-grep-tests/no-primitive-cast-test.yml | Adds rule tests for TS primitive-cast warnings. |
| ast-grep-tests/snapshots/no-raw-typeof-tsx-snapshot.yml | Adds ast-grep label snapshots for TSX typeof rule. |
| ast-grep-tests/snapshots/no-raw-typeof-snapshot.yml | Adds ast-grep label snapshots for TS typeof rule. |
| ast-grep-tests/snapshots/no-raw-regex-snapshot.yml | Adds ast-grep label snapshots for regex rule. |
| ast-grep-tests/snapshots/no-primitive-cast-tsx-snapshot.yml | Adds ast-grep label snapshots for TSX primitive-cast rule. |
| ast-grep-tests/snapshots/no-primitive-cast-snapshot.yml | Adds ast-grep label snapshots for TS primitive-cast rule. |
| ast-grep-rules/PARITY.md | Documents parity between legacy regex lint scripts and ast-grep rules. |
| ast-grep-rules/no-raw-typeof.yml | Adds AST-based typeof rule (TS). |
| ast-grep-rules/no-raw-typeof-tsx.yml | Adds AST-based typeof rule (TSX). |
| ast-grep-rules/no-raw-regex.yml | Adds AST-based regex rule (TS). |
| ast-grep-rules/no-raw-regex-tsx.yml | Adds AST-based regex rule (TSX). |
| ast-grep-rules/no-raw-json.yml | Adds AST-based JSON.parse rule (TS). |
| ast-grep-rules/no-raw-json-tsx.yml | Adds AST-based JSON.parse rule (TSX). |
| ast-grep-rules/no-raw-json-stringify.yml | Adds AST-based JSON.stringify (single-arg) rule (TS). |
| ast-grep-rules/no-raw-json-stringify-tsx.yml | Adds AST-based JSON.stringify (single-arg) rule (TSX). |
| ast-grep-rules/no-raw-json-stringify-multi.yml | Adds AST-based JSON.stringify (multi-arg) rule (TS). |
| ast-grep-rules/no-raw-json-stringify-multi-tsx.yml | Adds AST-based JSON.stringify (multi-arg) rule (TSX). |
| ast-grep-rules/no-primitive-cast.yml | Adds AST-based primitive-cast warning rule (TS). |
| ast-grep-rules/no-primitive-cast-tsx.yml | Adds AST-based primitive-cast warning rule (TSX). |
| apps/web/package.json | Adds @packrat/utils dependency. |
| apps/web/app/auth/page.tsx | Migrates JSON.stringify request bodies to safeJsonStringify. |
| apps/trails/package.json | Adds @packrat/utils dependency. |
| apps/trails/lib/auth.ts | Migrates JSON.parse/stringify usage in auth token/user storage to safe utils (strict parse for JSON-encoded). |
| apps/guides/scripts/generate-content.ts | Migrates JSON.parse to strict safeJsonParse for AI output parsing. |
| apps/guides/scripts/build-content.ts | Migrates JSON.stringify to safeJsonStringify for codegen content output. |
| apps/guides/package.json | Adds @packrat/utils dependency. |
| apps/guides/app/dev/generate/page.tsx | Migrates JSON.stringify request bodies to safeJsonStringify. |
| apps/expo/utils/storage.ts | Migrates JSON.parse/stringify in Jotai storage adapter to safe utils. |
| apps/expo/package.json | Adds @packrat/utils dependency; removes direct radash dep. |
| apps/expo/lib/api/packrat.ts | Migrates JSON.parse to safeJsonParse for cookie store parsing (validated via zod). |
| apps/expo/features/weather/screens/LocationSearchScreen.tsx | Migrates JSON.parse/stringify to safe utils for AsyncStorage persistence. |
| apps/expo/features/trail-conditions/hooks/useTrailConditionReports.ts | Migrates JSON.parse/stringify to safe utils for report caching. |
| apps/expo/features/packs/screens/PackItemDetailScreen.tsx | Migrates JSON.stringify route param encoding to safeJsonStringify. |
| apps/expo/features/packs/screens/PackDetailScreen.tsx | Migrates JSON.stringify route param encoding to safeJsonStringify. |
| apps/expo/features/pack-templates/screens/PackTemplateItemDetailScreen.tsx | Migrates JSON.stringify route param encoding to safeJsonStringify. |
| apps/expo/features/pack-templates/components/FeaturedPacksSection.tsx | Migrates isArray import to @packrat/guards (removes radash usage). |
| apps/expo/features/catalog/lib/normalizeDescription.ts | Migrates JSON.parse to strict safeJsonParse inside try/catch. |
| apps/expo/features/auth/hooks/useAuthActions.ts | Migrates JSON.parse to strict safeJsonParse for redirect route parsing. |
| apps/expo/features/ai/lib/llamaToolsWrapper.ts | Migrates JSON.stringify usage to safeJsonStringify for tool output serialization. |
| apps/expo/features/ai/lib/CustomChatTransport.ts | Migrates JSON.stringify error serialization to safeJsonStringify. |
| apps/expo/features/ai/lib/appleModelWrapper.ts | Migrates JSON.stringify tool input serialization to safeJsonStringify. |
| apps/expo/features/ai/components/ToolInvocationRenderer.tsx | Migrates JSON.parse tool output decoding to safeJsonParse. |
| apps/expo/features/ai/atoms/chatStorageAtoms.ts | Migrates JSON.parse/stringify usage in chat persistence to strict safeJsonParse + safeJsonStringify. |
| apps/expo/atoms/atomWithSecureStorage.web.ts | Migrates JSON.parse/stringify to strict safeJsonParse + safeJsonStringify with try/catch. |
| apps/expo/atoms/atomWithSecureStorage.ts | Migrates JSON.parse/stringify to safeJsonParse + safeJsonStringify for SecureStore atoms. |
| apps/expo/atoms/atomWithKvStorage.ts | Migrates JSON.parse/stringify to safeJsonParse + safeJsonStringify for KV-store atoms. |
| apps/expo/atoms/atomWithAsyncStorage.ts | Migrates JSON.parse/stringify to safeJsonParse + safeJsonStringify for AsyncStorage atoms. |
| apps/admin/package.json | Adds @packrat/utils dependency. |
| apps/admin/components/raw-object-dialog.tsx | Migrates JSON.stringify rendering to safeJsonStringify. |
| apps/admin/components/analytics/catalog-analytics.tsx | Migrates JSON.stringify rendering to safeJsonStringify. |
| .jscpd.json | Adds jscpd config (threshold, ignores, formats) for duplication gate. |
| .github/workflows/coverage.yml | Adds packages/utils to coverage workflow matrix. |
| .github/workflows/checks.yml | Adds CI steps for utils duplication, provenance, and jscpd. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| (async () => { | ||
| const item = await AsyncStorage.getItem(key); | ||
| setValue(item ? JSON.parse(item) : initialValue); | ||
| setValue(item ? safeJsonParse<T>(item) : initialValue); | ||
| })(); |
| (async () => { | ||
| const item = await SecureStore.getItemAsync(key); | ||
| setValue(item ? JSON.parse(item) : initialValue); | ||
| setValue(item ? safeJsonParse<T>(item) : initialValue); | ||
| })(); |
| baseAtom.onMount = (setValue) => { | ||
| (async () => { | ||
| const item = await Storage.getItem(key); | ||
| setValue(item ? JSON.parse(item) : initialValue); | ||
| setValue(item ? safeJsonParse<T>(item) : initialValue); | ||
| })(); |
| getItem: async (key: string) => { | ||
| const value = await AsyncStorage.getItem(key); | ||
| return value ? JSON.parse(value) : null; | ||
| return value ? safeJsonParse<WeatherLocation[]>(value) : null; | ||
| }, | ||
| setItem: async (key: string, value: unknown) => { | ||
| await AsyncStorage.setItem(key, JSON.stringify(value)); | ||
| await AsyncStorage.setItem(key, safeJsonStringify(value)); | ||
| }, |
| // Rules enforced (error → fails CI): | ||
| // - no-raw-typeof → use @packrat/guards predicates instead of raw typeof | ||
| // - no-raw-regex → use magic-regexp instead of raw regex literals | ||
| // Rules at warning level (do NOT fail CI yet): | ||
| // - no-raw-json* → use @packrat/utils safeParse/safeStringify | ||
| // (the repo-wide JSON migration flips these to error) |
| # WARNING (not error): the repo-wide JSON migration (~156 sites) is a separate | ||
| # unit handled by the orchestrator. This rule surfaces JSON.parse call sites | ||
| # without failing CI yet. The `fix` rewrites JSON.parse($X) -> safeJsonParse($X); | ||
| # import insertion is out of scope here (the orchestrator's codemod handles it). | ||
| severity: error |
| # WARNING (not error): see no-raw-json.yml. Covers the single-argument form of | ||
| # JSON.stringify, which has a clean 1:1 autofix (JSON.stringify($X) -> | ||
| # safeJsonStringify($X)). Multi-arg calls (replacer / space) are handled by | ||
| # no-raw-json-stringify-multi.yml, which has no autofix. Import insertion is out | ||
| # of scope here (the orchestrator's codemod handles it). | ||
| severity: error |
| # WARNING (not error): multi-argument JSON.stringify (replacer / space). Flagged | ||
| # for the separate JSON-migration unit but intentionally NO autofix — there is | ||
| # no clean 1:1 rewrite, so the orchestrator's codemod handles these by hand. | ||
| # The single-arg form is covered (with autofix) by no-raw-json-stringify.yml; | ||
| # the `not` constraint below keeps the two rules from double-reporting. | ||
| severity: error |
Sync the utils-hardening branch onto the active integration branch (development was 524 commits ahead of the stale main this branched from). Conflict resolution combined both sides: - lint:custom: ast-grep (replacing retired regex scripts) + no-duplicate-utils + dev's no-direct-wrapped-imports; check-types -> dev's tsgo-capable wrapper. - atoms/secureStore: my safeJson imports + dev's expo-app/lib/secureStore wrapper. - api/index.ts: took dev's refactor (inline Sentry, CORS isAllowedOrigin, local Neon proxy) + re-applied my safeJsonStringify migration on its error responses. - pack screens: dev removed the auth-guard blocks (took dev's deletion). - Migrated 1 dev-introduced raw typeof (polyfills.ts) to isString. Re-verified merged tree: ast-grep clean, biome clean, no-duplicate-utils + provenance pass. (check:catalog's nativewindui mismatch is pre-existing on dev.)
…ngify drop-ins CI (tsc) caught what the dev box couldn't (full tsc OOMs here): safe-stable- stringify's configured fn is typed string|undefined, so swapping JSON.stringify (typed string) for safeJsonStringify broke ~10 string-typed call sites (storage setters, log lines, headers). Cast to `typeof JSON.stringify` so it's a true drop-in returning string — mirroring the convenient string type TS already gives JSON.stringify. Verified locally via the tsgo wrapper (no OOM). Also fix apps/expo/utils/storage.ts: it parsed inside a createJSONStorage string-storage adapter (only compiled via JSON.parse's any) — switch to the canonical createJSONStorage(() => AsyncStorage) so jotai owns the JSON.
…-guards-hardening # Conflicts: # bun.lock # package.json
…w-typeof) development's isCloneable() added `typeof x.clone === 'function'`; route through @packrat/guards isFunction to satisfy the now-error-level no-raw-typeof rule.
What & why
Harden the monorepo's utility/guard surface and enforce its use, killing duplicated code. Introduces a single curated
@packrat/utilsfacade, refactors@packrat/guardsto sit on top of it (one lib-import boundary), and adds a five-layer enforcement pipeline so reach-arounds, re-implementations, and raw primitives can't creep back in.Plan:
docs/plans/2026-05-31-002-refactor-utils-guards-hardening-plan.md.What shipped
Facade + guards (behavior-neutral)
@packrat/utils— single barrel + subpath exports (array/object/string/async/fn/math/json/predicates), 52 curated exports sourced best-types-first across radashi → radash → es-toolkit → lodash → remeda (provenance manifest records each source + justifies any lower-priority pick). 100% test coverage, wired into the coverage ratchet.safeJsonStringify(circular/BigInt-safe, order-preserving),stableJsonStringify,configureJsonStringify,safeJsonParse(destr;{ strict: true }preserves throw-on-invalid). Named*Json*to avoid colliding with zod's.safeParse.@packrat/guardsrefactored onto utils (two-tier) — generic predicates now sourced through the single lib boundary; public surface byte-for-byte identical (59 exports, characterized), so zero consumer impact.Enforcement (all CI-blocking unless noted)
noRestrictedImports— bans the 5 libs (+ destr / safe-stable-stringify, incl. submodules) outsidepackages/utils.no-raw-typeof+no-raw-regexported from the brittle regex scripts (parity-proven viaast-grep test, stricter — caught optional-chaining/bracket cases the regex missed),no-raw-json(error),no-primitive-cast(warning;as string/number/boolean→ coercion). Old regex scripts retired only after parity proof.jscpd— copy-paste detection (threshold 7% over ~5.2% baseline; ratchet down later).no-duplicate-utils— flags re-implementations of facade exports (names auto-derived from the manifest).check-utils-provenance— manifest ↔ exports in sync + priority justified.Migration
JSON.parse/JSON.stringify(~129 sites) routed through the facade (behavior-preserving — strict mode where throw mattered; caught + fixed a real regression whereloggerrelied onJSON.stringifythrowing on circular refs).radashdep from api/analytics/expo.csv-utilssafeJsonParseinto the facade.packages/apitype errors (parseWeight/getEmbeddingText/timingSafeEqual positional-vs-object-arg).Verification
check:castsclean.@packrat/utils100% coverage.Known pre-existing items (red on
maintoo — out of scope here)no-owned-max-params(6) — framework-shaped signatures (Cloudflarescheduled, Workflowrun, etc.). This PR's delta is zero (logger reverted to object-arg,safeJsonParsedrop-in exempted).expo-doctor— "packages match Expo SDK versions" (no expo-versioned packages changed here).packages/mcptests (18 failures) — confirmed present atmainHEAD.tsc --noEmitOOMs on the dev box (~8GB) — verified via vitest + biome + signature reading; CI runstsc.Follow-ups (deferred)
no-primitive-castwarnings; flip it to error.no-owned-max-params/ expo-doctor / mcp-test backlogs (separate PRs).