Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
c4b01d5
docs(etl): pivot remediation plan from Queues+outbox to Workflows
andrew-bierman May 20, 2026
334fbdb
feat(etl): U1 Workflows spike (throwaway POC)
andrew-bierman May 20, 2026
9216908
feat(etl): U1 standalone spike worker — Workflows verified GO
andrew-bierman May 20, 2026
a32f738
feat(etl): U2 schema migration 0048 — Workflows-aware columns
andrew-bierman May 20, 2026
35a45d8
feat(etl): row-boundary-aligned R2 chunker (chunkCsvForR2)
andrew-bierman May 20, 2026
bcc7c9e
feat(etl): CatalogEtlWorkflow — durable ETL via Cloudflare Workflows
andrew-bierman May 20, 2026
eec7ec8
feat(etl): producer cutover — default to Workflow, retain queue fallback
andrew-bierman May 20, 2026
b99bb49
feat(etl): U4 validator hardening — close SSRF, IDN, length, charset …
andrew-bierman May 20, 2026
f92dacd
chore(etl): remove standalone spike worker; wire ETL_WORKFLOW into en…
andrew-bierman May 20, 2026
4af57df
fix(etl): drop `as Error` casts in CatalogEtlWorkflow
andrew-bierman May 20, 2026
fa6ceea
refactor(etl): slim U2 to workflow_instance_id + total_embedding_fail…
andrew-bierman May 20, 2026
1f8432f
feat(etl): U7 invalid_item_logs retention sweep
andrew-bierman May 20, 2026
a1f942c
feat(etl): U6 part 1 — structured logger + error propagation fixes
andrew-bierman May 20, 2026
53bb3a7
feat(etl): U5 (minimal) — workflow-aware retry + reconcile admin endp…
andrew-bierman May 20, 2026
64d1f67
docs(etl): U8 operator runbook for the Workflows pipeline
andrew-bierman May 20, 2026
4bac86e
fix(etl): CI failures — type errors, coverage threshold, hoisted mock
andrew-bierman May 20, 2026
4672fc8
feat(etl): migration 0050 — ETag fail-closed repair + supersession au…
andrew-bierman May 20, 2026
10dbf60
feat(etl): U6 part 2 — @sentry/cloudflare wiring
andrew-bierman May 20, 2026
cbae081
fix(etl): logger uses @packrat/guards type predicates instead of raw …
andrew-bierman May 20, 2026
0f1c057
fix(etl): drop isBoolean import — @packrat/guards doesn't export it
andrew-bierman May 20, 2026
cbab838
feat(etl): GET /api/admin/analytics/catalog/audit endpoint
andrew-bierman May 21, 2026
24423c1
refactor(db): drizzle.config schema path uses in-package re-export
andrew-bierman May 21, 2026
5187b6d
chore(etl): consolidate ETL migrations to single drizzle-kit-generate…
andrew-bierman May 21, 2026
9980ed4
fix: address P0/P1 review findings on ETL workflow
andrew-bierman May 21, 2026
51c77ed
docs: fix plan doc contradiction and stale runbook section
andrew-bierman May 21, 2026
a9e7c3e
🐛 fix: chunk boundary byte offset and retention returning type
andrew-bierman May 21, 2026
086ed13
🐛 fix(etl): address PR review feedback — chunker guards + docs
andrew-bierman May 21, 2026
5b031e1
Merge pull request #2462 from PackRat-AI/fix/etl-pipeline-workflows-m…
andrew-bierman May 21, 2026
8397d9e
fix: strip .csv from workflow instance ID (CF Workflows invalid_id)
andrew-bierman May 21, 2026
0f404b5
🧪 test(api): add unit tests for catalog ETL instanceId construction
andrew-bierman May 21, 2026
9513605
style(api): fix Biome useTemplate lint in instanceId test
andrew-bierman May 21, 2026
d32dff8
🛡️ fix(etl): handle malformed CSV rows gracefully instead of aborting
andrew-bierman May 21, 2026
7b1e6d2
🐛 fix(etl): reduce chunk size 20MB→5MB to prevent WorkflowTimeoutError
andrew-bierman May 21, 2026
f8f7be5
fix(api): clamp KV expirationTtl to minimum 60s (#2466)
mikib0 May 21, 2026
6353df4
🐛 fix(etl): use parser line number in on_skip error log
andrew-bierman May 21, 2026
36f1317
Merge pull request #2465 from PackRat-AI/fix/etl-catalog-sprint-fixes
andrew-bierman May 21, 2026
c64cf9b
✨ feat(etl): add JSONL/NDJSON support to catalog ETL pipeline
andrew-bierman May 21, 2026
603d281
🛠️ fix(json-utils): use @packrat/guards, add unit tests for coverage
andrew-bierman May 21, 2026
916732b
🛠️ fix(etl): replace unsafe casts with @packrat/guards, fix test
andrew-bierman May 21, 2026
2639f80
🛠️ fix(etl): address CR/Copilot comments — chunk skip, imports, types
andrew-bierman May 21, 2026
1b27205
🛠️ fix(json-utils): correct Biome import sort order
andrew-bierman May 21, 2026
4af87df
🛠️ fix(etl): drop explicit err type on on_skip to fix TS overload res…
andrew-bierman May 21, 2026
534e3f6
🛠️ fix(etl): guard err possibly-undefined in on_skip (TS18048)
andrew-bierman May 21, 2026
cd4e13e
🛠️ fix: use pre-computed `message` var in on_skip console.warn
andrew-bierman May 21, 2026
46da63e
🛠️ fix: collapse console.warn to single line for Biome formatter
andrew-bierman May 21, 2026
3af10be
fix(etl): capture csv pump promise to prevent silent hang on R2 errors
andrew-bierman May 21, 2026
ad4d009
🔀 chore: merge development into feat/jsonl-etl-support
andrew-bierman May 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,24 @@ Defined in root `tsconfig.json`:
- Migrations: Drizzle Kit (`drizzle-kit`)
- Embeddings: pgvector with 1536 dimensions

### Migration discipline (read before touching `packages/api/drizzle/`)

1. **Always generate via drizzle-kit.** Edit `packages/api/src/db/schema.ts` (or `packages/db/src/schema.ts` for the shared workspace), then run from the API package:

```bash
cd packages/api && bun run db:generate
```

Drizzle-kit emits a random-name file like `0048_loud_squirrel_girl.sql`. That random name is fine — keep it. The naming convention here is "whatever drizzle-kit gives you."

2. **Do not rename a generated migration file.** The `meta/_journal.json` `tag` field, the migration SQL filename, and the snapshot filename all encode the migration identity together. Renaming any one of them (even with corresponding journal edits) makes the migration look hand-authored and creates drift that future drizzle-kit operations can mis-handle.

3. **Do not hand-edit `meta/_journal.json`, `meta/*_snapshot.json`, or the generated SQL.** If the generated migration is wrong, fix the schema, delete the bad migration + snapshot + journal entry, and regenerate. Do not patch around it.

4. **Collapse additive changes into one migration when they ship together** — fewer snapshot files in the diff, easier to revert as a unit. Splitting only makes sense when migrations need to land in separate releases.

5. **Verify after generating.** Run `bunx drizzle-kit check` from `packages/api/` — it validates the snapshot chain is internally consistent. Run before pushing.

## EAS Build Profiles

| Profile | Use | Distribution |
Expand Down
21 changes: 20 additions & 1 deletion bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading