Add event import functionality with validation and command support#183
Merged
Conversation
…l timestamp - Accepts fully-populated *inventory.Event; preserves ev.TimestampUTC - Returns new DB-assigned autoincrement event_id - EntityPathChangedEvent is a no-op (returns 0, nil; no row inserted) - All other event types go through applyEventTx as normal Closes #176
- HasEvents: returns true if events table is non-empty (EXISTS short-circuit) - ClearAllData: deletes entities_current then events in FK-safe order; leaves schema_metadata intact Closes #177
… validation
- ImportOptions{Replace, Continue} and ImportResult{Replayed, Failed, Warnings, Errors}
- Non-monotonic event_id input returns error with no DB writes
- EntityPathChangedEvent records buffered for post-reparent validation, not replayed
- Path-changed count/payload mismatches increment Warnings via structured logger
- --continue accumulates per-event errors in ImportResult.Errors instead of aborting
- importRunner struct extracts per-record logic to keep cognitive complexity under limit
- 10 tests covering all acceptance criteria
Closes #178
…tput - cmd/import/db.go: importApp interface (ImportEvents + HasEvents) - cmd/import/import.go: NewImportCmd/NewDefaultImportCmd, --file/--replace/--yes flags - Reads NDJSON from stdin or --file, parses each line as ExportResult - Prints 'Replayed: N, Failed: M, Warnings: W' to stderr; suppressed by --quiet - app.HasEvents delegation added to *app.App to satisfy importApp interface - Registered in cmd/root.go - 5 tests covering all acceptance criteria Closes #179
- Non-empty DB without --replace → error before any replay - --replace without --yes → error, ClearAllData not called - --replace --yes → ClearAllData called before ImportEvents - --yes without --replace → silently accepted - --continue → passes Continue:true to ImportOptions, exits 0 on partial failure - app.ClearAllData delegation added to *app.App - Extract checkReplaceFlags/openInput/parseNDJSON to keep runImport under complexity limit - 5 new tests (10 total in cmd/import) Closes #180
Move ClearAllData ownership from the import command into App.ImportEvents via ImportOptions.Replace. The cmd layer now only decides whether to set Replace=true (based on --replace --yes); the app layer is the single owner of the destructive step. Removes the dead HasEvents probe in importEvents that discarded its boolean return — checking store responsiveness redundantly with the ClearAllData call that immediately followed. Updates TestRunImport_ReplaceAndYes_ClearsBeforeImport to assert the new cmd-layer contract (Replace=true passed through), and adds an app-layer test exercising opts.Replace=true end-to-end against a real SQLite app. No behavior change for the CLI user.
…d-import failure Add a Level 2 dry-validation pass to importEvents that runs after the monotonic event_id check and before the opts.Replace ClearAllData call. For every record: - ParseEventType must succeed (rejects unknown event types) - Payload must unmarshal into a JSON object (rejects malformed JSON, non-object payloads, truncated bytes) Both checks happen before any DB write. A file that would have left the database empty after ClearAllData + mid-replay failure now errors before the destructive step, leaving the original database intact. This is Level 2 of a four-level validation scale. Level 3 (per-event-type payload schema validation) is tracked as deferred follow-up work — it would require the app layer to import handler-specific payload schemas from internal/eventbus, which is a layering decision that deserves its own ADR. The --continue test was updated: it previously used an unknown event type to trigger per-event errors, which the new pre-flight pass now rejects unconditionally. Replaced with a rename-of-nonexistent-entity case that passes Level 2 validation but fails at replay time inside applyEventTx, which is what --continue is actually for.
bufio.Scanner's default 64 KiB per-token limit is too tight for a single NDJSON line that includes a long human-typed Note — import would fail with bufio.ErrTooLong on otherwise-valid input. Raise the per-line cap to 16 MiB. This is 4-5 orders of magnitude above realistic event sizes (typical payloads + notes are a few hundred bytes each) while still failing loudly on garbage files (e.g. binary blobs masquerading as NDJSON). The 16 MiB ceiling is also well below SQLite's default row-size limit, so the bottleneck stays at the parser with a clear error rather than deep in the storage layer. Also adds a CONTEXT.md glossary entry for Note making the canonical 'human-typed, used infrequently, not for blobs' constraint explicit — the size cap is the operational expression of that constraint, so they land together.
ImportResult.Warnings was an int counter. Callers could see 'how many warnings' but not 'which ones' — diagnostic detail was only visible via the structured logger, which the CLI summary did not surface. Split into two fields kept in lockstep: - WarningCount int (the count, was previously named Warnings) - Warnings []error (one descriptive error per increment) Invariant: len(Warnings) == WarningCount. A private importRunner.warn(err) helper bumps both fields together, so call sites stay one line and the two fields cannot drift. The CLI summary now prints the count on the existing summary line and one indented 'warning: <message>' line per entry in Warnings, suppressed by --quiet. This makes path-changed-mismatch diagnostics actionable without requiring the user to inspect logs. ADR 0005 updated to document the new warning surface.
…ReplayEvent Dispatch and ReplayEvent shared ~95% of their bodies — same INSERT SQL, same LastInsertId handling, same applyEventTx call — differing only in how TimestampUTC and EntityID were sourced. Extract a private Bus.writeEvent(ctx, *inventory.Event) (int64, error) that owns the INSERT + LastInsertId + applyEventTx sequence. Dispatch builds an *inventory.Event with a time.Now() timestamp and payload-parsed entity_id, then delegates. ReplayEvent's EntityPathChangedEvent early return is preserved at the public method, and every other event type delegates to the shared helper. The helper does a value-copy of its input (inserted := *ev) before setting EventID and calling applyEventTx, so callers' input structs are never mutated. Net diff: ~95 LOC → ~75 LOC in bus.go; one canonical write path for every future change. ADR 0004 updated with an implementation note: the public separation remains because the timestamp/entity_id sourcing genuinely differs, but any further divergence should be expressed as a parameter to writeEvent rather than re-splitting into two implementations.
ClearAllData hardcodes entities_current and events as the two tables it deletes. Per CONTEXT.md, entities_current is currently the only projection table — so the hardcoding is correct, not a missing abstraction. Add an inline comment block making the invariant and the tripwire explicit: adding a second projection table requires both updating ClearAllData and recording an ADR (the second projection breaks the documented single- projection invariant and that decision deserves to be captured). This is the comment-only version of the more elaborate projection-registry abstraction that was considered and rejected: with one projection, the registry is just a one-element slice, and the comment-as-tripwire makes the dependency discoverable without inventing the abstraction.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new
importcommand for the CLI, enabling users to restore inventory state by importing events from an NDJSON file. The implementation includes robust input handling, user safety checks for destructive operations, and a comprehensive suite of tests. It also updates documentation to clarify the import process and warning handling.Major features and changes:
New Import Command Implementation
importcommand (cmd/import/import.go,cmd/import/db.go) that reads events from an NDJSON file or stdin and replays them into the database, with options for replacing existing data, continuing on errors, and detailed summary output. [1] [2]cmd/root.go. [1] [2]Robust Input and Error Handling
CONTEXT.mdexplaining the rationale. [1] [2]--yesconfirmation for destructive--replaceoperations and preventing import if the database is non-empty unless explicitly replaced.Comprehensive Testing
cmd/import/import_test.go), covering valid/invalid input, error handling, quiet mode, large event notes, warning surfaces, and destructive operation safeguards.Documentation Updates
writeEventhelper and the handling of warnings inImportResult, in the ADR documentation. [1] [2]References:
[1] [2] [3] [4] [5] [6] [7] [8]