Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
88a8550
Plan: Add OpenSearch provider planning artifacts
bfarmer67 May 2, 2026
41becc0
Feature: Scaffold OpenSearch provider project
bfarmer67 May 2, 2026
18e8669
Test: Add OpenSearch Testcontainers harness and hello-world smoke test
bfarmer67 May 2, 2026
b2febba
Test: Hyperbee.Templating first-contact spike (Task 0.4)
bfarmer67 May 2, 2026
f6fbe8a
Feature: AST + Parlot grammar + safe-default merge middleware (Task 0.5)
bfarmer67 May 2, 2026
d74b65b
Test: Phase 0 spike wire-level integration tests (Task 0.6)
bfarmer67 May 2, 2026
dc958b8
Plan: Mark Phase 0 Task 0.6 checkboxes done
bfarmer67 May 2, 2026
0b40551
ADR: 0016 OpenSearch provider does not use file-level templating
bfarmer67 May 2, 2026
95825f0
Refactor: Remove Hyperbee.Templating dependency per ADR-0016
bfarmer67 May 2, 2026
bb4aea7
Docs: Align requirements/plan/design with ADR-0016 (no file-level tem…
bfarmer67 May 2, 2026
11f10ea
Docs: Update design spec Key Decisions section with all 6 ADR links
bfarmer67 May 2, 2026
70249f1
Feature: Phase 1 Slice A - Bootstrapper foundation (ADR-0014)
bfarmer67 May 2, 2026
ab17af8
Feature: Phase 1 Slice B (partial) - Index init steps + DI wiring
bfarmer67 May 2, 2026
9f356cc
Feature: Phase 1 - LockHandle + OpenSearchRecordStore (R-04, R-05, AD…
bfarmer67 May 2, 2026
baa08bf
Feature: Phase 1 - Foundation verbs grammar (R-08a)
bfarmer67 May 2, 2026
a337803
Plan: Update Phase 1 status (~70% done; statement compilers + integra…
bfarmer67 May 2, 2026
78191bf
Test+Fix: Phase 0+1 validated against real OpenSearch (17/17 pass)
bfarmer67 May 2, 2026
41f2cd3
Feature: Phase 1 Slice C.1 - StatementDispatcher (validated 27/27 aga…
bfarmer67 May 2, 2026
f6a6d90
Feature: Phase 1 Slice C.2 - OpenSearchResourceRunner (end-to-end mig…
bfarmer67 May 2, 2026
6514145
Feature: Phase 1 complete - ImplicitWaitMiddleware + R-24b lock tests
bfarmer67 May 2, 2026
147634f
Feature: Phase 2 Slice 2.1 - ALIAS verbs (R-16, NF-2)
bfarmer67 May 2, 2026
1708c32
Feature: Phase 2 Slice 2.2 - template, component, ISM policy verbs
bfarmer67 May 2, 2026
f2b529d
Feature: Phase 2 Slice 2.3 - MIGRATE INDEX composite (R-30)
bfarmer67 May 2, 2026
a7f2cd3
Feature: Phase 2 Slice 2.4 - WHEN VERSION + composed_of-aware refinement
bfarmer67 May 2, 2026
5628e29
Feature: Phase 2 Slice 2.5 - Down direction + R-19 partial-rollback l…
bfarmer67 May 2, 2026
3feb41e
Feature: Phase 3 Slices 3.1 + 3.2 - OpenSearch runner + samples projects
bfarmer67 May 2, 2026
95f3866
Docs: OpenSearch provider README - full statement syntax reference
bfarmer67 May 2, 2026
ef80dd4
Feature: Phase 3 Slice 3.4 - Authentication (Basic, ApiKey, mTLS) per…
bfarmer67 May 2, 2026
6e9df50
Refactor: Phase 3 Slice 3.5 - Body-source grammar with three resoluti…
bfarmer67 May 2, 2026
8d9b5b2
Test: Phase 2 Slice 2.11 - multi-node Testcontainers harness + R-28b …
bfarmer67 May 3, 2026
94cce59
Feature: Phase 3 Slice 3.2 - AWS SigV4 extension + endpoint loud-fail…
bfarmer67 May 3, 2026
92a8229
CI: Phase 3 Slice 3.6 - multi-node Testcontainers tests on every PR (…
bfarmer67 May 3, 2026
c85e586
Feature: Phase 2 Slice 2.8 - context filter (R-15) wired into resourc…
bfarmer67 May 3, 2026
20f8ee2
Feature: Phase 3 Slice 3.3 - BulkAllObservable wrapper (R-20)
bfarmer67 May 3, 2026
6336d1a
Feature: Phase 2 Slice 2.9 - WaitMode.PerMigration + NO WAIT modifier…
bfarmer67 May 3, 2026
167e881
Refactor: drop bodies/ subfolder convention from single-body samples
bfarmer67 May 3, 2026
8dcf44d
Test: Phase 2 Slice 2.12 - R-24c gap-fill (production scenarios c/d/g…
bfarmer67 May 3, 2026
f629a67
Docs: Phase 3 Slice 3.7 - AWS Managed OpenSearch scheduled validation…
bfarmer67 May 3, 2026
00a7694
Docs: Phase 3 Slice 3.8 - top-level docs include OpenSearch + templat…
bfarmer67 May 3, 2026
0e8a8e5
Docs: Phase 3 Slice 3.9 - ADR compliance audit (0001-0017)
bfarmer67 May 3, 2026
c4d87c2
Docs: archive completed OpenSearch provider plan
bfarmer67 May 3, 2026
e221887
chore: format code with dotnet format
invalid-email-address May 3, 2026
163196f
Hardening: EOF-anchor parser; close ADR-0009 + ADR-0016 audit soft spots
bfarmer67 May 3, 2026
3d32d00
Docs: ADR audit - mark ADR-0009 + ADR-0016 soft spots closed by 163196f
bfarmer67 May 3, 2026
939895a
chore: format code with dotnet format
invalid-email-address May 3, 2026
f408293
Hardening: ADR-0012 options-factory wiring + R-24c (f) coverage
bfarmer67 May 3, 2026
1d92bc0
Merge remote-tracking branch 'origin/devs/bfarmer/provider-opensearch…
bfarmer67 May 3, 2026
0b2eec0
chore: format code with dotnet format
invalid-email-address May 3, 2026
978acec
Docs: site - complete statement references for OpenSearch and Aerospike
bfarmer67 May 4, 2026
823430d
CI: full git history for multi-node workflow (fix Nerdbank.GitVersion…
bfarmer67 May 4, 2026
3e13527
Test: stabilize multi-node harness for shared CI runners
bfarmer67 May 4, 2026
b5a0343
CI: move multi-node workflow from PR trigger to nightly schedule
bfarmer67 May 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions .github/workflows/multi_node_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
name: Multi-Node Integration Tests

# R-28b — multi-node Testcontainers Compose CI runs every PR.
# Spins up a 3-node OpenSearch cluster and runs the [TestCategory("MultiNode")]
# tests that exercise behaviors single-node Testcontainers fundamentally
# masks (GREEN-threshold, replica allocation, shard relocation under load,
# PA-2 lock-index replicas:0 invariant).
#
# This workflow is intentionally separate from the shared `Run Tests`
# workflow because:
# - It requires Docker (the shared workflow may not).
# - It is heavier than unit tests (3 JVMs at ~512MB each, ~30s cluster
# formation per test class).
# - It compiles the integration test assembly with EnableIntegrationTests
# (which flips the `#if INTEGRATIONS` gate) — a property-driven
# define-constants flip rather than a source-level edit.

on:
schedule:
# Nightly at 03:00 UTC. Multi-node Testcontainers (3 OpenSearch JVMs)
# is too heavy and currently too flaky on shared `ubuntu-latest` PR
# runners to gate PRs (connection-reset under load on a single-endpoint
# connection pool, and inter-class container churn). The tests pass
# locally; running them nightly catches regressions without holding up
# PR merges. Stabilization for PR-trigger is tracked as follow-up work.
- cron: '0 3 * * *'
workflow_dispatch:

permissions:
contents: read

concurrency:
group: multi-node-${{ github.ref }}
cancel-in-progress: true

jobs:
multi-node:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: |
8.0.x
9.0.x
10.0.x

- name: Restore
run: dotnet restore tests/Hyperbee.Migrations.Integration.Tests/Hyperbee.Migrations.Integration.Tests.csproj

- name: Build (with EnableIntegrationTests)
run: >-
dotnet build
tests/Hyperbee.Migrations.Integration.Tests/Hyperbee.Migrations.Integration.Tests.csproj
-c Release
--no-restore
/p:EnableIntegrationTests=true

- name: Run multi-node tests (TestCategory=MultiNode)
# Tests use [TestCategory("MultiNode")] so this filter picks them up
# without affecting other test classes. The MultiNode test class's
# [ClassInitialize] spins up the 3-node cluster.
# HYPERBEE_TESTS_SKIP_SINGLE_NODE=true bypasses the assembly-level
# single-node container startup (Mongo, Postgres, Couchbase,
# Aerospike, single-node OpenSearch) since the MultiNode tests
# don't need any of them.
env:
HYPERBEE_TESTS_SKIP_SINGLE_NODE: "true"
run: >-
dotnet test
tests/Hyperbee.Migrations.Integration.Tests/Hyperbee.Migrations.Integration.Tests.csproj
-c Release
-f net10.0
--no-build
--filter "TestCategory=MultiNode"
--logger "trx;LogFileName=multinode.trx"
--logger "console;verbosity=normal"
/p:EnableIntegrationTests=true

- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: multi-node-test-results
path: '**/*.trx'
if-no-files-found: warn
4 changes: 4 additions & 0 deletions Directory.Packages.props
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@
<PackageVersion Include="Nerdbank.GitVersioning" Version="3.9.50" />
<!-- Aerospike Provider -->
<PackageVersion Include="Aerospike.Client" Version="8.2.0" />
<!-- OpenSearch Provider -->
<PackageVersion Include="OpenSearch.Client" Version="1.8.0" />
<PackageVersion Include="OpenSearch.Net" Version="1.8.0" />
<PackageVersion Include="OpenSearch.Net.Auth.AwsSigV4" Version="1.8.0" />
<!-- Parsing -->
<PackageVersion Include="Parlot" Version="1.5.7" />
<!-- Testing Framework -->
Expand Down
4 changes: 4 additions & 0 deletions Hyperbee.Migrations.slnx
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@
<Project Path="runners/Hyperbee.MigrationRunner.Couchbase/Hyperbee.MigrationRunner.Couchbase.csproj" />
<Project Path="runners/Hyperbee.MigrationRunner.MongoDB/Hyperbee.MigrationRunner.MongoDB.csproj" />
<Project Path="runners/Hyperbee.MigrationRunner.Postgres/Hyperbee.MigrationRunner.Postgres.csproj" />
<Project Path="runners/Hyperbee.MigrationRunner.OpenSearch/Hyperbee.MigrationRunner.OpenSearch.csproj" />
</Folder>
<Folder Name="/Samples/">
<Project Path="runners/samples/Hyperbee.Migrations.Aerospike.Samples/Hyperbee.Migrations.Aerospike.Samples.csproj" />
<Project Path="runners/samples/Hyperbee.Migrations.Couchbase.Samples/Hyperbee.Migrations.Couchbase.Samples.csproj" />
<Project Path="runners/samples/Hyperbee.Migrations.MongoDB.Samples/Hyperbee.Migrations.MongoDB.Samples.csproj" />
<Project Path="runners/samples/Hyperbee.Migrations.Postgres.Samples/Hyperbee.Migrations.Postgres.Samples.csproj" />
<Project Path="runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj" />
</Folder>
<Folder Name="/Solution Items/">
<File Path="Directory.Build.props" />
Expand Down Expand Up @@ -40,6 +42,8 @@
</Folder>
<Project Path="src/Hyperbee.Migrations.Providers.Aerospike/Hyperbee.Migrations.Providers.Aerospike.csproj" />
<Project Path="src/Hyperbee.Migrations.Providers.Couchbase/Hyperbee.Migrations.Providers.Couchbase.csproj" />
<Project Path="src/Hyperbee.Migrations.Providers.OpenSearch/Hyperbee.Migrations.Providers.OpenSearch.csproj" />
<Project Path="src/Hyperbee.Migrations.Providers.OpenSearch.Aws/Hyperbee.Migrations.Providers.OpenSearch.Aws.csproj" />
<Project Path="src/Hyperbee.Migrations.Providers.MongoDB/Hyperbee.Migrations.Providers.MongoDB.csproj" />
<Project Path="src/Hyperbee.Migrations.Providers.Postgres/Hyperbee.Migrations.Providers.Postgres.csproj" />
<Project Path="src/Hyperbee.Migrations/Hyperbee.Migrations.csproj" />
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ The Cron Helper uses HangFire Cronos.
### Features include:

* Easy integration
* Supports **Aerospike**, **Couchbase**, **MongoDB** and **PostgreSQL**
* Supports **Aerospike**, **Couchbase**, **MongoDB**, **OpenSearch**, and **PostgreSQL**
* Resource Migrations
* Migrations can be defined as embedded resource files (SQL, N1QL, AQL, MongoDB commands, JSON documents) alongside code-based migrations, enabling database changes without recompilation.
* Migrations can be defined as embedded resource files (SQL, N1QL, AQL, MongoDB commands, OpenSearch DDL, JSON documents) alongside code-based migrations, enabling database changes without recompilation.
* Preventing simultaneous migrations
* By default, Hyperbee Migrations prevents parallel migration runner execution.
* Profiles
Expand Down
60 changes: 60 additions & 0 deletions docs/decisions/0011-hybrid-parser-runtime-injection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# ADR-0011: Hybrid Parser+Runtime Injection for OpenSearch Safe Defaults

**Status:** Accepted
**Date:** 2026-05-02

## Context

The OpenSearch provider must apply safe defaults to prevent silent data corruption. Two are load-bearing:

- `op_type: create` injection on `REINDEX` request bodies (closes PM-3 from assessment 0002 — re-runs of a partially-completed reindex would otherwise double-write or skip new docs)
- `dynamic: strict` injection on `CREATE INDEX` mappings (eliminates mapping explosion; per R-17 must be component-template-aware: skipped when body has `composed_of`)

Two extreme architectures were rejected:

1. **Pure runtime middleware** (Approach A in `/nop:propose` for this provider) — applied during request dispatch on fully-built JSON. Cannot satisfy R-18's parse-time syntactic detection of unsafe ops with file/line/recognized-verb error context; component-template detection requires a JSON-tree walk on every dispatch; UNSAFE/NO WAIT justification token validation must happen at parse anyway. Existing providers (Couchbase, Aerospike, MongoDB) use pure runtime patterns, but those providers don't face JSON-body-merging hazards at OpenSearch's scale.

2. **Pure parser** (Approach B in propose) — AST emits a final correct payload; runtime is a thin transport. Cannot route logs through `SecretScrubber` (R-10/R-25), cannot emit structured WARN events from response paths, cannot observe Tasks API progress. Loses runtime observability entirely.

The assessment 0002 meta-finding established that *"documentation as a fix for correctness hazards on the laziest path is anti-pattern."* Safe defaults must be enforced in code, not documented in samples. The Independent Review's pattern claim (Red-Blue₂ Phase 3.75) was validated 4-of-5 contested, demanding parser-level enforcement for `op_type: create`, component-template-aware `dynamic: strict`, and `ALIAS SWAP` atomic-precondition.

The forces in tension: parse-time correctness (error messages, structural detection, AST-level intent) vs. runtime concerns (live request/response observation, secret scrubbing, structured event emission). Neither extreme satisfies the requirements.

## Decision

We will use a hybrid: parser owns *intent*, runtime owns *execution*.

**Parser layer (Parlot, per ADR-0001) produces:**
- AST nodes carrying safe-default flags (`op_type:create=true` on `REINDEX`, `dynamic:strict=auto` on `CREATE INDEX`)
- Component-template-aware flag computation (`dynamic:strict=auto` resolves to off when AST body has `composed_of`)
- Parse-time syntactic enumeration of unsafe operations (R-18) with file/index/recognized-verb error context
- UNSAFE/NO WAIT justification token validation (non-empty reason required)
- Semantic version comparison (R-15a) — parsed to `System.Version` at parse time
- `MIGRATE INDEX` composite verb decomposition into `CREATE INDEX` + `REINDEX` + `ALIAS SWAP` AST nodes (R-30)

**Runtime middleware layer applies:**
- `SafeDefaultMergeMiddleware` — merges AST flags into the JSON tree during request build
- `ImplicitWaitMiddleware` — issues scoped `_cluster/health` per `WaitMode` (R-12)
- `TasksApiPollMiddleware` — handles `?wait_for_completion=false` flow (R-11)
- `SecretScrubberSink` — wraps `ILogger`; redacts `SecretMarker` content-hashes from all output (R-10/R-25)

The two layers communicate through the AST. The parser cannot dispatch HTTP; the runtime cannot reject ill-formed grammar.

## Consequences

**Easier:**
- Parse-time errors carry full positional context (file, statement index, recognized-verb-so-far) — operators don't debug runtime stack traces for grammar issues
- Component-template detection is structural (presence of `composed_of` key on the AST) — no fragile JSON-tree walking at runtime
- Safe-default behavior changes are localized: new safe-default → new AST flag + new merge rule; observability changes are middleware-only
- Consumers extending the grammar add AST nodes with flags; they don't write middleware
- Unit tests against the parser are fast and don't require an OpenSearch container

**Harder:**
- Two layers must stay coordinated; the merge logic in middleware must correctly handle arbitrary user-supplied JSON bodies without losing AST flag intent
- The riskiest assumption in this architecture: runtime middleware can correctly merge AST safe-default flags into user-supplied JSON. This must be validated via a Phase 1 spike before any other implementation work
- Documentation must distinguish "parser-resolvable" decisions (compile-time) from "runtime-resolvable" decisions (dispatch-time) — failing to teach this distinction breeds confusion among future maintainers

**Constrains:**
- Any new safe-default behavior must declare its intent at the AST level (parser-resolvable) AND provide a runtime merge path
- Extending grammar via consumer DI is a parser-side decision (Parlot grammar composition); extending observability is a middleware-side decision
- Future ADRs about parser changes must consider whether the change requires a corresponding middleware update
61 changes: 61 additions & 0 deletions docs/decisions/0012-with-production-defaults-extension.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# ADR-0012: WithProductionDefaults() Extension Method (Not Environment Profile Enum)

**Status:** Accepted
**Date:** 2026-05-02

## Context

Several requirements coordinate dev-vs-prod safety defaults that must change together:

- `ClusterHealthThreshold` (R-03): Yellow / Green
- `WaitMode` (R-12): PerStatement / PerMigration
- `RequireUnsafeJustification` (R-18): false / true
- `ContextResolutionPolicy` (R-15): SkipIfUnset / RequireExplicit

In assessment 0002's Synthesis phase (Phase 2), the proposed solution was an `EnvironmentProfile = Development | Production` enum: one operator decision would flip all four behaviors. The synthesis explicitly flagged this as load-bearing — if the maintainer rejected the enum, the entire synthesis would collapse.

Independent Review (Phase 3.5) rejected the enum on three grounds:

1. **Hidden coupling** — flipping `Profile` silently flips four behaviors. The operator sees `Profile = Production` and must remember (or look up) what that implies. This is the laziest-path footgun the Mechanism Design analysis explicitly warns against.
2. **Contradicts a stated goal** — the user goal "same migrations run unchanged across all three topologies" applies to migration *files*, not DI configuration. An environment enum in DI re-introduces environment-aware switches that consumers reasoned about *not* having.
3. **Discoverability** — an enum value is set once at config time; an extension method shows in IntelliSense at the registration site, is grep-able in code review, and is callable as part of an audit trail.

Red-Blue₂ (Phase 3.75) resolved this contested point: Red (the IR's position) won; the synthesis was modified.

The forces in tension: operator ergonomics (one decision flips four defaults coherently) vs lazy-path safety (no hidden coupling); maintainer simplicity (one named noun consolidates the behaviors) vs IntelliSense-level discoverability.

## Decision

We will provide `services.AddOpenSearchMigrations(opts => { ... }).WithProductionDefaults();` as the single forcing function for production safety defaults.

The extension method explicitly sets:
- `ClusterHealthThreshold = Green`
- `WaitMode = PerMigration`
- `RequireUnsafeJustification = true`
- `ContextResolutionPolicy = RequireExplicit`

Per-option settings the operator chains AFTER `WithProductionDefaults()` win — the extension does not re-apply defaults if values were explicitly set later in the chain.

We will NOT provide an `EnvironmentProfile` enum. We will NOT auto-detect production environment from `DOTNET_ENVIRONMENT` / `ASPNETCORE_ENVIRONMENT` and apply defaults silently.

The startup banner (R-25) emits all resolved defaults at INFO so operators verify what's set in production logs.

## Consequences

**Easier:**
- Production deployments call one discoverable extension; the call site shows what changed without operators reading documentation
- Audit trails (git blame, code review) trivially identify which deployments use production defaults
- Resolved defaults visible in production logs (R-25 banner) so operators verify what's actually set
- Per-option overrides chain after the extension and win cleanly — no inheritance/override magic
- Extension method approach generalizes: future named bundles (`.WithCanaryDefaults()`, `.WithMigrationDryRunDefaults()`) follow the same pattern

**Harder:**
- Operators must explicitly call the extension; no implicit "set environment" gives prod safety
- Developers running locally with `DOTNET_ENVIRONMENT=Production` won't get prod defaults unless they call the extension explicitly — this is intentional but requires onboarding
- The runner project (R-26) must document the extension call in its sample `Program.cs`; new adopters who skip docs may ship dev defaults to prod
- A future regret about explicit-only opt-in cannot be reversed without superseding this ADR

**Constrains:**
- Future "named profile" requests (Staging, Canary) must justify avoiding the same hidden-coupling concern; if added, they should be additional extension methods, not enum values
- Per-option default changes must be reflected in the extension method's body; drift between "what's documented as production-safe" and "what the extension sets" must be tested
- The startup banner is required for completeness — without it, the extension's effects are invisible in deployed environments
Loading
Loading