Conversation
…it-index ADR Three small improvements informed by a comparative analysis against an external OpenSearch provider implementation: #1 Parser drive-letter guard. The body-path validator already rejected leading `/` and `\` and `..` segments per ADR-0017, but the lexer excluded `:` so `@C:/foo` produced a confusing parse error rather than the intended "absolute path" message. Allow `:` in the lexer accept set, then explicitly reject drive-letter prefixes (`C:`, `c:`, ...) and any other `:` in the path. Closes a cross-platform asymmetry where an author on Windows could write a path that's silently rooted on Linux. #5 Rejection-sweep tests. Theory-style coverage for the drive-letter shape (`C:/foo`, `C:\foo`, `c:/foo`, `Z:\foo`, `a:/x.json`) plus a test for stray `:` in a path. Existing absolute-path and `..` tests unchanged. #4 Forward-attachment sample (9000-ForwardAttachmentLifecycle). New sample demonstrating the declarative attachment pattern for greenfield pipelines: index template with `template.aliases` block + ISM policy with `ism_template.index_patterns` block. No runtime APPLY POLICY, no runtime ALIAS ADD; the cluster handles attachment lazily as new indices roll over. Provider README's ISM section grows a "Forward attachment vs runtime apply" subsection making the choice explicit. Sample 4000 stays as the runtime-apply backfill demonstration; samples README pairs them with a one-paragraph explanation of when to use which. ADR-0018 split-index trade-off. Captures why the OpenSearch provider ships two indices (.migrations ledger + .migrations-lock lock) while Aerospike/Couchbase/MongoDB/Postgres co-locate. Reason: PA-2 lock `replicas:0` mitigation against replica-write coupling under N-runner contention; the ledger keeps cluster-default durability. ADR-0017 also updated to mention the drive-letter check in the parse-time validation surface.
ISM attachment to an index series is three different problems, not one: - Greenfield (future indices auto-attach via ism_template) -> sample 9000 - One-time backfill (existing indices need a policy) -> sample 4000 - Ongoing reconciliation (policy evolves over time) -> sample 9001 (NEW) Sample 9001 demonstrates the reconciliation pattern: a [Migration(N, journal: false)] that re-runs APPLY POLICY against the wildcard pattern on every startup. ISM's change_policy is idempotent for already-on-policy indices, so re-running is cheap and convergent. The wildcard form is correct because the set of indices to reconcile changes as new ones roll over and old ones are deleted. Provider README's 'Forward attachment vs runtime apply' subsection expanded into a 'Three temporal scopes for ISM attachment' table so the choice between the three patterns is explicit, not implicit. Samples README adds the same matrix and points at the provider README as the canonical explainer. The three are stackable in a mature pipeline (greenfield at install, backfill when an existing series first adopts the policy, reconciliation as the policy evolves). Many pipelines never need more than one -- but the idea is to choose deliberately rather than reach for runtime APPLY POLICY by default.
5 tasks
…ee-scope ISM framing Two doc gaps surfaced during a documentation review: 1. Path-validation list (provider README + docs/site/opensearch.md) didn't mention the drive-letter rejection added in this PR's parser change. A Windows author writing @C:/foo would get a clear runtime error but no docs telling them what shapes the validator catches. 2. docs/site/opensearch.md ISM section described only runtime APPLY POLICY. The provider README and samples README in this PR introduce the 'Three temporal scopes for ISM attachment' framing (greenfield via ism_template, one-time backfill, ongoing reconciliation) and reference samples 4000/9000/9001 -- the site doc was the last surface still describing only the backfill case. Site doc updates kept ASCII-only per site-build constraint.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three small, defensive improvements to the OpenSearch provider, informed by a comparative analysis against a sibling external implementation. None changes existing API or behavior for valid migrations.
//\and..per ADR-0017, but the lexer excluded:— so@C:/fooproduced a confusing parse error rather than the intended "absolute path" message. Now::is in the lexer accept set, drive-letter prefix (C:,c:, ...) and any other:are rejected with a clear remediation message. Closes a Windows-vs-Linux authoring asymmetry.C:/foo,C:\foo,c:/foo,Z:\foo,a:/x.json, plus a test for stray:in a path.ForwardAttachmentLifecycledemonstrating the declarative-attachment pattern for greenfield pipelines: index template withtemplate.aliasesblock + ISM policy withism_template.index_patternsblock. No runtimeAPPLY POLICY, no runtimeALIAS ADD— the cluster wires both lazily as new indices roll over. Sample 4000 stays as the runtime-apply backfill demonstration; samples README pairs them.replicas:0invariant requires distinct durability profiles.What this PR does NOT do
Two ideas from the source analysis were considered and deliberately excluded:
APPLY POLICY ON INDICES (a, b, c)grammar variant. The justification (re-run determinism) doesn't apply because hyperbee's runner skips already-journaled migrations; recovery paths (Journal=false,Down+Up, operator deleting the journal entry) deliberately want current-state wildcard matching. Adding a literal-list form would solve a problem we don't have and create a real one (authors must enumerate auto-discoverable indices). If review-time clarity ever matters, the right shape is to capture the matched indices in dispatch logs / the journal record.CREATE POLICY409 → CAS retry. The doc's main motivation (re-run after partial failure) doesn't apply for the same reason. The narrow remaining case (an[Migration(N, Journal=false)]migration that creates a policy) is reasonable to address as an authoring constraint rather than a verb-level idempotency hack.Test plan
dotnet build— provider, samples, runner all clean.dotnet testunit suite — 356/356 pass on net10.0; new drive-letter rejection rows pass.dotnet format --verify-no-changes— clean.