From d9065765208d58f10c0d4e7c72c453bd687c09a0 Mon Sep 17 00:00:00 2001 From: Brenton Farmer Date: Mon, 4 May 2026 11:33:56 -0700 Subject: [PATCH 1/3] OpenSearch provider hygiene: parser guard, forward-attach sample, split-index ADR Three small improvements informed by a comparative analysis against an external OpenSearch provider implementation: #1 Parser drive-letter guard. The body-path validator already rejected leading `/` and `\` and `..` segments per ADR-0017, but the lexer excluded `:` so `@C:/foo` produced a confusing parse error rather than the intended "absolute path" message. Allow `:` in the lexer accept set, then explicitly reject drive-letter prefixes (`C:`, `c:`, ...) and any other `:` in the path. Closes a cross-platform asymmetry where an author on Windows could write a path that's silently rooted on Linux. #5 Rejection-sweep tests. Theory-style coverage for the drive-letter shape (`C:/foo`, `C:\foo`, `c:/foo`, `Z:\foo`, `a:/x.json`) plus a test for stray `:` in a path. Existing absolute-path and `..` tests unchanged. #4 Forward-attachment sample (9000-ForwardAttachmentLifecycle). New sample demonstrating the declarative attachment pattern for greenfield pipelines: index template with `template.aliases` block + ISM policy with `ism_template.index_patterns` block. No runtime APPLY POLICY, no runtime ALIAS ADD; the cluster handles attachment lazily as new indices roll over. Provider README's ISM section grows a "Forward attachment vs runtime apply" subsection making the choice explicit. Sample 4000 stays as the runtime-apply backfill demonstration; samples README pairs them with a one-paragraph explanation of when to use which. ADR-0018 split-index trade-off. Captures why the OpenSearch provider ships two indices (.migrations ledger + .migrations-lock lock) while Aerospike/Couchbase/MongoDB/Postgres co-locate. Reason: PA-2 lock `replicas:0` mitigation against replica-write coupling under N-runner contention; the ledger keeps cluster-default durability. ADR-0017 also updated to mention the drive-letter check in the parse-time validation surface. --- docs/decisions/0017-body-source-grammar.md | 10 ++- .../0018-split-ledger-and-lock-indices.md | 74 +++++++++++++++++++ docs/decisions/INDEX.md | 1 + ...erbee.Migrations.OpenSearch.Samples.csproj | 8 ++ .../9000-ForwardAttachmentLifecycle.cs | 52 +++++++++++++ .../README.md | 11 +++ .../bodies/component.json | 18 +++++ .../bodies/policy.json | 26 +++++++ .../bodies/template.json | 10 +++ .../statements.json | 16 ++++ .../Grammar/OpenSearchStatementParser.cs | 30 ++++++-- .../README.md | 24 +++++- .../Internal/BodySourceParserTests.cs | 28 +++++++ 13 files changed, 301 insertions(+), 7 deletions(-) create mode 100644 docs/decisions/0018-split-ledger-and-lock-indices.md create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9000-ForwardAttachmentLifecycle.cs create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/component.json create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/policy.json create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/template.json create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/statements.json diff --git a/docs/decisions/0017-body-source-grammar.md b/docs/decisions/0017-body-source-grammar.md index 542180c..1dca60d 100644 --- a/docs/decisions/0017-body-source-grammar.md +++ b/docs/decisions/0017-body-source-grammar.md @@ -139,11 +139,19 @@ contract; migrating existing resources is optional. ### Path validation (parse-time) -The grammar accepts characters `[a-zA-Z0-9_\-./\\]` in `@path`. +The grammar accepts characters `[a-zA-Z0-9_\-./\\:]` in `@path`. The +`:` is in the lexer's accept set only so a drive-letter prefix surfaces +as a clean "absolute path" error rather than a generic parse failure. Validation rejects at parse time: - Absolute paths (leading `/` or `\`) — body files must be inside the migration's resource folder. +- Drive-letter prefix (`C:`, `c:`, ...) — same reason. `Path.IsPathRooted` + is platform-dependent so an author editing on one host could otherwise + produce a manifest that's silently rooted on another. +- Any other `:` in the path — embedded resource names don't use it; the + reject is mechanical because a `:` that isn't a drive-letter prefix + is almost certainly an authoring mistake. - `..` segments — no parent-directory traversal; each migration's body files stay self-contained. diff --git a/docs/decisions/0018-split-ledger-and-lock-indices.md b/docs/decisions/0018-split-ledger-and-lock-indices.md new file mode 100644 index 0000000..2040696 --- /dev/null +++ b/docs/decisions/0018-split-ledger-and-lock-indices.md @@ -0,0 +1,74 @@ +# ADR-0018: OpenSearch Provider Splits Ledger and Lock into Two Indices + +**Status:** Accepted +**Date:** 2026-05-04 + +## Context + +Hyperbee's NoSQL provider family — Aerospike, Couchbase, MongoDB, Postgres — co-locates the migration ledger and the run-lock in a single namespace, distinguished by document id. A reviewer comparing the OpenSearch provider against a sibling internal implementation flagged that the OpenSearch provider deviates from this convention: it ships with two indices, `.migrations` (ledger) and `.migrations-lock` (lock), defaulted by `OpenSearchMigrationOptions.LedgerIndex` and `LockIndex` respectively. + +The deviation is intentional but, until now, undocumented as an ADR. The reviewer's observation is correct in two ways: + +1. **The convention exists.** The other four providers co-locate; the OpenSearch provider is the outlier. +2. **The deviation is load-bearing.** It exists to serve a specific concurrency invariant that the other providers don't face in the same shape. + +The risk of leaving this implicit is that a future provider author copies the wrong convention — either co-locating in OpenSearch when they shouldn't, or splitting in a future provider when they shouldn't. This ADR captures the reasoning so the next implementer makes the choice deliberately. + +## The deviation + +OpenSearch's primary-shard write contract is shard-replica coupling: a primary write blocks until each in-sync replica acknowledges the write. Under N-runner concurrent lock acquire (R-24b), the lock primary shard is contended; replica-write coupling adds a second source of tail latency on top of the contention itself. + +The mitigation (PA-2 from assessment 0002, encoded in `LockIndexInitStep`) is to create the lock index with `number_of_replicas: 0`. The lock document then writes to a single primary with no replica fan-out — eliminating replica-write coupling as a tail-latency contributor under contention. + +The ledger index has the opposite needs: it's a forensic record (R-06) used after the fact to answer "what migrations ran, when, in what direction, against what state." Durability matters; tail latency under concurrent writes does not (the lock serializes writes to the ledger). The ledger gets the cluster's normal replica configuration. + +Two distinct durability/latency profiles, two indices. The other providers don't face this trade-off because: + +- **Aerospike** uses native CAS on a record key in a configured namespace; durability is a namespace-level setting and is not coupled to replica-write semantics on a per-record basis. +- **Couchbase** uses bucket-level durability; the lock is a single document with provider-level coordination. +- **MongoDB** uses a collection-level write concern; the lock is a single document with `findOneAndUpdate` semantics. +- **Postgres** uses `pg_advisory_lock`; the lock is not a row in the ledger table at all. + +In each case, the lock's durability story is decoupled from the ledger's, either by language (advisory lock vs row) or by configuration knob (namespace, bucket, collection write concern). OpenSearch couples them through index settings — which means decoupling the two requires two indices. + +## Decision + +The OpenSearch provider will continue to ship two indices: + +- `LedgerIndex` (default `.migrations`) — strict-mapped ledger per R-06, with the cluster's normal replica configuration. +- `LockIndex` (default `.migrations-lock`) — `number_of_replicas: 0` per PA-2 mitigation, asserted by `LockIndexInitStep`. + +We will not introduce an option to combine them into a single index. The combined-index shape would either lose the PA-2 mitigation (if the index were configured for ledger-grade durability) or compromise the ledger's durability (if the index were configured for `replicas: 0`). Neither trade-off is worth the cross-provider symmetry. + +If a future operator deployment is so IAM-restricted that index creation is gated to a single index, we will reconsider — but only as a documented constrained-mode opt-in, never as a default. ADR-0013's `AssumeIndicesExist` already covers the IAM-restricted case for both indices; no additional surface is needed today. + +## Consequences + +**Easier:** + +- The lock's tail-latency story is clean: under R-24b N-runner contention, the lock primary shard's write path has no replica-coupling component. +- The ledger's durability story is clean: it inherits the cluster's normal replica configuration without per-index special-casing. +- Operators in non-AWS environments who configure cluster-wide replica counts get exactly what they expect for both indices. + +**Harder:** + +- Operators must monitor / back up two indices. In practice this is one extra entry in any backup or alerting tool; the entries are co-located by name (`.migrations*` glob covers both). +- Cross-provider documentation has to surface the asymmetry. This ADR is the canonical reference; the provider README's "Quick start" continues to default both indices for the common case. +- The next provider author asking "should I co-locate or split?" must read this ADR. The default answer is co-locate (the house style); split only when the lock and ledger have distinct durability or latency requirements that the underlying engine couples through shared configuration. + +**Constrains:** + +- The lock and ledger indices are part of the public contract of `OpenSearchMigrationOptions`. Removing either as a top-level index requires a superseding ADR. +- The PA-2 invariant (`number_of_replicas: 0` on the lock index) is asserted at startup by `LockIndexInitStep`; weakening this assertion requires a superseding ADR. + +## Relation to other ADRs + +- **ADR-0013 (Always-Create Lock and Ledger Indices in InitializeAsync with Explicit Override)** — this ADR refines the model that one introduced. ADR-0013 names the two indices and the always-create behavior; this ADR captures *why there are two*. +- **ADR-0005 (Provider-Native Distributed Locking)** — preserved. The split is an OpenSearch-specific implementation choice for native locking; the cross-provider lock contract is unchanged. + +## Implementation + +- `OpenSearchMigrationOptions.LedgerIndex` (default `.migrations`) and `OpenSearchMigrationOptions.LockIndex` (default `.migrations-lock`). +- `LedgerIndexInitStep` creates the ledger with strict R-06 mapping; replica configuration follows cluster default. +- `LockIndexInitStep` creates the lock with `number_of_replicas: 0` and asserts the value when the index already exists. Mismatch fails at startup with a remediation message. +- `OpenSearchRecordStore` reads the ledger and the lock through these options. There is no path that writes lock state to the ledger index (or vice versa). diff --git a/docs/decisions/INDEX.md b/docs/decisions/INDEX.md index 8aa475c..298788f 100644 --- a/docs/decisions/INDEX.md +++ b/docs/decisions/INDEX.md @@ -19,3 +19,4 @@ | 0015 | [Parser is Offline-Pure; All I/O is Runtime Middleware](0015-parser-offline-pure-all-io-runtime.md) | Accepted | 2026-05-02 | Clarifying corollary of ADR-0011; resolves R-30 template lookup ambiguity by deferring all I/O (including template body resolution) to runtime middleware | | 0016 | [OpenSearch Provider Does Not Use File-Level Templating](0016-no-file-level-templating.md) | Accepted | 2026-05-02 | Strikes R-10; matches Aerospike/Couchbase/MongoDB/Postgres house style (typed options + runtime substitution); deletes Phase 0 Task 0.4 work; removes Hyperbee.Templating dependency | | 0017 | [Body-Source Grammar — Three Resolution Forms](0017-body-source-grammar.md) | Accepted | 2026-05-02 | `WITH BODY @path` direct file reference + `bodies.` structured section + ADR-0009 sibling-property fallback for back-compat; parse-time path validation rejects absolute paths and `..` traversal | +| 0018 | [OpenSearch Provider Splits Ledger and Lock into Two Indices](0018-split-ledger-and-lock-indices.md) | Accepted | 2026-05-04 | Captures why the OpenSearch provider deviates from the Aerospike/Couchbase/MongoDB/Postgres single-namespace convention: distinct durability/latency profiles (PA-2 lock `replicas:0`, normal ledger durability) require two indices | diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj index f80bf4d..36e533e 100644 --- a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj @@ -15,6 +15,10 @@ + + + + @@ -28,6 +32,10 @@ + + + + diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9000-ForwardAttachmentLifecycle.cs b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9000-ForwardAttachmentLifecycle.cs new file mode 100644 index 0000000..fb8b0f2 --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9000-ForwardAttachmentLifecycle.cs @@ -0,0 +1,52 @@ +using Hyperbee.Migrations.Providers.OpenSearch.Resources; + +namespace Hyperbee.Migrations.OpenSearch.Samples.Migrations; + +// Sample 9: forward-attachment lifecycle for greenfield pipelines. +// +// Contrast with sample 4 (IsmPolicyAndApply), which demonstrates the +// runtime APPLY POLICY path — necessary when you need to attach a policy +// to indices that ALREADY exist (backfill). +// +// For pipelines starting clean — daily rollover indices for a new +// application, fresh log streams, anything where the migration runs +// before the indices exist — declarative attachment is preferable. The +// migration installs only the cluster-level scaffolding: +// +// CREATE COMPONENT — shared settings/mappings, declared once. +// CREATE TEMPLATE — `index_patterns` matches the rollover series; the +// template's `template.aliases` block wires the +// alias automatically when a matching index is +// created. +// CREATE POLICY — the policy body's `ism_template.index_patterns` +// block attaches the policy to any matching index +// at creation time. +// +// Note: there is NO runtime APPLY POLICY and NO runtime ALIAS ADD. The +// first index in the series — created later by the application, by daily +// rollover, or by a successor migration — picks up everything: settings, +// mappings, alias, lifecycle policy. +// +// When to use this pattern vs. sample 4: +// +// - greenfield series (no existing indices) -> sample 9 pattern +// - existing indices that need a new policy -> sample 4 pattern +// - new policy applies to BOTH existing and future -> both: sample 4 +// pattern PLUS an +// `ism_template` +// block in the policy +// +// Caveat: `ism_template` inside an ISM policy body is the modern endpoint +// (`_plugins/_ism/policies`). Older AWS-managed clusters served by the +// legacy `_opendistro/_ism` endpoint may not recognize it; the bootstrap +// `IsmEndpointDetectStep` resolves which endpoint is active, but the +// declarative `ism_template` shape itself is a property of the modern +// schema. If you target a legacy endpoint, fall back to sample 4's +// runtime APPLY for forward attachment. + +[Migration( 9000 )] +public class ForwardAttachmentLifecycle( OpenSearchResourceRunner runner ) : Migration +{ + public override Task UpAsync( CancellationToken cancellationToken = default ) + => runner.StatementsFromAsync( "statements.json", cancellationToken ); +} diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md index 291c888..936ff3d 100644 --- a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md @@ -15,11 +15,22 @@ this assembly via `Migrations:FromPaths` and runs them in version order. | 6000 | **`MigrateIndexComposite`** | **Featured: `MIGRATE INDEX` composite — the canonical template-propagation pattern (R-30)** | Form 2 | | 7000 | `ReversibleAlias` | Opt-in `rollback` per statement; partial-rollback ledger semantics (R-19) | (no bodies — DDL-only rollback) | | 8000 | `UnsafeReindex` | `REINDEX UNSAFE("")` — opt-out of `op_type:create` | Form 2 | +| 9000 | `ForwardAttachmentLifecycle` | Greenfield-only: declarative attachment via `template.aliases` + `ism_template` — **no runtime `APPLY POLICY` or `ALIAS ADD`** | Form 1 — direct `WITH BODY @path` for each body | **Sample 6 is the headline.** Adopters asking "how do I apply a template/mapping change to existing data?" should be pointed at `MigrateIndexComposite` first; the long-form sample 2 exists to show what the composite expands to. +**Samples 4 and 9 are paired.** Sample 4 demonstrates runtime attachment — +`APPLY POLICY` on a wildcard pattern, used when the indices already exist. +Sample 9 demonstrates declarative attachment — the policy's `ism_template` +block and the template's `aliases:` block — used when the migration runs +before the indices exist (greenfield rollover series). For most new +pipelines, sample 9's pattern is preferred: the cluster handles attachment +lazily, no follow-up migration is needed when the first index is created, +and the wired-state of the cluster is fully described by the templates and +policies, not by ad-hoc runtime calls. + **Body-source forms.** ADR-0017 defines three resolution forms for `WITH BODY` references. The samples deliberately demonstrate all of them so authors can compare the trade-offs side by side: diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/component.json b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/component.json new file mode 100644 index 0000000..1433c24 --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/component.json @@ -0,0 +1,18 @@ +{ + "template": { + "settings": { + "number_of_shards": 1, + "number_of_replicas": 1, + "refresh_interval": "30s" + }, + "mappings": { + "dynamic": "strict", + "properties": { + "@timestamp": { "type": "date" }, + "level": { "type": "keyword" }, + "msg": { "type": "text" }, + "service": { "type": "keyword" } + } + } + } +} diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/policy.json b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/policy.json new file mode 100644 index 0000000..d2233ad --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/policy.json @@ -0,0 +1,26 @@ +{ + "policy": { + "description": "Forward-attaching lifecycle policy. The `ism_template.index_patterns` block tells the cluster to attach this policy to any new index whose name matches `sample_app_events-*` at creation time. No runtime APPLY POLICY is required for indices that don't exist yet.", + "default_state": "hot", + "states": [ + { + "name": "hot", + "actions": [], + "transitions": [ + { "state_name": "delete", "conditions": { "min_index_age": "30d" } } + ] + }, + { + "name": "delete", + "actions": [{ "delete": {} }], + "transitions": [] + } + ], + "ism_template": [ + { + "index_patterns": ["sample_app_events-*"], + "priority": 100 + } + ] + } +} diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/template.json b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/template.json new file mode 100644 index 0000000..09294ee --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/bodies/template.json @@ -0,0 +1,10 @@ +{ + "index_patterns": ["sample_app_events-*"], + "priority": 100, + "composed_of": ["sample_app_events_mappings"], + "template": { + "aliases": { + "sample_app_events": {} + } + } +} diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/statements.json b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/statements.json new file mode 100644 index 0000000..5bf6681 --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9000-ForwardAttachmentLifecycle/statements.json @@ -0,0 +1,16 @@ +{ + "statements": [ + { + "//": "Shared mappings + settings extracted as a component template. Indices that match the next statement's index_patterns will compose this in.", + "statement": "CREATE COMPONENT sample_app_events_mappings WITH BODY @bodies/component.json" + }, + { + "//": "Index template with `aliases:` block — the cluster wires the alias automatically when a matching index is created. No runtime ALIAS ADD needed.", + "statement": "CREATE TEMPLATE sample_app_events WITH BODY @bodies/template.json" + }, + { + "//": "ISM policy with `ism_template.index_patterns` — the cluster attaches this policy to any matching index at creation time. No runtime APPLY POLICY needed.", + "statement": "CREATE POLICY sample_app_events_lifecycle WITH BODY @bodies/policy.json" + } + ] +} diff --git a/src/Hyperbee.Migrations.Providers.OpenSearch/Internal/Grammar/OpenSearchStatementParser.cs b/src/Hyperbee.Migrations.Providers.OpenSearch/Internal/Grammar/OpenSearchStatementParser.cs index 83408a8..69d7570 100644 --- a/src/Hyperbee.Migrations.Providers.OpenSearch/Internal/Grammar/OpenSearchStatementParser.cs +++ b/src/Hyperbee.Migrations.Providers.OpenSearch/Internal/Grammar/OpenSearchStatementParser.cs @@ -119,9 +119,14 @@ private static Parser BuildParser() // policies, reusable templates). // // Path validation is parse-time only: we reject leading `/` or `\` - // (absolute paths) and any `..` segment (parent-directory traversal) - // so each migration's body files stay self-contained — keeps repeatable - // dotnet publish boundaries honest. + // (Unix-rooted), drive-letter prefixes like `C:` or `c:` (Windows- + // rooted), and any `..` segment (parent-directory traversal) so each + // migration's body files stay self-contained — keeps repeatable + // dotnet publish boundaries honest. Path.IsPathRooted is platform- + // dependent ("C:/foo" reads as rooted on Windows but not on Linux), + // so the validator checks the rooted shapes explicitly: an author + // editing on one host can't produce a path that's silently rooted + // on another. var dollar = Terms.Char( '$' ); var at = Terms.Char( '@' ); @@ -129,15 +134,28 @@ private static Parser BuildParser() var siblingBodyRef = with.SkipAnd( body ).SkipAnd( dollar ).SkipAnd( identifier ) .Then( static name => (BodySource) new BodyRef( name ) ); - // path: letters/digits/_/-/./forward+back-slash. Terminates at whitespace. + // path: letters/digits/_/-/./forward+back-slash, plus `:` so a + // drive-letter prefix surfaces a clear "absolute path" error + // instead of a generic parse failure when an author writes + // `@C:/foo`. Terminates at whitespace. var bodyPath = Terms.Pattern( - static c => char.IsLetterOrDigit( c ) || c is '_' or '-' or '.' or '/' or '\\' + static c => char.IsLetterOrDigit( c ) || c is '_' or '-' or '.' or '/' or '\\' or ':' ).Then( static buf => { var path = buf.ToString()!; if ( path.StartsWith( '/' ) || path.StartsWith( '\\' ) ) throw new InvalidOperationException( $"WITH BODY `@{path}` is absolute. Body files must live inside the migration's resource folder; use a path relative to it." ); + // Drive-letter prefix (`C:`, `c:`, `Z:`...). Reject before the + // segment scan so the message names the actual shape that + // tripped validation. We don't need to allow `:` anywhere + // else in body paths — embedded resource names don't use it. + if ( path.Length >= 2 && path[1] == ':' && IsDriveLetter( path[0] ) ) + throw new InvalidOperationException( + $"WITH BODY `@{path}` is absolute (drive-letter prefix). Body files must live inside the migration's resource folder; use a path relative to it." ); + if ( path.Contains( ':' ) ) + throw new InvalidOperationException( + $"WITH BODY `@{path}` contains `:`. Body file paths must not contain `:` — embedded resource names don't use it." ); // `..` segment = parent traversal. Allow `.` (current dir) but not // `..` anywhere — split-and-check rather than substring so file // names that legitimately contain dots (`.json`) aren't false- @@ -692,6 +710,8 @@ private static Version ParseVersionLiteral( string literal ) return version; } + + private static bool IsDriveLetter( char c ) => c is >= 'A' and <= 'Z' or >= 'a' and <= 'z'; } public sealed class OpenSearchParseException : Exception diff --git a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md index 930e88a..3b35e7b 100644 --- a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md +++ b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md @@ -288,7 +288,29 @@ CREATE POLICY [WITH BODY $body] APPLY POLICY TO ``` -`CREATE POLICY` uploads the policy to `_plugins/_ism/policies`. `APPLY POLICY` attaches it to existing indices matching the pattern via `_plugins/_ism/add` — the dispatcher inspects the response body and surfaces logical failures explicitly: HTTP 200 with `updated_indices: 0` is mapped to `Failed`, not silent OK. For future-only attachment, declare `ism_template.index_patterns` in the policy body (handled at index-creation time by the cluster). +`CREATE POLICY` uploads the policy to `_plugins/_ism/policies`. `APPLY POLICY` attaches it to existing indices matching the pattern via `_plugins/_ism/add` — the dispatcher inspects the response body and surfaces logical failures explicitly: HTTP 200 with `updated_indices: 0` is mapped to `Failed`, not silent OK. + +#### Forward attachment vs runtime apply + +There are two ways to wire a policy (and an alias) to an index. Pick by whether the indices exist when the migration runs. + +**Forward attachment (preferred for greenfield).** When the index series doesn't exist yet — daily rollover indices for a new pipeline, fresh log streams, anything that the application or daily rollover will create later — let the cluster handle attachment lazily: + +- Inside the index template body, declare `template.aliases: { "": {} }`. The cluster wires the alias when a matching index is created. +- Inside the ISM policy body, declare `ism_template.index_patterns: [""]`. The cluster attaches the policy to any matching index at creation time. + +The migration installs only the cluster-level scaffolding (component templates, index templates, ISM policies). No runtime `APPLY POLICY`, no runtime `ALIAS ADD`. Sample 9000 (`ForwardAttachmentLifecycle`) demonstrates the full pattern. + +**Runtime apply (required for existing indices).** When indices already exist and need a new policy or alias attached, runtime statements are the only path: + +- `APPLY POLICY TO ` for ISM. +- `ALIAS ADD ON ` for aliases. + +Sample 4000 (`IsmPolicyAndApply`) demonstrates this case. The dispatcher's zero-updated-→-Failed escalation makes it loud when the pattern matches nothing. + +**Mixed.** If a new policy needs to apply to BOTH existing and future indices, do both: declare `ism_template` in the policy body for the future and run `APPLY POLICY` once for the current set. + +Caveat: `ism_template` inside a policy body is the modern endpoint shape. Older AWS-managed clusters served by the legacy `_opendistro/_ism` endpoint may not honor it; if `IsmEndpointDetectStep` resolves to the legacy endpoint, fall back to the runtime `APPLY POLICY` path even for forward attachment. Modern OpenSearch (2.x and the modern AWS endpoint) supports `ism_template` natively. ### Cluster waits diff --git a/tests/Hyperbee.Migrations.Tests/Providers/OpenSearch/Internal/BodySourceParserTests.cs b/tests/Hyperbee.Migrations.Tests/Providers/OpenSearch/Internal/BodySourceParserTests.cs index 4281f06..bea3e60 100644 --- a/tests/Hyperbee.Migrations.Tests/Providers/OpenSearch/Internal/BodySourceParserTests.cs +++ b/tests/Hyperbee.Migrations.Tests/Providers/OpenSearch/Internal/BodySourceParserTests.cs @@ -97,6 +97,34 @@ public void AtPath_AbsoluteWindows_RejectedAtParseTime() .Where( e => e.Message.Contains( "absolute" ) || e.Message.Contains( "relative" ) ); } + // Drive-letter prefix rejection (cross-platform asymmetry guard). + // `Path.IsPathRooted` is platform-dependent — `C:/foo` reads as rooted on + // Windows but not on Linux — so an author editing on one host could + // produce a manifest that's silently rooted on another. The validator + // checks the rooted shape explicitly. + [TestMethod] + [DataRow( "C:/foo/bar.json" )] + [DataRow( @"C:\foo\bar.json" )] + [DataRow( "c:/foo/bar.json" )] + [DataRow( @"Z:\bar.json" )] + [DataRow( "a:/x.json" )] + public void AtPath_DriveLetterPrefix_RejectedAtParseTime( string path ) + { + var act = () => _parser.Parse( $"CREATE INDEX users WITH BODY @{path}" ); + act.Should().Throw() + .Where( e => e.Message.Contains( "absolute" ) || e.Message.Contains( "drive-letter" ) ); + } + + // Colon anywhere in a body path is rejected — embedded resource names + // never use it, and accepting it would muddy the drive-letter check. + [TestMethod] + public void AtPath_ColonInPath_RejectedAtParseTime() + { + var act = () => _parser.Parse( "CREATE INDEX users WITH BODY @bodies/foo:bar.json" ); + act.Should().Throw() + .Where( e => e.Message.Contains( ":" ) ); + } + [TestMethod] public void AtPath_ParentTraversal_RejectedAtParseTime() { From 8580e917523561beb37775b58c1788dc10967361 Mon Sep 17 00:00:00 2001 From: Brenton Farmer Date: Mon, 4 May 2026 11:45:34 -0700 Subject: [PATCH 2/3] Sample 9001 + 'Three temporal scopes for ISM attachment' framing ISM attachment to an index series is three different problems, not one: - Greenfield (future indices auto-attach via ism_template) -> sample 9000 - One-time backfill (existing indices need a policy) -> sample 4000 - Ongoing reconciliation (policy evolves over time) -> sample 9001 (NEW) Sample 9001 demonstrates the reconciliation pattern: a [Migration(N, journal: false)] that re-runs APPLY POLICY against the wildcard pattern on every startup. ISM's change_policy is idempotent for already-on-policy indices, so re-running is cheap and convergent. The wildcard form is correct because the set of indices to reconcile changes as new ones roll over and old ones are deleted. Provider README's 'Forward attachment vs runtime apply' subsection expanded into a 'Three temporal scopes for ISM attachment' table so the choice between the three patterns is explicit, not implicit. Samples README adds the same matrix and points at the provider README as the canonical explainer. The three are stackable in a mature pipeline (greenfield at install, backfill when an existing series first adopts the policy, reconciliation as the policy evolves). Many pipelines never need more than one -- but the idea is to choose deliberately rather than reach for runtime APPLY POLICY by default. --- ...erbee.Migrations.OpenSearch.Samples.csproj | 2 + .../9001-OngoingPolicyReconciliation.cs | 45 +++++++++++++++++++ .../README.md | 28 +++++++----- .../statements.json | 8 ++++ .../README.md | 26 +++++------ 5 files changed, 83 insertions(+), 26 deletions(-) create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9001-OngoingPolicyReconciliation.cs create mode 100644 runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9001-OngoingPolicyReconciliation/statements.json diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj index 36e533e..6dc0bc6 100644 --- a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Hyperbee.Migrations.OpenSearch.Samples.csproj @@ -19,6 +19,7 @@ + @@ -36,6 +37,7 @@ + diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9001-OngoingPolicyReconciliation.cs b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9001-OngoingPolicyReconciliation.cs new file mode 100644 index 0000000..f10e6f8 --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Migrations/9001-OngoingPolicyReconciliation.cs @@ -0,0 +1,45 @@ +using Hyperbee.Migrations.Providers.OpenSearch.Resources; + +namespace Hyperbee.Migrations.OpenSearch.Samples.Migrations; + +// Sample 9.1: ongoing policy reconciliation. +// +// The third of three temporal scopes for ISM attachment. Pair it with +// sample 4 (one-time backfill via runtime APPLY) and sample 9 +// (greenfield via `ism_template`). +// +// Why this exists. Sample 9 installs a policy whose body has an +// `ism_template.index_patterns` block — new indices in the +// `sample_app_events-*` series auto-attach at creation. But when the +// policy DEFINITION later evolves (a new state added, transition +// criteria adjusted, retention reduced from 90d to 30d), existing +// indices that are already attached keep running on their cached copy +// of the policy until something explicitly re-attaches them. +// +// This migration runs `APPLY POLICY` against the same wildcard pattern +// the policy's `ism_template` covers — and it is journaled = false so +// it re-runs on every startup. The ISM `change_policy` API is +// idempotent: indices already on the current policy are a no-op, so +// re-running is cheap. The wildcard form is correct because the set of +// indices to reconcile changes as new ones roll over and old ones are +// deleted by the policy's own delete state. +// +// When NOT to use this pattern. +// +// - Greenfield-only series with policies that never change: sample 9 +// alone is enough. Don't add reconciliation noise on every startup +// for a thing that's already convergent. +// - One-time backfill of indices that exist before the policy: +// sample 4 (a normal `[Migration(N)]`) is the right tool. Don't +// reach for journaled = false unless the migration genuinely needs +// to run more than once. +// - Authoring-time-only enumeration of "these specific indices get +// this policy": just put the literal set in a normal migration; the +// wildcard story is for cluster-state-driven sets. + +[Migration( 9001, journal: false )] +public class OngoingPolicyReconciliation( OpenSearchResourceRunner runner ) : Migration +{ + public override Task UpAsync( CancellationToken cancellationToken = default ) + => runner.StatementsFromAsync( "statements.json", cancellationToken ); +} diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md index 936ff3d..a677545 100644 --- a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/README.md @@ -15,21 +15,29 @@ this assembly via `Migrations:FromPaths` and runs them in version order. | 6000 | **`MigrateIndexComposite`** | **Featured: `MIGRATE INDEX` composite — the canonical template-propagation pattern (R-30)** | Form 2 | | 7000 | `ReversibleAlias` | Opt-in `rollback` per statement; partial-rollback ledger semantics (R-19) | (no bodies — DDL-only rollback) | | 8000 | `UnsafeReindex` | `REINDEX UNSAFE("")` — opt-out of `op_type:create` | Form 2 | -| 9000 | `ForwardAttachmentLifecycle` | Greenfield-only: declarative attachment via `template.aliases` + `ism_template` — **no runtime `APPLY POLICY` or `ALIAS ADD`** | Form 1 — direct `WITH BODY @path` for each body | +| 9000 | `ForwardAttachmentLifecycle` | Greenfield: declarative attachment via `template.aliases` + `ism_template` — **no runtime `APPLY POLICY` or `ALIAS ADD`** | Form 1 — direct `WITH BODY @path` for each body | +| 9001 | `OngoingPolicyReconciliation` | `[Migration(N, journal: false)]` + `APPLY POLICY ON ` — re-runs every startup; keeps matching indices on the current policy as it evolves | (no bodies — APPLY-only) | **Sample 6 is the headline.** Adopters asking "how do I apply a template/mapping change to existing data?" should be pointed at `MigrateIndexComposite` first; the long-form sample 2 exists to show what the composite expands to. -**Samples 4 and 9 are paired.** Sample 4 demonstrates runtime attachment — -`APPLY POLICY` on a wildcard pattern, used when the indices already exist. -Sample 9 demonstrates declarative attachment — the policy's `ism_template` -block and the template's `aliases:` block — used when the migration runs -before the indices exist (greenfield rollover series). For most new -pipelines, sample 9's pattern is preferred: the cluster handles attachment -lazily, no follow-up migration is needed when the first index is created, -and the wired-state of the cluster is fully described by the templates and -policies, not by ad-hoc runtime calls. +**Samples 4, 9, and 9.1 are the three temporal scopes for ISM attachment.** +Pick the one that matches *when* the indices that need the policy come into +existence relative to the migration that owns the policy: + +| Scope | Sample | When | +|---|---|---| +| Greenfield (future indices auto-attach) | 9000 | Index series doesn't exist yet — daily rollover for a new pipeline, fresh log streams | +| One-time backfill (existing indices) | 4000 | Indices already exist and need the policy attached once | +| Ongoing reconciliation (future + existing, policy evolves) | 9001 | Policy definition evolves over time; re-attach every startup so already-attached indices pick up the new version | + +The three are stackable in a mature pipeline (greenfield at install, +backfill when an existing series first adopts a policy, reconciliation as +the policy evolves). Many pipelines never need more than one — but choose +deliberately rather than reach for runtime `APPLY POLICY` by default. +The provider README's "Three temporal scopes for ISM attachment" section +is the canonical explainer. **Body-source forms.** ADR-0017 defines three resolution forms for `WITH BODY` references. The samples deliberately demonstrate all of them so authors can diff --git a/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9001-OngoingPolicyReconciliation/statements.json b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9001-OngoingPolicyReconciliation/statements.json new file mode 100644 index 0000000..dc555ba --- /dev/null +++ b/runners/samples/Hyperbee.Migrations.OpenSearch.Samples/Resources/9001-OngoingPolicyReconciliation/statements.json @@ -0,0 +1,8 @@ +{ + "statements": [ + { + "//": "Re-apply the lifecycle policy to every matching index. The wildcard adapts to current cluster state — newer indices that the policy already auto-attached are a no-op via ISM's idempotent change_policy semantics. Cheap to re-run on every startup; this migration is `journal: false` so it's expected to re-run.", + "statement": "APPLY POLICY sample_app_events_lifecycle TO sample_app_events-*" + } + ] +} diff --git a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md index 3b35e7b..b4adf9c 100644 --- a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md +++ b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md @@ -290,27 +290,21 @@ APPLY POLICY TO `CREATE POLICY` uploads the policy to `_plugins/_ism/policies`. `APPLY POLICY` attaches it to existing indices matching the pattern via `_plugins/_ism/add` — the dispatcher inspects the response body and surfaces logical failures explicitly: HTTP 200 with `updated_indices: 0` is mapped to `Failed`, not silent OK. -#### Forward attachment vs runtime apply +#### Three temporal scopes for ISM attachment -There are two ways to wire a policy (and an alias) to an index. Pick by whether the indices exist when the migration runs. +ISM attachment to an index series isn't one problem with three solutions — it's three different problems, each with its own right tool. Pick by *when* the indices that need the policy come into existence relative to the migration that owns the policy. -**Forward attachment (preferred for greenfield).** When the index series doesn't exist yet — daily rollover indices for a new pipeline, fresh log streams, anything that the application or daily rollover will create later — let the cluster handle attachment lazily: - -- Inside the index template body, declare `template.aliases: { "": {} }`. The cluster wires the alias when a matching index is created. -- Inside the ISM policy body, declare `ism_template.index_patterns: [""]`. The cluster attaches the policy to any matching index at creation time. - -The migration installs only the cluster-level scaffolding (component templates, index templates, ISM policies). No runtime `APPLY POLICY`, no runtime `ALIAS ADD`. Sample 9000 (`ForwardAttachmentLifecycle`) demonstrates the full pattern. - -**Runtime apply (required for existing indices).** When indices already exist and need a new policy or alias attached, runtime statements are the only path: - -- `APPLY POLICY TO ` for ISM. -- `ALIAS ADD ON ` for aliases. +| Scope | Right tool | Sample | Notes | +|---|---|---|---| +| **Greenfield** — attach to indices that will be created in the future | `ism_template.index_patterns` in the policy body, `template.aliases` in the index template | 9000 — `ForwardAttachmentLifecycle` | Cluster handles it lazily at index-creation time. No migration runtime cost. Won't help with indices that already exist when the migration runs. | +| **One-time backfill** — attach a policy to a set of indices that already exist at migration run time | Runtime `APPLY POLICY TO ` in a normal `[Migration(N)]` | 4000 — `IsmPolicyAndApply` | Single-shot, journaled. Wildcards adapt to current cluster state at run time. Zero-updated → `Failed` escalation makes it loud when the pattern matches nothing. | +| **Ongoing reconciliation** — keep all matching existing indices on the current policy as the policy evolves | Runtime `APPLY POLICY TO ` in a `[Migration(N, journal: false)]` | 9001 — `OngoingPolicyReconciliation` | Re-runs on every startup. Idempotent on the wire (ISM's `change_policy` is a no-op for already-on-policy indices). The wildcard form is correct because the set of indices to reconcile changes as new ones roll over and old ones are deleted. | -Sample 4000 (`IsmPolicyAndApply`) demonstrates this case. The dispatcher's zero-updated-→-Failed escalation makes it loud when the pattern matches nothing. +The three are stackable. A typical mature pipeline uses **greenfield** at install time, **one-time backfill** when an existing series first adopts the policy, and **ongoing reconciliation** as the policy definition evolves over the project's lifetime. Many pipelines never need more than one — but you should choose deliberately rather than reach for runtime `APPLY POLICY` by default. -**Mixed.** If a new policy needs to apply to BOTH existing and future indices, do both: declare `ism_template` in the policy body for the future and run `APPLY POLICY` once for the current set. +The wildcard form of `APPLY POLICY` is the correct expression of "apply to whatever matches now" — that's exactly what backfill and reconciliation want. Don't try to pin to a literal index list as a substitute for forward-attachment; if the goal is "future indices auto-attach," `ism_template` is the right answer. -Caveat: `ism_template` inside a policy body is the modern endpoint shape. Older AWS-managed clusters served by the legacy `_opendistro/_ism` endpoint may not honor it; if `IsmEndpointDetectStep` resolves to the legacy endpoint, fall back to the runtime `APPLY POLICY` path even for forward attachment. Modern OpenSearch (2.x and the modern AWS endpoint) supports `ism_template` natively. +Caveat: `ism_template` inside a policy body is the modern endpoint shape. Older AWS-managed clusters served by the legacy `_opendistro/_ism` endpoint may not honor it; if `IsmEndpointDetectStep` resolves to the legacy endpoint, the greenfield row falls back to runtime `APPLY POLICY` (sample 4000's pattern, run once at install time, plus sample 9001's reconciliation pattern for ongoing changes). Modern OpenSearch (2.x and the modern AWS endpoint) supports `ism_template` natively. ### Cluster waits From 998c355a516514c95007d4ac61645e6db9d55745 Mon Sep 17 00:00:00 2001 From: Brenton Farmer Date: Mon, 4 May 2026 12:19:04 -0700 Subject: [PATCH 3/3] Docs: align README + site docs with parser drive-letter check and three-scope ISM framing Two doc gaps surfaced during a documentation review: 1. Path-validation list (provider README + docs/site/opensearch.md) didn't mention the drive-letter rejection added in this PR's parser change. A Windows author writing @C:/foo would get a clear runtime error but no docs telling them what shapes the validator catches. 2. docs/site/opensearch.md ISM section described only runtime APPLY POLICY. The provider README and samples README in this PR introduce the 'Three temporal scopes for ISM attachment' framing (greenfield via ism_template, one-time backfill, ongoing reconciliation) and reference samples 4000/9000/9001 -- the site doc was the last surface still describing only the backfill case. Site doc updates kept ASCII-only per site-build constraint. --- docs/site/opensearch.md | 18 +++++++++++++++++- .../README.md | 2 ++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/docs/site/opensearch.md b/docs/site/opensearch.md index 4bc2a8c..235ce04 100644 --- a/docs/site/opensearch.md +++ b/docs/site/opensearch.md @@ -175,6 +175,8 @@ JSON bodies attach to a statement via `WITH BODY `. The provider supports t The `@`-prefixed path loads an embedded resource relative to the migration's own resource folder. Use this for any body that would otherwise dominate the `statements.json` file -- large mappings, ISM policies, reusable templates. Subfolders are optional. Path validation is parse-time: - Absolute paths (leading `/` or `\`) are rejected -- body files must stay inside the migration's resource folder. +- Drive-letter prefixes (`C:`, `c:`, ...) are rejected -- same reason. `Path.IsPathRooted` is platform-dependent (`C:/foo` reads as rooted on Windows but not on Linux); the validator checks the rooted shape explicitly so an author editing on one host can't produce a path that's silently rooted on another. +- Any other `:` in the path is rejected -- embedded resource names don't use it. - `..` segments are rejected -- no parent-directory traversal. - Allowed characters: letters, digits, `_`, `-`, `.`, `/`, `\`. @@ -509,12 +511,26 @@ Uploads the policy to `_plugins/_ism/policies` (or `_opendistro/_ism/policies` o APPLY POLICY TO [NO WAIT("")] ``` -Attaches the policy to existing indices matching the pattern via `_plugins/_ism/add`. The dispatcher inspects the response body and surfaces logical failures explicitly: HTTP 200 with `updated_indices: 0` is mapped to `Failed`, not silent OK. For future-only attachment, declare `ism_template.index_patterns` in the policy body (handled at index-creation time by the cluster). +Attaches the policy to existing indices matching the pattern via `_plugins/_ism/add`. The dispatcher inspects the response body and surfaces logical failures explicitly: HTTP 200 with `updated_indices: 0` is mapped to `Failed`, not silent OK. ```json { "statement": "APPLY POLICY hot-warm-cold TO logs-*" } ``` +#### Three temporal scopes for ISM attachment + +ISM attachment to an index series isn't one problem with three solutions -- it's three different problems, each with its own right tool. Pick by *when* the indices that need the policy come into existence relative to the migration that owns the policy. + +| Scope | Right tool | Sample | Notes | +|---|---|---|---| +| **Greenfield** -- attach to indices that will be created in the future | `ism_template.index_patterns` in the policy body, `template.aliases` in the index template | 9000 -- `ForwardAttachmentLifecycle` | Cluster handles it lazily at index-creation time. No migration runtime cost. Won't help with indices that already exist when the migration runs. | +| **One-time backfill** -- attach a policy to a set of indices that already exist at migration run time | Runtime `APPLY POLICY TO ` in a normal `[Migration(N)]` | 4000 -- `IsmPolicyAndApply` | Single-shot, journaled. Wildcards adapt to current cluster state at run time. Zero-updated -> `Failed` escalation makes it loud when the pattern matches nothing. | +| **Ongoing reconciliation** -- keep all matching existing indices on the current policy as the policy evolves | Runtime `APPLY POLICY TO ` in a `[Migration(N, journal: false)]` | 9001 -- `OngoingPolicyReconciliation` | Re-runs on every startup. Idempotent on the wire (ISM's `change_policy` is a no-op for already-on-policy indices). The wildcard form is correct because the set of indices to reconcile changes as new ones roll over and old ones are deleted. | + +The three are stackable. A typical mature pipeline uses **greenfield** at install time, **one-time backfill** when an existing series first adopts the policy, and **ongoing reconciliation** as the policy definition evolves over the project's lifetime. Many pipelines never need more than one -- but you should choose deliberately rather than reach for runtime `APPLY POLICY` by default. + +Caveat: `ism_template` inside a policy body is the modern endpoint shape. Older AWS-managed clusters served by the legacy `_opendistro/_ism` endpoint may not honor it; if `IsmEndpointDetectStep` resolves to the legacy endpoint, the greenfield row falls back to runtime `APPLY POLICY` (sample 4000's pattern, run once at install time, plus sample 9001's reconciliation pattern for ongoing changes). Modern OpenSearch (2.x and the modern AWS endpoint) supports `ism_template` natively. + ### WAIT FOR (cluster health) ``` diff --git a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md index b4adf9c..0bc3e92 100644 --- a/src/Hyperbee.Migrations.Providers.OpenSearch/README.md +++ b/src/Hyperbee.Migrations.Providers.OpenSearch/README.md @@ -119,6 +119,8 @@ Subfolders are optional. The path is just a relative file reference — `@foo.js Path validation is parse-time: - Absolute paths (leading `/` or `\`) are rejected — body files must stay inside the migration's resource folder. +- Drive-letter prefixes (`C:`, `c:`, ...) are rejected — same reason. `Path.IsPathRooted` is platform-dependent (`C:/foo` reads as rooted on Windows but not on Linux); the validator checks the rooted shape explicitly so an author editing on one host can't produce a path that's silently rooted on another. +- Any other `:` in the path is rejected — embedded resource names don't use it. - `..` segments are rejected — no parent-directory traversal. - Allowed characters: letters, digits, `_`, `-`, `.`, `/`, `\`.