Skip to content

Idempotent CREATE POLICY (CAS retry on 409 version conflict)#119

Merged
bfarmer67 merged 2 commits intomainfrom
devs/bfarmer/opensearch-create-policy-cas
May 4, 2026
Merged

Idempotent CREATE POLICY (CAS retry on 409 version conflict)#119
bfarmer67 merged 2 commits intomainfrom
devs/bfarmer/opensearch-create-policy-cas

Conversation

@bfarmer67
Copy link
Copy Markdown
Contributor

Summary

Make CREATE POLICY idempotent so authors can use it inside [Migration(N, journal: false)] reconciliation migrations without hitting HTTP 409 every restart after the first.

ISM policies are versioned in OpenSearch — a plain PUT to an existing policy fails with 409 version_conflict_engine_exception. Previously the dispatcher surfaced that as Failed. Now it transparently:

  1. PUT /_plugins/_ism/policies/<id> (or the legacy prefix when IsmEndpointDetectStep resolved that path).
  2. On 409 → GET the existing policy, read _seq_no and _primary_term from the response root (ISM surfaces these at the document root, not under _source — unusual but documented).
  3. Retry the PUT with ?if_seq_no=N&if_primary_term=M query parameters.
  4. On second 409 → hard fail (concurrent writer between GET and retry; the migration lock should make this rare).

No behavior change when the policy doesn't exist — the first PUT succeeds, no GET, no retry. Mirrors LockHandle.RenewLockAsync's optimistic-concurrency pattern.

Why

PR #118 (currently in review) elevates the [Migration(N, journal: false)] + APPLY POLICY reconciliation pattern as one of three first-class temporal scopes for ISM attachment (sample 9001). The natural extension a team takes after copying that sample is to put CREATE POLICY in the same journal: false migration so the policy body itself stays in sync with source-of-truth on every deploy. Without idempotent CREATE POLICY, that pattern fails on the second startup with a confusing 409. This PR closes the gap so the reconciliation pattern is symmetric across attachment AND definition.

Tests

  • Integration: extended OpenSearchTemplatePolicyIntegrationTests with CreatePolicy_TwiceWithMutatedBody_SecondCallSucceedsViaCasRetry — creates a policy, mutates its description, creates again, asserts the second call succeeds and the post-condition GET shows the mutated body. Gated by the INTEGRATIONS define and runs in the nightly Testcontainers workflow.
  • Unit: 350/350 existing unit tests pass on net10.0. No new unit tests for the dispatcher CAS path — the repo has no StatementDispatcher mock infrastructure (all dispatch coverage lives in the integration suite); standing up mock plumbing for one method would be net-new work with marginal incremental value over the integration test.
  • dotnet format --verify-no-changes clean.

Out of scope

  • Surfacing if_seq_no / if_primary_term as a generally-available primitive for other verbs. The doc author flagged this as an optional follow-on; not needed for any current verb. Defer until a second consumer materializes.
  • ADR. This is a strict tightening of an existing verb's behavior (more cases succeed, none change), not an architecture decision.

Test plan

  • Provider builds on net8/9/10 — clean.
  • Unit suite green — 350/350 on net10.
  • Format clean.
  • CI green on this PR.
  • Nightly integration run picks up the new CreatePolicy_TwiceWithMutatedBody_* test.

Sequencing

This branches off main. Lands cleanly after PR #118 merges (no file conflicts — different files). The PR description references samples and README sections from #118; if this lands first, those references will be slightly forward-looking until #118 follows.

bfarmer67 added 2 commits May 4, 2026 12:09
ISM policies are versioned in OpenSearch — a plain PUT to an existing
policy fails with HTTP 409 version_conflict_engine_exception. Authors
adopting the [Migration(N, journal: false)] reconciliation pattern need
CREATE POLICY to be idempotent so the policy body can be upserted from
source-of-truth on every startup; the previous behavior surfaced 409
as Failed and forced authors to back out the pattern.

DispatchCreatePolicyAsync now retries on 409:

  1. PUT _plugins/_ism/policies/<id>
  2. On 409: GET the existing policy, extract _seq_no and _primary_term
     from the response root (ISM surfaces these at the doc root, not
     under _source — unusual but documented OpenSearch behavior).
  3. PUT again with ?if_seq_no=N&if_primary_term=M (CAS query params).
  4. On second 409: hard fail (concurrent writer between GET and retry).

No behavior change when the policy doesn't exist (first PUT succeeds,
no GET, no retry). Mirrors LockHandle.RenewLockAsync's optimistic-
concurrency pattern.

Integration test extends OpenSearchTemplatePolicyIntegrationTests with
a create-twice-with-mutated-body scenario that exercises the CAS retry
path and verifies the post-condition body via GET. Gated by the
INTEGRATIONS define and runs in the nightly Testcontainers workflow.

Provider README ISM section grows a paragraph documenting the
idempotency contract.

Note: dispatch-level unit tests (mocking IOpenSearchLowLevelClient) are
out of scope — the repo has no existing dispatcher unit-test
infrastructure; all dispatch coverage lives in the integration suite.
Adding mock plumbing for one method is net-new work for marginal
incremental value over the integration test.
Match the provider README paragraph added with the dispatcher change.
Site doc kept ASCII-only per site-build constraint.
@bfarmer67 bfarmer67 merged commit 4035678 into main May 4, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant