Skip to content

feat(charts): fold eval-runtime + operator-slo into the operator chart (0.2.0)#34

Merged
stxkxs merged 1 commit into
mainfrom
feat/fold-eval-runtime-slo-into-operator-chart
Jun 8, 2026
Merged

feat(charts): fold eval-runtime + operator-slo into the operator chart (0.2.0)#34
stxkxs merged 1 commit into
mainfrom
feat/fold-eval-runtime-slo-into-operator-chart

Conversation

@stxkxs

@stxkxs stxkxs commented Jun 8, 2026

Copy link
Copy Markdown
Member

What

The chart side of consolidating eks-agent-platform/gitops/ into the single eks-gitops catalog (#33). Brings the operator's own runtime — the eval-runtime and SLO — into charts/operator behind values toggles, so the product ships its own runtime and eks-gitops just deploys the chart.

eval-runtime (evalRuntime.*, default on)

The Argo Workflows runtime the operator submits EvalSuite runs to.

  • templates/eval-runtime/{namespace,serviceaccount,rbac}.yaml — templated with the chart label helpers. SA name/namespace stay byte-pinned to eval-runner/eval-runner (the terraform/components/eval-runtime IRSA trust); role ARN injected per-cluster (empty in chart — embeds the account id).
  • templates/eval-runtime/{workflowtemplate,analysistemplate}.yaml — thin .Files.Get wrappers over files/eval-runtime/*, so the Argo mustache ({{workflow.parameters}}, {{args.*}}) is emitted verbatim; only the bucket / gateway-url / namespace literals are substituted. AnalysisTemplate gated behind evalRuntime.rollouts.enabled (off — needs the Rollouts CRD).

operator SLO (slo.*, default on; alerting off)

  • .Files.Get wrappers for PrometheusRule / AlertmanagerConfig / CR-state ConfigMap preserving the Prometheus + Alertmanager mustache. PromQL namespace selectors → slo.operatorNamespace. alerting off by default (receivers need six external Secrets). The CR-state ConfigMap is inert until kube-state-metrics mounts it (eks-gitops KSM addon) — noted in NOTES + README.

Chart 0.1.1 → 0.2.0 (appVersion unchanged — operator binary untouched).

Verification

  • helm lint clean; default render emits WorkflowTemplate + PrometheusRule + CR-state ConfigMap, omits the gated AnalysisTemplate + AlertmanagerConfig.
  • Full-toggle render: all Argo/Prometheus/Alertmanager mustache survives literally ({{workflow.parameters}} ×12, {{ $labels }} ×8, {{args}} ×3), bucket substituted, REPLACE_BY_APPLICATIONSET gone, kubeconform clean (13 valid, 0 invalid).

Merge order (part of the #33 consolidation)

PR-2. eks-gitops sources charts/operator from git main, so on merge the operator picks up the default-on eval/slo templates. PR-3 (eks-gitops) then enables them per-env + injects the eval IRSA; gitops/ is deleted last (PR-6). Blueprint: .plans/gitops-consolidation.md (local).

Refs #33

…t behind toggles (0.2.0)

Brings the operator's own runtime into its Helm chart so the product ships its own eval +
observability, instead of a separate gitops overlay deploying them. eks-gitops deploys the
chart; this is the chart side of consolidating eks-agent-platform/gitops.

eval-runtime (evalRuntime.*, default on) — the Argo Workflows runtime the operator submits
EvalSuite runs to. templates/eval-runtime/{namespace,serviceaccount,rbac}.yaml are templated
with the chart label helpers; the SA name/namespace stay byte-pinned to eval-runner/eval-runner
(the terraform/components/eval-runtime IRSA trust) and the role ARN is injected per-cluster
(empty in the chart — it embeds the account id). workflowtemplate/analysistemplate are thin
.Files.Get wrappers over files/eval-runtime/* so the Argo mustache ({{workflow.parameters}},
{{inputs.parameters}}, {{args.*}}) is emitted verbatim — only the bucket / gateway-url / namespace
literals are substituted. The AnalysisTemplate is gated behind evalRuntime.rollouts.enabled (off —
needs the Argo Rollouts CRD).

operator SLO (slo.*, default on; alerting off) — namespace templated; prometheusrule/
alertmanagerconfig/customresourcestatemetrics are .Files.Get wrappers preserving the Prometheus
({{ $labels }}, {{ $value | humanize* }}) and Alertmanager ({{ template }}, {{ range .Alerts }})
mustache. The PromQL namespace selectors map to slo.operatorNamespace. AlertmanagerConfig is gated
behind slo.alerting.enabled (off — its receivers need six external Secrets). The CR-state ConfigMap
is inert until kube-state-metrics mounts it (--custom-resource-state-config-file, owned by the
eks-gitops KSM addon) — noted in NOTES + README.

Chart 0.1.1 → 0.2.0 (appVersion unchanged — the operator binary is untouched). values.yaml gains
the evalRuntime + slo blocks; README + NOTES document the toggles, the IRSA injection, and the two
external prereqs. .gitignore ignores the local .plans/ working dir.

Verified: helm lint clean; default render emits the WorkflowTemplate + PrometheusRule + CR-state
ConfigMap and omits the gated AnalysisTemplate + AlertmanagerConfig; full-toggle render keeps all
Argo/Prometheus/Alertmanager mustache literal, substitutes the bucket, and passes kubeconform.

Refs #33
@stxkxs stxkxs merged commit 629a653 into main Jun 8, 2026
15 checks passed
@stxkxs stxkxs deleted the feat/fold-eval-runtime-slo-into-operator-chart branch June 8, 2026 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant