-
Notifications
You must be signed in to change notification settings - Fork 15
feat: add prometheus verification support #795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Prometheus as a new MetricProvider across API schemas, generated Go models, provider factory, a full Prometheus provider implementation with tests, and web UI components + runtime validation for Prometheus metrics. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant Factory as ProviderFactory
participant Prov as PrometheusProvider
participant Server as PrometheusServer
participant Ctx as ProviderContext
Client->>Factory: CreateProvider(config)
Factory->>Prov: NewFromOAPI(config.prometheus)
Client->>Prov: Measure(ctx, providerCtx)
Prov->>Ctx: Resolve templates (query, headers, auth)
Ctx-->>Prov: Resolved values
Prov->>Server: HTTP GET (instant or range) with params & headers
Server-->>Prov: JSON response (vector/matrix/scalar)
Prov->>Prov: Parse & normalize response
Prov-->>Client: (timestamp, metrics map, error)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@cursoragent review |
|
Here’s a concise review of “feat: add prometheus verification support”:
Code notes: func buildQueryURL(config *oapi.PrometheusMetricProvider, now time.Time) (string, error) {
...
if config.Step != nil && *config.Step != "" {
...
start := end.Add(-step * 10) // look back 10 steps by default
...
return address + "/api/v1/query_range?" + params.Encode(), nil
}
return address + "/api/v1/query?" + params.Encode(), nil
}func parsePrometheusDuration(s string) (time.Duration, error) {
...
switch unit {
case 's','m','h','d':
default:
return 0, fmt.Errorf("unknown duration unit: %c", unit)
}
}for _, pair := range m.Values {
ts, _ := parseScalarValue(pair[0])
v, err := parseScalarValue(pair[1])
if err != nil { continue }
values = append(values, map[string]any{"timestamp": ts, "value": v})
}Docs:
If you want, I can add a first pass of the unit tests and the Prometheus docs page in this branch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go`:
- Around line 325-350: The current parsePrometheusDuration function restricts
units to only s/m/h/d and rejects valid Prometheus durations (e.g., ms, us, w);
replace its implementation to delegate parsing to
github.com/prometheus/common/model's model.ParseDuration: update imports to
include that package, call model.ParseDuration on the input string inside
parsePrometheusDuration, convert the returned model.Duration to time.Duration
and return it (handle and wrap parse errors), and remove the manual unit switch
in parsePrometheusDuration so it fully supports Prometheus duration formats.
🧹 Nitpick comments (1)
apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go (1)
151-168: Consider explicitly settingMinVersionfor clarity and defense-in-depth.The custom
tls.Configat line 161 does not setMinVersion. Go 1.18+ defaults to TLS 1.2+ for clients whenMinVersionis unset, so this is not a security vulnerability. However, explicitly pinningMinVersiontotls.VersionTLS12(or higher) documents the security requirement and reduces reliance on version-dependent defaults.Suggested change
if config.InsecureSkipVerify != nil && *config.InsecureSkipVerify { client.Transport = &http.Transport{ - TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec + TLSClientConfig: &tls.Config{ + InsecureSkipVerify: true, //nolint:gosec + MinVersion: tls.VersionTLS12, + }, } }
...e-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@apps/workspace-engine/oapi/spec/schemas/verification.jsonnet`:
- Around line 244-329: The Prometheus rangeQuery start/end handling is wrong:
parsePrometheusDuration only accepts simple durations and the code silently
falls back to defaults, and the provided end value is parsed/checked but never
used (always replaced with the now variable). Fix by either implementing proper
expression parsing for "now()" and "now() - duration(X)" in
parsePrometheusDuration (or add a new parsePrometheusTimeExpr helper) so start
and end accept those expressions, or else update the schema/examples to only
show simple durations; additionally ensure the parsed end value is actually
applied instead of always using now (replace the code that assigns end = now
with using the parsed end variable where rangeQuery is built). Reference
parsePrometheusDuration and the rangeQuery start/end handling in the
PrometheusMetricProvider processing logic.
In
`@apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go`:
- Around line 171-207: In buildQueryURL, the RangeQuery start/end overrides are
effectively ignored because an invalid start is silently skipped and the
RangeQuery.End branch is empty so the local end variable is never updated; fix
by validating and applying both config.RangeQuery.Start and
config.RangeQuery.End using parsePrometheusDuration (returning an error on
invalid formats instead of swallowing it), set start and end accordingly (e.g.,
override the computed start/end when parsed durations present), and then use
formatTimestamp to set the "start" and "end" params before returning; reference
buildQueryURL, config.RangeQuery.Start, config.RangeQuery.End,
parsePrometheusDuration, start, end, and params.Set to locate the changes.
- Around line 151-169: resolveAuthentication currently templates oauth2 fields
but the codebase never uses oauth2 tokens (only bearer tokens), so oauth2
configs are ignored and cause failures; either implement OAuth2 token retrieval
or fail fast: add a validation in the Prometheus provider constructor (e.g.,
NewPrometheusProvider / NewPrometheusMetricsProvider) to detect
prometheusAuth.Oauth2 != nil and return an error explaining oauth2 is not
supported, or alternatively implement OAuth2 token acquisition and injection
where bearer tokens are used (update resolveAuthentication usage and the
request-auth path to use the retrieved access token). Ensure you reference
prometheusAuth, resolveAuthentication and the provider constructor to locate
where to add the check or integrate token retrieval.
🧹 Nitpick comments (2)
apps/api/src/types/openapi.ts (1)
1262-1268: Alignheadersshape across metric providers for API consistency.
PrometheusMetricProvider.headersuses an array format{ key: string; value: string }[], whileHTTPMetricProvider.headers(line ~175 in the source schema) uses an object map{ [key: string]: string }. This inconsistency surfaces in all generated SDKs and client libraries, creating confusion for users configuring both provider types.The array format for Prometheus may support duplicate headers, but this rationale isn't documented. Consider standardizing both providers to the same representation (either both as maps or both as arrays) in
apps/api/openapi/schemas/verifications.jsonnetunless duplicate keys are an intentional, documented requirement specific to Prometheus.apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go (1)
57-79: Add Go doc comments for exported symbols.Line 57 onward includes exported
PrometheusProviderand its constructors/methods without doc comments. Please add Go doc comments for the exported type/functions.✍️ Example doc comment pattern
- type PrometheusProvider struct { + // PrometheusProvider implements provider.Provider by querying Prometheus. + type PrometheusProvider struct {As per coding guidelines: Write comments that explain why, document complex logic/algorithms, provide non-obvious context, include TODO/FIXME, and document exported functions/types/methods.
...e-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go
Show resolved
Hide resolved
...e-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In
`@apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/prometheus-metric.ts`:
- Around line 12-25: The schema drops range-query time-series; update
prometheusResultEntry to include an optional "values" field that matches
Prometheus range output (an array of [timestamp, value] tuples). Specifically,
in the prometheusResultEntry z.object add values: z.array(z.tuple([z.number(),
z.string()])).optional() (keep existing metric and value fields unchanged) so
prometheusMeasurementData can accept both instant "value" and range "values".
- Around line 3-8: The schema prometheusProviderConfig currently omits backend
fields auth, headers, rangeQuery, and insecure so the UI strips them; update
prometheusProviderConfig to include these fields with appropriate optional types
that mirror the backend model (e.g., an optional auth object/schema, optional
headers as a record/dictionary of strings, optional rangeQuery (string or
structured type per backend), and optional insecure as boolean) so the UI parser
can surface and preserve those values; ensure names match exactly (auth,
headers, rangeQuery, insecure) and types align with the backend provider model
used elsewhere.
In
`@apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/Prometheus.tsx`:
- Around line 14-119: PrometheusVerificationDisplay, ProviderInfo,
MeasurementTrend, and MeasurementRow are plain functions instead of typed React
components; extract each component's props into a named interface (e.g.,
PrometheusVerificationDisplayProps, ProviderInfoProps, MeasurementTrendProps,
MeasurementRowProps) and convert each to the project pattern using React.FC/FC
(e.g., const PrometheusVerificationDisplay:
FC<PrometheusVerificationDisplayProps> = ({ metric }) => { ... }) so all four
components use the FC<Props> typing; keep the same prop names (metric, query,
address, successCondition, measurements, measurement) and return values
unchanged.
In
`@apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/PrometheusIcon.tsx`:
- Around line 1-12: The PrometheusIcon component should use an explicit props
interface and React.FC typing: create an interface (e.g., PrometheusIconProps)
with className?: string, change the declaration of PrometheusIcon to a typed
React functional component (e.g., export const PrometheusIcon:
React.FC<PrometheusIconProps> = ({ className }) => { ... }), and ensure React is
imported if not already; keep the SVG return unchanged.
In
`@apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go`:
- Around line 407-422: The code in extractMatrixResults (and similarly in
extractVectorResults) only assigns primary when i == 0, so if the first series
is skipped primary stays nil even though later series have valid values; change
the logic to set primary when the first successfully parsed value is encountered
(e.g., after parseScalarValue succeeds) rather than relying on the loop index,
by checking if primary == nil and then assigning primary = &val; update both
extractMatrixResults and extractVectorResults accordingly so data["value"] is
populated from the first valid parsed value.
- Around line 256-298: fetchOAuth2Token currently uses http.DefaultClient (no
timeout/TLS config); change it to accept a *http.Client (or build one from the
existing Prometheus client config with the configured timeout and TLS
InsecureSkipVerify) so token requests respect timeout and TLS settings, update
the caller setAuthHeader to accept/pass that client through to fetchOAuth2Token,
and ensure the new client is used for the token HTTP request and error handling.
🧹 Nitpick comments (4)
apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/Prometheus.tsx (1)
83-85: Prefer a stable key instead of the array index.Using
measuredAtavoids UI glitches when the list reorders.Suggested update
- {measurements.map((m, i) => ( - <MeasurementRow key={i} measurement={m} /> - ))} + {measurements.map((m) => ( + <MeasurementRow key={m.measuredAt} measurement={m} /> + ))}apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/prometheus-metric.ts (1)
10-10: Prefer exported interfaces for public types.As per coding guidelines, "Prefer interfaces for public APIs in TypeScript".♻️ Suggested refactor
-export type PrometheusProviderConfig = z.infer<typeof prometheusProviderConfig>; +export interface PrometheusProviderConfig + extends z.infer<typeof prometheusProviderConfig> {} -export type PrometheusMeasurementData = z.infer< - typeof prometheusMeasurementData ->; +export interface PrometheusMeasurementData + extends z.infer<typeof prometheusMeasurementData> {}Also applies to: 27-29
apps/workspace-engine/pkg/workspace/releasemanager/verification/metrics/provider/prometheus/prometheus.go (2)
308-312: Static analysis: missingMinVersionon TLS config.When
insecureis enabled, the custom transport lacksMinVersion, defaulting to TLS 1.0 for server connections. Since this path is already opted-in as insecure, the risk is limited, but setting a floor is still good practice.♻️ Suggested fix
if config.Insecure != nil && *config.Insecure { client.Transport = &http.Transport{ - TLSClientConfig: &tls.Config{InsecureSkipVerify: true}, //nolint:gosec + TLSClientConfig: &tls.Config{ + InsecureSkipVerify: true, //nolint:gosec + MinVersion: tls.VersionTLS12, + }, } }
221-234: Custom headers can silently override theAuthorizationheader set by auth config.
setHeadersapplies auth first (Line 222), then iterates user-provided headers (Line 230-232). If a user provides anAuthorizationheader, it will overwrite the bearer/OAuth2 token. This might be intentional for flexibility, but it could also be a confusing footgun. Consider at minimum logging a warning if a custom header overwritesAuthorization.
| export const prometheusProviderConfig = z.object({ | ||
| type: z.literal("prometheus"), | ||
| address: z.string(), | ||
| query: z.string(), | ||
| timeout: z.number().optional(), | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prometheus provider schema drops auth/headers/rangeQuery/insecure fields.
These fields exist in the backend provider model, but are currently stripped by the schema, which means the UI parser can’t surface or preserve them.
🧩 Suggested schema expansion
+const prometheusHeader = z.object({
+ key: z.string(),
+ value: z.string(),
+});
+
+const prometheusOauth2 = z.object({
+ clientId: z.string(),
+ clientSecret: z.string(),
+ tokenUrl: z.string(),
+ scopes: z.array(z.string()).optional(),
+});
+
+const prometheusAuth = z.object({
+ bearerToken: z.string().optional(),
+ oauth2: prometheusOauth2.optional(),
+});
+
+const prometheusRangeQuery = z.object({
+ start: z.string().optional(),
+ end: z.string().optional(),
+ step: z.string(),
+});
+
export const prometheusProviderConfig = z.object({
type: z.literal("prometheus"),
address: z.string(),
query: z.string(),
timeout: z.number().optional(),
+ insecure: z.boolean().optional(),
+ headers: z.array(prometheusHeader).optional(),
+ authentication: prometheusAuth.optional(),
+ rangeQuery: prometheusRangeQuery.optional(),
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const prometheusProviderConfig = z.object({ | |
| type: z.literal("prometheus"), | |
| address: z.string(), | |
| query: z.string(), | |
| timeout: z.number().optional(), | |
| }); | |
| const prometheusHeader = z.object({ | |
| key: z.string(), | |
| value: z.string(), | |
| }); | |
| const prometheusOauth2 = z.object({ | |
| clientId: z.string(), | |
| clientSecret: z.string(), | |
| tokenUrl: z.string(), | |
| scopes: z.array(z.string()).optional(), | |
| }); | |
| const prometheusAuth = z.object({ | |
| bearerToken: z.string().optional(), | |
| oauth2: prometheusOauth2.optional(), | |
| }); | |
| const prometheusRangeQuery = z.object({ | |
| start: z.string().optional(), | |
| end: z.string().optional(), | |
| step: z.string(), | |
| }); | |
| export const prometheusProviderConfig = z.object({ | |
| type: z.literal("prometheus"), | |
| address: z.string(), | |
| query: z.string(), | |
| timeout: z.number().optional(), | |
| insecure: z.boolean().optional(), | |
| headers: z.array(prometheusHeader).optional(), | |
| authentication: prometheusAuth.optional(), | |
| rangeQuery: prometheusRangeQuery.optional(), | |
| }); |
🤖 Prompt for AI Agents
In
`@apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/prometheus-metric.ts`
around lines 3 - 8, The schema prometheusProviderConfig currently omits backend
fields auth, headers, rangeQuery, and insecure so the UI strips them; update
prometheusProviderConfig to include these fields with appropriate optional types
that mirror the backend model (e.g., an optional auth object/schema, optional
headers as a record/dictionary of strings, optional rangeQuery (string or
structured type per backend), and optional insecure as boolean) so the UI parser
can surface and preserve those values; ensure names match exactly (auth,
headers, rangeQuery, insecure) and types align with the backend provider model
used elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we donot want to surface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adityachoudhari26, understood! I'll note that the UI schema intentionally exposes only a minimal subset of the Prometheus provider configuration.
✏️ Learnings added
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 795
File: apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/prometheus-metric.ts:3-8
Timestamp: 2026-02-09T19:53:01.662Z
Learning: In the Prometheus provider UI schema at `apps/web/app/routes/ws/deployments/_components/release-targets/prometheus/prometheus-metric.ts`, the `prometheusProviderConfig` intentionally only includes `type`, `address`, `query`, and `timeout` fields. Backend-only fields like `authentication`, `headers`, `rangeQuery`, and `insecure` are deliberately excluded from the UI surface.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 637
File: packages/events/src/kafka/client.ts:10-16
Timestamp: 2025-08-01T04:41:41.345Z
Learning: User adityachoudhari26 prefers not to add null safety checks for required environment variables when they are guaranteed to be present in their deployment configuration, similar to their preference for simplicity over defensive programming in test code.
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 601
File: e2e/tests/api/policies/retry-policy.spec.ts:23-24
Timestamp: 2025-06-24T23:52:50.732Z
Learning: The user adityachoudhari26 prefers not to add null safety checks or defensive programming in test code, particularly in e2e tests, as they prioritize simplicity and focus on the main functionality being tested rather than comprehensive error handling within the test itself.


Summary by CodeRabbit
New Features
Tests