Skip to content

feat(scrapeconfig): add scrape_failure_log_file support#8426

Open
slashpai wants to merge 1 commit intoprometheus-operator:mainfrom
slashpai:scrape_failure_log_file
Open

feat(scrapeconfig): add scrape_failure_log_file support#8426
slashpai wants to merge 1 commit intoprometheus-operator:mainfrom
slashpai:scrape_failure_log_file

Conversation

@slashpai
Copy link
Copy Markdown
Contributor

@slashpai slashpai commented Mar 5, 2026

Fixes #8425

Description

Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.

Closes: #ISSUE-NUMBER

If you're contributing for the first-time, check our contribution guidelines.

Type of change

What type of changes does your code introduce to the Prometheus operator? Put an x in the box that apply.

  • CHANGE (fix or feature that would cause existing functionality to not work as expected)
  • FEATURE (non-breaking change which adds functionality)
  • BUGFIX (non-breaking change which fixes an issue)
  • ENHANCEMENT (non-breaking change which improves existing functionality)
  • NONE (if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)

Verification

Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.

Changelog entry

Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.

Add scrape_failure_log_file support in ScrapeConfig CRD

@slashpai slashpai requested a review from a team as a code owner March 5, 2026 06:31
heliapb
heliapb previously approved these changes Mar 5, 2026
Copy link
Copy Markdown
Member

@heliapb heliapb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@slashpai slashpai force-pushed the scrape_failure_log_file branch from 86c8ca2 to c828e1a Compare March 5, 2026 11:06
@simonpasquier
Copy link
Copy Markdown
Contributor

What are the potential concerns for environments where Prometheus is provided as-a-service (e.g. a platform team manages the workload resources and application teams manage the configuration resources)?

Is there a risk for app teams to tamper the filesystem? Should there be an option for the workload owner to disallow scrape failure logs? Should the API rather be an on/off toggle and the operator generates a predictable file path from the resource namespace/name?

@slashpai
Copy link
Copy Markdown
Contributor Author

What are the potential concerns for environments where Prometheus is provided as-a-service (e.g. a platform team manages the workload resources and application teams manage the configuration resources)?

Is there a risk for app teams to tamper the filesystem? Should there be an option for the workload owner to disallow scrape failure logs? Should the API rather be an on/off toggle and the operator generates a predictable file path from the resource namespace/name?

You are right. The global ScrapeFailureLogFile on PrometheusSpec doesn't have this problem, but the per-job version does. I liked the toggle based approach where operator can generate path. May be we could add disableScrapeFailureLogFile to let workload owners opt out of the feature entirely.

@slashpai slashpai force-pushed the scrape_failure_log_file branch from c828e1a to b9fc062 Compare March 12, 2026 10:17
@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 12, 2026
@slashpai slashpai marked this pull request as draft March 12, 2026 10:41
@slashpai slashpai force-pushed the scrape_failure_log_file branch from b9fc062 to f8118d7 Compare March 13, 2026 07:33
Add scrapeFailureLogFile (*bool) to ScrapeConfigSpec. When enabled, the
operator generates a deterministic per-job log path:
/var/log/prometheus/scrapeconfig-<namespace>-<name>.log

Add disableScrapeFailureLogFile (*bool) to CommonPrometheusFields so
platform teams can prevent application teams from enabling per-job scrape
failure log files in multi-tenant environments.

The /var/log/prometheus emptyDir volume is mounted automatically unless
the feature is disabled at the workload level.

Fixes prometheus-operator#8425
Signed-off-by: Jayapriya Pai <slashpai9@gmail.com>
@slashpai slashpai force-pushed the scrape_failure_log_file branch from f8118d7 to a4ccd1e Compare March 16, 2026 04:30
@slashpai slashpai marked this pull request as ready for review March 16, 2026 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support ScrapeFailureLogFile in ScrapeConfig crd

3 participants