Skip to content

feat(cli): add rule scorecard for per-rule keep/tune/retire verdicts#243

Merged
mostafa merged 5 commits into
mainfrom
feat/rule-scorecard
Jun 22, 2026
Merged

feat(cli): add rule scorecard for per-rule keep/tune/retire verdicts#243
mostafa merged 5 commits into
mainfrom
feat/rule-scorecard

Conversation

@mostafa

@mostafa mostafa commented Jun 22, 2026

Copy link
Copy Markdown
Member

Summary

Adds rsigma rule scorecard, the fusion-and-verdict layer that turns the toolkit's existing rule-side outputs into the per-rule keep/tune/retire table a detection program reviews on a cadence. It reads JSON the toolkit already emits, so it adds no new collection or evaluation: an offline rule-group command with no engine or hot-path involvement.

  • Fuses the required backtest report (precision proxy, recall, the corpus false-positive signal) and coverage report (ATT&CK mapping, per-technique rule count) into a per-rule record keyed by rule_id, optionally enriched by a Prometheus production-volume snapshot or endpoint (joined by rule_title, colliding titles summed and flagged), a Prometheus query-API range window for last-fired, and a triage disposition feed for the live false-positive ratio and MTTD/MTTR. Each cell records its source; a missing optional input degrades the verdict rather than blocking it.
  • Verdict bands default to the SOC quality-metrics thresholds and are fully configurable. A retire candidate that is the sole coverage for an ATT&CK technique is downgraded to tune with a coverage-risk note, so coverage is never silently dropped.
  • Renders through the global --output-format layer (table/json/ndjson/csv/tsv) plus a --report markdown or HTML artifact grouped by verdict, with --fail-on <none|tune|retire> for CI and the house exit codes (0/1/2/3).
  • Config. A scorecard config section follows the layered-config conventions; the verdict thresholds carry single-source defaults pinned to the clap flags by a drift-guard test, and every input (including the two required reports) plus the report path can come from the config file. rule coverage likewise now accepts its rule paths from coverage.rules.
  • No new dependencies. The Prometheus exposition-snapshot parser is hand-rolled and std-only (the single new untrusted-input surface); the query-API path reuses the existing ureq client.

Three commits: a behavior-neutral refactor lifting the backtest/coverage report structs into a shared module (so the producers and the scorecard consumer share one definition), the command itself plus the configurable inputs, and the docs.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features -- -D warnings
  • cargo test -p rsigma (unit + integration, including the JSON and markdown goldens and the config-layering paths)
  • cargo +nightly fuzz build fuzz_scorecard_promtext and a short run of the exposition parser
  • mkdocs build --strict
  • Existing rule backtest / rule coverage goldens still pass (the refactor is behavior-neutral)

mostafa added 5 commits June 22, 2026 12:28
…dule

The `rule backtest` and `rule coverage` commands own the JSON report
documents the rest of the toolkit consumes, but their report structs lived
inside each command module. Lift them into a shared `commands::reports`
module that the producers build and serialize, so a future consumer can
deserialize the very same types and the two cannot drift.

The lifted structs are pure wire shapes. The runtime-only knobs (the
backtest unexpected policy, the coverage fail-on-gaps flag) are no longer
struct fields; they are threaded through the rendering and exit-code methods
instead, so the shared types are exactly the JSON shape. Behavior-neutral and
pinned by the existing backtest/coverage golden tests.
`rsigma rule scorecard` fuses the rule-side outputs the toolkit already emits
into the per-rule keep/tune/retire verdict table a detection program reviews
on a cadence. It reads JSON the toolkit already produces, so it adds no new
collection or evaluation: an offline fusion-and-verdict layer.

- Joins the required backtest report (precision proxy, recall, corpus
  false-positive signal) and coverage report (ATT&CK mapping, per-technique
  rule count) into a per-rule record keyed by rule_id, optionally enriched by a
  Prometheus production-volume snapshot or endpoint (joined by rule_title, with
  colliding titles summed and flagged), a Prometheus query-API range window for
  last-fired, and a triage disposition feed for the live false-positive ratio
  and MTTD/MTTR. Each cell records its source; a missing optional input
  degrades the verdict rather than blocking it.
- Verdict bands default to the SOC quality-metrics thresholds and are
  configurable. A retire candidate that is the sole coverage for an ATT&CK
  technique is downgraded to tune with a coverage-risk note.
- Renders through the global output-format layer plus a markdown/HTML report,
  with --fail-on for CI and the house exit codes (0/1/2/3).
- The Prometheus exposition-snapshot parser is hand-rolled and std-only (the
  single untrusted-input surface, fuzzed by fuzz_scorecard_promtext); the
  query-API path reuses the existing ureq client. No new dependencies.

A scorecard config section follows the layered-config conventions: the verdict
thresholds carry single-source defaults pinned to the clap flags, and every
input (including the two required reports) plus the report path can be supplied
from the config file. rule coverage likewise now accepts its rule paths from
coverage.rules.
Add the rule scorecard CLI reference and a Detection Scorecard guide page
covering the keep/tune/retire verdict model and the review cadence, wire both
into the nav, document the scorecard config section and the new coverage.rules
key, cross-link the backtest, coverage, and CI/CD pages, refresh both READMEs,
and add the CHANGELOG entry.
The reports refactor renamed `Report` to `BacktestReport` but left the module
doc comment linking to `Report::build`, which `cargo doc -D warnings -D
rustdoc::broken-intra-doc-links` rejects. Point it at `BacktestReport::build`.
@mostafa mostafa merged commit 1219d0a into main Jun 22, 2026
16 checks passed
@mostafa mostafa deleted the feat/rule-scorecard branch June 22, 2026 10:53
@mostafa mostafa mentioned this pull request Jun 23, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant