Skip to content

feat(proto): portable Sigma IR protobuf interchange + conformance vectors#204

Draft
mostafa wants to merge 1 commit into
mainfrom
feat/portable-ir-proto
Draft

feat(proto): portable Sigma IR protobuf interchange + conformance vectors#204
mostafa wants to merge 1 commit into
mainfrom
feat/portable-ir-proto

Conversation

@mostafa

@mostafa mostafa commented Jun 13, 2026

Copy link
Copy Markdown
Member

Why

Sigma backends are reimplemented per engine today: a Splunk/KQL/Elastic/etc. backend written for one engine has to be re-ported into RSigma and any future engine. This defines a shared, strongly typed message so a backend can be written once and reused across engines, including as a remote gRPC service. The message sits at the post-pipeline, modifier-resolved, selector-resolved layer that pySigma (after modifier application and condition postprocess) and RSigma (its HIR) independently arrive at, which is what makes a single schema faithful rather than forced. External/dynamic sources are converging across engines too (see SigmaHQ/pySigma#470), so they belong in the shared schema rather than as an engine-specific extension.

Summary

  • Adds proto/sigma_ir.proto, a language-neutral protobuf schema for a post-pipeline, modifier-resolved, selector-resolved Sigma rule: the wire message a remote backend responds to, and a single canonical form an engine's HIR can be generated from or conformance-locked against.
  • Adds proto/sigma_backend.proto, a SigmaBackend gRPC service (Capabilities + Convert, with explicit Unsupported results) as the separable remote-backend transport over the schema.
  • Adds 19 golden (rule YAML -> canonical IR) conformance vectors under proto/conformance/vectors/, plus READMEs documenting the schema, conventions, and vector format.
  • Schema and docs only; no crate code changes. Built by reconciling the RSigma parser AST/HIR with pySigma's resolved types, conditions, modifiers, correlations, and filters, so both engines lower to and consume the same form.
  • Records the intent to extract proto/ into a neutral standalone schema repo so a second implementation can vendor it.

Test plan

  • protoc (libprotoc 34.1) compiles sigma_ir.proto and sigma_backend.proto cleanly.
  • All 19 conformance vectors validate strictly against the schema (generated Python bindings + json_format.ParseDict, which rejects unknown fields).
  • base64offset / windash expansions in the vectors were computed with pySigma's exact algorithm, not hand-written.
  • Reviewer: confirm the schema home (extract to a neutral repo vs keep in-tree).

… vectors

Define a language-neutral protobuf schema for a post-pipeline,
modifier-resolved, selector-resolved Sigma rule: the wire message a remote
backend responds to and the single source of truth an engine's HIR is
generated from or conformance-locked against. Built by reconciling the RSigma
parser AST / HIR with pySigma's resolved types, conditions, modifiers,
correlations, and filters.

- sigma_ir.proto: values (placeholder-aware SigmaString), all matcher variants,
  detections (incl. ArrayMatch/Conditional extensions), selector-resolved
  conditions, rule/correlation/filter, IrRuleMetadata superset, Pack envelope.
- sigma_backend.proto: SigmaBackend gRPC service (Capabilities + Convert with
  explicit Unsupported), the separable remote-backend transport.
- conformance/vectors: 19 hand-authored golden (rule YAML -> canonical IR)
  vectors covering every matcher kind, value linking, base64offset/windash
  expansions (computed with pySigma's exact algorithm), keywords, selector
  resolution, and/not, an event_count correlation, a filter, and the metadata
  superset. All validated strictly against the schema.

Intended to move to a neutral sigma-ir-schema repo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant