Skip to content

dariuszduszynski/ELSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

ELSA - Enduring Log Search Archive

Cold layer archival search for SOC teams. Built on the philosophy S3 as the only source of truth, fully stateless compute nodes, zero mandatory external databases.


Problem Statement Every SOC team faces the same tension: logs must be kept for years, but storage systems designed for search were never designed for retention — and storage systems designed for retention were never designed for search. The typical architecture that emerges: Wazuh or Splunk for real-time alerting — fast, expensive, short retention Elasticsearch for hot search — fast, very expensive, short retention Cold storage (S3, tape, NFS) for compliance — cheap, but unsearchable When a SOC analyst needs to investigate an incident from 6 months ago — a compromised IP address, a lateral movement trail, a suspicious username — they face a wall: the hot systems have already purged the data, and the cold archive has no index. The answer to "what did 185.220.101.42 do in February?" requires hours of manual log retrieval, decompression, and grep. ELSA solves this. It is the missing layer: cheap S3 storage with a searchable index, compliance-grade immutability, and a query model built for SOC workflows.

What ELSA Is (and Is Not) ELSA IS: An archival log storage system with entity-centric search (IP, username, hostname, session ID) A compliance-grade archive (S3 Object Lock / WORM, GDPR tombstone deletion, audit trail) A natural extension of logRotate semantics — data flows in when hot systems are done with it A complement to Wazuh/Elastic, not a competitor ELSA IS NOT: A real-time alerting or correlation engine (that's Wazuh's job) A full-text search engine (that's Elasticsearch's job) A replacement for your hot storage layer A streaming analytics platform The Ecosystem

T+0 → T+72h              T+72h → T+30d          T+30d → ∞
┌────────────────────┐   ┌────────────────────┐  ┌────────────────────┐
│  WAZUH + Elastic   │   │  Wazuh hot storage │  │  ELSA              │
│                    │   │  Fast DBs          │  │                    │
│  Real-time alerts  │   │  Fast search       │  │  Archive + Search  │
│  Correlation rules │   │  Last 30 days      │  │  Compliance/WORM   │
│  Active response   │   │                    │  │  Entity lookup     │
└────────────────────┘   └────────────────────┘  └────────────────────┘
                                                          ↑
                                              logRotate feeds data here

The boundaries between layers are configurable — logRotate-style rotation can happen every 6 hours for high-security environments, or daily for standard SOC operations.

Core Architecture Philosophical Foundation ELSA inherits the core principle from DES: S3 is the only source of truth. Every other component is either a cache or a compute node — destroyable and reconstructible at any time. The implication: Redis (the metadata cache) can be wiped and fully rebuilt from S3 manifests in minutes. No data lives exclusively in Redis. No PostgreSQL is required. This is the primary architectural distinction from Quickwit — the closest open-source equivalent — which requires PostgreSQL as a metastore. A PostgreSQL outage in Quickwit means query unavailability. In ELSA, any component failure results in a cold rebuild, not data loss. The D+1 Model The other foundational decision is the D+1 compaction model: data written during the current day lives in a mutable staging zone. Each night (or at a configured interval), a single compaction job processes staging data into immutable, indexed archive splits.

CONTINUOUS WRITE (no locking, no conflicts)
  Ingestors → micro-splits → s3://staging/{stream}/{today}/

NIGHTLY COMPACTION (single writer, zero race conditions)
  Nightly Job → merge → index → promote to archive → rebuild Redis
  
QUERY (stateless, any node can serve any query)
  Redis metastore → identify candidate splits → S3 Range-GET → results

This model eliminates the hardest problem in distributed index systems: concurrent write conflicts. Because only one process writes to the archive manifest at any given time, there is no need for optimistic concurrency control, compare-and-swap loops, or distributed locks. The complexity simply does not exist. The tradeoff is explicit and intentional: data ingested today has a slower, index-less query path (staging scan). This is acceptable because SOC analysts investigating incidents are almost always working with historical data — the real-time view is Wazuh's responsibility.

Why Not Quickwit? Quickwit is an excellent system — S3-native, built on Tantivy (Lucene-class search), acquired by Datadog in early 2025. It solves the same class of problem. We studied it carefully before designing ELSA. Aspect Quickwit ELSA Metastore PostgreSQL (required) Redis + S3 (Redis = cache only) Disaster recovery Restore DB backup + S3 Rebuild Redis from S3 (~5 min) Horizontal scaling Add node + configure DB Add node, connect to Redis — done WORM compliance Not supported Native (S3 Object Lock per stream) GDPR tombstone deletion Not supported Built-in (like DES) logRotate integration Not designed for this Primary ingestion model Entity-centric index General inverted index Dedicated SOC entity index Race condition handling Optimistic concurrency (CAS) Eliminated by design (D+1) Full-text search Tantivy (excellent) Not a goal Maturity Production-ready Greenfield Use Quickwit if: you need full-text search, real-time indexing (seconds latency), and have no compliance requirements. Build ELSA if: you need WORM-compliant archival, GDPR-safe deletion, SOC entity-centric workflows, and you want S3 as the genuine — not nominal — source of truth.

Index Architecture: What We Learned from Databases Classical database index structures do not map directly to object storage. We analyzed each structure against S3's cost model (pay per request, high latency per round-trip, cheap sequential reads): Structure S3 Suitability Reason B+Tree ✗ Poor Each node traversal = one S3 GET. Depth-10 tree = 10 sequential GETs = ~200ms minimum LSM-Tree / SSTable ✓ Excellent Immutable sorted files with embedded Bloom filters — maps naturally to S3 objects Inverted Index ✓ Good Posting lists = one large sequential file per term Bitmap Index ✓ Good Compact binary per low-cardinality field, fast bitwise operations after fetch Bloom Filter ✓ Excellent Small, cacheable locally, eliminates unnecessary S3 GETs with ~1% false positive rate The fundamental rule for S3-native indexes: one large sequential read is always better than many small random reads. Every architectural decision follows from this. Three-Layer Index

QUERY: src_ip = 1.2.3.4, time: last 3 weeks

LAYER 0: Redis entity index (0 S3 GETs)
  → "which weekly segments contain this IP?" → {W10, W12} (W11 excluded)

LAYER 1: Bloom filter per split (1 Range-GET per split, hotcache section)
  → "does this split definitely contain this IP?" → probabilistic, ~1% FPR
  → 90%+ of splits eliminated before any data is fetched

LAYER 2: Inverted index posting list (1 Range-GET per qualifying split)
  → exact list of doc_ids for this IP
  → fetch only those records from columnar storage

Total S3 GETs for a typical SOC query spanning 3 weeks: 3–15, regardless of total archive size.

The Split Format (.ldes) Each archive unit is a self-contained binary file stored on S3:

Split file (.ldes)
┌───────────────────────────────────────┐
│  MAGIC + VERSION (8B)                 │
│  HEADER: stream, time_range, schema   │
├───────────────────────────────────────┤
│  HOTCACHE SECTION (~50–200KB)         │  ← single Range-GET opens this
│  Bloom filters per entity field       │
│  Sparse index (every 256th record)    │
│  Column min/max statistics            │
├───────────────────────────────────────┤
│  COLUMNAR DATA SECTION                │  ← zstd compressed, per column
│  timestamp (int64, delta-encoded)     │
│  level (bitmap, RLE — 99% INFO)       │
│  entity fields (dictionary + delta)   │
│  message (zstd, largest column)       │
├───────────────────────────────────────┤
│  INVERTED INDEX SECTION               │
│  entity_value → posting list          │
│  (VByte delta-encoded doc_ids)        │
├───────────────────────────────────────┤
│  FOOTER (last 32B)                    │
│  Section offsets + CRC32              │
└───────────────────────────────────────┘

Why columnar? The level field in production logs is INFO in over 99% of cases. Columnar storage + RLE compression turns thousands of INFO values into a handful of bytes. A query WHERE level='ERROR' reads only the level column — it never touches the message column (the largest one). This is projection pushdown, and on S3 it translates directly to money saved on data transfer.

S3 Layout — Self-Describing Storage The entire system state can be reconstructed from S3 alone:

s3://logs-bucket/
├── _catalog/
│   ├── streams.json                         ← stream registry
│   └── {stream}/
│       ├── current                          ← pointer to active snapshot
│       ├── snap_{N}.json                   ← snapshot manifest list
│       └── manifests/
│           └── man_{YYYY-WNN}.json         ← weekly segment metadata
├── splits/
│   └── {stream}/{YYYY}/{WNN}/
│       └── {split_id}.ldes                 ← archive splits (WORM-locked)
├── indexes/
│   └── {stream}/{YYYY-WNN}/
│       └── ip_index.idx                    ← weekly cross-split index
├── staging/
│   └── {stream}/{YYYY}/{MM}/{DD}/
│       └── {micro_split_id}.ldes           ← today's data (mutable)
├── tombstones/
│   └── {stream}/{request_id}.json          ← GDPR deletion requests
└── audit-trail/
    └── {stream}/{YYYY}/{MM}/
        └── audit_{DD}.jsonl                ← immutable operation log

Compliance Design WORM and the logRotate Boundary The staging/archive boundary is also the compliance boundary: Staging: mutable — ingestors can correct errors before compaction Archive: immutable — S3 Object Lock applied immediately after promotion This maps naturally to how compliance requirements actually work: the "official record" begins when data is finalized, not when it first arrives. GDPR vs WORM Tension WORM says data cannot be modified. GDPR says personal data must be deleted on request. These are fundamentally in conflict. LOG-DES resolves this through tombstone-based deletion — the same mechanism as DES: GDPR request received → tombstone file written to s3://tombstones/ Query engine immediately begins filtering tombstoned doc_ids from results Next compaction cycle: repack affected splits without tombstoned records Physical deletion complete; new splits receive WORM lock; old splits deleted The COMPLIANCE mode (strictest S3 Object Lock) makes physical deletion impossible — a deliberate design choice for environments where GDPR and strict WORM are in direct conflict. Stream operators configure which mode applies to each stream. Audit Trail Every system operation is logged to a separate, WORM-protected audit stream: INGEST — data received, split written COMPACT — micro-splits merged to archive WORM_APPLY — Object Lock applied, retention date set QUERY — search performed, by whom, result count GDPR_REQUEST — deletion requested, doc_ids affected REPACK — physical removal of tombstoned records LEGAL_HOLD_SET — hold applied for investigation EXPORT — data exported with audit proof Audit trail entries include a chain hash (SHA-256 of previous entry + current entry) — a cryptographic proof that the audit log itself has not been tampered with.

Epics Epic Title Description EPIC-01 Core Architecture & Split Format Binary .ldes format, S3 layout, self-describing storage EPIC-02 Ingestor & Schema Normalization Multi-source ingestion, SOC entity field extraction, micro-split writer EPIC-03 Nightly Compaction & Index Build K-way merge, Bloom + inverted index build, WORM promotion EPIC-04 Redis Metastore S3-backed cache, atomic rebuild, entity hot index EPIC-05 Query Engine Entity-centric + time-range paths, predicate pushdown, parallel S3 fetch EPIC-06 Compliance Layer S3 Object Lock, GDPR tombstone, audit trail, chain hash, legal hold EPIC-07 logRotate Integration CLI import hook, Wazuh/syslog/CEF parsers, directory watcher EPIC-08 SOC API & Query Interface REST API, entity timeline, aggregations, export with audit proof Estimated scope: ~340 story points, 2 developers, ~6 months to full production. MVP (v0.2.0): EPIC-01 through EPIC-04 plus EPIC-07 — import from logRotate, nightly compaction, Redis-backed query, basic entity lookup.

Technology Stack Component Technology Rationale Runtime Java 21 + Quarkus Aligned with DES 2.0; enterprise ecosystem; GraalVM native for CLI Object storage S3-compatible (AWS, MinIO, Ceph/RGW) Same as DES; multi-provider compatibility required Metadata cache Redis 7.x Atomic Lua scripts; fast sorted sets for time-range; RDB persistence Compression zstd level 3 Balance between speed and ratio; columnar data benefits significantly Bloom filter Guava BloomFilter Serializable; configurable FPR; well-tested Posting list codec VByte delta encoding 1–2 bytes per doc_id typical; standard in search engine literature Secrets management OpenBao (Vault fork) Aligned with DES security model Container orchestration Kubernetes CronJob for nightly compaction; HPA for ingestors and query nodes Observability Prometheus + Grafana Standard; aligned with DES

Key Design Decisions — Summary S3 is the only source of truth. Redis can be wiped and rebuilt at any time. No PostgreSQL. D+1 model eliminates race conditions. One writer per manifest, one job per night. Concurrency complexity does not exist. Entity-centric index is first class. SOC analysts pivot by IP, username, session ID — not by time window. The index is designed for this, not adapted for it. Bloom filters are the gate. Before any data is fetched, probabilistic pre-filtering eliminates the vast majority of S3 GETs. Cold queries on a year of data touch only a handful of splits. Columnar storage for projection pushdown. Reading level='ERROR' should not require fetching message. On S3, every byte fetched costs money. Compliance is a first-class concern. WORM, GDPR tombstone, audit trail, and chain hash are not afterthoughts — they are designed into the data flow from the beginning. logRotate is the natural ingestion model. Data comes in when hot systems are done with it. The boundary between hot and cold is a configuration, not an architectural constraint. The staging/archive boundary is the compliance boundary. Staging is mutable (correct errors, re-ingest). Archive is immutable (official record). This is intentional and explicit.

Relationship to DES LOG-DES is a sibling project to DES (Data Easy Store), developed by the same team at Datavision.pl. DES solves the small-file consolidation problem for general object storage. LOG-DES solves the archival search problem for log data. They share: The same philosophical foundation (S3 as source of truth, stateless compute) The same compliance model (WORM + tombstone-based GDPR deletion) The same security model (OpenBao/Vault, HMAC tokens) The same target deployment environment (Kubernetes, S3-compatible storage) They can be deployed independently or together. In a combined deployment, LOG-DES uses S3-compatible storage (potentially backed by Ceph) that DES may also manage.

License Apache License 2.0

Datavision.pl — data science consultancy and infrastructure tooling.
Project status: design phase. Implementation begins Q2 2026.

About

Easy Log Storage Archive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors