Append-only storage for logs, traces, and metrics — all three in one process. One binary, one directory, HTTP + gRPC API. Think "SQLite for observability".
- One store for logs, traces, and metrics — counters, gauges, and exponential histograms share the same WAL, flush gate, retention, and compaction
- Append-only segments with zstd compression and per-block min/max stats in segment footer
- Write-Ahead Log for crash recovery (fail-stop writers, CRC over length+seq+payload)
- Bitmap indexes (sorted
uint64sets, not roaring — ULID-derived keys shattered roaring into one container per few IDs) for fast field filtering (service, level, host) - Full-text search index for log body
- Ribbon filters for high-cardinality fields (trace_id)
- Sparse index for time-based segment pruning (skips 95%+ data without I/O)
- OTLP compatible — gRPC (:4317) and HTTP endpoints for logs, traces, and metrics
- Log-trace correlation — trace viewer with span tree and linked logs
- Retention policies — max age, max bytes, max segments
- Embedded mode — use as a Go library without HTTP server
- amberctl — command-line client and interactive terminal UI (logs, traces, span waterfall, live tail)
git clone https://github.com/yaop-labs/amber.git
cd amber
make build
cp config.example.yaml config.yaml # edit as needed
./amber config.yamldocker build -t amber .
docker run -p 8080:8080 -p 4317:4317 \
-v amber-data:/data \
-v ./config.yaml:/data/config.yaml \
amberimport "github.com/yaop-labs/amber"
db, err := amber.Open("./data", nil) // nil = default Options
defer db.Close()
db.Log(ctx, amber.LogEntry{
Level: amber.LevelError,
Service: "api-gateway",
Body: "connection refused",
})
result, err := db.QueryLogs(ctx, &amber.LogQuery{
Services: []string{"api-gateway"},
Limit: 100,
})amberctl is the terminal client. It speaks amber's HTTP read API, so it works
the same against a local dev server (default http://localhost:8080, no auth)
or a remote one (--addr / --api-key, or AMBER_ADDR / AMBER_API_KEY).
make build # builds ./amber and ./amberctl
# one-shot, scriptable
amberctl logs --service api --level ERROR --since 1h
amberctl logs -q "connection refused" --json | jq .
amberctl logs --service api -f # live tail
amberctl traces --service checkout --since 6h
amberctl trace <trace_id> # span waterfall + linked logs
amberctl services
amberctl stats
# interactive terminal UI
amberctl tuiIn the TUI: ↑/↓ move, enter or click opens a trace, space expands a row,
/ searches, t cycles the time range, f toggles live tail, tab switches
logs/traces, q quits.
# JSON
curl -X POST http://localhost:8080/api/v1/logs \
-H "Authorization: Bearer <key>" \
-d '[{"level":"ERROR","service":"api","body":"connection refused"}]'
# OTLP HTTP
curl -X POST http://localhost:8080/v1/logs \
-H "Authorization: Bearer <key>" \
-H "Content-Type: application/json" \
-d @otlp_payload.json# Logs
curl "http://localhost:8080/api/v1/logs?service=api-gateway&level=ERROR&limit=50" \
-H "Authorization: Bearer <key>"
# Trace
curl "http://localhost:8080/api/v1/traces/<trace_id>" \
-H "Authorization: Bearer <key>"
# Services list
curl "http://localhost:8080/api/v1/services" \
-H "Authorization: Bearer <key>"GET /api/v1/logs supports:
| Parameter | Description |
|---|---|
service |
Comma-separated service names |
level |
Comma-separated levels (ERROR,WARN,...) |
host |
Comma-separated host names |
q |
Full-text search in log body |
from / to |
RFC3339 time range |
limit |
Page size (default 100) |
offset |
Result offset |
attr.<key> |
Exact match on structured attribute |
Examples:
# Time-bounded search
curl "http://localhost:8080/api/v1/logs?service=checkout&from=2026-05-07T10:00:00Z&to=2026-05-07T11:00:00Z&q=timeout"
# Filter by structured attribute
curl "http://localhost:8080/api/v1/logs?service=checkout&attr.env=prod&attr.region=eu-west-1"
# NDJSON streaming-friendly output
curl "http://localhost:8080/api/v1/logs?service=checkout&limit=100" \
-H "Accept: application/x-ndjson"# Trace list
curl "http://localhost:8080/api/v1/traces?service=checkout&limit=20&offset=0" \
-H "Authorization: Bearer <key>"
# Single trace with span tree + correlated logs
curl "http://localhost:8080/api/v1/traces/<trace_id>" \
-H "Authorization: Bearer <key>"GET /api/v1/traces supports:
| Parameter | Description |
|---|---|
service |
Comma-separated service names |
from / to |
RFC3339 time range |
limit |
Number of traces to return (default 20) |
offset |
Trace offset |
Response fields for each trace summary:
trace_idserviceoperationstart_timeduration_msspan_counthas_errors
Metrics are ingested over OTLP (POST /v1/metrics, HTTP or gRPC) and queried
through a small purpose-built API — counters and gauges as rates, exponential
histograms as quantiles:
# Instant rate by label (counters), evaluated at `time`
curl "http://localhost:8080/api/v1/metrics/rate?metric=http_requests_total&by=service&time=2026-06-16T10:00:00Z" \
-H "Authorization: Bearer <key>"
# Per-step range rate over a window
curl "http://localhost:8080/api/v1/metrics/rate_range?metric=http_requests_total&by=route&from=2026-06-16T09:00:00Z&to=2026-06-16T10:00:00Z&step=60s" \
-H "Authorization: Bearer <key>"
# Quantile from exponential histograms
curl "http://localhost:8080/api/v1/metrics/quantile?metric=http_request_duration&q=0.95&by=service" \
-H "Authorization: Bearer <key>"
# Series list and metrics store stats
curl "http://localhost:8080/api/v1/metrics" -H "Authorization: Bearer <key>"
curl "http://localhost:8080/api/v1/metrics/stats" -H "Authorization: Bearer <key>"Sample values are stored as int64 — see Metrics value model below for the scale/precision trade-off.
# Runtime + storage stats
curl "http://localhost:8080/api/v1/admin/stats" \
-H "Authorization: Bearer <key>"
# Segment metadata
curl "http://localhost:8080/api/v1/admin/segments" \
-H "Authorization: Bearer <key>"/api/v1/admin/stats includes:
- segment counts and total records
- active segment metadata
- sparse index size
- heap usage snapshot
db.Log() and db.Span() are asynchronous: a nil error means the entry
was accepted into the in-process ingest queue (10k entries by default), not
that it reached disk. Entries are batched, written to the WAL with an fsync,
and acknowledged internally — but a crash (kill -9, OOM, power loss) loses
whatever was still in the queue or an unflushed batch.
When you need a durability barrier, call:
if err := db.Flush(ctx); err != nil { /* ctx expired */ }Flush returns once every Log/Span call that completed before it was
invoked has been written to the WAL and synced (or counted as dropped —
write failures surface via the ingest circuit breaker and the
ingest_dropped_total self-metric, not via Flush). db.Close() drains the
queue the same way on shutdown.
The HTTP ingest API has the same model and makes it explicit with status
202 Accepted.
The embedded metrics engine stores every sample as an int64. OTLP float
points are converted to round(value × scale):
- the default scale is 1000 (three decimal digits of precision);
- values smaller than
1/scalecollapse to0; NaN(OTLP staleness marker),±Inf, and values whose scaled form overflows int64 are dropped at ingest and counted in themetrics_ingest_rejected{reason="value_unencodable"}self-metric;- the scale is recorded in the reserved
__scale__label, so the same metric sent with different scales produces different series.
If you need more than 3 decimal digits, set an explicit per-point scale. A native float codec is on the roadmap; until then size your scale to the precision your data actually carries.
See config.example.yaml for all options. Key settings:
| Setting | Default | Description |
|---|---|---|
storage.data_dir |
./data |
Data directory |
storage.segment_max_records |
1000000 |
Records per segment before rotation |
storage.index_cache_size |
32 |
Max sealed index readers kept in memory |
ingest.batch_size |
1000 |
WAL batch size |
ingest.batch_timeout |
100ms |
Max wait before flushing batch |
ingest.queue_size |
100000 |
Buffered ingest queue length |
api.http_addr |
:8080 |
HTTP listen address |
api.grpc_addr |
:4317 |
gRPC listen address (OTLP) |
api.api_key |
(empty) | Bearer token (empty = auth disabled) |
retention.max_age |
0s |
Max segment age (0 = disabled) |
Measured with the in-repo obsbench suite: rate-capped ingest (20k/s, so every
system holds identical data), a fixed 600s settle, sequential runs, pinned
competitor versions, and a result-count equality gate per run (same seeds →
same query instances → systems must agree before any latency is published).
amber runs with an 800 MB soft memory limit (runtime.memory_limit). All
latencies are HTTP end-to-end, client-measured. Numbers are medians across runs.
These are honest, current results — amber is a deliberately experimental engine (sorted-slice bitmaps, no roaring/columnar): it wins logs outright, and on metrics/traces it trades raw scan/aggregate speed for low memory and a compact, unified store. Where a columnar competitor wins, the table says so.
Loki 3.4.2 · VictoriaLogs v1.36.0. Query p50/p95 (ms):
| Scenario | amber | Loki | VictoriaLogs |
|---|---|---|---|
| q1 — point (service + level) | 0.90 / 6.2 | 9.3 / 20 | 12.3 / 16 |
| q2 — range (service sub-window) | 1.89 / 7.0 | 51 / 77 | 15.3 / 21 |
| q3 — full-text (common token) | 4.06 / 5.2 | 57 / 73 | 28.4 / 31 |
| q4 — full-text (rare token) | 0.46 / 0.93 | 1914 / 1987 | 10.3 / 11 |
| Resource | amber | Loki | VictoriaLogs |
|---|---|---|---|
| RSS peak / loaded (MB) | 809 / 582 | 1169 / 688 | 402 / 53 |
| Storage (MB) | 875 | 260 | 226 |
amber wins all four query scenarios. It loses on RSS and on-disk size: VictoriaLogs is far leaner and Loki compacts hard during the settle window.
Mimir 3.1.0 · VictoriaMetrics v1.145.0 · Prometheus v3.12.0. Query p50/p95 (ms):
| Scenario | amber | Mimir | VictoriaMetrics | Prometheus |
|---|---|---|---|---|
| qm1 — instant rate | 2139 / 2625 | 768 / 883 | 187 / 227 | 458 / 490 |
| qm2 — range rate | 6330 / 6881 | 1325 / 1529 | 267 / 337 | 868 / 927 |
| qm3 — histogram quantile | 1297 / 4320 | 495 / 654 | 291 / 609 | 259 / 344 |
| qm4 — group-by rate | 2239 / 2852 | 760 / 922 | 179 / 206 | 423 / 506 |
| Resource | amber | Mimir | VictoriaMetrics | Prometheus |
|---|---|---|---|---|
| RSS peak (MB) | 1110 | 2076 | 576 | 718 |
| Storage (MB) | 540 | 370 | 152 | 255 |
Caveat (important): at campaign time amber was the slowest on all four metric queries — VictoriaMetrics (columnar) dominates. Since then the range-step query path was rewritten (resident block index + series-partitioned parallelism): in the amber-only bench qm2 dropped 6.3s → ~0.68s (~9×), moving amber past Mimir/Prometheus to ~2.5× VictoriaMetrics. A full cross-system re-campaign on the fixed engine is pending, so the table above is the last fully comparable run. amber already beats Mimir on RSS.
Tempo 2.10.1 · VictoriaTraces v0.9.2. Query p50 (ms):
| Scenario | amber | Tempo | VictoriaTraces |
|---|---|---|---|
| QT1 — trace-ID lookup | 108 | 94 | 18 |
| QT2 — service + operation search | 7629 | 108 | 26 |
| QT3 — service + duration search | 6078 | 67 | 40 |
| Resource | amber | Tempo | VictoriaTraces |
|---|---|---|---|
| RSS peak (MB) | 617 | 2249 | 1537 |
| Storage (MB) | 1872 | 5166 | 1639 |
amber wins RSS decisively (≈3.6× lighter than Tempo) and is second on storage, but loses the scan-search scenarios (QT2/QT3): it currently full-scans spans with no tag/duration index, where VictoriaTraces' columnar engine answers in tens of ms. Point lookup (QT1) is on par with Tempo.
Methodology and notes
- Equality gate: each run compares result counts across systems before publishing latency; a mismatch fails the run. Logs use exact set equality; metrics/traces use coarse gates (group cardinality / full-page count) where cross-system value identity isn't well defined.
- Ingest: rate-capped at 20k/s for every system so they all index the same
data; partial queue-full rejects are parsed from amber's
503body for true acked counts. TotalHitsis a lower bound in amber (heap-threshold block skip) — it is never used for verification; adminsegments.total_recordsis.- All OTLP labels are datapoint attributes for metrics (resource attrs are
renamed per system and would break cross-system label identity); for traces
service.namerides as the OTLP resource attribute (backend standard). - Full methodology and per-run artifacts live with the
obsbenchharness.
Apache License 2.0
