-
-
Notifications
You must be signed in to change notification settings - Fork 832
Description
Hi Borg maintainers π,
Iβve been exploring the borg create flow in depth (from CLI dispatch β do_create β _rec_walk / _process_any β archive.py processing), and I noticed that while Borg provides rich logging for humans, it currently lacks structured, machine-readable observability during backup operations.
π‘ Motivation
During large backup runs:
- Errors and retries are logged as text, which can be hard to analyze programmatically
- Deduplication decisions (reuse vs new chunks) are not externally visible
- There is no structured way to trace per-file lifecycle (start β processed β skipped β error)
This makes:
- debugging large backups harder
- integration with external tools (dashboards, monitoring, UI) difficult
- automated analysis of backup behavior nearly impossible
π Proposal
Introduce an optional structured event reporting system for borg create:
CLI Flag (opt-in)
borg create --log-json ...π§© Core Idea
Emit structured events during backup execution, for example (JSON Lines format):
{"event": "file_started", "path": "/home/user/file.txt"}
{"event": "chunk_reused", "chunk_id": "...", "size": 4096}
{"event": "file_completed", "path": "/home/user/file.txt", "status": "ok"}
{"event": "file_error", "path": "...", "error": "..."}ποΈ Possible Design Direction
-
Introduce a lightweight event emitter inside the create pipeline
-
Hook into key points:
- before/after file processing (
_process_any,process_file) - deduplication decisions (
cache.reuse_chunk,add_chunk) - retry / error paths
- before/after file processing (
-
Keep default behavior unchanged (text logs remain primary)
π― Benefits
-
Improves observability and debugging
-
Enables future tooling:
- dashboards / visualizations
- progress trackers
- integration with external systems
-
Keeps Borg backward-compatible (fully opt-in)
β Questions / Feedback
-
Would such an observability layer align with Borgβs design goals?
-
Are there existing discussions or constraints I should be aware of before prototyping?
-
Preferred direction:
- integrate with existing logging system
- or introduce a separate structured event pipeline?
If this direction makes sense, Iβd be happy to:
- prototype a minimal version (e.g., file-level events)
- iterate based on feedback
Thanks for your time and for maintaining such a powerful tool π