feat: add plakar integration by BeArchiTek · Pull Request #107 · opsmill/infrahub-backup

BeArchiTek · 2026-03-19T12:54:05Z

Summary

Adds Plakar (kloset) as an optional backup backend, enabling content-addressed, deduplicated backups alongside the existing tar.gz workflow.

Introduces --backend plakar and --repo flags for create and restore commands
Streams backup data directly from containers into kloset snapshots (no local temp files)
Each backup creates one snapshot per component (Neo4j, PostgreSQL, metadata), grouped by a backup-id
Supports local filesystem and S3 repository storage via integration backends
Adds snapshots list command to browse available backups
Default behavior (tar.gz) remains unchanged

Architecture

Component Design

Each Plakar backup creates 3 snapshots grouped by a timestamp-based backup-id:

Component	Virtual Path	Source
`neo4j`	`/neo4j-backup.tar` (enterprise) or `/neo4j.dump` (community)	`neo4j-admin database backup` / `dump` streamed from container
`postgres`	`/prefect.dump`	`pg_dump` streamed from task-manager-db container
`metadata`	`/backup_information.json`	In-memory JSON with version info, components, edition

Snapshot Tags

All metadata is stored as kloset snapshot tags for query-time filtering:

infrahub.backup-id=20260318_143000
infrahub.component=neo4j
infrahub.backup-status=complete
infrahub.version=1.2.0
infrahub.backup-tool-version=0.5.0
infrahub.neo4j-edition=enterprise
infrahub.components=neo4j,postgres,metadata
infrahub.redacted=true  (optional)

Streaming Pipeline

Data flows directly from container exec → kloset chunker → repository, avoiding local temp files:

StreamingImporter implements importer.Importer with a lazy dataFunc factory
dataFunc calls CommandExecutor.ExecStreamStdout() which returns an io.ReadCloser pipe from docker compose exec
kloset's snapshot.Backup() reads from the pipe, content-addresses chunks, and deduplicates against existing data

Backup Group Integrity

Group completeness is derived at query time by comparing present component snapshots against the infrahub.components tag
Incomplete groups (partial failures) are logged with a warning and listed with incomplete status
restore refuses incomplete groups unless --force is used

Restore Flow

Restore extracts snapshots to a temp directory, then restructures files to match the existing restore functions' expected layout:

temp/backup/neo4j/neo4j-backup.tar  →  temp/backup/database/neo4j-backup.tar
temp/backup/postgres/prefect.dump   →  temp/backup/prefect.dump

This reuses all existing restoreNeo4j() and restorePostgreSQL() logic without modification.

New Files

File	Purpose
`src/internal/app/plakar.go`	Context init, repo open/create, cache management
`src/internal/app/plakar_backup.go`	`CreatePlakarBackup()` — orchestrates streaming backup
`src/internal/app/plakar_restore.go`	`RestorePlakarBackup()` — group restore, single snapshot restore, snapshot ID resolution
`src/internal/app/snapshots.go`	`ListSnapshots()`, backup group collection, status determination, tag parsing
`src/internal/app/importer.go`	`StreamingImporter` — kloset importer for pipe-based data

Modified Files

File	Changes
`src/cmd/infrahub-backup/main.go`	Added `--backend`, `--repo`, `--backup-id`, `--snapshot` flags; `snapshots list` command; backend validation
`src/internal/app/app.go`	Added `BackendType`, `PlakarConfig`, routing in `CreateBackup()`/`RestoreBackup()`
`src/internal/app/cli.go`	Added `--backend`, `--repo`, `--backup-id`, `--snapshot` to shared flag configuration
`src/internal/app/backup_neo4j.go`	Added `backupNeo4jEnterpriseStream()` and `backupNeo4jCommunityStream()` for pipe-based dumps
`src/internal/app/backup_taskmanager.go`	Added `backupTaskManagerDBStream()` for pipe-based pg_dump
`src/internal/app/command_executor.go`	Added `ExecStreamStdout()` for streaming container exec

Usage

Create a backup

# Streams directly from containers into kloset (no local temp files)
infrahub-backup create --backend plakar --repo /var/backups/infrahub

# Subsequent backups are deduplicated against existing snapshots
infrahub-backup create --backend plakar --repo /var/backups/infrahub

# Backup to S3
infrahub-backup create --backend plakar --repo s3://my-bucket/infrahub-backups

List available backups

infrahub-backup snapshots list --repo /var/backups/infrahub

BACKUP ID            DATE                       STATUS      INFRAHUB VERSION  NEO4J EDITION  COMPONENTS
20260318_143000      2026-03-18T14:30:00Z       complete    1.18.1            enterprise     neo4j, postgres, metadata
20260317_020000      2026-03-17T02:00:00Z       complete    1.18.1            enterprise     neo4j, postgres, metadata

Restore from a backup

# Restore the latest complete backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub

# Restore a specific backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --backup-id 20260318_143000

# Restore a single component (partial recovery)
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --snapshot a3f2b1c8

# Force restore of incomplete backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --backup-id 20260318_143000 --force

Default behavior (unchanged)

# Still works exactly as before — produces tar.gz
infrahub-backup create
infrahub-backup restore backup-20260318.tar.gz

Dependencies Added

github.com/PlakarKorp/kloset v1.0.13 — Core content-addressed storage library
github.com/PlakarKorp/integration-fs — Filesystem storage/importer/exporter backend
github.com/PlakarKorp/integration-s3 — S3 storage backend

Test plan

🤖 Generated with Claude Code

cloudflare-workers-and-pages · 2026-03-20T13:45:09Z

Deploying infrahub-ops-cli with Cloudflare Pages

Latest commit:	`37a4ef0`
Status:	✅ Deploy successful!
Preview URL:	https://0843f101.infrahub-ops-cli.pages.dev
Branch Preview URL:	https://002-plakar-integration.infrahub-ops-cli.pages.dev

View logs

BeArchiTek

Detailed review comments on the Plakar integration architecture and key design decisions.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/importer.go

+}
+
+// Scan returns a channel with a root directory record and a single file record.
+func (imp *StreamingImporter) Scan(_ context.Context) (<-chan *importer.ScanResult, error) {


Streaming Pipeline: The StreamingImporter implements kloset's importer.Importer interface using a lazy dataFunc factory. When kloset calls Scan(), it receives a channel with a root directory record and a single file record. The dataFunc is only invoked when kloset reads the file — this is when the container exec pipe is actually established, avoiding any upfront data buffering.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/plakar.go

+}
+
+// openOrCreateRepo opens an existing Plakar repository, or creates a new one if it doesn't exist.
+func openOrCreateRepo(kctx *kcontext.KContext, cfg *PlakarConfig) (*repository.Repository, error) {


Repository Lifecycle: openOrCreateRepo tries storage.Open() first. If it fails (repo doesn't exist), it creates a new repository with plaintext config (no encryption), closes it, then re-opens to get the proper config bytes for repository.New(). The create-close-reopen pattern is required by kloset's storage layer.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/plakar_backup.go

+	}
+
+	// Generate backup-id timestamp
+	backupID := time.Now().Format("20060102_150405")


Backup Group Model: Each CreatePlakarBackup call generates a timestamp-based backupID and creates one snapshot per component (neo4j, postgres, metadata). All snapshots in a group share the same infrahub.backup-id tag, which is used at query time to reconstruct groups. Partial failure tracking logs completed components but does not attempt rollback — incomplete groups are flagged at list/restore time.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/plakar_restore.go

+}
+
+// restoreBackupGroup exports each component snapshot to a temp directory and restores.
+func (iops *InfrahubOps) restoreBackupGroup(kctx *kcontext.KContext, repo *repository.Repository, group *BackupGroupInfo, excludeTaskManager bool, restoreMigrateFormat bool) error {


Restore Restructuring: The restore extracts each component snapshot to its own subdirectory (backup/neo4j/, backup/postgres/, backup/metadata/), then renames directories to match the layout expected by the existing restoreNeo4j() and restorePostgreSQL() functions. This avoids duplicating any restore logic.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/plakar_restore.go

+}
+
+// resolveSnapshotID resolves a snapshot identifier (partial hex or empty for latest).
+func resolveSnapshotID(repo snapshotLister, snapshotID string) (objects.MAC, error) {


Snapshot ID Resolution: resolveSnapshotID supports hex prefix matching — users can provide partial IDs (e.g., a3f2b1c8) and the function finds the matching snapshot. Ambiguous prefixes (matching multiple snapshots) return a clear error asking for a longer prefix. Empty ID returns the latest snapshot.

BeArchiTek · 2026-03-23T14:34:13Z

src/internal/app/snapshots.go

+	return groups, nil
+}
+
+// determineGroupStatus checks if all expected components are present.


Group Status Derivation: Completeness is not stored — it's computed at query time by comparing present component snapshots against the infrahub.components tag. This means kloset's immutable snapshot model works naturally: no need to update tags after creation. If a backup fails midway, the group simply has fewer snapshots than expected.

BeArchiTek · 2026-03-23T14:34:13Z

src/cmd/infrahub-backup/main.go

 )

+// validateBackendFlags checks for invalid flag combinations related to the --backend flag.
+func validateBackendFlags(iops *app.InfrahubOps) error {


Backend Validation: S3 flags (--s3-upload, --s3-bucket, etc.) are explicitly rejected when using plakar backend. For S3 with plakar, users should use --repo s3://bucket/prefix instead, which routes through kloset's integration-s3 storage backend.

- Fix CLAUDE.md markdown lint errors (blank lines around headings/lists) - Deduplicate Active Technologies entries in CLAUDE.md - Update vendorHash in flake.nix after go.mod dependency changes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

BeArchiTek requested a review from a team as a code owner March 19, 2026 12:54

BeArchiTek commented Mar 23, 2026

View reviewed changes

Benoit Kohler and others added 2 commits March 23, 2026 15:40

feat: add plakar integration

9c3fa8c

feat: improve integration

9b54c5f

BeArchiTek force-pushed the 002-plakar-integration branch from cf793d2 to 9b54c5f Compare March 23, 2026 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add plakar integration#107

feat: add plakar integration#107
BeArchiTek wants to merge 3 commits intomainfrom
002-plakar-integration

BeArchiTek commented Mar 19, 2026 •

edited

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Mar 20, 2026 •

edited

Loading

Uh oh!

BeArchiTek left a comment

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

BeArchiTek Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

BeArchiTek commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Architecture

Component Design

Snapshot Tags

Streaming Pipeline

Backup Group Integrity

Restore Flow

New Files

Modified Files

Usage

Create a backup

List available backups

Restore from a backup

Default behavior (unchanged)

Dependencies Added

Test plan

Uh oh!

cloudflare-workers-and-pages bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying infrahub-ops-cli with Cloudflare Pages

Uh oh!

BeArchiTek left a comment

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

BeArchiTek Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

BeArchiTek commented Mar 19, 2026 •

edited

Loading

cloudflare-workers-and-pages bot commented Mar 20, 2026 •

edited

Loading