Skip to content

feat: add plakar integration#107

Open
BeArchiTek wants to merge 3 commits intomainfrom
002-plakar-integration
Open

feat: add plakar integration#107
BeArchiTek wants to merge 3 commits intomainfrom
002-plakar-integration

Conversation

@BeArchiTek
Copy link
Copy Markdown
Contributor

@BeArchiTek BeArchiTek commented Mar 19, 2026

Summary

Adds Plakar (kloset) as an optional backup backend, enabling content-addressed, deduplicated backups alongside the existing tar.gz workflow.

  • Introduces --backend plakar and --repo flags for create and restore commands
  • Streams backup data directly from containers into kloset snapshots (no local temp files)
  • Each backup creates one snapshot per component (Neo4j, PostgreSQL, metadata), grouped by a backup-id
  • Supports local filesystem and S3 repository storage via integration backends
  • Adds snapshots list command to browse available backups
  • Default behavior (tar.gz) remains unchanged

Architecture

Component Design

Each Plakar backup creates 3 snapshots grouped by a timestamp-based backup-id:

Component Virtual Path Source
neo4j /neo4j-backup.tar (enterprise) or /neo4j.dump (community) neo4j-admin database backup / dump streamed from container
postgres /prefect.dump pg_dump streamed from task-manager-db container
metadata /backup_information.json In-memory JSON with version info, components, edition

Snapshot Tags

All metadata is stored as kloset snapshot tags for query-time filtering:

infrahub.backup-id=20260318_143000
infrahub.component=neo4j
infrahub.backup-status=complete
infrahub.version=1.2.0
infrahub.backup-tool-version=0.5.0
infrahub.neo4j-edition=enterprise
infrahub.components=neo4j,postgres,metadata
infrahub.redacted=true  (optional)

Streaming Pipeline

Data flows directly from container exec → kloset chunker → repository, avoiding local temp files:

  1. StreamingImporter implements importer.Importer with a lazy dataFunc factory
  2. dataFunc calls CommandExecutor.ExecStreamStdout() which returns an io.ReadCloser pipe from docker compose exec
  3. kloset's snapshot.Backup() reads from the pipe, content-addresses chunks, and deduplicates against existing data

Backup Group Integrity

  • Group completeness is derived at query time by comparing present component snapshots against the infrahub.components tag
  • Incomplete groups (partial failures) are logged with a warning and listed with incomplete status
  • restore refuses incomplete groups unless --force is used

Restore Flow

Restore extracts snapshots to a temp directory, then restructures files to match the existing restore functions' expected layout:

temp/backup/neo4j/neo4j-backup.tar  →  temp/backup/database/neo4j-backup.tar
temp/backup/postgres/prefect.dump   →  temp/backup/prefect.dump

This reuses all existing restoreNeo4j() and restorePostgreSQL() logic without modification.

New Files

File Purpose
src/internal/app/plakar.go Context init, repo open/create, cache management
src/internal/app/plakar_backup.go CreatePlakarBackup() — orchestrates streaming backup
src/internal/app/plakar_restore.go RestorePlakarBackup() — group restore, single snapshot restore, snapshot ID resolution
src/internal/app/snapshots.go ListSnapshots(), backup group collection, status determination, tag parsing
src/internal/app/importer.go StreamingImporter — kloset importer for pipe-based data

Modified Files

File Changes
src/cmd/infrahub-backup/main.go Added --backend, --repo, --backup-id, --snapshot flags; snapshots list command; backend validation
src/internal/app/app.go Added BackendType, PlakarConfig, routing in CreateBackup()/RestoreBackup()
src/internal/app/cli.go Added --backend, --repo, --backup-id, --snapshot to shared flag configuration
src/internal/app/backup_neo4j.go Added backupNeo4jEnterpriseStream() and backupNeo4jCommunityStream() for pipe-based dumps
src/internal/app/backup_taskmanager.go Added backupTaskManagerDBStream() for pipe-based pg_dump
src/internal/app/command_executor.go Added ExecStreamStdout() for streaming container exec

Usage

Create a backup

# Streams directly from containers into kloset (no local temp files)
infrahub-backup create --backend plakar --repo /var/backups/infrahub

# Subsequent backups are deduplicated against existing snapshots
infrahub-backup create --backend plakar --repo /var/backups/infrahub

# Backup to S3
infrahub-backup create --backend plakar --repo s3://my-bucket/infrahub-backups

List available backups

infrahub-backup snapshots list --repo /var/backups/infrahub
BACKUP ID            DATE                       STATUS      INFRAHUB VERSION  NEO4J EDITION  COMPONENTS
20260318_143000      2026-03-18T14:30:00Z       complete    1.18.1            enterprise     neo4j, postgres, metadata
20260317_020000      2026-03-17T02:00:00Z       complete    1.18.1            enterprise     neo4j, postgres, metadata

Restore from a backup

# Restore the latest complete backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub

# Restore a specific backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --backup-id 20260318_143000

# Restore a single component (partial recovery)
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --snapshot a3f2b1c8

# Force restore of incomplete backup group
infrahub-backup restore --backend plakar --repo /var/backups/infrahub --backup-id 20260318_143000 --force

Default behavior (unchanged)

# Still works exactly as before — produces tar.gz
infrahub-backup create
infrahub-backup restore backup-20260318.tar.gz

Dependencies Added

  • github.com/PlakarKorp/kloset v1.0.13 — Core content-addressed storage library
  • github.com/PlakarKorp/integration-fs — Filesystem storage/importer/exporter backend
  • github.com/PlakarKorp/integration-s3 — S3 storage backend

Test plan

  • Verify create --backend plakar produces snapshots in a local repo
  • Verify snapshots list displays grouped backup info correctly
  • Verify restore --backend plakar restores from latest complete snapshot group
  • Verify restore --backup-id targets a specific backup group
  • Verify restore --snapshot restores a single component
  • Verify restore --force allows restoring incomplete groups
  • Verify default tar.gz workflow is unaffected (no regressions)
  • Verify --backend plakar without --repo returns a clear error
  • Verify S3 flags are rejected with plakar backend (--s3-upload conflicts)
  • Verify Neo4j Community Edition handling (container stop/start, dump format)
  • Verify S3 backend connectivity (if configured)

🤖 Generated with Claude Code

@BeArchiTek BeArchiTek requested a review from a team as a code owner March 19, 2026 12:54
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages bot commented Mar 20, 2026

Deploying infrahub-ops-cli with  Cloudflare Pages  Cloudflare Pages

Latest commit: 37a4ef0
Status: ✅  Deploy successful!
Preview URL: https://0843f101.infrahub-ops-cli.pages.dev
Branch Preview URL: https://002-plakar-integration.infrahub-ops-cli.pages.dev

View logs

Copy link
Copy Markdown
Contributor Author

@BeArchiTek BeArchiTek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detailed review comments on the Plakar integration architecture and key design decisions.

}

// Scan returns a channel with a root directory record and a single file record.
func (imp *StreamingImporter) Scan(_ context.Context) (<-chan *importer.ScanResult, error) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Streaming Pipeline: The StreamingImporter implements kloset's importer.Importer interface using a lazy dataFunc factory. When kloset calls Scan(), it receives a channel with a root directory record and a single file record. The dataFunc is only invoked when kloset reads the file — this is when the container exec pipe is actually established, avoiding any upfront data buffering.

}

// openOrCreateRepo opens an existing Plakar repository, or creates a new one if it doesn't exist.
func openOrCreateRepo(kctx *kcontext.KContext, cfg *PlakarConfig) (*repository.Repository, error) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repository Lifecycle: openOrCreateRepo tries storage.Open() first. If it fails (repo doesn't exist), it creates a new repository with plaintext config (no encryption), closes it, then re-opens to get the proper config bytes for repository.New(). The create-close-reopen pattern is required by kloset's storage layer.

}

// Generate backup-id timestamp
backupID := time.Now().Format("20060102_150405")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backup Group Model: Each CreatePlakarBackup call generates a timestamp-based backupID and creates one snapshot per component (neo4j, postgres, metadata). All snapshots in a group share the same infrahub.backup-id tag, which is used at query time to reconstruct groups. Partial failure tracking logs completed components but does not attempt rollback — incomplete groups are flagged at list/restore time.

}

// restoreBackupGroup exports each component snapshot to a temp directory and restores.
func (iops *InfrahubOps) restoreBackupGroup(kctx *kcontext.KContext, repo *repository.Repository, group *BackupGroupInfo, excludeTaskManager bool, restoreMigrateFormat bool) error {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restore Restructuring: The restore extracts each component snapshot to its own subdirectory (backup/neo4j/, backup/postgres/, backup/metadata/), then renames directories to match the layout expected by the existing restoreNeo4j() and restorePostgreSQL() functions. This avoids duplicating any restore logic.

}

// resolveSnapshotID resolves a snapshot identifier (partial hex or empty for latest).
func resolveSnapshotID(repo snapshotLister, snapshotID string) (objects.MAC, error) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapshot ID Resolution: resolveSnapshotID supports hex prefix matching — users can provide partial IDs (e.g., a3f2b1c8) and the function finds the matching snapshot. Ambiguous prefixes (matching multiple snapshots) return a clear error asking for a longer prefix. Empty ID returns the latest snapshot.

return groups, nil
}

// determineGroupStatus checks if all expected components are present.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group Status Derivation: Completeness is not stored — it's computed at query time by comparing present component snapshots against the infrahub.components tag. This means kloset's immutable snapshot model works naturally: no need to update tags after creation. If a backup fails midway, the group simply has fewer snapshots than expected.

)

// validateBackendFlags checks for invalid flag combinations related to the --backend flag.
func validateBackendFlags(iops *app.InfrahubOps) error {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend Validation: S3 flags (--s3-upload, --s3-bucket, etc.) are explicitly rejected when using plakar backend. For S3 with plakar, users should use --repo s3://bucket/prefix instead, which routes through kloset's integration-s3 storage backend.

@BeArchiTek BeArchiTek force-pushed the 002-plakar-integration branch from cf793d2 to 9b54c5f Compare March 23, 2026 14:48
- Fix CLAUDE.md markdown lint errors (blank lines around headings/lists)
- Deduplicate Active Technologies entries in CLAUDE.md
- Update vendorHash in flake.nix after go.mod dependency changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant