-
Notifications
You must be signed in to change notification settings - Fork 0
Problem Statement
Every team that maintains a production database eventually faces this scenario:
- A developer restores a production backup to their local environment
- They spend days or weeks making changes — new tables, schema tweaks, reference data updates, config changes
- Now they need to push those changes back to production
This is where things go wrong.
Without tooling, the options are:
- Manually diff schemas and write ALTER statements — error-prone and time-consuming
- Export full tables and re-import — risks data loss and overwrites legitimate production changes
- Run raw SQL diffs — no safety checks, no review step, no rollback
- Hope that the ORM migration history is complete and accurate — it rarely is
The result is data loss, production incidents, and hours of manual reconciliation.
| Tool | What it does | What it misses |
|---|---|---|
mysqldiff / pgdiff
|
Schema-only comparison | No row-level diffing |
| Flyway / Liquibase | Migration history tracking | Doesn't compare live databases |
pt-table-checksum |
Row checksum comparison | No migration generation, MySQL only |
| Manual SQL export | Full data copy | Overwrites production changes (conflicts) |
| ORM migrations | Schema versioning | Doesn't capture data changes |
None of these tools answer the question: "Given two live databases that have independently diverged, what is the exact, safe, reviewable set of changes needed to bring production in sync with dev?"
DeepDiff DB treats database synchronization as a diff-review-apply workflow, similar to how Git handles code:
Prod DB ──┐
├── deepdiff-db diff ──► Reviewable diff + migration pack ──► deepdiff-db apply ──► Prod DB (updated)
Dev DB ──┘
1. Schema drift detection Detects every structural difference between two databases: added/removed tables, column type changes, nullable mismatches, added/removed indexes and foreign keys. Generates a safe, ordered SQL migration script respecting FK dependencies.
2. Row-level change detection without full export Hashes every row using SHA-256. Compares hash maps between prod and dev to find:
- Rows that exist in dev but not prod (to be inserted)
- Rows that exist in prod but not dev (to be deleted)
- Rows that exist in both but differ (conflicts)
Memory usage is bounded regardless of table size through keyset-paginated streaming.
3. Conflict identification and resolution
When the same row (same primary key) exists in both databases with different values, it's a conflict — neither "insert" nor "delete" is correct. DeepDiff DB surfaces all conflicts explicitly and lets you choose a resolution strategy per table (ours, theirs, manual).
4. Safe, transactional application Generated migration packs execute inside a single database transaction. Any error triggers automatic rollback — production is never left in a partially-migrated state.
5. Resilient long-running operations For large databases, hashing millions of rows takes time. DeepDiff DB checkpoints progress after every batch so an interrupted operation can resume exactly where it left off.
6. Destructive operation safety DROP statements (tables, columns, indexes, foreign keys) are gated behind explicit config flags and are commented out by default. You must opt in to destructive changes.
| Use Case | Description |
|---|---|
| Dev → Prod sync | Push reference data, config table changes, or feature data from dev to production |
| Prod backup reconciliation | After restoring a prod backup to dev, identify what changed in both databases |
| Environment promotion | Promote changes from staging to production with full diff visibility |
| Audit and compliance | Generate a complete record of what data changed, when, and what the migration looks like |
| Schema drift monitoring | Run schema-diff in CI to detect unintended production schema changes |
- Review before apply — Every migration is a human-readable SQL file you inspect before running
- Safe by default — Destructive operations require explicit opt-in
- Transactional — All-or-nothing; never leave production in a partial state
- Memory-efficient — Keyset pagination means table size doesn't determine memory usage
- Database-agnostic — Works across MySQL, PostgreSQL, SQLite, MSSQL, and Oracle with a single config
Home · Problem Statement · Architecture · Data Flow · CLI Reference · Configuration · Contributing
DeepDiff DB — safe, deterministic database synchronization