Skip to content

refactor: centralize schema migration planning#524

Merged
iamgp merged 2 commits into
mainfrom
GWP/schema-migration-planning
May 23, 2026
Merged

refactor: centralize schema migration planning#524
iamgp merged 2 commits into
mainfrom
GWP/schema-migration-planning

Conversation

@iamgp
Copy link
Copy Markdown
Collaborator

@iamgp iamgp commented May 23, 2026

Summary

Centralizes schema migration planning behind the new phlo.schema_migration package and removes duplicated provider diff logic from the Iceberg and Delta adapters.

What changed

  • Added shared schema migration planning primitives for neutral schema diffs, explicit rename instructions, provider policy overrides, and recommendation generation.
  • Added domain-level migration instruction loading for YAML files and repeatable CLI --rename old=new flags.
  • Updated schema-migrate diff, plan, and apply to use one planning path instead of duplicating instruction resolution in each command.
  • Made apply --dry-run validate provider support before printing dry-run output.
  • Updated Iceberg and Delta migrators to delegate diff planning and classification through the shared policy-based planner.
  • Removed older duplicate compatibility helpers from schema registry/helper surfaces.

Validation

  • uv run pytest tests/data/test_schema_migration_planning.py tests/data/test_schema_migration.py tests/data/test_schema_registry.py tests/data/test_schema_migrate_contracts.py tests/cli/test_cli_schema_migrate.py tests/unit/helpers/test_helpers_core.py packages/phlo-iceberg/tests/test_schema_migrator.py packages/phlo-delta/tests -q -> 128 passed, 1 deselected
  • uv run ruff check src/phlo/schema_migration src/phlo/cli/commands/schema_migrate.py src/phlo/cli/commands/schema_migrate_contracts.py packages/phlo-iceberg/src/phlo_iceberg/schema_migrator.py packages/phlo-delta/src/phlo_delta/schema_migrator.py tests/data/test_schema_migration_planning.py tests/cli/test_cli_schema_migrate.py
  • git diff --check
  • uv run ty check ... exited 0 with existing warnings in tests around dynamic module attributes, monkeypatched methods, and possible None access.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

Review Change Stack

Warning

Review limit reached

@iamgp, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 1 review/hour. Refill in 47 minutes and 58 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: f3864f20-2883-4068-8fae-4ef244e91380

📥 Commits

Reviewing files that changed from the base of the PR and between 0fdd575 and c84e7bf.

📒 Files selected for processing (15)
  • packages/phlo-delta/src/phlo_delta/schema_migrator.py
  • packages/phlo-iceberg/src/phlo_iceberg/schema_migrator.py
  • src/phlo/cli/commands/schema_migrate.py
  • src/phlo/cli/commands/schema_migrate_contracts.py
  • src/phlo/cli/commands/schema_registry_cli.py
  • src/phlo/helpers/__init__.py
  • src/phlo/helpers/schema.py
  • src/phlo/schema_migration/__init__.py
  • src/phlo/schema_migration/instructions.py
  • src/phlo/schema_migration/planning.py
  • src/phlo/schema_registry.py
  • tests/cli/test_cli_schema_migrate.py
  • tests/data/test_schema_migration_planning.py
  • tests/data/test_schema_registry.py
  • tests/unit/helpers/test_helpers_core.py
📝 Walkthrough

Walkthrough

This PR consolidates schema migration planning into a centralised shared module. The new phlo.schema_migration package provides configurable migration planning, instruction loading, and change classification. Migrators (Delta, Iceberg) and CLI commands are refactored to use this shared layer, replacing inline schema-diffing logic. Legacy comparison functions are removed.

Changes

Schema migration planning refactor

Layer / File(s) Summary
Shared migration planning module
src/phlo/schema_migration/__init__.py, src/phlo/schema_migration/instructions.py, src/phlo/schema_migration/planning.py
New package introduces plan_schema_migration() to compute SchemaMigrationPlan with change classification, recommendations, and approval requirements. Supports explicit field renames via SchemaMigrationInstructions and customisable change classifications via SchemaPlanningPolicy. Instruction loading handles YAML files and CLI --rename flags with conflict detection.
Delta migrator integration
packages/phlo-delta/src/phlo_delta/schema_migrator.py
DeltaSchemaMigrator.diff_schema() refactored to use plan_schema_migration() instead of inline field diffing. Introduces DELTA_SCHEMA_POLICY to customise "drop" and "rename" classifications. Signature updated to accept optional instructions parameter.
Iceberg migrator integration
packages/phlo-iceberg/src/phlo_iceberg/schema_migrator.py
IcebergSchemaMigrator.diff_schema() refactored to use plan_schema_migration(). Introduces ICEBERG_SCHEMA_POLICY to customise drop classifications. Removes internal _WIDEN_PAIRS constant. Signature updated to accept optional instructions and converts Iceberg schema to FieldSpec entries (excluding system metadata).
CLI command integration
src/phlo/cli/commands/schema_migrate.py, src/phlo/cli/commands/schema_migrate_contracts.py, src/phlo/cli/commands/schema_registry_cli.py
schema-migrate diff/plan/apply commands gain --migration-file and repeatable --rename options. New _build_migration_plan() helper resolves instructions and invokes migrators; new _ensure_plan_supported() validates plan change types. Scaffold generation updated to include "renames": {} and use standardised instruction paths. Schema registry check command switched to plan_schema_migration().
Legacy code removal
src/phlo/helpers/schema.py, src/phlo/helpers/__init__.py, src/phlo/schema_registry.py
Removes deprecated schema comparison/compatibility functions (compare_schemas, assert_schema_compatible, check_compatibility). Narrows phlo.helpers exports to normalisation and conversion utilities only.
Test coverage – planning
tests/data/test_schema_migration_planning.py
Comprehensive suite testing plan_schema_migration() classification behaviour, explicit rename handling, rename validation, and resolve_migration_instructions() YAML/CLI merging and conflict detection.
Test coverage – CLI
tests/cli/test_cli_schema_migrate.py
Regression tests validate --migration-file and --rename option handling, YAML/CLI rename merging, conflicting rename rejection, and refusal of unsupported change types in apply mode. Scaffold generation assertions updated.
Test cleanup
tests/data/test_schema_registry.py, tests/unit/helpers/test_helpers_core.py
Removes TestCheckCompatibility test class and legacy compare_schemas coverage. Updates one test to use plan_schema_migration() instead of deprecated helper.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 52.46% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main objective: centralizing schema migration planning, which is the core refactoring effort across all changed files.
Description check ✅ Passed The description directly relates to the changeset, detailing what was centralised, what was added, and what was removed across the schema migration infrastructure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch GWP/schema-migration-planning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@iamgp iamgp marked this pull request as ready for review May 23, 2026 10:43
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

Cairn Quality Report

Commit: 86a0104 · View full report

Checker Status ✅ Passed ❌ Failed Items
ruff passed 0 0 0
ruff-format pass 0 0 1
pytest-3.11 passed 992 0 993
pytest-3.12 passed 992 0 993

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/phlo/schema_registry.py (1)

1-1: ⚡ Quick win

Refresh the module docstring after removing compatibility checks.

This module now handles snapshotting/registry concerns only; “detecting breaking changes” suggests functionality that was removed with check_compatibility(...).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/phlo/schema_registry.py` at line 1, Update the module docstring to remove
the phrase about “detecting breaking changes” and describe the module's current
responsibility: snapshotting and registry management of schemas only; reference
the removed function check_compatibility(...) in the updated text to clarify
that compatibility checking was removed and is no longer provided by this
module.
src/phlo/helpers/schema.py (1)

1-1: ⚡ Quick win

Update the module docstring to match the trimmed API.

compare_schemas, suggest_schema_migration, and assert_schema_compatible are gone, so “comparison helpers” now overstates what this module provides.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/phlo/helpers/schema.py` at line 1, Update the module docstring at the top
of the schema module to remove the reference to comparison helpers (the
functions compare_schemas, suggest_schema_migration, and
assert_schema_compatible have been removed); replace the current string
"Normalized schema construction and comparison helpers." with wording that
reflects the trimmed API such as "Normalized schema construction helpers." so
the docstring accurately describes the remaining functionality.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/phlo/schema_migration/planning.py`:
- Around line 217-244: The _validate_renames logic currently allows
chained/swapped renames by permitting a rename whose new_name already exists in
current_fields when that existing name is also a rename source; change this to
reject such cases and also reject no-op renames where old_name == new_name.
Concretely: inside _validate_renames (symbols: renames, current_fields,
desired_fields, seen_targets, renamed_sources) add an explicit check that raises
SchemaMigrationPlanningError if old_name == new_name, and change the later
validation so that any new_name present in current_fields is invalid (remove the
exception for renamed_sources) — keep the existing checks for empty names,
missing source/target in current/desired, and duplicate targets (seen_targets)
unchanged.

---

Nitpick comments:
In `@src/phlo/helpers/schema.py`:
- Line 1: Update the module docstring at the top of the schema module to remove
the reference to comparison helpers (the functions compare_schemas,
suggest_schema_migration, and assert_schema_compatible have been removed);
replace the current string "Normalized schema construction and comparison
helpers." with wording that reflects the trimmed API such as "Normalized schema
construction helpers." so the docstring accurately describes the remaining
functionality.

In `@src/phlo/schema_registry.py`:
- Line 1: Update the module docstring to remove the phrase about “detecting
breaking changes” and describe the module's current responsibility: snapshotting
and registry management of schemas only; reference the removed function
check_compatibility(...) in the updated text to clarify that compatibility
checking was removed and is no longer provided by this module.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ab6b3cf2-b5d3-4758-a59e-4e672cbc4eb3

📥 Commits

Reviewing files that changed from the base of the PR and between fb84fbc and 0fdd575.

📒 Files selected for processing (15)
  • packages/phlo-delta/src/phlo_delta/schema_migrator.py
  • packages/phlo-iceberg/src/phlo_iceberg/schema_migrator.py
  • src/phlo/cli/commands/schema_migrate.py
  • src/phlo/cli/commands/schema_migrate_contracts.py
  • src/phlo/cli/commands/schema_registry_cli.py
  • src/phlo/helpers/__init__.py
  • src/phlo/helpers/schema.py
  • src/phlo/schema_migration/__init__.py
  • src/phlo/schema_migration/instructions.py
  • src/phlo/schema_migration/planning.py
  • src/phlo/schema_registry.py
  • tests/cli/test_cli_schema_migrate.py
  • tests/data/test_schema_migration_planning.py
  • tests/data/test_schema_registry.py
  • tests/unit/helpers/test_helpers_core.py
💤 Files with no reviewable changes (2)
  • src/phlo/helpers/init.py
  • tests/data/test_schema_registry.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: cairn / report
🔇 Additional comments (3)
tests/data/test_schema_migration_planning.py (1)

1-198: LGTM!

tests/cli/test_cli_schema_migrate.py (1)

23-23: LGTM!

Also applies to: 287-287, 292-526

tests/unit/helpers/test_helpers_core.py (1)

49-49: LGTM!

Also applies to: 101-103

Comment thread src/phlo/schema_migration/planning.py
@iamgp iamgp force-pushed the GWP/schema-migration-planning branch from 0fdd575 to 5da7d5e Compare May 23, 2026 11:38
@iamgp iamgp merged commit 8e46600 into main May 23, 2026
16 checks passed
@iamgp iamgp deleted the GWP/schema-migration-planning branch May 23, 2026 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant