Skip to content

Sync measures.yaml after tidy review apply#68

Merged
daniel-thom merged 4 commits into
mainfrom
fix/tidy-measures-sync
May 10, 2026
Merged

Sync measures.yaml after tidy review apply#68
daniel-thom merged 4 commits into
mainfrom
fix/tidy-measures-sync

Conversation

@daniel-thom
Copy link
Copy Markdown
Contributor

Summary

The tidy review apply pipeline previously only synced schema.yaml, leaving measures.yaml pointing at columns that no longer exist after a wide-to-long reshape. Reproed against dsgrid-test-data/.../Commercial_End_Uses/load_data.csv: after --replace-source, measures.yaml still referenced elec_heating, elec_cooling, etc. which had been pivoted into a single value column.

Adds update_measures_yaml_for_apply in src/datasight/tidy_review.py, mirroring the existing schema sync. Wired into both the CLI apply path (cli_commands/tidy.py:865) and the web Apply endpoint (web/app.py:3387).

Behavior per disposition mode

  • keep: existing source entries stay valid; append stub for the long form's value column.
  • rename: rewrite table field on entries from the source name to rename_to (columns themselves unchanged); append value stub on the new long form.
  • replace / drop: drop entries whose columns no longer exist on the source table; append value stub on whichever table ends up holding the long form.

Seeded value entries are intentionally minimal (table + column only) so missing fields fall back to inference defaults at read time. Re-applying the same proposal is idempotent.

CLI behavior matches schema.yaml: no-op if the file is absent. Web Apply uses create_if_absent=True so an explicit user click persists across restarts.

Test plan

  • 7 new unit tests covering all 4 disposition modes + no-file no-op + create-if-absent + idempotent reseed
  • pytest -m "not integration" passes (1488 tests, +7 new)
  • prek run --all-files passes
  • Manual reseed of the dsgrid load_data reshape to verify stale entries clear

🤖 Generated with Claude Code

daniel-thom and others added 2 commits May 10, 2026 10:05
Wide-form measure columns are pivoted into a single value column on the
long form, so any pre-existing measure overrides that target those
source columns become stale (the columns no longer exist on the
relevant table). The apply pipeline previously only synced schema.yaml,
leaving measures.yaml pointing at columns that no longer exist.

Adds update_measures_yaml_for_apply, mirroring the existing schema
sync. Per disposition mode:

- keep: existing entries on source stay valid; append a stub for the
  long form's value column.
- rename: rewrite `table` field on entries from the source name to the
  rename_to name (the columns themselves are unchanged); append value
  stub on the new long-form table.
- replace / drop: drop entries whose columns no longer exist on the
  source table after the reshape; append value stub on whichever table
  ends up holding the long form.

Seeded value entries are intentionally minimal (table + column only) so
missing fields fall back to inference at read time. Re-applying the
same proposal is idempotent.

Wired into both the CLI apply path (cli_commands/tidy.py) and the web
Apply endpoint (web/app.py), matching the create_if_absent semantics
already in place for schema.yaml.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new update_measures_yaml_for_apply is at complexity 11, and the
existing _apply_review_proposals went from 11 to 12 once the new
measures.yaml sync block was added. Both follow the established
convention from the C901 enforcement PR: existing complex functions
get tagged so the rule applies to new code only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the tidy review apply pipeline to keep measures.yaml consistent with wide-to-long reshapes, mirroring the existing schema.yaml sync so measure overrides don’t keep pointing at pivoted-away wide columns.

Changes:

  • Added update_measures_yaml_for_apply in src/datasight/tidy_review.py to prune stale measure override entries and seed a stub for the new long-form value column.
  • Wired measures syncing into both the CLI apply path and the web /api/tidy/apply endpoint (with create_if_absent=True for the web path).
  • Added 7 unit tests covering disposition modes, idempotency, and create-if-absent behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/datasight/tidy_review.py Adds the measures.yaml sync helper and exports it.
src/datasight/cli_commands/tidy.py Calls the new helper after successful applies (CLI semantics: no-op if missing).
src/datasight/web/app.py Calls the new helper in the web Apply endpoint (web semantics: create file if absent).
tests/test_tidy_review.py Adds unit tests for update_measures_yaml_for_apply.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/datasight/tidy_review.py Outdated
Comment thread src/datasight/tidy_review.py Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 10, 2026

Codecov Report

❌ Patch coverage is 82.05128% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.87%. Comparing base (3282c3b) to head (8a7f23b).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/datasight/tidy_review.py 86.56% 9 Missing ⚠️
src/datasight/cli_commands/tidy.py 57.14% 3 Missing ⚠️
src/datasight/web/app.py 50.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #68      +/-   ##
==========================================
- Coverage   86.90%   86.87%   -0.04%     
==========================================
  Files          61       61              
  Lines       11462    11539      +77     
==========================================
+ Hits         9961    10024      +63     
- Misses       1501     1515      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

daniel-thom and others added 2 commits May 10, 2026 10:48
Mirrors what `datasight generate` writes: run `_infer_measure_semantics`
against the value column with the long form's id + dimension columns as
siblings, then emit all fields (role, default_aggregation,
allowed_aggregations, etc.). Functionally equivalent at read time but
gives the user visible fields to edit inline.

The dtype assumption (DOUBLE) is safe because DuckDB UNPIVOT widens
the value column to the broadest numeric type of the mapped columns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace mode now also rewrites pre-existing entries that point at the
long form's intermediate target_object_name, since that name disappears
when the long form is renamed to take over the source's name. Drop mode
now removes every entry referencing the dropped source table, not just
entries on mapped pivot columns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

@daniel-thom daniel-thom merged commit edeaafd into main May 10, 2026
12 checks passed
github-actions Bot pushed a commit that referenced this pull request May 10, 2026
Sync measures.yaml after tidy review apply
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants