Open
Conversation
Validates that the ECMWF IFS ENS MARS backfill data hosted on source.coop can be downloaded via byte-range requests and read correctly through rasterio. Tests all 20 template variables (14 sfc + 6 pl) for correct shape and finite values. The test handles both the old JSON array and new JSON-lines index formats, normalizing field names to the open data convention.
Enable the reformatter to fetch historical data from the MARS staging bucket (source.coop) for init times before 2024-04-01, routing to ECMWF open data for later dates. Key changes: - Add mars_grib_index_param and mars_read_scale_factor to EcmwfInternalAttrs for z→gh conversion (geopotential to geopotential height) - Extend grib index parsing with step filtering and missing column handling for MARS indexes (all steps in one file, cf-only indexes lack number column) - Add MARS download path in region_job using S3 byte-range downloads - Skip GRIB metadata assertions for MARS source (different field descriptions) - Add test_backfill_local_mars_source integration test running the full pipeline on 2016-03-08 MARS data, verifying temperature and precipitation output including deaccumulation
mrshll
commented
Mar 12, 2026
| grib_index_level_type="pl", | ||
| grib_index_level_value=925, | ||
| keep_mantissa_bits=11, | ||
| mars_grib_index_param="z", |
Member
Author
There was a problem hiding this comment.
I don't like this pattern -- is there a better way to declare these changes over time as internal conventions change?
mrshll
commented
Mar 12, 2026
| f"{grib_comment=} != {data_var.internal_attrs.grib_comment=}" | ||
| # MARS GRIBs have different comment/description metadata than open data, | ||
| # so we only validate these fields for open data sources. | ||
| if not coord.is_mars_source(): |
Member
Author
There was a problem hiding this comment.
I don't love this branching either -- is there a more "native" way to declare changes over the lifetime of the archive, or is it just internal logic like tihs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
mars_grib_index_paramandmars_read_scale_factortoEcmwfInternalAttrsfor handling MARS-specific differences (e.g. geopotential z → geopotential height gh conversion)Test plan
test_read_mars_staging_data— validates all 20 variables are readable from source.coop at a single step/membertest_backfill_local_mars_source— runs the full reformatter pipeline (backfill_local) on MARS data, verifying temperature and precipitation output including deaccumulation