Skip to content

Phase 25: GOES ABI L2 satellite ingest (mostlyrightmd-weather[satellite], free local tier)#78

Open
helloiamvu wants to merge 51 commits into
mainfrom
phase25/integration
Open

Phase 25: GOES ABI L2 satellite ingest (mostlyrightmd-weather[satellite], free local tier)#78
helloiamvu wants to merge 51 commits into
mainfrom
phase25/integration

Conversation

@helloiamvu

@helloiamvu helloiamvu commented Jun 18, 2026

Copy link
Copy Markdown
Member

Phase 25 — GOES ABI L2 Satellite Ingest (free local tier)

Ports the battle-tested GOES-16/19 ABI Level-2 single-pixel extractor from the old monorepo (Tarabcak/mostlyright@sprint2/goes-satellite, the "2i" generation) into the SDK as a new optional extra mostlyrightmd-weather[satellite], mirroring the [nwp] extra and the forecast_nwp() pipeline shape. Python-only; ships the FREE local tier only (no hosted backend, no paid adapter — reserved via the schema's delivery lineage column).

34 files changed, +11,260 / −15. Base: origin/main (v1.7.0, 98028a9).

What's in it (5 waves)

  • W0 — schema/exceptions/packaging: SatelliteSchema (schema.satellite.v1, source identity noaa_goes, per-row source overlay, delivery {live,hosted} lineage, qc_status, nullable as_of_time, ICAO ^[A-Z]{4}$ hook), SatelliteError + 5 rehomed Goes* exceptions, [satellite] extra (boto3/s3fs/gcsfs/h5netcdf/xarray/numpy), codegen wired with the standard .dev $id.
  • W1 — extractor engine (_fetchers/_goes_extract.py): byte-faithful port of the PRODUCTS registry + _apply_scale_offset/_apply_valid_range + ABI projection inversion + dual-projection DSRF branch. All 5 load-bearing NOAA quirks preserved (LST [100,350]K floor, Cloud_Probabilities valid_range_filter=False, _Unsigned DSR int16→uint16, ACM units='', per-satellite DSRF grid split). Only intended edits: station.icao identity, delivery/qc_status/as_of_time stamping, exception rehome, and the D5 units-mismatch suspect-and-continue (annotate-never-drop).
  • W2 — transport + cache (_fetchers/_goes_s3.py, cache.py): AWS (anon s3fs) / GCP (anon gcsfs) mirror switch, single full-object read into BytesIO (D3 — not lazy fs.open, no byte-range), size cap + shape validation, parquet cache tier (~/.mostlyright/cache/satellite/..., filelock, atomic write), _dedup_satellite_rows first-seen-wins.
  • W3 — public satellite() fetcher: mirror enum validated pre-I/O, lazy-import guard (s3fs + gcsfs) with SourceUnavailableError hint, leakage wiring (event-time scan_start/end vs knowledge-time as_of via KnowledgeView), df.attrs['source']='noaa_goes' + per-row overlay, DSRF one-time gating warning.
  • W4 — fleet backfill CLI + throughput probe (satellite/_backfill.py, _probe.py, __main__.py): array-job-friendly (satellite,year,month) slices, full-identity crash-safe resume, Thread/Process executor split, --mirror, named probe-derived concurrency constants. The live throughput probe is @pytest.mark.live (CI-excluded) and was not run.

Tests & review

  • Full non-live Python suite green (pytest -m "not live"), ≥80% line coverage on the new modules (measurable lane in scripts/satellite_coverage.sh works around the numpy/C-tracer reload conflict). TS pre-push gate (typecheck + vitest) green.
  • Reviewed by an internal multi-agent loop (3 iterations × 4 dimensions) plus 3 rounds of codex review. Fixed across rounds: 2 P1 (unpicklable ProcessPool worker; resume key dropping product/station → silent data loss) + 4 P2 (--out ignored; 3D-profile shape-gate axis order; event-time window filtering; current/future slice over-completion; retrieved_at stamping). Gate currently PASS (no P1).

Known follow-ups (codex P2 — backfill↔cache integration depth, not blockers)

  1. Backfill caches raw rows, not finalized rows (_backfill.py) — cached partitions miss source/event_time/knowledge_time/retrieved_at/qc_status, so backfill output shape ≠ live satellite() shape. Most important — finalize backfill rows through the same path as live before this is used for training.
  2. satellite(cache=True) does not read cached partitions (satellite/__init__.py) — the public read path always re-fetches; wire cache reads for elapsed months.
  3. Progress lock not crash-recoverable (_backfill.py) — a killed run strands later resumes with ProgressLockBusy; add dead-PID / stale-lock recovery.

Notes

  • Commit history is verbose (49 commits, full TDD RED/GREEN). Wave 0's original commits landed on a stray branch and the surface was rebuilt + recovered by the review loop, so history isn't perfectly linear. Recommend squash-merge.
  • TS parity: the TS satellite reader is an explicitly deferred parity ticket (future h5wasm impl) per CROSS-SDK-SYNC.md — no TS changes here.

python_only: true — satellite ingest ships Python-first; the TS satellite reader is a deferred h5wasm parity ticket per CROSS-SDK-SYNC.md, so this PR carries no TypeScript surface change.

  • Scope deferred: paid/hosted adapter, the 28 TB fleet backfill run, DSRF-at-scale productionization.

🤖 Generated with Claude Code

minereda and others added 30 commits June 18, 2026 12:19
…ptions + ICAO hook

- Add [satellite] optional extra to packages/weather (boto3/s3fs/gcsfs/h5netcdf/xarray/numpy/pandas)
- Rehome 2i GOES typed exceptions into core.exceptions (SatelliteError base +
  GoesS3Error/GoesDataCorruptError/StationOutOfGridError/ProductNotRegisteredError/UnitsContractError)
- Add core.schemas.satellite.validate_satellite_station (ICAO ^[A-Z]{4}$ hook)

[Rule 3 - Blocking] 25-01 dependency + Wave 0 [satellite] extra never landed in this
repo (stale 25-01-SUMMARY describes a prior run whose commits do not exist); created the
minimal 25-01 surface this Wave 1 port consumes so the verbatim extractor port can import
its exceptions + ICAO hook.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y quirk)

Task 1: 6 quirk fixtures in conftest.py (LST [100,350]K floor, Cloud_Probabilities
physical-units valid_range, _Unsigned DSR int16->uint16, ACM no-units, DSRF
dual-projection goes16 lat/lon + goes19 ABI, multi-var units-mismatch) + fixture
smoke tests. All in-memory, zero network, zero checked-in binary NetCDF. Mirrors
test_forecast_nwp.py's in-memory dataset stubbing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…_scan_times

Task 2 RED gate: failing tests for the registry (all products + grid_shape_expected +
DSRF 5424x5424 split + Cloud_Probabilities valid_range_filter=False + LST [100,350]K +
ACM units=''), the ABI scan-angle projection (exactness vs independent forward + out-of-grid
StationOutOfGridError), the regular lat/lon DSRF branch, and stdlib-only parse_scan_times.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…scan_times (byte-faithful)

Task 2 GREEN: create _fetchers/_goes_extract.py with the verbatim 2i extraction
engine — ProductVariable + full PRODUCTS registry (incl grid_shape_expected per
product, every load-bearing NOAA comment preserved), get_product_variable/
products_in_tier/variables_for_product, the ABI scan-angle projection
(_read_projection_params/_read_grid_params/latlon_to_abi_xy/compute_pixel_indices),
the regular lat/lon DSRF branch (_read_lat_lon_*/latlon_to_ll_pixel), and
stdlib-only _parse_goes_ts/parse_scan_times. _KNOWN_PRODUCTS derived for the
public fetcher. Exceptions import from mostlyright.core.exceptions (the ONLY
coupling severed). The decode + record-build functions ship in the same module
file (exercised by Task 3).

Byte-faithful: per-file ruff ignore (RUF001/002/003/046) preserves the verbatim
NOAA comments' Unicode and the verbatim int(round()) projection arithmetic
without altering the port.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…spect-continue

Task 3: cover _apply_scale_offset (_Unsigned DSR positive + signed-negative sanity +
_FillValue->NaN), _apply_valid_range (Cloud_Probabilities filter=False pass-through +
LST [100,350]K floor survives), _read_pixel_dqf (None path + declared-but-missing raises),
and _extract_from_dataset: ACM units='' quirk, DSRF dual-projection routing (lat/lon vs
ABI), _FillValue->pixel_value None, ICAO build (station=KNYC, delivery/qc_status/as_of_time
present, no source column) + non-ICAO loud reject, the P2-c units-mismatch
suspect-and-continue (both vars emit rows, no UnitsContractError, none dropped),
3D-profile pressure-loop, missing-variable skip, no-projection raise, and the registry
helpers. Module coverage 90% (>=80% gate).

conftest/test use plain optional numpy/xarray imports + pytestmark skip (not top-level
importorskip) to avoid a coverage+importorskip double-import skip on Python 3.14.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-cap/shape gates

- assert _get_s3_client UNSIGNED us-east-1, thread-local anon s3fs
- assert D9 gcp branch selects gcsfs(token='anon') + gcp-public-data buckets
- assert single full-object read into BytesIO (cat_file once, no lazy fs.open to xarray)
- assert size-cap rejects pre-read + grid_shape_expected validation post-open (both mirrors, DSRF split)
- assert available_since clamp + list_product_keys captures Size on both mirrors

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e full-object read

- AWS (boto3 UNSIGNED + thread-local anon s3fs) + GCP (anon gcsfs token=anon) provider switch
- _get_fs/_get_buckets threaded through list_product_keys + extract_pixel; ValueError on unknown mirror
- D3: _read_full_object reads the ENTIRE object via fs.cat_file (fallback .read()) into BytesIO;
  xr.open_dataset(io.BytesIO, engine=h5netcdf, mask_and_scale=False, decode_times=False) — never lazy fs.open
- P2-d: per-product size cap rejects pre-read; _validate_dataset_shape rejects post-open (DSRF goes16/goes19 split)
- available_since clamp (goes16 2017-05-24 / goes19 2024-11-15) on both mirrors; list_product_keys captures Size
- _open_and_extract is the SHARED mirror-agnostic pipeline; only fs+bucket differ

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ositions

- _dedup_satellite_rows first-seen-wins on the 6-tuple; mirror-invariant collapse
- _validate_satellite_record returns dispositions (clean vs findings), never quarantine
- units/3D-pressure/physics-bounds/dqf-nullability/source-key checks + M4 scan-time + _FillValue-clean

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…date dispositions

- _dedup_satellite_rows: first-seen-wins on (station,satellite,product,variable,pressure_level_hpa,scan_start_utc)
- key has no mirror component (D9) — same scan from AWS/GCS collapses to one row
- _validate_satellite_record: returns list[Finding] (annotate-never-drop, D5), never quarantine
- ports 2i semantic checks: units/3D-pressure/physics-bounds/dqf-nullability/source-key + M4 scan-time
- pixel_value=None (_FillValue) is a clean data condition; registry via lazy import (no core->weather cycle)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…riant + roundtrip)

- satellite_cache_path layout v1/satellite/{sat}/{product}/{station}/{YYYY}/{MM}.parquet, no mirror segment
- P2-e: reject bad station (ICAO), bad satellite (enum + no sep), bad product (_KNOWN_PRODUCTS + no sep)
- write empty no-op, write->read roundtrip, merge-dedup-on-existing-partition, invalidate, A6 current-month skip

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ite + dedup-on-merge

- satellite_cache_path: v1/satellite/{sat}/{product}/{station}/{YYYY}/{MM}.parquet, NO mirror segment (D9)
- P2-e: validate station (ICAO) + satellite (enum, no sep) + product (_KNOWN_PRODUCTS, no sep) + assert_path_under
- read/write/invalidate_satellite mirroring the forecast tier; current-UTC-month skip (A6)
- write reads-existing -> concat -> _dedup_satellite_rows -> _atomic_write (single chokepoint); empty no-op
- collapses the 2i staging->merge->R2 dance to one direct per-partition write; no staging, no R2

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…cer + DSRF gating

- cheap-validation (product/satellite/date/mirror) before I/O + lazy-guard
- D9 mirror threaded to transport; gcsfs covered by SourceUnavailableError guard
- ICAO resolve via _resolve_station_infos (alias dedup, skip-unknown)
- qc_status worst-wins reducer + units-contract->suspect + annotate-never-drop
- DSRF one-time gating warning (mirror-agnostic)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…qc reducer + DSRF gating

- satellite/ is a PACKAGE re-exporting satellite(); weather pkg re-exports it
- cheap validation (product/satellite/date/mirror) raises ValueError before any
  I/O and before the lazy-import guard (forecast_nwp.py:710 parity, D9)
- lazy-import guard covers boto3/s3fs/gcsfs/h5netcdf/xarray ->
  SourceUnavailableError with [satellite] install hint (gcsfs for the GCP mirror)
- mirror threaded transport-only to list_product_keys/extract_pixel (D9);
  source identity stays noaa_goes, no mirror row column
- ICAO resolve via _resolve_station_infos (alias dedup, skip-unknown)
- qc_status worst-wins reducer (severity inversion) + units-contract->suspect
  defensive boundary + annotate-never-drop (no quarantine)
- DSRF one-time gating warning (mirror-agnostic), steers to backfill CLI
- backend/return_type validated up-front, _maybe_wrap_satellite mirrors NWP wrap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…KnowledgeView

- event_time=scan_start (tz-aware UTC); knowledge_time=ingested_at else fetch ts
- per-row source='noaa_goes' + df.attrs['source']='noaa_goes' (validator needs both)
- D9 mirror-invariant identity (no mirror column/attr on either frame)
- typed as_of filtering via KnowledgeView (TimePoint|datetime|None); naive rejected
- assert_no_leakage accepts the frame; LeakageDetector raises on a leaking row

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ve, variable filter, wrapper wrap

- guarded pandas import (Py3.14 + coverage numpy double-load avoidance)
- module __getattr__ rejects unknown attrs; transport names resolvable
- empty station list -> empty frame, no I/O, source attr still stamped
- variable= filter keeps only the requested variable
- return_type=wrapper routes through _maybe_wrap_satellite non-default path

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, executor split, mirror, probe-derived constants)

- Per-(satellite,product,station,year,month) slice -> write_satellite_cache (no staging/R2)
- D9 mirror thread-through (transport-only, cache partition mirror-invariant)
- available_since clamp skip; ICAO resolve; Thread/Process executor split
- FIX-2 provenance lock: _GOES_S3_RATE_HZ + _DEFAULT_MAX_WORKERS vs SOURCE-LIMITS.md

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ead/Process split, mirror, probe-derived constants

- backfill_goes_satellite: per-(sat,product,station,year,month) slice -> direct write_satellite_cache (D8, no staging/upload)
- bulk_backfill: ICAO resolve + slice enumeration + Thread/Process executor split (D7)
- D9 mirror threaded into every list_product_keys/extract_pixel; cache partition mirror-invariant
- available_since clamp skips pre-availability slices with no I/O
- FIX-2: _GOES_S3_RATE_HZ + _DEFAULT_MAX_WORKERS named constants with SOURCE-LIMITS.md provenance comments (NOMADS_CONCURRENCY_CAP-style); package-co-located conservative-pending seed
- ProgressLockBusy/ProgressCorrupt SatelliteError subclasses; result dataclasses

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…LI (--mirror, path hardening)

- resume skip / --no-resume (locks but no read/write) / failed-not-marked
- atomic-barrier (os.sync before mark, tmp+parent fsync, os.replace) / .bak fallback / both-torn-loud
- key+value validation (malformed key, bad value, invalid month rejected)
- single-writer O_CREAT|O_EXCL lock (double-start ProgressLockBusy, PID+host, released in finally)
- CLI backfill dispatch + --mirror aws|gcp (default aws, invalid rejected) + --no-resume + --executor + P2-e malicious satellite/product rejection

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…or, P2-e path hardening)

- __main__.py argparse CLI: backfill subcommand (--satellites/--products/--stations/--year-start/--year-end/--out/--max-workers/--resume/--no-resume/--executor/--mirror) + probe subcommand stub dispatch
- --mirror aws|gcp via argparse choices (default aws; unknown rejected pre-run); threaded to bulk_backfill(mirror=...)
- P2-e: _validate_partition_components rejects malicious --satellites/--products at the boundary (reuses cache enum + _KNOWN_PRODUCTS + no-path-separator) before any mkdir/write
- resume skip/--no-resume(lock-only)/failed-not-marked + fsync-durable .bak progress + O_CREAT|O_EXCL lock all GREEN

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- run_probe mockable measurement loop (synthetic per-N sweep, network-free in CI)
- derive_max_workers (knee=8 on a flattening curve) + derive_rate_cap (finite, deterministic)
- FIX-2 BOTH artifacts: findings (per-N table + Summary + Scope + derived recs) AND SOURCE-LIMITS satellite section
- provenance round-trip: read_source_limits_satellite returns the written values; parses the package seed
- probe CLI dispatch (network-free) + @pytest.mark.live real-NOAA probe (CI-excluded)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…derives shipped constants

- run_probe: injectable measurement loop (ListObjectsV2 latency + single-file throughput + 1/4/8/16/32 concurrency sweep); live funcs hit NOAA, CI injects synthetic inputs (network-free)
- derive_max_workers: diminishing-returns/throttle knee; derive_rate_cap: knee_N/p50 floored at conservative-pending — both deterministic
- FIX-2 BOTH artifacts: findings (Summary + per-N table + Scope + derived recs + re-run cmd) AND idempotent SOURCE-LIMITS.md satellite section
- read_source_limits_satellite reader (table + bullet-seed forms) closes the provenance round-trip the Task-1 lock depends on
- live real-NOAA probe stays @pytest.mark.live (CI-excluded)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…tions

- satellite.md: cheap-CONUS (ACMC) steering, DSRF gating, 28TB/whole-file/near-data reality
- --mirror aws|gcp documented
- max_workers + S3 rate cap as probe-DERIVED constants (probe pointer + SOURCE-LIMITS; no bare UNTUNED)
- README satellite section pointer + [satellite] install line

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- docs/satellite.md: cheap-CONUS (ACMC) steering, DSRF gating, leakage/qc, cache, --mirror aws|gcp
- 28TB/whole-file-not-byte-range/near-data-in-region reality; deferred-paid-adapter note (shared noaa_goes identity + delivery lineage)
- max_workers + S3 rate cap documented as probe-DERIVED constants (probe pointer + SOURCE-LIMITS provenance) — bare UNTUNED caveat replaced
- README GOES satellite (Phase 25) section: doc pointer + [satellite] install line

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Apply _dedup_satellite_rows to the assembled row list before
_assemble_dataframe, mirroring cache.write_satellite_cache. NOAA reprocesses a
scan under a new creation-time token with identical scan_start; both keys list
+ extract to rows sharing one 6-tuple dedup key. The live path now collapses
them, honoring the documented deduped first-seen-wins invariant instead of
silently double-counting a scan.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…(P2-b)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add the full schema.satellite.v1 Schema subclass (singular
_registered_source='noaa_goes' per D2; delivery {live,hosted} + qc_status
{clean,flagged,suspect} enums; nullable as_of_time; the 18 ported 2i fields +
the per-row source overlay the validator requires). Register it in
core/schemas/__init__.py and add schema.satellite.v1 to the codegen
_GROUP_A_SCHEMA_IDS so schemas/json/schema.satellite.v1.json is emitted
deterministically (.dev namespace). The source-identity validator now
reconciles df.attrs['source'] AND the per-row source column for satellite rows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…2-b)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion (P2-b)

Wire validate_dataframe(df, 'schema.satellite.v1') into the live path via
_validate_against_schema, run on a typed projection copy (RFC3339-Z string scan
times coerced to tz-aware datetimes; all-null nullable-float columns coerced to
float64) so the returned frame stays byte-faithful. The validator's
source-identity invariant (df.attrs['source'] AND the per-row source column both
'noaa_goes') now actually executes — a tampered source raises loudly.
Annotate-never-drop suspect sentinel rows (no parseable scan_start) are excluded
from the strict dtype/null check but still ship (D5).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dges (P2)

Adds the missing-coverage tests flagged in review: (1) run_probe twice into a
SOURCE-LIMITS.md that already has a satellite section AND an unrelated '## AWC'
section — asserts the satellite section is replaced (not duplicated) and the AWC
section survives (the module's headline don't-clobber safety claim); (2)
derive_max_workers empty-sweep floor + errors-before-flatten break, derive_rate_cap
degenerate p50<=0 floor; (3) run_probe(mirror='azure') ValueError; (4)
read_source_limits_satellite None branches. _probe.py coverage 74% -> 100%.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
minereda and others added 19 commits June 18, 2026 15:15
…tion (P2)

Network-free tests for the loud-failure branches carrying the GoesS3Error
contract: _list_aws fail-fast (_S3_FAIL_FAST_CODES -> immediate raise, no sleep)
+ retryable ClientError/EndpointConnectionError (retry _MAX_S3_RETRIES then
raise, backoff asserted); _list_gcp FileNotFoundError (empty-hour -> []) +
OSError retry; extract_pixel transient retry-then-success + retry-exhaustion. A
zero-interval _NOOP_LIMITER isolates backoff sleeps from limiter pacing.
_goes_s3 coverage 81% -> 94%.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…FileLock

P2 review fix: the existing-partition read (pq.read_table) ran OUTSIDE the
FileLock that _atomic_write acquired, so the lock serialized only the write
half. Two writers targeting the same (satellite, product, station, year,
month) partition could both read the same existing rows and the second
os.replace would clobber the first writer's rows (lost update) — the exact
hazard that distinguishes the read-modify-write satellite tier from the
overwrite-only forecast tier.

Factor the inner write out of _atomic_write into _write_table_unlocked
(assumes the caller holds the lock) and wrap the whole
read->concat->_dedup_satellite_rows->write sequence in write_satellite_cache
under a single FileLock acquisition. Correct the docstring that previously
claimed the merge was a single atomic chokepoint.

Add test_write_satellite_cache_serializes_read_under_lock: with an external
holder owning the partition lock, the write blocks on acquisition before the
read fires (asserts the read tripwire never executes and the acquire times
out).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…readers

P2 review fix: the 'malformed NetCDF attrs -> GoesDataCorruptError' paths
were only half-covered. Tests exercised the no-projection-variable case but
never constructed a dataset WITH the projection/grid/latlon variable present
but MISSING a single required attr, so the per-attr raise branches in
_read_projection_params (line 421/432), _read_grid_params (line 443/448), and
_read_lat_lon_grid (line 541/547) were unexercised.

Add TestMalformedProjectionAttrs: drop semi_major_axis from
goes_imager_projection, drop x.scale_factor from the grid, and drop
lat.add_offset from the lat/lon grid — each asserts GoesDataCorruptError with
the expected message. Attrs are copied before mutation so the shared fixture
dicts are not corrupted across tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
P2 review fix: coverage could not be MEASURED for _goes_extract, _goes_s3,
_internal/merge/satellite, and core/schemas/satellite. The default
coverage-gate runs --cov-branch, which forces coverage's C tracer; any
subprocess importing numpy/pandas under the C tracer raises
'numpy: cannot load module more than once per process', so these four modules
were skipped and the >=80% gate on them was inferred, not proven.

Add a dedicated lane that measures LINE coverage under the sys.monitoring
(sysmon) backend (branch=false in .coveragerc-satellite, COVERAGE_CORE=sysmon),
which does not trip the numpy reload. Uses path-based include globs (the
modules import lazily inside test bodies; dotted source does not attach
reliably under sysmon) and runs the four satellite test files in a single
process so numpy loads exactly once. scripts/satellite_coverage.sh makes it
reproducible locally; a new satellite-coverage CI job runs it.

Measured: _goes_extract 95%, _goes_s3 96%, merge/satellite 86%,
schemas/satellite 100% (TOTAL 94%) — all above the 80% floor, now provable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reproduce all four confirmed review findings with failing tests before fixing:

- P1-1: assert the ProcessPool worker + its submitted item are picklable, and
  that a real ProcessPoolExecutor run completes without PicklingError.
- P1-2: assert the resume progress key distinguishes product AND station, and
  that a resume run does not over-skip sibling product/station slices.
- P2-1: assert backfill --out is honored for the parquet write (partition lands
  under --out, not the home/env cache root).
- P2-2: assert a 3D profile var (pressure x lat x lon) for ABI-L2-LVMPC passes
  the shape check while a wrong spatial grid still raises GoesDataCorruptError.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
P1-1: move the bulk_backfill pool worker from a nested closure to a module-level
_run_slice that takes a fully-picklable item tuple
(station, satellite, product, year, month, out, mirror, max_workers). The 2i
nested _run captured out/mirror/max_workers and raised PicklingError on every
pool.submit under executor="process", so the documented DSRF process-pool path
never ran. Run-wide params now travel inside the picklable item.

P1-2: _progress_key now encodes the FULL slice identity
({satellite}_{product}_{station}_{YYYY}_{MM}); both callers (skip-check and
completion-mark) pass product + station.icao. The 2i key dropped product AND
station, so completing one (product, station) slice silently skipped every other
(product, station) in the same satellite-month on resume though their partitions
were never written. _PROGRESS_KEY_RE + _validate_progress updated to the new
5-component schema; existing resume tests reseeded with full-identity keys.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
P2-1: write_satellite_cache (and satellite_cache_path / read_satellite_cache /
invalidate_satellite for symmetry) gain an optional cache_root override.
backfill_goes_satellite threads its out= directory through as cache_root, so the
parquet partition lands under --out (the CLI-advertised output dir) instead of
the home/env cache root — previously --out only received the progress files
while the parquet went elsewhere. When cache_root is None every existing caller
and the forecast/observation/climate tiers resolve _cache_root() byte-for-byte
unchanged; the assert_path_under backstop validates against the active root.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gate

P2-2: _validate_dataset_shape rejected valid ABI-L2-LVMPC / ABI-L2-LVTPC files.
Those registered 3D profile products carry a leading pressure axis, so the data
variable is 3D (pressure, lat/y, lon/x) while grid_shape_expected is the 2D
spatial grid — the gate raised GoesDataCorruptError before the extractor's
3D-profile pressure loop could emit one row per level. For is_3d_profile
products the gate now validates only the trailing two (spatial) dims against the
registry: a genuine profile file passes, while a wrong spatial grid (or a
non-3D shape) still fails loudly. 2D products and the DSRF goes16/goes19 split
are untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ntity

ruff format reflow of the P1-2/P2-1 edits (regex one-liner, write_satellite_cache
call, reseeded progress-key test literals) and update the module docstring's
resume-key description from the old {satellite}_{year}_{MM} to the full-identity
{satellite}_{product}_{station}_{YYYY}_{MM} (P1-2). No behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…axis (y,x,pressure) (P2-3)

The byte-faithful extractor (_goes_extract.py expected_3d = ("y","x","pressure"))
and the real ABI-L2-LVMPC/LVTPC files carry the pressure axis TRAILING, but the
_goes_s3.py shape gate wrongly assumed a LEADING pressure axis (pressure,y,x).
Correct the two prior 3D-profile gate tests to the real (y,x,pressure) layout and
add an end-to-end test that flows a synthetic LVMPC dataset through BOTH the gate
and the real _extract_from_dataset 3D-profile loop (one row per level).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ING pressure (P2-3)

A prior fix wrongly assumed a LEADING pressure axis (pressure, y, x) in the
_goes_s3.py transport shape gate and validated spatial = actual[-2:], so the gate
accepted (pressure, y, x) while the byte-faithful extractor demands
expected_3d = ("y", "x", "pressure") (pressure TRAILING). Real ABI-L2-LVMPC/LVTPC
files use the trailing layout, so they failed end-to-end regardless.

Validate spatial = actual[:2] (LEADING two dims) against grid_shape_expected and
require len(actual) == 3; update the comment + error message to state the pressure
axis is TRAILING (layout (y, x, pressure)). The extractor is unchanged (correct).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ndow (P2-1)

satellite() fetches whole UTC days via _days_in_range but never filters the
emitted rows back to [start, end], so a sub-day window (e.g. 12:00-13:00) returns
every scan on that date. Add a sub-day-window test (only in-window scans), an
inclusive-boundary test, and a date-granular (midnight) whole-day regression guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ow (P2-1)

satellite() widens the FETCH to whole UTC days via _days_in_range (scans list at
the day grain) but never filtered rows back to [start, end], so a sub-day window
returned every scan on the boundary days. Add _event_time_window: a midnight
(date-granular) end extends to 23:59:59.999999 of that UTC day (preserving the
documented whole-day behavior), an end carrying a sub-day time filters precisely.
Drop rows whose event_time (scan_start_utc) falls outside the inclusive bounds;
keep degenerate suspect SENTINEL rows (event_time None) per annotate-never-drop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ompleted (P2-2)

A current/future-month slice writes nothing (the cache skips the incomplete
current month; future listings return no rows) yet also produces no errors, so
the prior 'if resume and not res.errors' wrongly marked it completed and a later
resume PERMANENTLY skipped it. Add tests asserting future months are NOT marked
completed (and re-attempted on resume) while a fully-elapsed past month IS, plus a
_is_month_fully_elapsed predicate test.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eted (P2-2)

The prior gate marked a slice completed whenever 'resume and not res.errors', but
a current/future month writes nothing (the cache skips the incomplete current
month; future listings return no rows) and produces no errors — so it was marked
completed and a later resume PERMANENTLY skipped it even once data existed.

Add _is_month_fully_elapsed (the first instant of the next UTC month <= now) and
_is_current_utc_month (mirrors the cache tier's current-UTC-month skip). Mark a
slice completed only when it is genuinely terminal: the month is fully elapsed,
or it persisted rows AND is not the current month (the rows clause is guarded so
a current-month no-op write cannot re-introduce the bug). Full-identity key
unchanged (P1-2).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
_assemble_dataframe sets df.attrs['source'] but not df.attrs['retrieved_at'], so
return_type='wrapper' falls back to a synthetic datetime.now() instead of the real
fetch time. Add a test asserting the returned frame carries a tz-aware UTC
df.attrs['retrieved_at'] equal to the per-row fetch timestamp, and a wrapper test
asserting wrap_result receives that real value (not None).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…me (P2-4)

_assemble_dataframe set df.attrs['source'] but not df.attrs['retrieved_at'], so
return_type='wrapper' fell back to a synthetic datetime.now() (via wrap_result)
instead of the real fetch timestamp already computed in satellite(). Thread the
existing retrieved_at into _assemble_dataframe and stamp the attr, mirroring
forecast_nwp.py:1076. The per-row retrieved_at is unchanged. Update the
leakage-test _tamper stub to the new _assemble_dataframe signature.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Apply ruff format line-wrapping to the P2-2 terminal-gate expression and the
P2-2 test comprehensions, and replace an ambiguous EN DASH (RUF002) with a
hyphen in the P2-1 _event_time_window docstring. No behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Docs-required check: PASS

API-surface change includes docs updates — no reminder needed.

API-surface files changed:

packages/core/src/mostlyright/core/exceptions.py
packages/core/src/mostlyright/core/schemas/__init__.py
packages/core/src/mostlyright/core/schemas/satellite.py
packages/weather/src/mostlyright/weather/__init__.py
packages/weather/src/mostlyright/weather/_fetchers/_goes_extract.py
packages/weather/src/mostlyright/weather/_fetchers/_goes_s3.py
packages/weather/src/mostlyright/weather/cache.py
packages/weather/src/mostlyright/weather/satellite/SOURCE-LIMITS-satellite.md
packages/weather/src/mostlyright/weather/satellite/__init__.py
packages/weather/src/mostlyright/weather/satellite/__main__.py
packages/weather/src/mostlyright/weather/satellite/_backfill.py
packages/weather/src/mostlyright/weather/satellite/_probe.py
packages/weather/src/mostlyright/weather/satellite/_resolve.py

Docs files changed:

README.md
docs/satellite.md
packages/weather/src/mostlyright/weather/satellite/SOURCE-LIMITS-satellite.md

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Parity ticket gate: PASSED

parity-ticket-check: Python-side trigger surface touched; opt-out satisfied (parity ticket, python_only flag, or label).

See CROSS-SDK-SYNC.md §2 for the workflow.

minereda and others added 2 commits June 18, 2026 18:01
…suite skips them

The base CI fast-suite / pandas-3 / polars / coverage-gate run
`uv sync --all-packages` WITHOUT the [satellite] optional extra; only the
dedicated satellite-coverage lane installs it. The satellite test modules that
exercise the transport (boto3/s3fs/xarray) imported it at collection or runtime
and crashed the no-extra suite (ModuleNotFoundError: boto3).

Guard them behind the extra (matching the existing test_satellite_extract.py
pattern): module-level pytestmark skipif on _s3/_backfill/_probe/_leakage/_cache,
fixture/per-test importorskip on the transport-exercising tests in test_satellite.py.
The import-cleanliness contract + validation tests still run in the base suite;
the satellite-coverage lane (extra installed) runs the full set.

Verified by reproducing CI's no-extra env locally (0 failures, satellite tests
skip) and confirming with-extra still runs+passes all satellite tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wraps the long pytest.importorskip(...) lines added in the prior commit so
`ruff format --check` (run by the CI fast-suite before pytest) passes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants