Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
359 changes: 359 additions & 0 deletions .briefs/cloud-cover-deep-research.md

Large diffs are not rendered by default.

277 changes: 277 additions & 0 deletions .briefs/github-issue-63-nwp-fields-review.md

Large diffs are not rendered by default.

53 changes: 53 additions & 0 deletions .briefs/github-issue-pairs-source-misclassification.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# `_pairs.py` source column incorrectly set for Open-Meteo rows

## How Discovered
Found by Gemini 2.5 Pro during adversarial review of PR #65 (Open-Meteo rate limiting). The review scope was cache wiring + throttling, but the reviewer traced the data flow downstream and identified a pre-existing bug in the pairs join.

## Problem

In `packages/core/src/mostlyright/_internal/_pairs.py`, `build_pairs_row()` separates IEM MOS and Open-Meteo forecast records using the **presence of `issued_at`**:

```python
iem_records = [r for r in forecasts if r.get("issued_at")]
om_records = [r for r in forecasts if not r.get("issued_at")]
```

This split is incorrect. **Phase 20 Open-Meteo Previous Runs records carry a derived `issued_at`** (cycle math: `valid_at - publish_lag`, floored to model cycle hours). Open-Meteo records with `issued_at` set get classified as IEM records.

### Impact

When both sources are requested (`forecast_source=["iem_mos", "open_meteo"]`):

1. Open-Meteo records are mixed into the IEM MOS pool
2. Run selection may pick an Open-Meteo cycle as the "best" IEM run
3. IEM-specific aggregation processes Open-Meteo rows (different column names)
4. **Data corruption:** incorrect temperature/precipitation values in output pairs

Bug is masked when only `forecast_source="open_meteo"` is used (all records end up in `iem_records` but `_select_best_run` still picks the only available run).

## Proposed Fix

Replace the `issued_at` presence check with explicit source field inspection:

```python
iem_records = [
r for r in forecasts
if not r.get("source", "").startswith("open_meteo")
]
om_records = [
r for r in forecasts
if r.get("source", "").startswith("open_meteo")
]
```

Every record carries a `source` field (`"iem_mos"` for IEM, `"open_meteo.previous_runs"` / etc. for Open-Meteo) — unambiguous.

## Secondary Issue

The fallback block uses IEM column names. OM records from `_fetch_open_meteo_range` carry `temperature_f` / `pop_6hr_pct` / `qpf_6hr_in` (converted from Celsius), but the fallback looks for `precipitation_probability_pct`. Needs column name compatibility handling.

## Test Cases Needed

1. **Mixed source classification** — both IEM MOS and OM records; verify OM records (with `issued_at`) are NOT placed in `iem_records`
2. **Column name compatibility** — OM records from research path produce correct `fcst_high`/`fcst_low`/`fcst_pop`/`fcst_qpf`
3. **Single source regression** — `iem_mos` only and `open_meteo` only still correct
59 changes: 59 additions & 0 deletions .briefs/implementation_plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Implementation Plan: NWP Fields & Cloud Cover (Issue #63)

Fix the latent GFS precipitation duplicate-record crash and implement three new weather forecast columns (`cloud_cover_pct`, `visibility_m`, and `cloud_ceiling_m`) for HRRR and GFS models.

## Proposed Changes

### Core component (schema)

#### [MODIFY] [forecast_nwp.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/core/src/mostlyright/core/schemas/forecast_nwp.py)
- Add columns:
- `cloud_cover_pct` (float64, %, nullable)
- `visibility_m` (float64, meters, nullable)
- `cloud_ceiling_m` (float64, meters, nullable)

### Weather component (fetchers & models)

#### [MODIFY] [forecast_nwp.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/weather/src/mostlyright/weather/forecast_nwp.py)
- Implement `_pick_record(group)` helper to filter duplicate records (prioritizing instantaneous over window-aggregated and breaking ties by `record_no`).
- Update `_extract_records` to call `_pick_record` and log a warning instead of raising `GribIntegrityError` when `len(group) > 1`.
- Add short-name lookups directly to `_GRIB_VAR_TO_CFGRIB_NAME`:
- `("TCDC", "entire atmosphere"): "tcc"`
- `("VIS", "surface"): "vis"`
- `("HGT", "cloud ceiling"): "gh"`
- Register new columns in `nullable_numeric_cols` and `_empty_dataframe`.

#### [MODIFY] [gfs.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/weather/src/mostlyright/weather/_fetchers/_nwp_grids/gfs.py)
- Add to `VARIABLE_MAP`:
- `"cloud_cover_pct": ("TCDC", "entire atmosphere")`
- `"visibility_m": ("VIS", "surface")`
- `"cloud_ceiling_m": ("HGT", "cloud ceiling")`

#### [MODIFY] [hrrr.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/weather/src/mostlyright/weather/_fetchers/_nwp_grids/hrrr.py)
- Add to `VARIABLE_MAP`:
- `"cloud_cover_pct": ("TCDC", "entire atmosphere")`
- `"visibility_m": ("VIS", "surface")`
- `"cloud_ceiling_m": ("HGT", "cloud ceiling")`

#### [MODIFY] [rules_nwp.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/weather/src/mostlyright/weather/qc/rules_nwp.py)
- Add QC rules to `RULES_NWP_NCEP`:
- `cloud_cover_pct` $\in [0, 100]$
- `visibility_m` $\ge 0$
- `cloud_ceiling_m` $\ge 0$

### Test component

#### [MODIFY] [test_forecast_nwp.py](file:///Users/zach/.openclaw/workspace-chad/mostlyright-sdk/packages/weather/tests/test_forecast_nwp.py)
- Add `TestDisambiguationHeuristics` to test `_pick_record` logic with synthetic indices.
- Add GFS live smoke test to ensure no crashes.
- Add test coverage for new fields.

## Verification Plan

### Automated Tests
- `uv run pytest -m "not live" -q`
- `uv run pytest -k "test_forecast_nwp_live" -q` (smoke live check for HRRR + GFS)
- `uv run ruff check --fix . && uv run ruff format .`

### Manual Verification
- Verify generated schemas JSON files under `schemas/json/`.
Loading
Loading