perf(weather): Open-Meteo forecast path uncached + weight-blind throttle → users hit free-tier rate limits in backtests

## Problem

The Open-Meteo forecast path (`research(..., forecast_source="open_meteo")` and the standalone `fetch_open_meteo()`) is shaped in a way that makes users hit Open-Meteo's **free-tier rate limits** quickly at backtesting scale. Three compounding issues, the first being dominant.

Open-Meteo's free tier is **600 calls/min, 5,000/hr, 10,000/day**, and critically it **bills by *weighted* call cost, not request count**: a request counts as more than one call when it exceeds **10 variables** *or* **14 days**, and the two **multiply** — `weight ≈ max(vars/10, 1) × max(days/14, 1) × locations` ([Open-Meteo pricing](https://open-meteo.com/en/pricing); their examples: 15 vars × 14 days = 1.5 calls, 15 vars × 28 days = 3.0 calls).

Every call we make requests **18 hourly variables** (`_OM_VARIABLES_TO_FETCH`, [`_open_meteo.py:75`](packages/weather/src/mostlyright/weather/_fetchers/_open_meteo.py#L75)) over the **full `from_date..to_date`** with no chunking ([`research.py:1404`](packages/core/src/mostlyright/research.py#L1404)). So the weighted cost of a single `research()` forecast call is `1.8 × max(days/14, 1)`:

| Window | Weighted cost of 1 call | Calls until 600/min ceiling |
|---|---|---|
| 7–14 days | 1.8 | ~330 |
| 30 days | ~3.9 | ~150 |
| 90 days | ~11.6 | ~50 |
| 1 year | **~47** | **~13** |
| 2 years | ~94 | ~6 |

### 1. The forecast cache exists but is never wired in — every run re-fetches (dominant amplifier)

`read_forecast_cache` / `write_forecast_cache` / `forecast_cache_path` ([`cache.py:542`](packages/weather/src/mostlyright/weather/cache.py#L542), [`:571`](packages/weather/src/mostlyright/weather/cache.py#L571)) were built in Phase 20 (OM-06) but are referenced **only in `test_cache_forecasts.py`** — there is no production caller. `_fetch_open_meteo_range` ([`research.py:1384`](packages/core/src/mostlyright/research.py#L1384)) calls the network directly with no cache read/write.

Previous-runs / single-runs / seamless data is **immutable** (historical forecast cycles never change), yet a quant iterating on a model re-fetches identical data on every run. This turns one legitimate fetch into 5–50.

### 2. The politeness throttle counts requests, not weight — false safety

`_OM_POLITE_DELAY_S = 0.2` ([`_open_meteo.py:63`](packages/weather/src/mostlyright/weather/_fetchers/_open_meteo.py#L63)) caps a single worker at ~300 req/min, nominally under 600. But because each request is *weighted*, a 1-year window weighs ~47, so the **600/min budget is exhausted ~13 stations into a loop — roughly 2.6 seconds in** — and the 0.2s sleep does nothing to prevent it. The delay also lives *inside* `fetch_open_meteo`, so a user threading their own station loop loses even count-based bounding.

### 3. No client-side chunking + 18-variable over-fetch

The fetcher's own docstring warns *"14-day Open-Meteo per-call cap; longer windows must chunk client-side"* ([`_open_meteo.py:32`](packages/weather/src/mostlyright/weather/_fetchers/_open_meteo.py#L32)), but the caller chunks nothing — long windows become single unbounded-weight calls. And the `research()` pairs join only consumes temp / precip-probability / precip, yet we always request 18 variables — paying ~1.8× weight on data that is then discarded.

**Secondary:** 429 backoff is shallow/linear (`max(Retry-After, 0.2×(attempt+1))`, 3 retries → ~1.2s total absent a `Retry-After` header; [`_open_meteo.py:553`](packages/weather/src/mostlyright/weather/_fetchers/_open_meteo.py#L553)). It honors `Retry-After` (good) but gives up fast otherwise. No `apikey` / base-URL plumbing, so a user who needs headroom can't move to a paid tier — and the free tier is **non-commercial-only**, a ToS flag for Kalshi traders.

## Worst case

Backtesting 1 model × 60 US stations over a 1-year window = `60 × 47 ≈ 2,800` weighted calls per run. With **no caching**, iterating that backtest just **4 times exhausts the 10,000/day budget**. With a fast loop the **600/min** ceiling trips after ~13 stations (~2.6s), well before the politeness delay is relevant — the per-minute lockout, not the daily budget, is what users will hit first.

## Reproduction (conceptual)

```python
import mostlyright as mr

# 60-station, 1-year backtest, single model
stations = [...]  # 60 ICAO codes
for s in stations:
    df = mr.research(s, "2025-01-01", "2025-12-31",
                     forecast_source="open_meteo", forecast_model="gfs_global")
# 429s from *-api.open-meteo.com begin ~13 stations in;
# re-running the loop re-fetches everything (no forecast cache).
```

## Suggested fix (by impact)

1. **Wire the forecast cache into `_fetch_open_meteo_range`** — highest impact, lowest risk; the read/write/path functions already exist and are tested. Cache `previous_runs` / `single_run` / `seamless`; never cache `live` (rolling cycle, already flagged in `write_forecast_cache`).
2. **Throttle by weight, not request count** — estimate `max(vars/10,1) × max(days/14,1)` per call and pace against the 600/min budget, and/or chunk windows to ≤14 days so per-call weight stays ~1.8.
3. **Trim variables on the `research()` path** to what the pairs join uses (~3); keep the full 18 only for the standalone `fetch_open_meteo()` DataFrame API, ideally behind a `variables=` param.
4. **Deeper 429 backoff** (exponential 1→2→4→8s) + optional `apikey` / base-URL override for paid/commercial tiers.

## Parity note (TS twin)

Only the Python side was audited. The TS twin (`weather-ts`) almost certainly mirrors items 1–3 and should get a parity ticket or be fixed in the same phase (cf. the IEM-MOS perf parity in #57/#58).

## Related

- #51 — IEM ASOS 429 rate-limiting (sibling: same "always re-downloads, no cache-skip" root cause, different source)
- #40 — Single-Runs API 400s (same fetcher)
- #55 — Open-Meteo int64 fractional-drop (same fetcher)

---
_Filed from a source audit of the Open-Meteo fetch path (mostlyright-sdk @ v1.5.2, `16d62de`)._


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(weather): Open-Meteo forecast path uncached + weight-blind throttle → users hit free-tier rate limits in backtests #64

Problem

1. The forecast cache exists but is never wired in — every run re-fetches (dominant amplifier)

2. The politeness throttle counts requests, not weight — false safety

3. No client-side chunking + 18-variable over-fetch

Worst case

Reproduction (conceptual)

Suggested fix (by impact)

Parity note (TS twin)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Window	Weighted cost of 1 call	Calls until 600/min ceiling
7–14 days	1.8	~330
30 days	~3.9	~150
90 days	~11.6	~50
1 year	~47	~13
2 years	~94	~6

perf(weather): Open-Meteo forecast path uncached + weight-blind throttle → users hit free-tier rate limits in backtests #64

Description

Problem

1. The forecast cache exists but is never wired in — every run re-fetches (dominant amplifier)

2. The politeness throttle counts requests, not weight — false safety

3. No client-side chunking + 18-variable over-fetch

Worst case

Reproduction (conceptual)

Suggested fix (by impact)

Parity note (TS twin)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions