Map forecast to truth by dnerini · Pull Request #109 · MeteoSwiss/evalml

dnerini · 2026-02-12T14:14:42Z

this PR generalizes verification to use a more generic truth source instead of analysis-only inputs, so forecasts can be compared against either analysis zarr data or observations. It also updates the workflow, plotting/verification scripts, and config schema/docs to use the new truth interface consistently.

Missing features:

meteograms: support for multiple baselines
meteograms: dynamic labels

Louis-Frey · 2026-03-02T14:25:18Z

Hi Daniele, nice work!

I had a rough look at everything, seems fine to me. Should I test the configs? (Hopefully tomorrow afternoon between Balfrin blockages.) I would prioritize forecasters-ich1.yaml and forecasters-ich1-oper.yaml.

Louis-Frey · 2026-03-03T14:24:26Z

Quick update: I tried to run the configs, but am getting some errors. More to follow...

Louis-Frey · 2026-03-03T15:52:19Z

Ok, it could be that the errors I had at first were related to the fact that balfrin was still down when I launched the jobs. So I made a new clone, but got some errors there too. Claude Code suggested a small change, after which both forecasters-ich1-oper.yaml and forecasters-ich1.yaml ran fine. I pushed the change as a bug fix.

Louis-Frey · 2026-03-09T16:51:34Z

I am testing the configs but am getting some errors. I will keep you posted.

Louis-Frey · 2026-03-09T17:02:38Z

Ok, for forecasters-co1e.yaml, the config is requesting initialization times from January 2020, but the baseline data set /store_new/mch/msopr/ml/COSMO-1E/FCST20.zarr only has reference times from 1st of August 2020. Change the dates in the config?

The other configs, apart from forecasters-ich1.yaml and forecasters-ich1-oper.yaml, produced some errors as well that I could investigate tomorrow.

Louis-Frey · 2026-03-10T09:39:41Z

For forecasters-co2.yaml I get the following log file of rule verif_metrics_baseline (with some print statements inserted as suggested by Claude Code):

2026-03-10 10:02:05,507 - data_input - INFO - Loading baseline forecasts from zarr dataset...
2026-03-10 10:02:22,281 - __main__ - INFO - Loaded forecast data in 16.775267 seconds: 
<xarray.Dataset> Size: 271MB
Dimensions:    (lead_time: 21, y: 390, x: 582)
Coordinates:
    ref_time   datetime64[ns] 8B 2020-02-03
  * lead_time  (lead_time) timedelta64[ns] 168B 0 days 00:00:00 ... 5 days 00...
    lat        (y, x) float64 2MB dask.array<chunksize=(390, 582), meta=np.ndarray>
    lon        (y, x) float64 2MB dask.array<chunksize=(390, 582), meta=np.ndarray>
    time       (lead_time) datetime64[ns] 168B 2020-02-03 ... 2020-02-08
Dimensions without coordinates: y, x
Data variables:
    T_2M       (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    TD_2M      (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    U_10M      (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    V_10M      (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    PS         (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    PMSL       (lead_time, y, x) float64 38MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    TOT_PREC   (lead_time, y, x) float64 38MB dask.array<chunksize=(10, 390, 582), meta=np.ndarray>
Attributes:
    Conventions:  CF-1.8
    institution:  MeteoSwiss
2026-03-10 10:02:22,293 - data_input - INFO - Loading ground truth from an analysis zarr dataset...
/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/data_input/__init__.py:88: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  print("dims:", dict(ds.dims))
/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/data_input/__init__.py:95: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  print("dims:", dict(ds.dims))
/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/data_input/__init__.py:104: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  print("dims:", dict(ds.dims))
/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/data_input/__init__.py:122: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  print("dims:", dict(ds.dims))
/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/data_input/__init__.py:140: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  print("dims:", dict(ds.dims))
2026-03-10 10:02:23,892 - __main__ - INFO - Loaded truth data in 1.598991 seconds: 
<xarray.Dataset> Size: 137MB
Dimensions:   (time: 21, y: 390, x: 582)
Coordinates:
  * y         (y) int64 3kB 0 1 2 3 4 5 6 7 ... 382 383 384 385 386 387 388 389
  * x         (x) int64 5kB 0 1 2 3 4 5 6 7 ... 574 575 576 577 578 579 580 581
    lat       (y, x) float64 2MB dask.array<chunksize=(390, 582), meta=np.ndarray>
    lon       (y, x) float64 2MB dask.array<chunksize=(390, 582), meta=np.ndarray>
  * time      (time) datetime64[s] 168B 2020-02-03 ... 2020-02-08
Data variables:
    T_2M      (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    TD_2M     (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    U_10M     (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    V_10M     (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    PS        (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    PMSL      (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
    TOT_PREC  (time, y, x) float32 19MB dask.array<chunksize=(21, 390, 582), meta=np.ndarray>
=== [DEBUG 1] Before set_index ===
dims: {'variable': 7, 'time': 7184, 'cell': 226980}
coords: ['time', 'variable', 'y', 'x']
=== [DEBUG 2] After set_index ===
dims: {'variable': 7, 'time': 7184, 'cell': 226980}
coords: ['time', 'variable', 'cell', 'y', 'x']
cell index type: <class 'pandas.core.indexes.multi.MultiIndex'>
=== [DEBUG 3] After unstack ===
dims: {'y': 390, 'x': 582, 'variable': 7, 'time': 7184}
coords: ['y', 'x', 'time', 'variable']
indexes: {'time': <class 'pandas.core.indexes.datetimes.DatetimeIndex'>, 'variable': <class 'pandas.core.indexes.base.Index'>, 'y': <class 'pandas.core.indexes.base.Index'>, 'x': <class 'pandas.core.indexes.base.Index'>}
=== [DEBUG 4] After to_dataset ===
dims: {'time': 7184, 'y': 390, 'x': 582}
coords: ['y', 'x', 'lat', 'lon', 'time']
indexes: {'time': <class 'pandas.core.indexes.datetimes.DatetimeIndex'>, 'y': <class 'pandas.core.indexes.base.Index'>, 'x': <class 'pandas.core.indexes.base.Index'>}
=== [DEBUG 5b] 'cell' is NOT in dims — rename skipped ===
  Non-dim coords that might be stale: ['lat', 'lon']
=== [DEBUG 6] Final dataset ===
dims: {'time': 7184, 'y': 390, 'x': 582}
coords: ['y', 'x', 'lat', 'lon', 'time']
indexes: {'time': <class 'pandas.core.indexes.datetimes.DatetimeIndex'>, 'y': <class 'pandas.core.indexes.base.Index'>, 'x': <class 'pandas.core.indexes.base.Index'>}
Traceback (most recent call last):
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/workflow/scripts/verif_single_init.py", line 144, in <module>
    main(args)
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/workflow/scripts/verif_single_init.py", line 75, in main
    results = verify(fcst, truth, args.label, args.truth_label, args.regions)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/src/verification/__init__.py", line 169, in verify
    fcst_aligned, obs_aligned = xr.align(fcst, obs, join="inner", copy=False)
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/.venv/lib/python3.12/site-packages/xarray/structure/alignment.py", line 968, in align
    aligner.align()
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/.venv/lib/python3.12/site-packages/xarray/structure/alignment.py", line 660, in align
    self.align_indexes()
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/.venv/lib/python3.12/site-packages/xarray/structure/alignment.py", line 497, in align_indexes
    update_dicts(key, joined_index, joined_index_vars, need_reindex)
  File "/scratch/mch/lfrey/2025_SEN_eval_ML/software/evalml/2026_03_09_test_map-fcst-to-truth_forecasters-co2/evalml/.venv/lib/python3.12/site-packages/xarray/structure/alignment.py", line 419, in update_dicts
    raise AlignmentError(
xarray.structure.alignment.AlignmentError: cannot align objects on coordinate 'y' because of conflicting indexes
first index: PandasIndex(Index([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,
       ...
       380, 381, 382, 383, 384, 385, 386, 387, 388, 389],
      dtype='int64', name='y', length=390))
second index: PandasIndex(MultiIndex([(  0,   0),
            (  0,   1),
            (  0,   2),
            (  0,   3),
            (  0,   4),
            (  0,   5),
            (  0,   6),
            (  0,   7),
            (  0,   8),
            (  0,   9),
            ...
            (389, 572),
            (389, 573),
            (389, 574),
            (389, 575),
            (389, 576),
            (389, 577),
            (389, 578),
            (389, 579),
            (389, 580),
            (389, 581)],
           name='values', length=226980))
first variable: <xarray.IndexVariable 'y' (y: 390)> Size: 3kB
array([  0,   1,   2, ..., 387, 388, 389], shape=(390,))
second variable: <xarray.IndexVariable 'values' (values: 226980)> Size: 2MB
[226980 values with dtype=int64]

Apparently there is something wrong with the indexing of forecast / truth. Error messages from the other configs to follow.

Louis-Frey · 2026-03-10T09:48:33Z

Ok, configs forecasters-co2-disentangled.yaml and interpolators-co2.yaml suffer from the same problem. Likely an indexing issue on the COSMO grid. Which would explain why the two ICON configs do not suffer from this problem.

frazane and others added 12 commits February 6, 2026 14:40

add configs for evaluation with operational analyses

de37100

update lockfile

b857370

Map fcst to truth

0453af8

Merge branch 'main' into feat/map-fcst-to-truth

93628ce

Merge branch 'main' into feat/map-fcst-to-truth

cf23813

Merge branch 'main' into feat/map-fcst-to-truth

36ddb54

Fix steps parameter for analysis data

288a3bd

Small refactoring

7ed566f

Refactor

f2a1bd0

Include new spatial module

20d1452

Liniting

998b8fc

Fix tests

12d5b80

dnerini marked this pull request as ready for review March 2, 2026 13:31

dnerini requested review from Louis-Frey and frazane March 2, 2026 13:32

Bug Fix suggested by Claude Code.

68b3d63

Allow for multiple baselines

02eeeea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Map forecast to truth#109

Map forecast to truth#109
dnerini wants to merge 14 commits intomainfrom
feat/map-fcst-to-truth

dnerini commented Feb 12, 2026 •

edited

Loading

Uh oh!

Louis-Frey commented Mar 2, 2026

Uh oh!

Louis-Frey commented Mar 3, 2026

Uh oh!

Louis-Frey commented Mar 3, 2026

Uh oh!

Louis-Frey commented Mar 9, 2026

Uh oh!

Louis-Frey commented Mar 9, 2026 •

edited

Loading

Uh oh!

Louis-Frey commented Mar 10, 2026

Uh oh!

Louis-Frey commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dnerini commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Louis-Frey commented Mar 2, 2026

Uh oh!

Louis-Frey commented Mar 3, 2026

Uh oh!

Louis-Frey commented Mar 3, 2026

Uh oh!

Louis-Frey commented Mar 9, 2026

Uh oh!

Louis-Frey commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Louis-Frey commented Mar 10, 2026

Uh oh!

Louis-Frey commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dnerini commented Feb 12, 2026 •

edited

Loading

Louis-Frey commented Mar 9, 2026 •

edited

Loading