Summary
The ensemble analysis sometimes discards good fits, and the median-fit heuristic can be a poor central estimate. Produce a written, step-by-step review of the current pipeline — filter_by_rmse, filter_by_r_squared, filter_fits, compute_median_params, compute_mad, aggregate_fits, and the per-replica pooling in fit_measurement_set_per_replica — identifying where acceptable fits get filtered out and assessing the validity/robustness of the current metrics (RMSE-factor threshold, minimum R², median + MAD). Propose concrete, ranked redesign options. Explicitly evaluate offering mean-based aggregation alongside median (folds in the separate "average statistics option for the ensemble" request). Deliverable is a design/analysis document — no code changes this round.
Source
Reported via the fitting-app feedback channel.
Acceptance criteria
Scope & file boundaries
Package B (independent). Create only:
docs/notes/ensemble-analysis-review.md
May read core/optimizer/filters.py and core/pipeline/fit_pipeline.py but must not edit any code file.
Parallelization
Independent — runs in parallel with packages A, C, D. Folds in the "average statistics option for the ensemble" request.
Summary
The ensemble analysis sometimes discards good fits, and the median-fit heuristic can be a poor central estimate. Produce a written, step-by-step review of the current pipeline —
filter_by_rmse,filter_by_r_squared,filter_fits,compute_median_params,compute_mad,aggregate_fits, and the per-replica pooling infit_measurement_set_per_replica— identifying where acceptable fits get filtered out and assessing the validity/robustness of the current metrics (RMSE-factor threshold, minimum R², median + MAD). Propose concrete, ranked redesign options. Explicitly evaluate offering mean-based aggregation alongside median (folds in the separate "average statistics option for the ensemble" request). Deliverable is a design/analysis document — no code changes this round.Source
Reported via the fitting-app feedback channel.
Acceptance criteria
docs/notes/ensemble-analysis-review.mdcontaining: current-pipeline walkthrough, failure modes (good fits dropped), metric validity/robustness assessment, and ranked redesign proposals.Scope & file boundaries
Package B (independent). Create only:
docs/notes/ensemble-analysis-review.mdMay read
core/optimizer/filters.pyandcore/pipeline/fit_pipeline.pybut must not edit any code file.Parallelization
Independent — runs in parallel with packages A, C, D. Folds in the "average statistics option for the ensemble" request.