fix: populate automated_production_score in controlled_solver by rasdani · Pull Request #368 · JackHopkins/factorio-learning-environment

rasdani · 2026-05-10T17:37:57Z

The unbounded_solver tracks automated_production_score per step and writes it into TrajectoryData (solver.py:1023, :1167, :1225), but the controlled_solver (used by all throughput tasks via fle inspect-eval --solver controlled) doesn't. Result: every saved .eval log for a throughput task reports
automated_production_score = 0.0 regardless of the actual factory output.

Mirror the unbounded_solver's pattern in the controlled loop:

Track automated_production_scores list alongside production_scores near the trajectory init (solver.py:319).
Read info.get("automated_production_score", 0) after each gym_env.step and append.
Update trajectory_data.{automated_production_score, automated_scores} at every per-step store and at final results, plus reset on the exception path.

Verified with::

fle inspect-eval --tasks iron_gear_wheel_throughput \ --model anthropic/claude-sonnet-4-5 --solver controlled \ --pass-n 1 --max-connections 1

Before: auto=0 despite prod=21.
After: auto=460 with timeseries [-30, 42, 123, 215, 460].

The unbounded_solver tracks ``automated_production_score`` per step and writes it into ``TrajectoryData`` (solver.py:1023, :1167, :1225), but the controlled_solver (used by all throughput tasks via ``fle inspect-eval --solver controlled``) doesn't. Result: every saved ``.eval`` log for a throughput task reports ``automated_production_score = 0.0`` regardless of the actual factory output. Mirror the unbounded_solver's pattern in the controlled loop: - Track ``automated_production_scores`` list alongside ``production_scores`` near the trajectory init (solver.py:319). - Read ``info.get("automated_production_score", 0)`` after each ``gym_env.step`` and append. - Update ``trajectory_data.{automated_production_score, automated_scores}`` at every per-step store and at final results, plus reset on the exception path. Verified with:: fle inspect-eval --tasks iron_gear_wheel_throughput \ --model anthropic/claude-sonnet-4-5 --solver controlled \ --pass-n 1 --max-connections 1 Before: ``auto=0`` despite ``prod=21``. After: ``auto=460`` with timeseries ``[-30, 42, 123, 215, 460]``.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: populate automated_production_score in controlled_solver#368

fix: populate automated_production_score in controlled_solver#368
rasdani wants to merge 1 commit into
JackHopkins:mainfrom
rasdani:fix/auto-score-controlled-solver

rasdani commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rasdani commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant