fix: retry transient init failures in make_factorio_env by rasdani · Pull Request #367 · JackHopkins/factorio-learning-environment

rasdani · 2026-05-10T17:36:40Z

When a Factorio container is recycled across samples (e.g. between
Inspect-AI rollouts of an eval-set, or between epochs of a pass@N
solver run), the first gym_env.reset on the recycled slot
occasionally races with FLE's internal Lua script cache and the RCON
server returns malformed data, surfacing as

RuntimeError: Failed to create Factorio environment: Could not
save research state: { ["technologies"] = { ... } }

The error is transient — a 2-4-second wait followed by a fresh
FactorioInstance build clears it. Wrap the FactorioInstance +
task.setup block in a 3-attempt retry with backoff and best-effort
cleanup of the partially-built instance between tries. The
configuration-error path (no containers / invalid run_idx) is moved
out of the retry loop since those errors are not transient.

Reproduced with::

fle inspect-eval --tasks plastic_bar_throughput,automation_science_pack_throughput
--model anthropic/claude-sonnet-4-5 --solver controlled
--pass-n 1 --max-connections 2

Without the retry, both samples fail with 0 model calls and score 0;
with the retry they reach the model and execute normally.

Number of retries is tunable via FLE_INIT_RETRIES (default 3).

When a Factorio container is recycled across samples (e.g. between Inspect-AI rollouts of an eval-set, or between epochs of a pass@N solver run), the first ``gym_env.reset`` on the recycled slot occasionally races with FLE's internal Lua script cache and the RCON server returns malformed data, surfacing as RuntimeError: Failed to create Factorio environment: Could not save research state: { ["technologies"] = { ... } } The error is transient — a 2-4-second wait followed by a fresh FactorioInstance build clears it. Wrap the FactorioInstance + task.setup block in a 3-attempt retry with backoff and best-effort cleanup of the partially-built instance between tries. The configuration-error path (no containers / invalid run_idx) is moved out of the retry loop since those errors are not transient. Reproduced with:: fle inspect-eval --tasks plastic_bar_throughput,automation_science_pack_throughput \ --model anthropic/claude-sonnet-4-5 --solver controlled \ --pass-n 1 --max-connections 2 Without the retry, both samples fail with 0 model calls and score 0; with the retry they reach the model and execute normally. Number of retries is tunable via ``FLE_INIT_RETRIES`` (default 3).

Root cause of "Could not save research state" was deeper than the previous patch handled. cleanup() only closes the RCON socket — it does NOT reset the Factorio server's Lua VM. With cache_scripts=True (the default in make_factorio_env), the LuaScriptManager checksums match between rebuilds and skips re-uploading; the new instance ends up running against stale, corrupted Lua state from the previous run. Real fix: on every retry attempt after the first, force ``cache_scripts=False`` so the Lua scripts are freshly re-uploaded, re-initializing all server-side script state. This is what closes the actual race. Also bump retry budget: 5 attempts (was 3) with 2/5/10/20/30s backoff (was 2/4/6s) — under heavy parallel load, the corruption takes longer than 6s to clear. Budget is configurable via FLE_INIT_RETRIES / FLE_INIT_BACKOFF. Verified: crude_oil_throughput (gpt-4o) which previously bombed at 0 steps (setup race) now succeeds with 6 steps, prod=1438, auto=329.

rasdani added 2 commits May 10, 2026 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: retry transient init failures in make_factorio_env#367

fix: retry transient init failures in make_factorio_env#367
rasdani wants to merge 2 commits into
JackHopkins:mainfrom
rasdani:fix/init-retry-clean

rasdani commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rasdani commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant