Add retry logic for failed BEAST2 simulations with cleanup by Copilot · Pull Request #51 · aezarebski/derp-simulation

Copilot · 2026-01-05T13:49:26Z

Failed simulations were not being handled gracefully—they would leave behind intermediate files and not attempt recovery. This adds automatic retry with seed increment and proper cleanup.

Changes

Retry loop: Failed simulations (timeout or error) automatically retry up to max_retries times (default: 3) with an incremented seed
Cleanup on failure: Removes .tree and .traj files before each retry; cleans up .time file only on final failure
Seed validation: Validates beast_seed_str is a valid integer upfront
Time tracking: .time file now records total elapsed time across all attempts (only written on success)

Usage

The function signature adds an optional max_retries parameter:

run_beast2_simulations_parallel(
    simulation_xml_list, 
    num_jobs=NUM_WORKERS, 
    beast_seed_str=str(CONFIG["seed"]),
    max_retries=3  # optional, defaults to 3
)

Each retry uses seed base_seed + attempt - 1, so attempts get seeds 42, 43, 44 for a base seed of 42.

Original prompt

This section details on the original issue you should resolve

<issue_title>What happens to the failed simulations?</issue_title>
<issue_description>At the moment, we simulate more epidemics than needed so that even if a couple of fail we still have enough to form the desired dataset. It would be better to have some safer code that checked if a simulation had worked and either clean up and retry, or moved on depending upon the results.

The following subfunction, run_beast2(simulation_xml), probably needs to be expanded to check if a simulation has failed and if so, then to re-run it with a different seed (resampling the parameters could get very tricky so seems too risky). But this is made much harder because there is a seed parameter that is shared across all the subruns and would need to be handled in a different way. Potentially creating a local copy and then calling itself if it fails but with the local copy incremented or something. Intermediate files from the failed run would also need to be cleaned up.

derp-simulation/main.py

Line 364 in 029f11a

def run_beast2(simulation_xml):

</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes What happens to the failed simulations? #13

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

…ssues Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Initial plan

ad93bc9

Copilot AI assigned Copilot and aezarebski Jan 5, 2026

Copilot started work on behalf of aezarebski January 5, 2026 13:49 View session

Copilot AI and others added 4 commits January 5, 2026 13:53

Add retry logic for failed BEAST2 simulations with cleanup

8bfb72e

Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Add __pycache__ to gitignore and remove from tracking

0eb8375

Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Refactor retry logic to use loop-based approach and fix code review i…

4633277

…ssues Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Add documentation for time log behavior and seed calculation

78f9ace

Co-authored-by: aezarebski <36244258+aezarebski@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Improve error handling for failed simulations~~ Add retry logic for failed BEAST2 simulations with cleanup Jan 5, 2026

Copilot AI requested a review from aezarebski January 5, 2026 13:58

Copilot finished work on behalf of aezarebski January 5, 2026 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add retry logic for failed BEAST2 simulations with cleanup#51

Add retry logic for failed BEAST2 simulations with cleanup#51
Copilot wants to merge 5 commits into
mainfrom
copilot/improve-simulation-error-handling

Copilot AI commented Jan 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Usage

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Jan 5, 2026 •

edited

Loading