Skip to content

Harden operator session recovery and fallback#5

Open
Zefan-Cai wants to merge 1 commit intozefan-todo-2from
zefan-todo-3
Open

Harden operator session recovery and fallback#5
Zefan-Cai wants to merge 1 commit intozefan-todo-2from
zefan-todo-3

Conversation

@Zefan-Cai
Copy link
Copy Markdown
Collaborator

Summary

This PR isolates TODO 3 from the larger CLI branch: harden operator session recovery, resume fallback, and attempt/session bookkeeping on top of zefan-todo-2.

What Changed

  • added per-session state files under operator_state/
  • added per-attempt state files under operator_state/
  • record stream metadata for operator attempts
  • mark broken sessions and avoid reusing them
  • fall back to a fresh session when --resume fails
  • added focused recovery regression tests in tests/test_operator_recovery.py

Base

  • base branch: zefan-todo-2

Scope

This PR is intentionally limited to TODO 3 only.

Not included here:

  • stage handoff / context compression
  • paper/release package generation

Validation

  • python -m py_compile main.py src/*.py tests/*.py
  • python -m unittest tests.test_operator_recovery tests.test_run_manifest tests.test_stage_rollback tests.test_writing_pipeline -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant