Skip to content

Implement pure CLI AutoR workflow and publication packages#1

Open
Zefan-Cai wants to merge 3 commits intomainfrom
zefan-dev
Open

Implement pure CLI AutoR workflow and publication packages#1
Zefan-Cai wants to merge 3 commits intomainfrom
zefan-dev

Conversation

@Zefan-Cai
Copy link
Copy Markdown
Collaborator

Summary

This PR turns the branch into a pure CLI-first AutoR workflow runner with stronger workflow state management, richer platform-alignment modules, and production-oriented stage 07/08 packaging.

The branch keeps main.py as the run entrypoint, src/manager.py as the 8-stage orchestrator with human approval gates, src/operator.py as the Claude Code executor, and src/utils.py as the run-layout/prompt/validation layer. Runs still live under runs/<run_id>/, with stage drafts written to stages/*.tmp.md before validation and promotion.

TODO Status

  1. Cross-stage rollback and downstream invalidation
    Status: Done
    What landed:
  • --rollback-stage CLI support
  • downstream stages marked stale
  • rollback target marked pending/dirty
  • approved memory rebuilt from manifest after rollback
  1. Run manifest and stage status file
    Status: Done
    What landed:
  • run_manifest.json as the primary machine-readable state source
  • per-stage status, approval, stale/dirty flags, session id, attempt count, artifact pointers, handoff pointer, compressed summary
  1. Operator session recovery and failure hardening
    Status: Done
    What landed:
  • per-session state files under operator_state/
  • per-attempt state files under operator_state/
  • broken sessions are no longer reused
  • resume failure falls back to a fresh session and records attempt metadata
  • missing stage draft fallback materialization retained and integrated
  1. Stage context compression and handoff
    Status: Done
    What landed:
  • handoff/<stage>.md summaries for approved stages
  • routed orchestration context, manifest context, and handoff context injected into prompts
  1. TODO item 5
    Status: Not done
    Note:
  • The original task text provided in the thread was truncated after item 4 and before item 7, so this item was not fully visible. I did not guess and implement a partially specified requirement.
  1. TODO item 6
    Status: Not done
    Note:
  • Same reason as item 5: the original task text was truncated and the item was not fully visible.
  1. Submission-grade paper package
    Status: Done
    What landed:
  • stronger stage 07 paper package generation
  • manuscript .tex, bibliography, tables, figure manifest, build script, submission checklist, and compiled PDF placeholder artifacts
  1. Review / dissemination package
    Status: Done
    What landed:
  • stronger stage 08 release/review package generation
  • readiness checklist, threats-to-validity notes, artifact bundle manifest, release notes, external summary, and dissemination collateral generation hooks
  1. Frontend run dashboard
    Status: Not done by design
    Note:
  • A frontend/dashboard iteration was explored earlier, but the final direction for this branch was explicitly changed to pure CLI. The web stack was removed accordingly.
  1. Tests and CI
    Status: Partially done
    What landed:
  • expanded regression coverage around prompt context, KB search, rollback/stale handling, operator recovery, literature workflow, debate workflow, playbook workflow, router execution, foundry generation, and manifest consistency
    What is still missing:
  • CI wiring in GitHub Actions or equivalent

Additional Notes

  • run_state.json file dependency has been removed; run_manifest.json is now the sole persisted workflow state source.
  • src/run_state.py remains only as an in-memory compatibility formatter/adapter derived from the manifest.
  • The branch is intentionally scoped to a pure CLI main workflow rather than a web control plane.

Validation

  • python -m py_compile main.py src/*.py src/platform/*.py tests/*.py
  • python -m unittest discover -s tests -v

@black-yt
Copy link
Copy Markdown
Collaborator

Thanks for the work here. There is useful progress in this branch, but this PR should be split before merge.

Right now it is too broad to review safely: 28 changed files, ~4k additions, and several distinct concerns bundled together. In particular, it mixes:

  1. Core workflow state changes (run_manifest, rollback, stale/dirty stage tracking, CLI flags)
  2. Operator/session recovery and stage handoff/compression logic
  3. Stage 07/08 publication-package changes
  4. A large new src/platform/* stack plus knowledge_base.py / inspection.py
  5. README + test expansion

These are not one review unit. Some are core workflow changes, some are reliability improvements, some are writing-package features, and some are a substantial architectural expansion. Reviewing them together makes it hard to reason about regressions, approve only the good parts, or maintain a clear project direction.

Suggested split:

  • PR A: workflow-state layer only

    • main.py, src/manager.py, src/manifest.py, src/run_state.py, src/utils.py
    • focus on run_manifest, rollback, stale downstream invalidation, and state transitions
  • PR B: operator recovery / continuation only

    • src/operator.py + the minimal related manager changes + targeted tests
    • focus on session recovery, failed resume fallback, attempt metadata, handoff/compression if still needed
  • PR C: Stage 07/08 packaging only

    • publication package, review/dissemination artifacts, README updates relevant to that scope, and tests for that slice
  • PR D: platform modules only, if they are still desired

    • src/platform/*, src/knowledge_base.py, src/inspection.py
    • this is a major architectural addition by itself and should be reviewed independently from the CLI workflow changes

Please keep each split PR narrowly scoped, with its own motivation, tests, and validation. In the current form, this is too much surface area for one merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants