Skip to content

Rollout-host-only: 3.10 toolchain, ROLLOUT_INSTALL, rollout-requirements#402

Open
rl2aloha wants to merge 2 commits into
elmo/pi-rollout-fixfrom
elmo/pi-rollout-local
Open

Rollout-host-only: 3.10 toolchain, ROLLOUT_INSTALL, rollout-requirements#402
rl2aloha wants to merge 2 commits into
elmo/pi-rollout-fixfrom
elmo/pi-rollout-local

Conversation

@rl2aloha

@rl2aloha rl2aloha commented May 8, 2026

Copy link
Copy Markdown
Contributor

Rollout-host-only: 3.10 toolchain, ROLLOUT_INSTALL, rollout-requirements

Local-only branch that should NOT be merged. Carries the patches the
rollout host needs to bring up the policy:

  • pyproject.toml: relax requires-python to >=3.10 (ARX arx5_interface
    ships as a cpython-310 .so, so the rollout host can't be 3.11) and drop
    zarr==3.1.5 from the runtime deps (training-only).
  • pull_models.sh: REMOTE_PATH bumped to the EgoVerse4 logs dir.
  • rollout-requirements.txt: trimmed sibling of requirements.txt with
    zarr removed and 3.10-friendly pins.
  • uv.lock: 3.10-resolved lockfile (used to seed openpi's pip install
    via uv export -c).
  • ROLLOUT_INSTALL.md: end-to-end setup guide for the rollout host
    (Python 3.10, openpi fork, ARX bindings, HF auth, checkpoint conversion).

External submodule pointer for external/openpi is intentionally left
uncommitted here; commit inside the submodule first if you want to
freeze the openpi rollout-fork rev.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

ROLLOUT_INSTALL.md: document actual openpi state, not aspirational

The "Branches" section claimed external/openpi is on a fork branch with
the rollout patches "already committed" — it isn't. The submodule is
detached at upstream 981483d with the patches as uncommitted
working-tree edits, plus an untracked scripts/patch_transformers.py.

The fork has a pi-rollout-changes branch with overlapping patches, but
it diverges from this host's working tree (adds chex, loosens numpy /
opencv-python, patches pi0_pytorch.py; missing .python-version,
packages/openpi-client/pyproject.toml, uv.lock, patch_transformers.py).
Calling out the divergence so future-us doesn't merge them blindly.

Replaces the prior "checked out on a rollout branch" wording with an
explicit "openpi patches to apply on top of 981483d" checklist —
file-by-file, what to change and why — so the setup is reproducible
even without a clean fork branch.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

rl2aloha commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

rl2aloha and others added 2 commits May 29, 2026 18:25
Local-only branch that should NOT be merged. Carries the patches the
rollout host needs to bring up the policy:

- pyproject.toml: relax `requires-python` to `>=3.10` (ARX `arx5_interface`
  ships as a cpython-310 .so, so the rollout host can't be 3.11) and drop
  `zarr==3.1.5` from the runtime deps (training-only).
- pull_models.sh: REMOTE_PATH bumped to the EgoVerse4 logs dir.
- rollout-requirements.txt: trimmed sibling of requirements.txt with
  zarr removed and 3.10-friendly pins.
- uv.lock: 3.10-resolved lockfile (used to seed openpi's pip install
  via `uv export -c`).
- ROLLOUT_INSTALL.md: end-to-end setup guide for the rollout host
  (Python 3.10, openpi fork, ARX bindings, HF auth, checkpoint conversion).

External submodule pointer for `external/openpi` is intentionally left
uncommitted here; commit inside the submodule first if you want to
freeze the openpi rollout-fork rev.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The "Branches" section claimed external/openpi is on a fork branch with
the rollout patches "already committed" — it isn't. The submodule is
detached at upstream 981483d with the patches as uncommitted
working-tree edits, plus an untracked scripts/patch_transformers.py.

The fork has a pi-rollout-changes branch with overlapping patches, but
it diverges from this host's working tree (adds chex, loosens numpy /
opencv-python, patches pi0_pytorch.py; missing .python-version,
packages/openpi-client/pyproject.toml, uv.lock, patch_transformers.py).
Calling out the divergence so future-us doesn't merge them blindly.

Replaces the prior "checked out on a rollout branch" wording with an
explicit "openpi patches to apply on top of 981483d" checklist —
file-by-file, what to change and why — so the setup is reproducible
even without a clean fork branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Claude Code Review

Review

Summary

Local-only branch carrying rollout-host setup (Python 3.10, ARX bindings, trimmed deps, install guide). The PR description explicitly states this should not be merged — it's meant to live as a host-local branch.

Key concerns

  1. Should not be merged, per author's own description. The PR title and body both flag this as a local-only branch. Treat this PR as "comment / do not merge" — if it lands on main, it will:

    • Relax requires-python to >=3.10, which contradicts the team's stated 3.11 + uv "emimic" environment convention.
    • Remove zarr==3.1.5 from runtime deps. The training stack (ZarrWriter, rldb data loading) imports zarr at module load — every training run on a fresh env would break.
    • Replace uv.lock with a 3.10-resolved lockfile, dragging in async-timeout, tomli, py3.10 wheels, and dual-resolved numpy/scipy/pandas/contourpy markers. Re-resolving on 3.11 after a merge is non-trivial and risks subtle pin drift.
  2. pull_models.sh REMOTE_PATH bump (EgoVerseEgoVerse4) is the one change here that looks generally useful, but it's host-specific (paphiwetsa3's PACE home) and tied to whatever "EgoVerse4" project dir is current. If we ever want this on main, it should be parameterized via env var, not hardcoded.

  3. Missing trailing newline in pull_models.sh (\ No newline at end of file). Minor, but pre-commit/ruff will flag it.

  4. rollout-requirements.txt diverges from pyproject.toml in ways that look unintentional even for a rollout file:

    • Pins torch==2.6.0 / torchvision==0.21.0, but pyproject.toml (and ROLLOUT_INSTALL Step 5) pin torch==2.7.1 / torchvision==0.22.1. Step 3 also installs openpi which pins torch 2.7.1, so pip will either upgrade torch (invalidating the requirements file) or conflict.
    • transformers==4.57.3 here vs. transformers==4.53.2 in pyproject.toml. The patched-transformers step (Step 4) is sensitive to transformers minor versions — this mismatch is exactly the kind of thing that silently invalidates transformers_replace/.
    • Duplicate entries (sqlalchemy, boto3 listed twice).
  5. ROLLOUT_INSTALL.md is honest about the openpi divergence (good — the revised commit message explicitly calls this out), but the resulting setup is not reproducible from version control: it relies on uncommitted working-tree edits on a detached HEAD plus an untracked scripts/patch_transformers.py. If the rollout host disk is lost, this guide is not sufficient to rebuild — the patch_transformers.py script content isn't captured anywhere.

Suggestions

  • Keep this branch local. Add a [DO NOT MERGE] prefix to the PR title, or convert to draft, so it doesn't get accidentally squashed in. Consider closing the PR entirely and just keeping the branch.
  • Fix the torch/transformers version mismatches in rollout-requirements.txt so it's actually consistent with pyproject.toml and the install guide. Right now Step 2 → Step 3 → Step 5 each undo each other's pins.
  • Dedupe rollout-requirements.txt (sqlalchemy, boto3).
  • Commit scripts/patch_transformers.py somewhere in egomimic (e.g. egomimic/rollout/patch_transformers.py) so the rollout setup is reproducible without "copy this from another host."
  • Push the openpi patches as a real branch on the GaTech-RL2 fork (pi-rollout-host) and pin the submodule to that commit. The guide's own "If you finish a rollout setup..." note acknowledges this; doing it now would save the next person hours.
  • Parameterize pull_models.sh via REMOTE_PATH="${REMOTE_PATH:-/storage/.../EgoVerse4/logs/pick_place/}" so updating the project dir doesn't require a code change.
  • For the requires-python relaxation specifically: if there's any future intent to land 3.10 support on main, audit egomimic/ for 3.11-only syntax (Self, datetime.UTC, PEP 695 type X = ..., tomllib) — the openpi patch list in the guide shows these do appear in adjacent codebases.

Verdict: Comment (do not merge)

Branch is correctly scoped as local-only and the ROLLOUT_INSTALL writeup is genuinely useful documentation. The concerns above are about making the rollout setup more reproducible, not about gating the branch — it shouldn't land on main in its current form regardless.


Reviewed by Claude · Review workflow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant