feat(policies): WBCPolicy provider for GR00T Whole-Body-Control (SONIC, ONNX) by rmncardoso · Pull Request #483 · strands-labs/robots

rmncardoso · 2026-06-16T22:56:45Z

Summary

Adds WBCPolicy — a new non-VLA policy provider wrapping NVIDIA's GR00T
Whole-Body-Control (SONIC / decoupled-WBC) ONNX controllers for deploy-grade
Unitree G1 locomotion. Clean-room implementation against the upstream
reference (NVlabs/GR00T-WholeBodyControl,
decoupled_wbc/sim2mujoco/scripts/run_mujoco_gear_wbc.py +
resources/robots/g1/g1_gear_wbc.yaml). Closes #466.

Validated end-to-end against the real shipped weights
(GR00T-WholeBodyControl-{Balance,Walk}.onnx): through the torque-control deploy
harness the G1 produces a stable forward walk (~0.38 m/s for a 0.5 m/s
command, height held, no fall), and holds balance in place on a zero command.

What's included

strands_robots/policies/wbc/ — WBCPolicy (provider wbc, shorthand
sonic), WBCConfig, the observation builder, and pure-NumPy control/quat
helpers.
- requires_images = False; reads the goal from the well-known kwargs
  (target_velocity = [vx, vy, omega], optional target_orientation for base
  RPY, optional height). Drives the 15 leg+waist DOFs; the 14 arm joints
  are held at defaults (upper-body composition is left to a future
  CompositePolicy, feat(policies): CompositePolicy - stack policies across DOF groups #468).
- Faithful to the reference contract: 86-dim observation
  (command[7] + base_ang_vel[3] + projected_gravity[3] + qj[29] + dqj[29] + prev_action[15]) stacked over obs_history_len=6 → 516-wide network input;
  whole-body qj/dqj (all 29 joints, not just the 15 controlled); two ONNX
  sessions (policy + walk_policy) selected by norm(raw velocity) <= 0.05;
  PD→torque law via compute_torques(...); zero-warm-started history deque;
  quaternion math numerically identical to upstream (verified to 1e-10).
- WBCConfig loads the upstream g1_gear_wbc.yaml directly (JSON or YAML;
  flat *_scale keys normalised). Joints resolved by name (handles the
  G1's leading floating_base_joint + interleaved arms).
- No weights bundled — fetched/pointed-at at runtime under the NVIDIA Open
  Model License.
Precursor (feat(sim)): a per-call policy_kwargs channel on
SimEngine.run_policy / start_policy (+ MuJoCo overrides), forwarded
verbatim to policy.get_actions, so the docs(policies): clarify Policy ABC supports VLA + classical planners (#299) #300 well-known goal kwargs reach
non-VLA providers (WBC, cuRobo, MoveIt2) through the local sim path — not just
the mesh. Pinned by tests/simulation/test_policy_kwargs_forwarding.py.
Integration: registered in registry/policies.json; wbc/sonic added
to the mesh + Device Connect policy-provider allowlist so WBC can be driven
over tell() / Device Connect; new [wbc] extra
(onnxruntime + pyyaml + huggingface_hub); NOTICE attribution.
Docs & example: docs/policies/wbc.md (+ mkdocs nav, README rows);
examples/wbc_g1_torque_deploy.py — a torque-control deploy harness with an
importable simulate_rollout (the real weights walk the G1).

Testing

82 WBC unit tests (stub ONNX, no GPU): observation layout, PD/quat
hand-values, walk-selection, history determinism, config/YAML loading, factory
resolution, mesh allowlist, error paths — plus TestUpstreamFidelity pins
transcribed from the upstream YAML/runner.
No-weights harness smoke test (tests/policies/wbc/test_torque_harness.py,
runs in CI with mujoco): ground plane, torque actuators, observation shape,
no fall-through.
Gated real-weights integration (tests_integ/policies/wbc/,
WBC_LIVE=1 + checkpoint): ONNX I/O-dims match config, action shape, load.
The gait-quality + standing-balance checks run the torque-deploy loop behind
WBC_GAIT_CHECK=1 and pass with real weights.
format / lint / mypy clean; zero regressions vs main.

Reviewer notes / scope

Gait-quality tests are opt-in (WBC_GAIT_CHECK=1 + real checkpoint): they
need the torque-control deploy environment, not sim.run_policy's
position-servo path. The harness + the I/O-contract tests are the runnable
evidence; a reviewer with the real weights can flip the flag to watch it walk.
policy_kwargs is scoped to the control/deploy path (run_policy /
start_policy / mesh), not eval_policy / evaluate_benchmark (which are
instruction-driven benchmark entry points). Static-velocity WBC eval works via
the constructor target_velocity. Documented in docs/policies/wbc.md.
Pre-existing observation: the already-merged cosmos3 / curobo / moveit2
providers are also absent from the mesh policy-provider allowlist; this PR only
adds wbc/sonic (the providers it introduces) and leaves the broader gap for
a follow-up.

Follow-up (not in this PR)

An optional SonicPlannerPolicy wrapping the GR00T-WBC kinematic planner
(planner_sonic.onnx: mode/target_vel → qpos[T,36]) — a different SONIC
component, complementary to this tracking controller. Tracked separately.

Review changelog

Round 1 — CodeQL code-scanning findings (commit `d92904f`)

Resolved both code-scanning alerts; no behavioral change, no API change.

Empty except without comment (strands_robots/policies/wbc/policy.py,
per-joint observation reader). Added an explanatory comment: a non-numeric
observation entry intentionally degrades to its pre-zeroed neutral value
(the full qj/dqj block is validated downstream), rather than aborting the
observation build.
Module imported with both import and import from
(tests/policies/wbc/test_policy.py). Consolidated to a single top-level
from strands_robots.policies.wbc import policy as wbc_policy; the
monkeypatch (require_optional) and _looks_like_hf_repo_id heuristic
checks now reference the module object consistently.

Validation: ruff check + ruff format --check clean; WBC unit suite
73 passed / 1 skipped.

cagataycali · 2026-06-17T06:48:28Z

Test report (aarch64 / MuJoCo EGL headless)

Functionally this looks solid, but the lint gate (mypy strands_robots tests tests_integ from hatch run lint) is not clean — test-lint.yml will fail on 6 type errors in the PR's own test files. The PR description says "format / lint / mypy clean"; that does not reproduce here (mypy 1.20.2).

What passes

tests/policies/wbc/ + tests/simulation/test_policy_kwargs_forwarding.py: 82 passed (incl. the mujoco torque-harness smoke tests and the 4 kwargs-forwarding tests — all run, none skipped).
Provider wiring: wbc is in list_providers(); the sonic shorthand resolves to wbc via resolve_policy.
No regressions in tests/simulation/test_policy_runner.py (30 passed, 2 skipped) from the policy_runner.py / base.py / mujoco/simulation.py changes.
ruff check / ruff format --check clean on strands_robots/policies/wbc.

mypy failures (block CI)

tests_integ/policies/wbc/test_wbc_live.py:156: error: "Policy" has no attribute "policy_session"  [attr-defined]
tests_integ/policies/wbc/test_wbc_live.py:159,160,164,165: error: "Policy" has no attribute "config"  [attr-defined]
tests/policies/wbc/test_policy.py:661: error: Incompatible types in assignment (expression has type "list[float]", target has type "float")  [assignment]

Root causes:

create_policy("wbc", ...) is typed to return the Policy base, so .policy_session / .config (WBCPolicy-specific) are invisible to mypy in test_real_onnx_io_dims_match_config.
obs = {k: 0.0 for k in _g1_keys()} infers dict[str, float], then obs["base_ang_vel"] = [0.0, 0.0, 0.1] assigns a list into a float-valued dict.

Minimal fix (verified — `Success: no issues found in 363 source files`, and `tests/policies/wbc/test_policy.py` still 74 passed)

In test_real_onnx_io_dims_match_config, narrow after construction:

policy = create_policy("wbc", checkpoint=CHECKPOINT, walk=True)
assert isinstance(policy, WBCPolicy)   # add
sess = policy.policy_session

(WBCPolicy is already imported in this file.)

At test_policy.py:660, widen the annotation:

obs: dict[str, object] = {k: 0.0 for k in _g1_keys()}

Other

mergeable: CONFLICTING — the only conflict is CHANGELOG.md (upstream collapsed [Unreleased] into a dated v0.4.0 heading in chore(release): collapse Unreleased CHANGELOG sections into dated v0.4.0 heading #484). A rebase + re-anchoring the WBC entry resolves it; no code conflicts.

Real-weights gait checks (WBC_LIVE=1 / WBC_GAIT_CHECK=1) were not exercised here — no GR00T-WBC ONNX checkpoint available — so this report covers the no-weights path only.

rmncardoso · 2026-06-17T10:23:19Z

Rebased on `main` + fixed the CI dependency conflict

1. call-test-lint (fixed): failure was at pip install .[all,dev], not tests/lint — the [wbc] extra pinned huggingface_hub<1.0.0, conflicting with lerobot's >=1.0.0. Widened to >=0.20.0,<2.0.0 (matches lerobot); the full [all,dev] surface now resolves.

2. Merge conflict (resolved): rebased onto current main (incl. the #484 CHANGELOG restructure); WBC entry reconciled under ## [Unreleased]. Now MERGEABLE, 12 linear commits.

Re-verified locally: WBC + integration tests pass, ruff/mypy clean, zero new failures vs main. CI is held on the fork-PR workflow-approval gate — ready for a maintainer to approve + review.

…ctions Non-VLA providers (cuRobo, MoveIt2, and the incoming WBCPolicy) read their goal from the well-known strands-labs#300 kwargs (target_pose / target_joints / target_velocity / world_update). The mesh tell() path already forwards these, but the local-sim path (run_policy/start_policy -> PolicyRunner.run -> get_actions) dropped them: get_actions was called positionally with no kwargs. Add an explicit policy_kwargs dict (default None) threaded through SimEngine.run_policy/start_policy, the MuJoCo overrides, and PolicyRunner.run, forwarded verbatim to every policy.get_actions(obs, instruction, **policy_kwargs) call. Explicit dict over **kwargs per the project's 'reject silently-dropped kwargs' convention. VLA providers ignore unknown kwargs per strands-labs#300 so forwarding is always safe; None reproduces the historical no-kwargs behaviour exactly. Unblocks dynamic per-call goals for WBCPolicy (strands-labs#466) and fixes the same latent gap for cuRobo/MoveIt2 driven through sim.run_policy. Pinned by tests/simulation/test_policy_kwargs_forwarding.py.

…C, ONNX) (closes strands-labs#466) Clean-room non-VLA locomotion provider wrapping NVIDIA's GR00T Whole-Body-Control (SONIC / decoupled-WBC) ONNX controllers for deploy-grade Unitree G1 locomotion. Drives the 15 leg+waist DOFs; arm joints held at defaults (composition with an upper-body policy is left to a future CompositePolicy, strands-labs#468). New package strands_robots/policies/wbc/: - config.py WBCConfig dataclass (paths, gains, dims) + JSON loader, fail-fast on dimension mismatch. - control.py pure-NumPy pd_control / compute_targets / quat_rotate_inverse / projected_gravity (hand-verified, no torch/onnx). - observation.py build_single_frame (exact 86-dim layout) + ObservationHistory deque (warm-started, oldest-first stacking). - policy.py WBCPolicy(Policy): dual ONNX sessions loaded in __init__ (centralised dep check; RuntimeError on missing onnxruntime / checkpoint - never silent zero torques), requires_images=False, reads target_velocity/target_orientation from strands-labs#300 kwargs, explicit WBC_G1_LEG_WAIST_JOINTS actuator<->15-dim mapping with validation in set_robot_state_keys (no positional guessing), compute_torques() public PD helper, reset() clears history. Checkpoint resolution: local path | HF download | cache. Wiring: - registry/policies.json: canonical 'wbc', shorthands ['wbc','sonic']. - pyproject.toml: [wbc] extra = onnxruntime (light, no torch); added to [all]; onnxruntime in mypy ignore; 'wbc' pytest marker. - NOTICE: attributes GR00T-WBC (Apache-2.0) + ONNX Runtime (MIT). - docs/policies/wbc.md + mkdocs nav + README extras/provider rows. Tests: - tests/policies/wbc/test_policy.py: 45 unit tests with a stub ONNX session (no GPU/onnxruntime) - obs layout, PD/quat hand values, action shape, history deque, factory wbc/sonic resolution, missing-dep+missing-model RuntimeError, reset, mapping-table validation, prev-action feedback, checkpoint resolution. - tests_integ/policies/wbc/test_wbc_live.py: gated (WBC_LIVE=1 + checkpoint) real-ONNX G1 forward-walk integration test; skips cleanly otherwise. Builds on the policy_kwargs forwarding fix (ff5dbee) so per-call target_velocity reaches get_actions through sim.run_policy.

…s-labs#466) Tested against the REAL MuJoCo G1 (robot_descriptions) + REAL onnxruntime sessions (not stubs), plus an adversarial code review. Fixes: SHIP-BLOCKER - name-based joint resolution: The real sim's robot_joint_names('unitree_g1') leads with 'floating_base_joint' (the free joint), so the leg+waist joints are at [1:16], not [0:15]. The old set_robot_state_keys validated keys[:15] == WBC_G1_LEG_WAIST_JOINTS and RAISED against the real G1 - run_policy would have crashed. Now resolve the 15 WBC joints BY NAME within the robot's key list (any surrounding free-base/arm joints ignored), and read/emit by those resolved names. Also fixes the flat observation.state path, which now indexes by name instead of a positional [:n] slice. (was hidden because the unit-test fixture hard-coded leg+waist-first; fixture now uses the real layout.) velocity feedback (adversarial strands-labs#4): The MuJoCo backend's unified observation is positions-only (no '<name>.vel', no observation.velocity), so WBC ran open-loop on velocity while a comment falsely claimed it 'converges'. Now read .vel by resolved name when present, and warn ONCE when no velocity signal exists (a dead channel can destabilise the gait) instead of silently pretending otherwise. target_orientation overflow (adversarial strands-labs#1): A target_orientation longer than command_dim-3 overflowed the command block and raised on every tick - a documented kwarg that deterministically crashed. The command is now fitted to command_dim (truncate-with-debug-log). num_actions guard (adversarial strands-labs#8): num_actions > len(WBC_G1_LEG_WAIST_JOINTS) silently truncated the 15-entry mapping table everywhere and failed late with a confusing zip error. Now rejected at construction with a clear message. HF-id heuristic (adversarial strands-labs#6/strands-labs#7): Tightened to _looks_like_hf_repo_id (rejects ./ ../ absolute, backslash, .onnx, >1 slash, non-HF charset) so a non-existent relative dir isn't mistaken for a repo; log BEFORE the network download so an unexpected fetch is visible. Integration test fixes (found by running it for real): - precondition asserted joint_names[:15] positionally (same floating_base bug) -> now checks WBC joints present by name. - get_body_state json block is content[1], not content[0] -> find it robustly. Tests: 59 unit (was 49; +10 regressions incl. the floating_base ordering, the velocity warning, target_orientation no-crash, num_actions guard, HF heuristic), all green. 2 deterministic integration tests pass against real MuJoCo G1 + real ONNX; gait-quality test correctly requires real SONIC weights. Zero regressions vs. pristine baseline (verified: same 20 pre-existing macOS-GL failures on main). format + lint + mypy clean.

…; dup-name first-wins Second adversarial-review pass found two more issues in _read_joint_vector: - strands-labs#1 (BUG): the flat observation.state / observation.velocity branch was gated on 'self._robot_state_keys' being set, so a direct-API / replay caller that passed observation.state but never called set_robot_state_keys got all-zero qj/dqj silently - diverging from cuRobo/MoveIt2, which consume observation.state unconditionally. Now: with keys -> name-resolved indexing (free-base/arm safe); without keys -> consume the flat vector positionally in WBC leg+waist order (the same contract the sibling planners use), zero-padding a short vector. - strands-labs#8 (minor): name-resolved indexing used a dict comprehension (last-occurrence wins), so a duplicated joint name could shift the resolved slot. Now first occurrence wins via setdefault. +2 regression tests (61 total in wbc+forwarding). format/lint/mypy clean.

…y_config static-walk path Two edge cases verified during the second testing round (real onnxruntime + stub introspection): - history_len=3: each tick feeds a 258-wide stack whose blocks are [oldest..newest] over a rolling 3-frame window, warm-filled at tick 0. - mesh tell()/policy_config path: create_policy('wbc', target_velocity=...) sets the static-walk default command (how a command reaches WBC without per-call policy_kwargs). 63 wbc+forwarding tests pass; format/lint/mypy clean.

…s#466 Reviewed against the ACTUAL source strands-labs#466 cites - NVlabs/GR00T-WholeBodyControl decoupled_wbc/sim2mujoco/scripts/run_mujoco_gear_wbc.py + resources/robots/g1/ g1_gear_wbc.yaml (fetched from upstream main) - rather than the paraphrase in the issue body or the SageMaker endpoint (which serves a different SONIC component, the kinematic planner). The reference VINDICATES the core design (single_obs_dim=86, two ONNX sessions, PD law, action_scale, deque history, name-based G1 joints, quaternion math) but revealed concrete contract details my first pass got wrong: - Command block layout (the big one). compute_observation builds the 7-dim command as [vx*cmd_scale, vy*cmd_scale, omega*cmd_scale, height_cmd, roll, pitch, yaw] - NOT zero-padded velocity. Rewrote _resolve_command to produce exactly this; target_orientation now means base RPY (slots [4:7]) and a new 'height' kwarg drives slot [3]. With real weights the old zero-padded command would have been wrong in 6 of 7 slots. - Defaults corrected to g1_gear_wbc.yaml: obs_history_len 1 -> 6 (num_obs 86 -> 516), ang_vel_scale 0.25 -> 0.5. Added cmd_scale [2.0,2.0,0.5] and height_cmd 0.74 config fields. - Walk-vs-main selection: upstream uses norm(RAW loco_cmd) <= 0.05 -> main (standing) policy. Was norm(scaled command) > 1e-6. _run_session now takes the raw velocity and uses the 0.05 threshold (constant _WALK_VELOCITY_THRESHOLD). - Quaternion math verified numerically IDENTICAL to upstream's conjugate-based quat_rotate_inverse (max diff 1.8e-15 over 2000 random quats). Tests: +6 TestUpstreamFidelity pins (config matches yaml, command layout, height default, 0.05 walk threshold on raw velocity, raw-not-scaled selection, quat-matches-upstream-formula). 69 wbc+forwarding pass; 2 integration tests pass against real onnxruntime+MuJoCo G1 with the corrected 516-wide input. Docs updated with the exact observation layout + new kwargs. format/lint/mypy clean.

…onfig Round-4 review (adversarial pass + validation against the REAL shipped weights, which are in the upstream git: GR00T-WholeBodyControl-Balance.onnx /-Walk.onnx, ONNX input [batch,516] output [batch,15] - confirming the corrected 516 default). Observation warm-start (BUG, behavioral on the first obs_history_len-1 ticks): ObservationHistory pre-filled the deque with COPIES of the first frame; the upstream reference (run_mujoco_gear_wbc.py:47-50) pre-fills with ZERO frames and appends the live frame each tick. Since frame 0 carries gravity ~ -1 and height_cmd 0.74, the copies-vs-zeros difference was O(1) in the older history blocks during the first ~5 control steps of every episode (and post-reset). Now zero-warm-started to match upstream exactly; reset() re-seeds zeros. The old unit test + docstring asserted the WRONG behavior - both corrected. Real-weights readiness - YAML config: WBCConfig.from_file now reads .yaml/.yml (via pyyaml, added to [wbc] extra) in addition to .json, so the upstream g1_gear_wbc.yaml loads directly. from_dict normalizes the upstream FLAT scale keys (ang_vel_scale/dof_pos_scale/ dof_vel_scale) into obs_scales; an explicit obs_scales map still wins. Verified by loading the real upstream YAML (516/6, scales, cmd_scale, 15-len vectors). Config validation: cmd_scale must be length 3 (the one previously-unguarded field), matching the per-joint-vector guards. Scope: documented that this targets the NON-gait reference (single_obs_dim 86, 7-wide command, two policies). The upstream gait variant (run_mujoco_gear_wbc_gait.py: single_obs_dim 95, 8-wide command + clock + torso) is a separate embodiment, explicitly out of scope. Integration: added test_real_onnx_io_dims_match_config (pins ONNX input==num_obs, output==num_actions). Reframed the gait-quality test as a deploy-fidelity check gated behind WBC_GAIT_CHECK=1 - WBCPolicy emits POSITION targets; a real gait needs the upstream TORQUE-PD loop (compute_torques) on a torque-actuator model, which is a deploy concern beyond the policy. The 3 deterministic real-weights integration tests pass; gait check skips with a clear explanation. Tests: 73 wbc+forwarding pass (+4 config/YAML); zero regressions vs pristine main (verified by failure-set diff). format/lint/mypy clean.

…d (critical) Found while building the torque-deploy harness, by confronting the decisive upstream math: single_obs_dim is hardcoded 86, but 7+3+3+15+15+15 = 58. The only way to reach 86 is qj/dqj observing ALL joints (upstream n_joints = nq-7 = 29): 7+3+3+29+29+15 = 86. Re-reading compute_observation confirms qj = qpos[7:7+n_joints], dqj = qvel[6:6+n_joints] with n_joints=29, while the action block is the 15 leg+waist outputs. My implementation used num_actions (15) for the qj/dqj blocks, so it populated only 58 of 86 values and put dqj/prev_action in the WRONG slots - the arm-joint observations (28 values) were ZEROS. The real ONNX accepts the 516 width so earlier 'it runs' tests passed, but the network saw a malformed observation (verified: frame[28], a real arm-joint reading, was 0 before and is 0.16 now). Fix - separate the two joint sets: - New WBCConfig.n_obs_joints (default 29) = observed whole-body joints; distinct from num_actions (15) = controlled leg+waist. Validates n_obs_joints>=num_actions. - build_single_frame writes qj/dqj at n_obs_joints width, action at num_actions: [0:c] cmd, [c:c+3] omega, [c+3:c+6] grav, [c+6:c+6+no] qj, [c+6+no:c+6+2no] dqj, [c+6+2no:c+6+2no+na] action. default_angles (15) zero-padded to no for the qj subtraction, exactly as upstream padded_defaults. - New WBC_G1_ALL_JOINTS (29, legs+waist+arms) = the observed-joint order; its first 15 are exactly WBC_G1_LEG_WAIST_JOINTS. set_robot_state_keys now resolves BOTH the controlled (15) and observed (29) sets by name; _extract_state reads all 29 via _read_joint_vector(names=...). Re-validated end-to-end against the REAL GR00T-WholeBodyControl-Balance/Walk.onnx (input 516) - arm-joint observations now populated; the upstream g1_gear_wbc.yaml still loads (n_obs_joints defaults to 29). 76 unit tests pass (+ pins for the 29-vs-15 layout and n_obs_joints>=num_actions); 3 real-weights integration tests pass. format/lint/mypy clean.

Adds examples/wbc_g1_torque_deploy.py: a standalone harness that reproduces the upstream GearWbcController loop closely enough to drive the real GR00T-WholeBodyControl-{Balance,Walk}.onnx weights and watch the G1 locomote - the deploy-fidelity path that sim.run_policy (position-servo) can't exercise. What it does: - Loads the robot_descriptions G1 and converts its position-servo actuators to pure-torque motors in-process (+ adds a ground plane the bare robot MJCF lacks). - Per physics tick: tau = policy.compute_torques(target, q, dq) on the 15 leg+ waist motors; arms held with a stiff PD; mj_step. - Every control_decimation (4) ticks: builds the whole-body 86-dim observation (29-joint qj/dqj + real joint velocities + base IMU) and queries WBCPolicy (Balance when standing, Walk when moving); the returned targets become the PD setpoints. - Reports base x/z over the rollout, a verdict, and an optional MP4. Validated with the real SONIC weights (all errors found were harness setup, not WBCPolicy - the layout fix in the prior commit was the real bug): - vx=0.5, 5s -> base advances +1.89 m, height holds ~0.74 m (WALKED FORWARD) - vx=1.0, 4s -> +3.07 m (faster command, faster gait) - vx=0 stand -> holds balance in place (static-hold PD collapses by ~1s; the policy actively balances) - proving the controller is doing real work - omega / vy -> stays upright Rendered a 167-frame MP4 of the forward walk (real motion, not static). Docs: WBC page gains a 'Watching it walk' section pointing at the harness and explaining the position-servo vs torque-deploy distinction; examples README row.

… + no-weights harness smoke test The gait check was a permanently-skipped placeholder that asserted something the position-servo sim.run_policy path cannot achieve. Replace it with tests that actually validate the torque-deploy gait through the harness loop. Harness refactor: extract a pure, importable simulate_rollout(policy, vx, ..., duration) from the CLI run() - returns {x0,z0,x1,z1,forward,fell,steps,frames}. The CLI and both tests now drive the IDENTICAL loop (no duplicated control code). No-weights smoke test (tests/policies/wbc/test_torque_harness.py, runs in CI with mujoco, stub ONNX): - _build_torque_g1 adds a ground plane (REGRESSION: the bare robot_descriptions G1 MJCF has none -> the robot fell through space, z -> -18) and converts all actuators to pure-torque motors. - _model_observation yields whole-body per-joint positions + .vel + base IMU. - a short static-hold rollout (zero-action stub) stays above the floor and does not fall through the world; rollout returns the documented metric keys. Real-weights gait test (gated WBC_GAIT_CHECK=1 + WBC_CHECKPOINT): now calls simulate_rollout with the real Balance/Walk weights and asserts forward progress + no fall - and PASSES (verified: +1.5-1.9 m forward, height held). Added a companion standing-balance test (zero command holds height). Both skip cleanly without the env flag. 80 WBC unit tests pass (+4 harness smoke); gated gait+balance pass with real weights; integration suite skips cleanly in CI; zero regressions; format/lint clean.

…, docs) Final pre-PR review pass (whole changeset + adversarial review) found and fixed: - PACKAGING (ship-blocker): the [wbc] extra was missing huggingface_hub, but the default create_policy('wbc') downloads nvidia/GEAR-SONIC via it - so a clean 'pip install strands-robots[wbc]' + zero-arg construct raised 'huggingface_hub not installed (the [wbc] extra)' while [wbc] didn't carry it. Added huggingface_hub>=0.20,<1.0 to the extra. - SECURITY/INTEGRATION (real gap): 'wbc'/'sonic' were absent from the mesh + Device Connect policy-provider allowlist (_DEFAULT_POLICY_TYPES), so tell(..., policy_provider='wbc') and the DC drivers rejected WBC at validate_command / is_safe_policy_provider. Added them + 2 regression tests (allowlist membership + validate_command accepts a wbc execute). - DOCS (stale, safety-relevant): observation.py module docstring still described the OLD 15-wide qj/dqj / 58-padded layout; corrected to the real 29-joint whole-body layout (7+3+3+29+29+15=86, no padding) matching the function docstring and code. Reserved-tail comment clarified (no tail by default). - CHANGELOG: added a WBCPolicy 'Unreleased' entry (matching the existing format). - Scope note in docs/policies/wbc.md: policy_kwargs is on the control/deploy path (run_policy/start_policy/mesh), not eval_policy/evaluate_benchmark (instruction-driven); static-velocity WBC eval uses the constructor. This is a deliberate scope decision, now documented rather than silent. - Cosmetics: WBC_G1_ALL_JOINTS added to policy.py __all__; _build_torque_g1 docstring corrected (4-tuple, not a phantom 5-tuple). Verified: 82 WBC unit tests pass; mesh security suite 231 pass; real-weights integration (incl. gait + balance) 5 pass; full-repo regression diff vs pristine main is EMPTY (zero new failures); format/lint/mypy clean.

CI dependency-resolution failure: the [wbc] extra pinned huggingface_hub <1.0.0, but lerobot 0.5.x requires huggingface-hub>=1.0.0. Since [wbc] is pulled into [all], 'pip install .[all,dev]' (what CI installs) hit ResolutionImpossible. Widen to >=0.20.0,<2.0.0 (cap the major per AGENTS.md dependency-bounds rule), matching lerobot's own <2.0.0 bound so the two co-resolve (verified: resolves to huggingface-hub 1.19.0 + lerobot 0.5.1). The <1.0.0 cap was an over-tightening introduced when adding huggingface_hub to the extra.

…t style) Address two CodeQL code-scanning findings on the WBC policy PR: - policy.py: the empty except (TypeError, ValueError) in the per-joint observation reader now carries an explanatory comment documenting that a non-numeric entry intentionally degrades to its pre-zeroed neutral value (the full qj/dqj block is validated downstream). - test_policy.py: the wbc.policy module was imported both with 'import ... as' and 'from ... import' across test methods. Consolidate to a single top-level 'from strands_robots.policies.wbc import policy as wbc_policy' and reference require_optional / _looks_like_hf_repo_id through the module object. No behavioral change. 73 passed / 1 skipped; ruff check + format clean.

CI runs 'mypy strands_robots tests tests_integ' over the whole tree; my local mypy was scoped to the wbc source files and missed 6 errors in the tests: - tests_integ/policies/wbc/test_wbc_live.py: create_policy() is typed to return the base Policy, so .policy_session / .config (WBCPolicy attrs) tripped [attr-defined] (x5). Added the isinstance(policy, WBCPolicy) narrowing the other tests already use - also a genuinely correct assertion. - tests/policies/wbc/test_policy.py: obs inferred as dict[str, float] from the comprehension, then assigned a list base_ang_vel -> [assignment]. Annotated that obs as dict[str, Any]. Verified with the CI-equivalent invocation: mypy clean across all wbc source + test + integ files; ruff + format clean; 82 WBC tests still pass.

rmncardoso · 2026-06-17T23:36:58Z

Thanks — spot-on, all three addressed. My miss: I ran mypy scoped to wbc/ locally instead of the whole-tree check CI uses.

mypy: fixed as you suggested — assert isinstance(policy, WBCPolicy) in test_real_onnx_io_dims_match_config, and widened the obs annotation (used dict[str, Any] to match Policy.get_actions's signature; either works).
CHANGELOG conflict: rebased onto current main (incl. chore(release): collapse Unreleased CHANGELOG sections into dated v0.4.0 heading #484) and re-anchored the entry under [Unreleased]. The huggingface_hub<1.0.0↔lerobot conflict is also fixed (widened to <2.0.0).

Re-verified CI-style: mypy/ruff/format clean, 82 tests pass. FWIW the WBC_GAIT_CHECK=1 path against the real GR00T-WBC weights also passes (G1 walks ~1.5–1.9 m, holds balance) — just opt-in since it needs a checkpoint. Ready for a re-run.

github-advanced-security AI found potential problems Jun 17, 2026

View reviewed changes

Comment thread strands_robots/policies/wbc/policy.py

Comment thread tests/policies/wbc/test_policy.py Outdated

rmncardoso force-pushed the feat/wbc-policy branch from 15492df to 2c8bbbb Compare June 17, 2026 08:49

cagataycali added this to Strands Labs - Robots Jun 17, 2026

github-project-automation Bot moved this to Backlog in Strands Labs - Robots Jun 17, 2026

cagataycali moved this from Backlog to In review in Strands Labs - Robots Jun 17, 2026

rmncardoso and others added 14 commits June 18, 2026 00:30

rmncardoso force-pushed the feat/wbc-policy branch from 8fc2ddc to 53ee9ab Compare June 17, 2026 23:30

cagataycali approved these changes Jun 18, 2026

View reviewed changes

cagataycali merged commit 7ec9e3c into strands-labs:main Jun 18, 2026
9 checks passed

github-project-automation Bot moved this from In review to Done in Strands Labs - Robots Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(policies): WBCPolicy provider for GR00T Whole-Body-Control (SONIC, ONNX)#483

feat(policies): WBCPolicy provider for GR00T Whole-Body-Control (SONIC, ONNX)#483
cagataycali merged 14 commits into
strands-labs:mainfrom
rmncardoso:feat/wbc-policy

rmncardoso commented Jun 16, 2026 •

edited by cagataycali

Loading

Uh oh!

cagataycali commented Jun 17, 2026

Uh oh!

Uh oh!

Uh oh!

rmncardoso commented Jun 17, 2026

Uh oh!

rmncardoso commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rmncardoso commented Jun 16, 2026 • edited by cagataycali Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Testing

Reviewer notes / scope

Follow-up (not in this PR)

Review changelog

Round 1 — CodeQL code-scanning findings (commit d92904f)

Uh oh!

cagataycali commented Jun 17, 2026

Test report (aarch64 / MuJoCo EGL headless)

What passes

mypy failures (block CI)

Minimal fix (verified — Success: no issues found in 363 source files, and tests/policies/wbc/test_policy.py still 74 passed)

Other

Uh oh!

Uh oh!

Uh oh!

rmncardoso commented Jun 17, 2026

Rebased on main + fixed the CI dependency conflict

Uh oh!

rmncardoso commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rmncardoso commented Jun 16, 2026 •

edited by cagataycali

Loading

Round 1 — CodeQL code-scanning findings (commit `d92904f`)

Minimal fix (verified — `Success: no issues found in 363 source files`, and `tests/policies/wbc/test_policy.py` still 74 passed)

Rebased on `main` + fixed the CI dependency conflict