Skip to content

Add MicroAGI embodiment with data configs; fix viz_language image-key resolution#495

Draft
marcopepunkt wants to merge 2 commits into
GaTech-RL2:mainfrom
MicroAGI-Labs:microagi-embodiment
Draft

Add MicroAGI embodiment with data configs; fix viz_language image-key resolution#495
marcopepunkt wants to merge 2 commits into
GaTech-RL2:mainfrom
MicroAGI-Labs:microagi-embodiment

Conversation

@marcopepunkt

Copy link
Copy Markdown

What

Registers MicroAGI egocentric capture as a new Human embodiment and
fixes a key-naming mismatch that broke viz_language.py with the existing
keypoint viz configs.

New embodiment: microagi_* (IDs 15–17)

(`images.front_1`, `obs_head_pose`, `left/right.obs_{ee_pose,wrist_pose,keypoints}`
in SLAM world frame), but with per-episode intrinsics read from
`zarr.attrs["intrinsics"]` (with constant fallback) and standard MediaPipe/MANO
21-keypoint ordering inherited from the `Human` base
  • Hydra data config (microagi_keypoints.yaml, mirrors aria_keypoints.yaml)
    and viz entries for evaluator/viz/keypoints*.yaml
  • CONTRIBUTING_DATA.md: embodiment ID table

Fix: viz_language.py resolved zero videos with keypoints*.yaml

Those viz configs use flat image keys (front_img_1) written for the trainer
path, where HPT flattens dotted batch keys. viz_language.py reads raw
dataloader batches (observations.images.front_img_1), so every batch raised
KeyError and runs silently wrote 0 videos. Missing image keys are now matched
by their last dotted segment — the inverse of HPT's flattening, scoped to the
image key only so dotted-key configs (cotrain_lang.yaml) are unchanged.

Validation

  • Episodes pass the full CONTRIBUTING_DATA.md spec check (arrays, shapes,
    unit-norm quaternions, 30 fps timestamps, JPEG q85, annotations)
  • World-frame correctness verified by projecting stored keypoints through
    inverse(obs_head_pose) + intrinsics: skeletons land on the hands
  • Rendered side by side with aria/mecka/scale sample episodes through the same
    pipeline — identical overlay behavior

marcopepunkt added 2 commits June 11, 2026 21:08
  - Microagi Human-sibling embodiment with keymaps, transform list,
    intrinsics fallback, MediaPipe / MANO keypoint connectivity
  -microagi_bimanual / right_arm / left_arm embodiment IDs + hydra data/viz configs
  - CONTRIBUTING_DATA.md embodiment table updates

  Co-authored-by: Aristotelis-Sib <aristotelis98@gmail.com>
The evaluator/viz configs (keypoints.yaml, keypoints_wrist.yaml) specify
image_key as a flat name like 'front_img_1' because they were written for
the trainer's EvalVideo path, where HPT flattens dotted batch keys to
their last segment before viz. viz_language operates on raw dataloader
batches, where the same image is keyed 'observations.images.front_img_1',
so those configs raised KeyError on every batch and produced zero videos.

Resolve a missing image_key by matching batch keys on their last dotted
segment — the inverse of HPT's flattening. Only the image key is resolved
(wholesale flattening would collide, e.g. left/right.obs_keypoints), and
only when the exact key is absent, so configs using full dotted keys
(cotrain_lang.yaml) are unaffected.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant