Skip to content

[codex] Restore bilinear feature-only policy defaults#14

Merged
fei-yang-wu merged 3 commits into
mainfrom
feature/ipmd-offline-spectral-pretrain
May 10, 2026
Merged

[codex] Restore bilinear feature-only policy defaults#14
fei-yang-wu merged 3 commits into
mainfrom
feature/ipmd-offline-spectral-pretrain

Conversation

@fei-yang-wu
Copy link
Copy Markdown
Owner

Summary

  • Restore the bilinear policy default to feature-only input: F(s)z, without concatenating the raw state.
  • Keep feature values detached for the policy path while allowing the posterior/latent command path to receive policy gradients.
  • Add and update focused coverage for bilinear policy representation, command feature width, and offline SR pretraining behavior.

Validation

  • Focused RLOpt bilinear tests passed: 8 passed.
  • 1024-env Dance102 smoke matched the intended latent v0 policy input shape.
  • Skynet 4096-env Dance102 job 3121342 completed successfully.

@fei-yang-wu fei-yang-wu marked this pull request as ready for review May 10, 2026 19:44
@fei-yang-wu fei-yang-wu merged commit 9c47bfa into main May 10, 2026
0 of 2 checks passed
@fei-yang-wu fei-yang-wu deleted the feature/ipmd-offline-spectral-pretrain branch May 10, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant