Skip to content

motional/SpanVLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

SpanVLA

website paper dataset License

This is the official implementation of the paper:

SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model

Zewei Zhou1,2*, Ruining Yang2,3*, Xuewei (Tony) Qi2†, Yiluan Guo2, Sherry X. Chen2, Tao Feng2, Kateryna Pistunova2, Yishan Shen2, Lili Su3, Jiaqi Ma1

1 University of California, Los Angeles, USA | 2 Motional, USA | 3 Northeastern University, USA

* Equal contribution. Corresponding author.

teaser

SpanVLA introduce a efficient action bridging with sparse KV-Cache and history initialization and learn from negative-recovery samples to improve the robustness and performance.

News

  • 2026/04: SpanVLA paper is now released.

Release Plan

  • 2026/04: ✅ SpanVLA paper.
  • 2026/09: SpanVLA codebase.
  • 2026/12: mReasoning dataset.

Acknowledgements

We would like to express their gratitude to Qian Zhu, Haram Kim, and Baoshu Qi for their extensive efforts in data preparation and annotation of mReasoning dataset. Special thanks also go to Muhammad Taufik Tirtosudiro and Jiong Yang for their support in developing the evaluation pipeline. The authors also thank Nitin Kapania, Sourabh Vora, and Balajee Kannan for their strong support for the project.

Citation

If you find this repository useful for your research, please consider giving us a star 🌟 and citing our paper.

@article{zhou2026spanvla,
 author = {Zhou, Zewei and Yang, Ruining and Qi, Xuewei and Guo, Yiluan and Chen, Sherry X. and Feng, Tao and Pistunova, Kateryna and Shen, Yishan and Su, Lili and Ma, Jiaqi},
 title = {SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model},
 journal = {arXiv preprint arXiv:2604.19710},
 year = {2026},
}

About

SpanVLA: Efficient Action Bridging and Learning from Negative-Recovery Samples for Vision-Language-Action Model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors