Multimodal Language Grounding | PEARLS Lab @ UCSD

How do we effectively synergize learning from language by grounding to other modalities such as vision and motor control?

References

2025

Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning

Isadora White, Kolby Nottingham, Ayush Maniar, Max Robinson, Hansen Lillemark, Mehul Maheshwari, Lianhui Qin, and Prithviraj Ammanabrolu

arXiv preprint arXiv:2504.17950, 2025

Bib Paper Website

@article{white2025collaborating,
  title = {Collaborating Action by Action: A Multi-agent LLM Framework for Embodied Reasoning},
  author = {White, Isadora and Nottingham, Kolby and Maniar, Ayush and Robinson, Max and Lillemark, Hansen and Maheshwari, Mehul and Qin, Lianhui and Ammanabrolu, Prithviraj},
  journal = {arXiv preprint arXiv:2504.17950},
  year = {2025},
  url = {https://arxiv.org/abs/2504.17950},
}

2023

Multimodal Knowledge Alignment with Reinforcement Learning

Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Rowan Zellers, Prithviraj Ammanabrolu, Ronan Le Bras, Gunhee Kim, and Yejin Choi

In Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Bib Paper

@inproceedings{yu2022esper,
  title = {Multimodal Knowledge Alignment with Reinforcement Learning},
  author = {Yu, Youngjae and Chung, Jiwan and Yun, Heeseung and Hessel, Jack and Park, JaeSung and Lu, Ximing and Zellers, Rowan and Ammanabrolu, Prithviraj and Bras, Ronan Le and Kim, Gunhee and Choi, Yejin},
  booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
  url = {https://arxiv.org/abs/2205.12630},
  year = {2023},
}

Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling

Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, and Roy Fox

In International Conference on Machine Learning (ICML), 2023

Bib Paper Website

@inproceedings{Nottingham2023Embodied,
  title = {Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling},
  author = {Nottingham, Kolby and Ammanabrolu, Prithviraj and Suhr, Alane and Choi, Yejin and Hajishirzi, Hannaneh and Singh, Sameer and Fox, Roy},
  booktitle = {International Conference on Machine Learning (ICML)},
  url = {https://arxiv.org/abs/2301.12050},
  year = {2023},
}