How do we effectively synergize learning from language by grounding to other modalities such as vision and motor control?
References
2023
Multimodal Knowledge Alignment with Reinforcement Learning
Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Rowan Zellers, Prithviraj Ammanabrolu, Ronan Le Bras, Gunhee Kim, and Yejin Choi
In Conference on Computer Vision and Pattern Recognition (CVPR), 2023
@inproceedings{yu2022esper,title={Multimodal Knowledge Alignment with Reinforcement Learning},author={Yu, Youngjae and Chung, Jiwan and Yun, Heeseung and Hessel, Jack and Park, JaeSung and Lu, Ximing and Zellers, Rowan and Ammanabrolu, Prithviraj and Bras, Ronan Le and Kim, Gunhee and Choi, Yejin},booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},url={https://arxiv.org/abs/2205.12630},year={2023},}
Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling
Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, and Roy Fox
In International Conference on Machine Learning (ICML), 2023
@inproceedings{Nottingham2023Embodied,title={Do Embodied Agents Dream of Pixelated Sheep: Embodied Decision Making using Language Guided World Modelling},author={Nottingham, Kolby and Ammanabrolu, Prithviraj and Suhr, Alane and Choi, Yejin and Hajishirzi, Hannaneh and Singh, Sameer and Fox, Roy},booktitle={International Conference on Machine Learning (ICML)},url={https://arxiv.org/abs/2301.12050},year={2023},}