Social Value Alignment

Aligning AI agents to social commonsense norms and values.

Social value alignment refers to creating agents whose behaviors conform to expected moral and social norms for a given context and group of people – in our case, it means agents that behave in a manner that is less harmful and more beneficial for themselves and others.

References

2022

Aligning to Social Norms and Values in Interactive Narratives

Prithviraj Ammanabrolu, Liwei Jiang, Maarten Sap, Hannaneh Hajizhirzi, and Yejin Choi

In North American Chapter of the Association for Computational Linguistics (NAACL), Jun 2022

Bib Paper

@inproceedings{ammanabrolu2022aligning,
  title = {Aligning to Social Norms and Values in Interactive Narratives},
  author = {Ammanabrolu, Prithviraj and Jiang, Liwei and Sap, Maarten and Hajizhirzi, Hannaneh and Choi, Yejin},
  booktitle = {North American Chapter of the Association for Computational Linguistics (NAACL)},
  url = {https://arxiv.org/abs/2205.01975},
  year = {2022},
}

Quark: Controllable Text Generation with Reinforced Unlearning

Ximing Lu, Sean Welleck, Liwei Jiang, Jack Hessel, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, and Yejin Choi

In Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS), Jun 2022

Bib Paper

@inproceedings{lu2022quark,
  title = {Quark: Controllable Text Generation with Reinforced Unlearning},
  author = {Lu, Ximing and Welleck, Sean and Jiang, Liwei and Hessel, Jack and Qin, Lianhui and West, Peter and Ammanabrolu, Prithviraj and Choi, Yejin},
  booktitle = {Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS)},
  url = {https://arxiv.org/abs/2205.13636},
  year = {2022},
}