https://arxiv.org/abs/2307.04657
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset (Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang)
https://arxiv.org/abs/2307.04657
BeaverTails: Towards Improved Safety Alignment of LLM via a Human-Preference Dataset (Jiaming Ji, Mickel Liu, Juntao Dai, Xuehai Pan, Chi Zhang, Ce Bian, Chi Zhang, Ruiyang Sun, Yizhou Wang, Yaodong Yang)