๐ถโ๐ซ๏ธ
Psych
Search...
Ctrl
K
[05/14/2024] LLM as A Evaluator
Previous
[05/14/2024] Validity Coding
Next
[05/14/2024] Social Skill Training via LLMs (Diyi's Group)
Last updated
1 year ago
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
arXiv.org
Perils of Self-Feedback: Self-Bias Amplifies in Large Language Models
arXiv.org