Back to researchers

Eric P. Xing

LLM-as-a-judge evaluation (MT-Bench)

Co-authored MT-Bench / LLM-as-a-judge: a widely used template for scalable multi-turn evaluation.

Highlights

EvaluationLMSysLLM-as-a-judge
Focus: LLM-as-a-judge evaluation (MT-Bench)
Why it matters: Co-authored MT-Bench / LLM-as-a-judge: a widely used template for scalable multi-turn evaluation.

Research Areas

EvaluationLMSysLLM-as-a-judge
Eric P. Xing - AI Researcher Profile | 500AI