Back to researchers

Dan Hendrycks

Broad capability evaluation (MMLU)

Co-authored MMLU: a widely used benchmark for general LLM capability across many subjects.

Highlights

EvaluationBenchmarksLLMs
Focus: Broad capability evaluation (MMLU)
Why it matters: Co-authored MMLU: a widely used benchmark for general LLM capability across many subjects.

Research Areas

EvaluationBenchmarksLLMs
Dan Hendrycks - AI Researcher Profile | 500AI