Back to researchers

Daniela Amodei

Model-written evaluations for LM behavior

Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.

Highlights

AnthropicEvaluationSafetyAlignment
Focus: Model-written evaluations for LM behavior
Why it matters: Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.

Research Areas

AnthropicEvaluationSafetyAlignment
Daniela Amodei - AI Researcher Profile | 500AI