Paul Christiano

Alignment theory, reward modeling

Major influence on reward-modeling and oversight ideas that feed into modern post-training.

Highlights

AlignmentRLHFSafety

Focus: Alignment theory, reward modeling

Why it matters: Major influence on reward-modeling and oversight ideas that feed into modern post-training.

Start here

AlignmentRLHFSafety

Paul Christiano - AI Researcher Profile | 500AI