Back to researchers

Shane Legg

Practical RL from human feedback

Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Highlights

RLHFAlignmentPreference learning
Focus: Practical RL from human feedback
Why it matters: Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Research Areas

RLHFAlignmentPreference learning
Shane Legg - AI Researcher Profile | 500AI