Back to researchers

Tom B. Brown

Practical RL from human feedback

Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Highlights

RLHFAlignmentPreference learning
Focus: Practical RL from human feedback
Why it matters: Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Research Areas

RLHFAlignmentPreference learning
Tom B. Brown - AI Researcher Profile | 500AI