Back to researchers

Alex Ray

Instruction-following via RLHF (InstructGPT)

Co-authored the InstructGPT paper that set the standard instruction-tuning + RLHF recipe.

Highlights

OpenAIInstructGPTRLHFPost-training
Focus: Instruction-following via RLHF (InstructGPT)
Why it matters: Co-authored the InstructGPT paper that set the standard instruction-tuning + RLHF recipe.

Research Areas

OpenAIInstructGPTRLHFPost-training
Alex Ray - AI Researcher Profile | 500AI