Back to researchers
Maddie Simens
Instruction-following via RLHF (InstructGPT)
Co-authored the InstructGPT paper that set the standard instruction-tuning + RLHF recipe.
Highlights
OpenAIInstructGPTRLHFPost-training
Focus: Instruction-following via RLHF (InstructGPT)
Why it matters: Co-authored the InstructGPT paper that set the standard instruction-tuning + RLHF recipe.
Research Areas
OpenAIInstructGPTRLHFPost-training