Back to researchers
Sandhini Agarwal
Instruction tuning and RLHF
Worked on instruction-following and RLHF practices that became the standard post-training recipe.
Highlights
OpenAIRLHFPost-training
Focus: Instruction tuning and RLHF
Why it matters: Worked on instruction-following and RLHF practices that became the standard post-training recipe.
Research Areas
OpenAIRLHFPost-training