Back to researchers
Xian Li
Self-rewarding post-training
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Highlights
Post-trainingAlignmentPreferences
Focus: Self-rewarding post-training
Why it matters: Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Start here
Research Areas
Post-trainingAlignmentPreferences