Back to researchers
William Fedus
Trillion-parameter scaling with sparsity (Switch Transformers)
Co-authored Switch Transformers: a core reference for practical MoE scaling.
Highlights
MoEScalingSystems
Focus: Trillion-parameter scaling with sparsity (Switch Transformers)
Why it matters: Co-authored Switch Transformers: a core reference for practical MoE scaling.
Research Areas
MoEScalingSystems