Back to researchers

Mostofa Patwary

Model-parallel training at scale (Megatron-LM)

Co-authored Megatron-LM: a core reference for scaling transformer training via model parallelism.

Highlights

SystemsTrainingScaling
Focus: Model-parallel training at scale (Megatron-LM)
Why it matters: Co-authored Megatron-LM: a core reference for scaling transformer training via model parallelism.

Research Areas

SystemsTrainingScaling
Mostofa Patwary - AI Researcher Profile | 500AI