Back to researchers
Min Xu
Fully Sharded Data Parallel training (FSDP)
Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.
Highlights
SystemsTrainingPyTorchScaling
Focus: Fully Sharded Data Parallel training (FSDP)
Why it matters: Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.
Research Areas
SystemsTrainingPyTorchScaling