Back to researchers

Andrew Gu

Fully Sharded Data Parallel training (FSDP)

Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

Highlights

SystemsTrainingPyTorchScaling
Focus: Fully Sharded Data Parallel training (FSDP)
Why it matters: Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

Research Areas

SystemsTrainingPyTorchScaling
Andrew Gu - AI Researcher Profile | 500AI