Back to researchers

Minjia Zhang

Large-scale transformer inference (DeepSpeed)

Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.

Highlights

SystemsInferenceServing
Focus: Large-scale transformer inference (DeepSpeed)
Why it matters: Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.

Research Areas

SystemsInferenceServing
Minjia Zhang - AI Researcher Profile | 500AI