Back to researchers
Ammar Ahmad Awan
Large-scale transformer inference (DeepSpeed)
Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.
Highlights
SystemsInferenceServing
Focus: Large-scale transformer inference (DeepSpeed)
Why it matters: Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.
Research Areas
SystemsInferenceServing