Researchers — page 24

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2776

Shane Arora

Open, fully-documented language models (OLMo)

AI2

Co-authored OLMo: Accelerating the Science of Language Models.

Open Models OLMo: Accelerating the Science of Language Models

2777

Shane Gu

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2778

Shane Legg

Practical RL from human feedback

Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Post-Training & Alignment Reinforcement Learning Deep Reinforcement Learning from Human Preferences

2779

Shanghao Lu

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2780

Shangyan Zhou

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2781

Shanhuang Chen

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2782

Shantanu Jain

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

2783

Shantanu Thakoor

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2784

Shaobo Hou

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2785

Shaoliang Nie

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2786

Shaoqing Wu

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2787

Sharad Vikram

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2788

Sharadh Ramaswamy

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2789

Sharan Narang

Pathways-scale language modeling (PaLM)

PaLM: Scaling Language Modeling with Pathways

Co-authored PaLM: Scaling Language Modeling with Pathways.

2790

Sharat Chikkerur

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2791

Sharath Raparthy

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2792

Shariq Iqbal

Open multimodal models (Gemma 3)

Co-authored the Gemma 3 Technical Report.

2793

Shashi Narayan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2794

Shashir Reddy

Open multimodal models (Gemma 3)

Co-authored the Gemma 3 Technical Report.

2795

Shaun Lindsay

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Post-Training & Alignment Evaluation & Benchmarks Systems & Infrastructure Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

2796

Shauna Kravec

Alignment via AI feedback (Constitutional AI)

Anthropic

A useful profile for the operational side of alignment work, especially where RL systems and evaluation loops have to be robust enough to support day-to-day model development.

2797

Shawn Jain

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Open Models Shawn Presser

2798

Shawn Presser

Open-source LLMs (EleutherAI)

EleutherAI

Worth knowing in the open-model ecosystem because his profile combines authorship on The Pile with a large body of public code and notes rather than only one flagship paper.

2799

Shean Wang

Parameter-efficient finetuning

Co-authored LoRA: one of the core techniques behind modern fine-tuning pipelines.

Systems & Infrastructure LoRA: Low-Rank Adaptation of Large Language Models

2800

Sheer El Showk

Alignment via AI feedback (Constitutional AI)

Anthropic

Worth following for the thread inside Anthropic that connects assistant training to more explicit work on reasoning faithfulness and evaluation.

Post-Training & Alignment Evaluation & Benchmarks Agents & Reasoning Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

2801

Sheila Dunning

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2802

Sheleem Kashem

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2803

Shen Li

Fully Sharded Data Parallel training (FSDP)

Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

Systems & Infrastructure PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

2804

Sheng Feng

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2805

Sheng Shen

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

RoFormer: Enhanced Transformer with Rotary Position Embedding

2806

Shengfeng Pan

Rotary position embeddings (RoPE)

A useful architecture page because he appears on the small original RoFormer author list, making him one of the identifiable contributors behind the RoPE design that later spread across modern LLM stacks.

2807

Shengfeng Ye

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2808

Shengguang Wu

Open-weight LLMs (Qwen)

Co-authored the Qwen Technical Report.

2809

Shenghao Lin

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2810

Shengjia Zhao

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2811

Shengli Hu

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2812

Shengxin Cindy Zha

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2813

Shengyang Dai

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2814

Shengye Wan

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2815

Shereen Ashraf

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2816

Sherjil Ozair

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2817

Sherwin Wu

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Evaluation & Benchmarks Holistic Evaluation of Language Models

2818

Shibani Santurkar

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

2819

Shibo Wang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2820

Shijie Wang

Open-weight LLMs (Qwen)

Co-authored the Qwen Technical Report.

2821

Shimu Wu

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2822

Shino Jomoto

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2823

Shipra Banga

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2824

Shirin Badiezadegan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2825

Shirley Chung

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2826

Shirong Ma

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2827

Shishir Patil

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2828

Shital Shah

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

2829

Shiva Shankar

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2830

Shivani Agrawal

Pathways-scale language modeling (PaLM)

PaLM: Scaling Language Modeling with Pathways

Co-authored PaLM: Scaling Language Modeling with Pathways.

2831

Shivanshu Purohit

Open-source LLMs (EleutherAI)

EleutherAI

A better starting page for the open-model long tail because it ties one of the GPT-NeoX contributors to current public ML interests instead of leaving the profile as generic EleutherAI filler.

Open Models Shivanshu Purohit

2832

Shixiang Shane Gu

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2833

Shiyu Wang

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2834

Shiyuan Chen

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2835

Sho Arora

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2836

Sholto Douglas

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2837

Shree Pandya

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2838

Shreya Pathak

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2839

Shreya Singh

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2840

Shreyas Rammohan Belle

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2841

Shruti Bhosale

Open-weight chat and foundation models (Llama 2)

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

2842

Shruti Garg

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2843

Shruti Rijhwani

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2844

Shruti Sheth

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2845

Shuai Bai

Open-weight LLMs (Qwen)

Co-authored the Qwen Technical Report.

2846

Shuang Zhou

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2847

Shuangfeng Li

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2848

Shubham Agrawal

Open code models (CodeGemma)

Open Models Code Models CodeGemma: Open Code Models Based on Gemma

Co-authored CodeGemma: open code models based on Gemma.

2849

Shuiping Yu

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2850

Shun Zhang

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2851

Shunfeng Zhou

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2852

Shuntong Lei

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2853

Shunyu Yao

Reasoning + acting for LLM agents (ReAct)

Co-authored ReAct: a simple, high-leverage template for tool-using LLM agents.

Agents & Reasoning ReAct: Synergizing Reasoning and Acting in Language Models

2854

Shuo-yiin Chang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2855

Shuohang Wang

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2856

Shuqiang Zhang

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2857

Shusheng Yang

Open-weight LLMs (Qwen)

Co-authored the Qwen Technical Report.

2858

Shuting Pan

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2859

Shuyuan Zhang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2860

Shyam Upadhyay

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2861

Shyamal Anadkat

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2862

Siamak Shakeri

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2863

Sid Black

Open-source LLMs, training

EleutherAI

A useful anchor for the open-model ecosystem because his path runs from EleutherAI’s training efforts into a more explicit alignment and interpretability agenda at Conjecture.

Open Models Post-Training & Alignment Interpretability Conjecture

2864

Sid Lall

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2865

Sid Mittal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2866

Siddharth Gopal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2867

Siddharth Goyal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2868

Siddhartha Brahma

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2869

Siddhinita Wandekar

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2870

Sidharth Mudgal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2871

Siim Põder

Open multimodal models (Gemma 3)

Co-authored the Gemma 3 Technical Report.

2872

Sijal Bhatnagar

Open multimodal models (Gemma 3)

Co-authored the Gemma 3 Technical Report.

Multimodal Systems & Infrastructure BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

2873

Silvio Savarese

BLIP-2 and frozen-encoder multimodal LLMs

Co-authored BLIP-2: a key step toward efficient vision-language models built around LLM backbones.

2874

Simon Osindero

Compute-optimal scaling for LLM training

Multimodal Systems & Infrastructure Diffusion & Generative Media WaveNet: A Generative Model for Raw Audio

Important because his work spans several major eras of modern deep learning, from early generative modeling and sequence systems to the DeepMind large-model stack that culminated in Gemini.

2875

Simón Posada Fishman

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2876

Simon Schmitt

Planning with learned dynamics (MuZero)