Researchers — page 30

A strong researcher to follow if you care about reasoning-heavy language models, especially the line connecting chain-of-thought style methods, evaluation frameworks, and more agentic prompting patterns.

Evaluation & Benchmarks Agents & Reasoning Reinforcement Learning Stanford CRFM people page

3492

Yuhui Zhang

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

Evaluation & Benchmarks Holistic Evaluation of Language Models

3493

Yujia He

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3494

Yujia Li

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3495

Yujing Zhang

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3496

Yukun Zha

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3497

Yuling Gu

Open, fully-documented language models (OLMo)

AI2

Co-authored OLMo: Accelerating the Science of Language Models.

Open Models OLMo: Accelerating the Science of Language Models

3498

Yunan Zhang

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

3499

Yundi Qian

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3500

Yunfan Xiong

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3501

Yunfei Chu

Open-weight LLMs (Qwen)

Qwen

Co-authored the Qwen Technical Report.

Open Models Qwen Technical Report

3502

Yunfeng Liu

Rotary position embeddings (RoPE)

Useful to keep because he is another named member of the original RoFormer author group, which makes his page part of the historical trail for how RoPE entered the mainstream transformer toolkit.

RoFormer: Enhanced Transformer with Rotary Position Embedding

3503

Yunhan Xu

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3504

Yunhao Tang

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3505

Yuning Mao

Open-weight chat and foundation models (Llama 2)

Meta

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

3506

Yunjie Li

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3507

Yunlu Li

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3508

Yunsheng Li

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

3509

Yuntao Bai

Alignment via AI feedback (Constitutional AI)

Anthropic

Important for understanding how Anthropic’s assistant-training stack evolved from early RLHF into Constitutional AI and later robustness work around jailbreaks and behavior control.

Post-Training & Alignment Reinforcement Learning Security & Robustness Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

3510

Yunxian Ma

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3511

Yunxing Dai

Frontier model development (GPT-4)

OpenAI

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

GPT-4 Technical Report

3512

Yunxuan Li

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3513

Yuqiong Liu

Open-weight LLMs (Qwen2)

Qwen

Co-authored the Qwen2 Technical Report.

Open Models Qwen2 Technical Report

3514

Yuri Burda

Code-focused LLMs and evaluation (Codex)

OpenAI

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

3515

Yury Sulsky

Generalist agents (Gato)

Google DeepMind

Co-authored Gato: a key reference for generalist, multi-task agents.

Multimodal Agents & Reasoning A Generalist Agent

3516

Yuta Koreeda

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

Evaluation & Benchmarks Holistic Evaluation of Language Models

3517

Yutian Chen

Generalist agents (Gato)

Google DeepMind

Co-authored Gato: a key reference for generalist, multi-task agents.

Multimodal Agents & Reasoning A Generalist Agent

3518

Yuting Sun

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3519

Yuting Yan

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3520

Yuval Globerson

Hybrid Transformer–Mamba language models (Jamba)

AI21

A better head-queue page because it turns one of the thinner Jamba-1.5 coauthor profiles into an actual AI21 systems profile with a concrete role, a paper trail, and a clearer place in the hybrid-model program.

Systems & Infrastructure Yuval Globerson

3521

Yuval Peleg Levy

Hybrid Transformer–Mamba language models (Jamba)

AI21

Worth including because it adds another concrete algorithm engineer to the public picture of how AI21’s model work gets built, which is more useful than leaving the page as anonymous Jamba spillover.

Systems & Infrastructure Yuval Peleg Levy

3522

Yuvein Zhu

Open multimodal models (Gemma 3)

Google

Co-authored the Gemma 3 Technical Report.

Open Models Multimodal Gemma 3 Technical Report

3523

Yuxiang Luo

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3524

Yuxiang You

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3525

Yuxiong He

Memory-efficient distributed training (ZeRO)

Co-authored ZeRO: foundational memory optimizations for training very large models.

Systems & Infrastructure ZeRO: Memory Optimizations Toward Training Trillion Parameter Models

3526

Yuxuan Liu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3527

Yuyang Zhou

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3528

Yuzi He

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3529

Z. F. Wu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3530

Z. Z. Ren

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3531

Zac Hatfield-Dodds

Alignment via AI feedback (Constitutional AI)

Anthropic

Worth tracking for the practical evaluation layer around frontier models, especially where safety claims have to survive contact with real tests and faithful-reasoning checks.

Post-Training & Alignment Evaluation & Benchmarks Agents & Reasoning Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

3532

Zach DeVito

Deep learning infrastructure (PyTorch)

Co-authored the PyTorch paper describing the imperative-style deep learning framework.

Open Models Systems & Infrastructure PyTorch: An Imperative Style, High-Performance Deep Learning Library

3533

Zach Gleicher

Open multimodal models (Gemma 3)

Google

Co-authored the Gemma 3 Technical Report.

Open Models Multimodal Gemma 3 Technical Report

3534

Zach Irving

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3535

Zach Rait

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3536

Zacharie Delpierre Coudert

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3537

Zachary DeVito

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3538

Zachary Nado

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3539

Zack Ontiveros

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3540

Zafarali Ahmed

Open language models from Google (Gemma)

Google

Co-authored Gemma: open models based on Gemini research and technology.

Open Models Gemma: Open Models Based on Gemini Research and Technology

3541

Zaheer Abbas

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3542

Zef Rosnbrick

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3543

Zehui Ren

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3544

Zeqi Lin

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

3545

Zeynep Cankara

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3546

Zeyu Cui

Open-weight LLMs (Qwen)

Qwen

Co-authored the Qwen Technical Report.

Open Models Qwen Technical Report

3547

Zeyu Zheng

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3548

Zeyuan Allen-Zhu

Parameter-efficient finetuning

One of the clearer people to follow if you want the bridge between deep-learning theory, practical adaptation methods like LoRA, and broader attempts to explain how large language models actually work.

Systems & Infrastructure Zeyuan Allen-Zhu's Home Page

3549

Zhang

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3550

Zhanghao Wu

LLM-as-a-judge evaluation (MT-Bench)

Co-authored MT-Bench / LLM-as-a-judge: a widely used template for scalable multi-turn evaluation.

Evaluation & Benchmarks Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

3551

Zhangli Sha

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3552

Zhaoduo Wen

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3553

Zhe Chen

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3554

Zhe Dong

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3555

Zhe Fu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3556

Zhe Shen

Open language models (Gemma 2)

Google

Co-authored Gemma 2: improving open language models at a practical size.

Open Models Gemma 2: Improving Open Language Models at a Practical Size

3557

Zhean Xu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3558

Zhen Huang

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3559

Zhen Yang

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3560

Zhen Zhang

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3561

Zhenda Xie

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3562

Zheng Yan

Open-weight chat and foundation models (Llama 2)

Meta

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

3563

Zheng Yuan

Open-weight LLMs (Qwen)

Qwen

Co-authored the Qwen Technical Report.

Open Models Qwen Technical Report

3564

Zhengxing Chen

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3565

Zhengyan Zhang

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3566

Zhenkai Zhu

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3567

Zhenru Zhang

Open-weight LLMs (Qwen)

Qwen

Co-authored the Qwen Technical Report.

Open Models Qwen Technical Report

3568

Zhenyu Yang

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3569

Zhenyuan Zhang

RWKV and efficient sequence modeling

Co-authored RWKV: Reinventing RNNs for the Transformer Era.

Open Models Systems & Infrastructure RWKV: Reinventing RNNs for the Transformer Era

3570

Zhewen Hao

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3571

Zhibin Gou

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3572

Zhicheng Ma

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3573

Zhichun Wu

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3574

Zhifang Guo

Open-weight LLMs (Qwen2)

Qwen

Co-authored the Qwen2 Technical Report.

Open Models Qwen2 Technical Report

3575

Zhifeng Chen

Efficient MoE scaling (GLaM)

Co-authored GLaM: an influential MoE scaling reference in large language modeling.

Systems & Infrastructure GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

3576

Zhigang Yan

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3577

Zhihan Zhang

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

Open Models Code Models StarCoder: may the source be with you!

3578

Zhihao Fan

Open-weight LLMs (Qwen2)

Qwen

Co-authored the Qwen2 Technical Report.

Open Models Qwen2 Technical Report

3579

Zhihong Shao

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3580

Zhipeng Xu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3581

Zhiruo Wang

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

Open Models Code Models StarCoder: may the source be with you!

3582

Zhishuai Zhang

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3583

Zhitao Gong

Few-shot vision-language models (Flamingo)

Google DeepMind

Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.

Multimodal Flamingo: a Visual Language Model for Few-Shot Learning

3584

Zhiwei Zhao

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3585

Zhiyu Liu

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3586

Zhiyu Ma

Open-weight frontier models (Llama 3)

Meta

Co-authored “The Llama 3 Herd of Models”.

Open Models The Llama 3 Herd of Models

3587

Zhiyu Wu

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3588

Zhongyu Zhang

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3589

Zhuohan Li

Fast, cheap LLM serving (PagedAttention)

Co-authored vLLM: a widely used serving stack for efficient LLM inference.

Systems & Infrastructure vLLM: Easy, Fast, and Cheap LLM Serving with PagedAttention

3590

Zhuoshu Li

Open-model frontier reports (DeepSeek-V3)

DeepSeek

Co-authored the DeepSeek-V3 Technical Report.

Open Models DeepSeek-V3 Technical Report

3591

Zhuyun Dai

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3592

Zi Lin

LLM-as-a-judge evaluation (MT-Bench)

Co-authored MT-Bench / LLM-as-a-judge: a widely used template for scalable multi-turn evaluation.

Evaluation & Benchmarks Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena

3593

Zichuan Wei

Open language models (Gemma 2)

Google

Co-authored Gemma 2: improving open language models at a practical size.

Open Models Gemma 2: Improving Open Language Models at a Practical Size

3594

Zifan Lin

Multimodal frontier models (Gemini)

Google DeepMind

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

Multimodal Gemini: A Family of Highly Capable Multimodal Models

3595

Zifan Wang

Universal jailbreak-style attacks on aligned LMs

Co-authored universal and transferable adversarial attacks on aligned language models.

Security & Robustness Universal and Transferable Adversarial Attacks on Aligned Language Models