Researchers — page 4

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

370

Ayal Hitron

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

371

Ayan Chakrabarti

Open multimodal models (Gemma 3)

Open Models Multimodal Gemma 3 Technical Report

Co-authored the Gemma 3 Technical Report.

372

Azadeh Yazdan

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Post-Training & Alignment Systems & Infrastructure Reinforcement Learning Chip Design with Deep Reinforcement Learning

373

Azalia Mirhoseini

Alignment via AI feedback (Constitutional AI)

Anthropic

High-signal for the seam between machine learning and hardware systems, especially where learned optimization methods begin affecting the actual compute infrastructure underneath frontier models.

374

Bailin Wang

Linear transformers via the delta rule

A good page to have because he is one of the recurring names in the recent MIT line of work on linear-attention alternatives, especially where hardware-efficient training meets practical long-context sequence modeling.

Systems & Infrastructure Gated Linear Attention Transformers with Hardware-Efficient Training

375

Balaji Lakshminarayanan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

376

Banghua Zhu

Human preference evaluation at scale (Chatbot Arena)

Co-authored Chatbot Arena: a high-impact human-preference evaluation platform for LLMs.

Evaluation & Benchmarks Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

377

Baosong Yang

Open-weight LLMs (Qwen2)

Open Models Qwen2 Technical Report

Co-authored the Qwen2 Technical Report.

378

Baptiste Rozière

Open-weight foundation models (LLaMA)

Open Models Systems & Infrastructure Code Llama: Open Foundation Models for Code

Important for the code-model side of the open-weight ecosystem, especially where general-purpose LLaMA work turns into stronger coding systems.

379

Barak Lenz

Hybrid Transformer–Mamba language models (Jamba)

AI21

One of the higher-signal people to know in the hybrid-LLM line because he sits at the point where AI21’s research architecture, long-context systems work, and real product deployment meet.

Evaluation & Benchmarks Systems & Infrastructure Why AI Leaderboards Miss the Point

380

Barak Peleg

Hybrid Transformer–Mamba language models (Jamba)

AI21

Worth tracking on the architecture side of AI21 because his profile sits where infrastructure leadership, hybrid-model design, and the mechanics of shipping long-context systems overlap.

Systems & Infrastructure Barak Peleg

381

Barret Zoph

Trillion-parameter scaling with sparsity (Switch Transformers)

Co-authored Switch Transformers: a core reference for practical MoE scaling.

Systems & Infrastructure Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity

382

Bart Chrzaszcz

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

383

Bartek Perz

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

384

Bartłomiej Koptyra

RWKV and efficient sequence modeling

A strong page to keep because he links the early RWKV work to the later Wrocław-centered PLLuM effort, which makes him one of the clearer continuity threads between open sequence models and Polish-language LLM development.

Open Models Systems & Infrastructure Bartłomiej Koptyra at Wrocław University of Science and Technology

385

Barun Patra

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

386

Basil Mustafa

Scaled multilingual vision-language models (PaLI)

Multimodal PaLI: A Jointly-Scaled Multilingual Language-Image Model

Co-authored PaLI: a key reference for scaling multilingual vision-language models.

387

Beau James

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

388

Becca Roelofs

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

389

Behnam Neyshabur

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

390

Bei Feng

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

Systems & Infrastructure Beidi Chen at Carnegie Mellon University

391

Beidi Chen

Streaming + long-context stability (attention sinks)

A strong researcher to follow for efficient and long-context LLM systems, especially where inference tricks and memory management make large models practical to run.

392

Ben Albrecht

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

393

Ben Aviram

Hybrid Transformer–Mamba language models (Jamba)

AI21

A better page than the default Jamba stub because it gives one of the quieter AI21 researchers a real place in the company’s hybrid-model program instead of treating him as just another author in a long list.

Systems & Infrastructure AI21 Labs

394

Ben Bariach

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

395

Ben Bastian

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

396

Ben Brown

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

397

Ben Caine

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

398

Ben Chess

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

399

Ben Horn

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

400

Ben Hutchinson

Pathways-scale language modeling (PaLM)

PaLM: Scaling Language Modeling with Pathways

Co-authored PaLM: Scaling Language Modeling with Pathways.

401

Ben Limonchik

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

402

Ben Mann

Alignment via AI feedback (Constitutional AI)

Anthropic

A strong profile for the engineering and product layer underneath early Anthropic alignment work, especially where human-feedback collection and evaluation infrastructure had to become real systems.

Post-Training & Alignment Evaluation & Benchmarks Systems & Infrastructure Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

403

Ben Maurer

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Vision & Robotics NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

404

Ben Mildenhall

Neural radiance fields (NeRF)

Co-authored NeRF: a foundational paper for neural rendering and 3D scene representations.

405

Ben Poole

Score-based diffusion modeling via SDEs

Co-authored the score-based diffusion SDE paper: a key theoretical view of diffusion models.

Diffusion & Generative Media Score-Based Generative Modeling through Stochastic Differential Equations

406

Ben Vargas

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

407

Ben Wang

Open-source LLMs (EleutherAI)

EleutherAI

Important for the bridge between early open-model scaling work and later frontier closed-model systems, especially around architecture and training-stack choices that ended up mattering at both ends of the field.

Open Models Systems & Infrastructure Ben Wang

408

Benfeng Xu

Open-weight LLMs (Qwen)

Open Models Qwen Technical Report

Co-authored the Qwen Technical Report.

409

Benigno Uria

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

410

Benjamin Chess

Large-scale language modeling (GPT-3)

Language Models are Few-Shot Learners (GPT-3)

Co-authored GPT-3: Language Models are Few-Shot Learners.

411

Benjamin Coleman

Open multimodal models (Gemma 3)

Open Models Multimodal Gemma 3 Technical Report

Co-authored the Gemma 3 Technical Report.

412

Benjamin Lee

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

413

Benjamin Leonhardi

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Open Models Code Models StarCoder: may the source be with you!

414

Benjamin Lipkin

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

415

Benjamin Newman

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

Evaluation & Benchmarks Holistic Evaluation of Language Models

416

Benjamin Sokolowsky

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Systems & Infrastructure PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

417

Bernard Nguyen

Fully Sharded Data Parallel training (FSDP)

Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

418

Bernie Huang

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

419

Beth Loyd

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

420

Beth Tsai

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

421

Bethany Biron

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

422

Beto De Paola

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

423

Betty Chan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

424

Bhargava Urala

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

425

Bhargavi Paranjape

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

426

Bhavishya Mittal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

427

Bianca Martin

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

428

Biao Zhang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

429

Bilal Piot

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

430

Billie Jonn

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

431

Bin Ni

Open code models (CodeGemma)

Open Models Code Models CodeGemma: Open Code Models Based on Gemma

Co-authored CodeGemma: open code models based on Gemma.

432

Bin Xiao

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

433

Bing Liu

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

434

Bing Xue

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

435

Bingxuan Wang

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

436

Binh Tang

Open-weight chat and foundation models (Llama 2)

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

437

Binhang Yuan

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

Evaluation & Benchmarks Holistic Evaluation of Language Models

438

Binyuan Hui

Open-weight LLMs (Qwen)

Open Models Qwen Technical Report

Co-authored the Qwen Technical Report.

439

Björn Ommer

Latent diffusion for high-res generation

Co-authored Latent Diffusion Models: the foundation behind Stable Diffusion-style pipelines.

Diffusion & Generative Media High-Resolution Image Synthesis with Latent Diffusion Models

440

Blake Hechtman

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

441

Blanche Savary

Mixture-of-experts LLMs

Mistral

Co-authored Mixtral of Experts: a key MoE reference in the open-weights frontier.

Open Models Systems & Infrastructure Mixtral of Experts

442

Bo Feng

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

443

Bo Peng

RWKV and efficient sequence modeling

Worth tracking if you care about alternatives to the standard transformer playbook, especially the line of work trying to keep strong language-model performance while making inference and memory use much cheaper.

Open Models Systems & Infrastructure RWKV: Reinventing RNNs for the Transformer Era

444

Bo Wen

Rotary position embeddings (RoPE)

A better page than the generated stub because it places him in the original RoFormer team at Zhuiyi, tied to the positional-embedding design that became standard in later open-weight model families.

Open Models RoFormer: Enhanced Transformer with Rotary Position Embedding

445

Bo Wu

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

446

Bo Zheng

Open-weight LLMs (Qwen2)

Open Models Qwen2 Technical Report

Co-authored the Qwen2 Technical Report.

447

Bob McGrew

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

448

Bob Rotsted

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

449

Bobak Shahriari

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

450

Bobbie Chern

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Evaluation & Benchmarks Holistic Evaluation of Language Models

451

Bobby Yan

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

452

Bochao Wu

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

453

Bogdan Damoc

Compute-optimal scaling for LLM training

Multimodal Gemini: A Family of Highly Capable Multimodal Models

A useful page for the less public but still important DeepMind contributors behind frontier language-model scaling and Gemini.

454

Bolun Wang

RWKV and efficient sequence modeling

Important within the RWKV cluster because his name carries from the original RWKV paper into Gated Slot Attention, making him part of the small set of contributors who reappear as this sequence-model thread evolves.

Open Models Systems & Infrastructure RWKV: Reinventing RNNs for the Transformer Era

455

Boris Power

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

456

Botu Sun

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

457

Bowen Yu

Open-weight LLMs (Qwen)

Open Models Qwen Technical Report

Co-authored the Qwen Technical Report.

458

Boxi Wu

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

459

Boyi Liu

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

460

Boyu Ni

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

461

Braden Hancock

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

462

Bram Wasti

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

463

Brandon Houghton

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

464

Brandon Norick

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

465

Brandon Royal

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

466

Brandon Spence

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Post-Training & Alignment Security & Robustness Adversarial Examples Are Not Bugs, They Are Features

467

Brandon Tran

Adversarial robustness and feature learning

Co-authored “Adversarial Examples Are Not Bugs, They Are Features”.

468

Brani Stojkovic

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Open Models Code Models StarCoder: may the source be with you!

469

Brendan Dolan-Gavitt

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

470

Brennan Saeta

Pathways-scale language modeling (PaLM)

PaLM: Scaling Language Modeling with Pathways

Co-authored PaLM: Scaling Language Modeling with Pathways.

471

Brian Albert

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

472

Brian Fuller

Open-weight chat and foundation models (Llama 2)

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

473

Brian Gamido

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Agents & Reasoning Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

474

Brian Ichter

Chain-of-thought prompting and reasoning

Co-authored the chain-of-thought prompting paper; foundational for modern reasoning prompting.

475

Brian Israel

Model-written evaluations for LM behavior

Anthropic

Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.

Post-Training & Alignment Evaluation & Benchmarks Discovering Language Model Behaviors with Model-Written Evaluations

476

Brian Lester

Instruction tuning for better zero-shot behavior

Co-authored FLAN: a practical anchor for instruction tuning and zero-shot transfer.

Post-Training & Alignment Finetuned Language Models Are Zero-Shot Learners

477

Britt Montalvo

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

478

Brittany Carey

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

479

Brona Robenek

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

480

Brooke Chan

Code-focused LLMs and evaluation (Codex)