Researchers — page 18

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2044

Michael Jordan

Human preference evaluation at scale (Chatbot Arena)

Co-authored Chatbot Arena: a high-impact human-preference evaluation platform for LLMs.

Evaluation & Benchmarks Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference

2045

Michael Kucharski

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2046

Michael Kwong

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2047

Michael L. Seltzer

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2048

Michael Lampe

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2049

Michael Laskin

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2050

Michael Mandl

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2051

Michael Matena

Text-to-text transfer and pretraining (T5)

Co-authored T5: a practical template for unified NLP training and evaluation.

Evaluation & Benchmarks Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2052

Michael Moynihan

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2053

Michael Petrov

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

2054

Michael Pieler

Open-source LLMs (EleutherAI)

EleutherAI

Useful for the applied side of open-model work because his profile bridges EleutherAI-era public model training and production radiology AI inside a real clinical-imaging company.

Open Models Michael Pieler

2055

Michael Pokorny

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2056

Michael Santacroce

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

2057

Michael Sellitto

Alignment via AI feedback (Constitutional AI)

Anthropic

A useful profile for the segment of Anthropic that turns alignment concerns into concrete evals, behavior probes, and red-team-style measurement.

Post-Training & Alignment Evaluation & Benchmarks Systems & Infrastructure Constitutional AI: Harmlessness from AI Feedback

2058

Michael Sharman

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2059

Michael Wu

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2060

Michael Wyatt

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

2061

Michal Valko

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2062

Michela Paganini

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2063

Michele Catasta

Pathways-scale language modeling (PaLM)

PaLM: Scaling Language Modeling with Pathways

Co-authored PaLM: Scaling Language Modeling with Pathways.

2064

Michelle Casbon

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2065

Michelle Pokrass

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2066

Michelle Restrepo

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Evaluation & Benchmarks Holistic Evaluation of Language Models

2067

Michihiro Yasunaga

Holistic evaluation of language models (HELM)

Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.

2068

Mihir Patel

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2069

Mihir Sanjay Kale

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2070

Mik Vyatskov

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2071

Mikayel Samvelyan

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2072

Mike Clark

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2073

Mike Dusenberry

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2074

Mike Heaton

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

Evaluation & Benchmarks Systems & Infrastructure Diffusion & Generative Media BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

2075

Mike Lewis

Streaming + long-context stability (attention sinks)

A strong person to study for the modern NLP stack because his work spans denoising pretraining, retrieval-augmented generation, and later long-context inference tricks rather than only one phase of the language-model pipeline.

2076

Mike Macey

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2077

Mike Wang

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2078

Mikel Rodriguez

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2079

Mikhail Dektiarev

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2080

Mikhail Pavlov

Text-to-image generation (DALL·E)

Multimodal Diffusion & Generative Media Zero-Shot Text-to-Image Generation

Co-authored the original DALL·E paper: zero-shot text-to-image generation.

2081

Mikolaj Binkowski

Few-shot vision-language models (Flamingo)

Multimodal Flamingo: a Visual Language Model for Few-Shot Learning

Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.

2082

Mikołaj Rybiński

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2083

Milad Gholami

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2084

Milad Nasr

Universal jailbreak-style attacks on aligned LMs

Co-authored universal and transferable adversarial attacks on aligned language models.

Security & Robustness Universal and Transferable Adversarial Attacks on Aligned Language Models

2085

Milan Someswar

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2086

Miles Brundage

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

2087

Miljan Martic

Practical RL from human feedback

Co-authored Deep RL from Human Preferences: an early anchor for RLHF-style post-training.

Post-Training & Alignment Reinforcement Learning Deep Reinforcement Learning from Human Preferences

2088

Milos Besta

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2089

Miltiadis Allamanis

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2090

Mimi Jasarevic

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2091

Min Gao

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2092

Min Ma

Open multimodal models (Gemma 3)

Open Models Multimodal Gemma 3 Technical Report

Co-authored the Gemma 3 Technical Report.

2093

Min Si

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Systems & Infrastructure PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

2094

Min Xu

Fully Sharded Data Parallel training (FSDP)

Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

2095

Mina Khan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2096

Ming Zhang

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2097

Ming-Ho Yee

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

Open Models Code Models StarCoder: may the source be with you!

2098

Ming-Wei Chang

Bidirectional transformer pretraining (BERT)

Co-authored BERT: a turning point for transfer learning in NLP.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2099

Mingchuan Zhang

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

Open Models Qwen2 Technical Report

2100

Mingfeng Xue

Open-weight LLMs (Qwen2)

Qwen

Co-authored the Qwen2 Technical Report.

2101

Minghua Zhang

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2102

Minghui Tang

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2103

Mingming Li

Open-model frontier reports (DeepSeek-V3)

Co-authored the DeepSeek-V3 Technical Report.

2104

Mingqiu Wang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2105

Mingyang Zhang

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2106

Minh Giang

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2107

Minjia Zhang

Large-scale transformer inference (DeepSpeed)

Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.

Systems & Infrastructure DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale

2108

Minjie Lu

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2109

Minnie Lui

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2110

Minsuk Kahng

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2111

Minwoo Park

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2112

Miquel Jubert Hermoso

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2113

Mira Murati

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

2114

Mirac Suzgun

Challenging BIG-bench tasks (BBH)

Co-authored BBH: a popular set of hard reasoning tasks used for evaluating chain-of-thought prompting.

Evaluation & Benchmarks Agents & Reasoning Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them (BBH)

2115

Miranda Zhang

Model-written evaluations for LM behavior

Anthropic

Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.

Post-Training & Alignment Evaluation & Benchmarks Discovering Language Model Behaviors with Model-Written Evaluations

2116

Misha Bilenko

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2117

Misha Khalman

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2118

Mishig Davaadorj

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

Open Models Code Models StarCoder: may the source be with you!

2119

Mitch Rudominer

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2120

Mitchell Wortsman

Open large-scale image-text data (LAION-5B)

Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.

Multimodal LAION-5B: An open large-scale dataset for training next generation image-text models

2121

Mitesh Kumar Singh

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2122

Miteyan Patel

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2123

MK Blake

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2124

Mo Metanat

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2125

Mofi Rahman

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2126

Mohak Patel

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2127

Mohamed Elhawaty

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2128

Mohammad Bavarian

Code-focused LLMs and evaluation (Codex)

Evaluation & Benchmarks Code Models Evaluating Large Language Models Trained on Code

Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.

2129

Mohammad Norouzi

Text-to-image diffusion with strong language understanding (Imagen)

Diffusion & Generative Media Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Co-authored Imagen: a milestone for photorealistic text-to-image diffusion models.

2130

Mohammad Rastegari

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2131

Mohammad Saleh

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2132

Mohammad Shoeybi

Model-parallel training at scale (Megatron-LM)

Co-authored Megatron-LM: a core reference for scaling transformer training via model parallelism.

Systems & Infrastructure Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

2133

MohammadHossein Bateni

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2134

Mohit Khatwani

Open language models (Gemma 2)

Open Models Gemma 2: Improving Open Language Models at a Practical Size

Co-authored Gemma 2: improving open language models at a practical size.

2135

Mohsen Jafari

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2136

Mojan Javaheripi

Small, capable models (Phi-3)

Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).

Open Models Systems & Infrastructure Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

2137

Mojtaba Seyedhosseini

Scaled multilingual vision-language models (PaLI)

Multimodal PaLI: A Jointly-Scaled Multilingual Language-Image Model

Co-authored PaLI: a key reference for scaling multilingual vision-language models.

2138

Molly Lin

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2139

Mona Hassan

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

Systems & Infrastructure Jamba: A Hybrid Transformer-Mamba Language Model

2140

Mor Zusman

Hybrid Transformer–Mamba language models (Jamba)

AI21

Helpful because it adds contributor-level detail to the original Jamba release, which is exactly the kind of context these long-tail pages need to be useful rather than decorative.

2141

Morgan Funtowicz

Open-source tooling for modern NLP (Transformers library)

Hugging Face

Co-authored the Hugging Face Transformers paper that helped standardize modern NLP workflows.

Open Models Transformers: State-of-the-Art Natural Language Processing

2142

Morgan Grafstein

Frontier model development (GPT-4)

Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.

2143

Morgan Redshaw

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2144

Morgane Rivière

Open language models from Google (Gemma)

Open Models Gemma: Open Models Based on Gemini Research and Technology

Co-authored Gemma: open models based on Gemini research and technology.

2145

Mostafa Dehghani

Vision Transformers (ViT)

Co-authored ViT: a turning point for transformers in vision.

Vision & Robotics An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

2146

Mostofa Patwary

Model-parallel training at scale (Megatron-LM)

Co-authored Megatron-LM: a core reference for scaling transformer training via model parallelism.

Systems & Infrastructure Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

2147

Motoki Sano

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2148

Moya Chen

Open-weight chat and foundation models (Llama 2)

Open Models Llama 2: Open Foundation and Fine-Tuned Chat Models

Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.

2149

Mudit Bansal

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2150

Mudit Jain

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2151

Muhtasham Oblokulov

Open code LLMs (StarCoder)

Co-authored StarCoder: a foundational open code model effort (BigCode).

Open Models Code Models StarCoder: may the source be with you!

2152

Mukarram Tariq

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2153

Mukund Sridhar

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2154

Mukund Sundararajan

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2155

Munish Bansal

Open-weight frontier models (Llama 3)

Co-authored “The Llama 3 Herd of Models”.

2156

Music Li

Multimodal frontier models (Gemini)

Multimodal Gemini: A Family of Highly Capable Multimodal Models

Co-authored Gemini: A Family of Highly Capable Multimodal Models.

2157

Myle Ott

Fully Sharded Data Parallel training (FSDP)

Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.

Systems & Infrastructure PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

2158

Na Ni

Open-weight LLMs (Qwen2)

Qwen

Co-authored the Qwen2 Technical Report.

Open Models Qwen2 Technical Report

2159

Naama Gidron

Hybrid Transformer–Mamba language models (Jamba)

AI21

A useful page because evaluation work is easy to flatten into leaderboard noise, and her profile anchors the people inside AI21 who were responsible for turning Jamba performance claims into something measurable.

Evaluation & Benchmarks Systems & Infrastructure JAMBA: Hybrid Transformer-Mamba Language Models

2160

Nabila Babar

Open multimodal models (Gemma 3)