Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Text-to-image diffusion with strong language understanding (Imagen)
Co-authored Imagen: a milestone for photorealistic text-to-image diffusion models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Neural radiance fields (NeRF)
Co-authored NeRF: a foundational paper for neural rendering and 3D scene representations.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
A useful engineering profile because it points to the people operating in the middle layer between frontier architecture research and the hard work of making models run well enough to ship.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Grade-school math reasoning (GSM8K)
Co-authored GSM8K: a core benchmark/dataset for math word problems and verification.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-source tooling for modern NLP (Transformers library)
Co-authored the Hugging Face Transformers paper that helped standardize modern NLP workflows.
Neural radiance fields (NeRF)
Co-authored NeRF: a foundational paper for neural rendering and 3D scene representations.
Co-authored the Gemma 3 Technical Report.
Co-authored the Gemma 3 Technical Report.
Large-scale language modeling (GPT-3)
Co-authored GPT-3: Language Models are Few-Shot Learners.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Large-scale transformer inference (DeepSpeed)
Co-authored DeepSpeed Inference: practical inference optimizations for serving large transformer models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Few-shot vision-language models (Flamingo)
Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Self-rewarding post-training
Co-authored Self-Rewarding Language Models: explores self-improvement via internal reward modeling.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Holistic evaluation of language models (HELM)
Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Hybrid Transformer–Mamba language models (Jamba)
A useful page for the data side of AI21 because it gives one of the quieter contributors on the Jamba-1.5 line a real place in the stack instead of flattening the work into a generic hybrid-model author list.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Alignment via AI feedback (Constitutional AI)
Co-authored Constitutional AI: Harmlessness from AI Feedback.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Teaching LMs to use tools (Toolformer)
Co-authored Toolformer: an influential approach to tool use via self-supervision.
Teaching LMs to use tools (Toolformer)
Co-authored Toolformer: an influential approach to tool use via self-supervision.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Alignment via AI feedback (Constitutional AI)
A useful page for Anthropic’s evaluation stack, especially where new model behaviors are surfaced through generated tests rather than only hand-authored benchmarks.
Latent diffusion for high-res generation
Co-authored Latent Diffusion Models: the foundation behind Stable Diffusion-style pipelines.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open, fully-documented language models (OLMo)
Co-authored OLMo: Accelerating the Science of Language Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Model-written evaluations for LM behavior
Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Gemini (multimodal foundation models)
One of the more useful people to study for the Gemini era because his work spans both the text-core of multimodal frontier models and the optimization tricks that make those systems cheaper and more stable to train.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Fully Sharded Data Parallel training (FSDP)
Co-authored PyTorch FSDP: practical lessons for scaling fully-sharded training workloads.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
A good page to keep because it makes the project-and-product layer of a frontier-model launch visible; Jamba-1.5 was not only a research effort, it also needed people coordinating what got built, packaged, and released.
A helpful page because it reminds users that frontier AI products are not only research artifacts; someone still has to build the actual interfaces, application surfaces, and product plumbing around the models.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Open foundation models for code (Code Llama)
Co-authored Code Llama: a key open-model reference for code generation and coding assistants.
Hybrid Transformer–Mamba language models (Jamba)
Useful because his name sits inside the original Jamba author list, which gives this page a concrete place in the AI21 hybrid-model lineage instead of leaving it as another anonymous seed profile.
Few-shot vision-language models (Flamingo)
Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Masked autoencoders for vision (MAE)
Co-authored MAE: a strong template for scalable self-supervised vision pretraining.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Frontier-scale training infrastructure
Builds core infrastructure for xAI’s frontier models.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Commonsense reasoning evaluation (HellaSwag)
Co-authored HellaSwag: a widely used commonsense benchmark for language understanding.
Co-authored the Qwen2 Technical Report.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Co-authored the Qwen Technical Report.
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
RWKV and efficient sequence modeling
Probably the strongest page in this batch because he spans the original RWKV paper, Eagle/Finch-adjacent work, and later efficient-language-model papers like SpikeGPT and Gated Slot Attention instead of ending at a single coauthor credit.
Co-authored Gemma: open models based on Gemini research and technology.
RWKV and efficient sequence modeling
A strong long-tail RWKV page because he is present on the original paper, Eagle/Finch, and RWKV-7, which makes him part of the smaller recurring contributor set that carried the architecture through several major revisions.
Co-authored the DeepSeek-V3 Technical Report.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Co-authored the Qwen2 Technical Report.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored the Qwen Technical Report.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.