Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Holistic evaluation of language models (HELM)
Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Gemini (multimodal foundation models)
One of the clearest DeepMind names for understanding planning-heavy reinforcement learning, from AlphaZero and MuZero to newer reasoning and tool-use work around Gemini.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Hybrid Transformer–Mamba language models (Jamba)
An especially valuable page for understanding how AI systems get judged in practice, because it puts human evaluation and rubric design at the center rather than treating them as an afterthought to model building.
Open-source tooling for modern NLP (Transformers library)
Co-authored the Hugging Face Transformers paper that helped standardize modern NLP workflows.
Self-supervised vision transformers (DINO)
Co-authored DINO: influential self-supervised representation learning for vision transformers.
Open-source tooling for modern NLP (Transformers library)
Co-authored the Hugging Face Transformers paper that helped standardize modern NLP workflows.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Bootstrapped vision-language pretraining (BLIP)
Co-authored BLIP: a high-impact recipe for unified vision-language understanding and generation.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Open-weight LLMs (Qwen)
Co-authored the Qwen Technical Report.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Co-authored the Qwen Technical Report.
Co-authored the DeepSeek-V3 Technical Report.
Co-authored the DeepSeek-V3 Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored “The Llama 3 Herd of Models”.
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored the DeepSeek-V3 Technical Report.
Computer vision, representation learning
A foundational computer-vision researcher whose work on representations and architectures still shapes modern pretraining and perception systems.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Co-authored “The Llama 3 Herd of Models”.
Alignment via AI feedback (Constitutional AI)
Worth following for the evaluation side of Anthropic’s alignment program, especially where model-written tests and public-input methods become practical tooling rather than just ideas.
Model-written evaluations for LM behavior
Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.
Alignment via AI feedback (Constitutional AI)
Worth following for the evaluation side of alignment work, especially where model-written tests and more faithful reasoning traces are used to make model behavior easier to inspect.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored the DeepSeek-V3 Technical Report.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Few-shot vision-language models (Flamingo)
Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.
Compute-optimal scaling for LLM training
A foundational vision researcher who also matters for the more recent DeepMind language-model lineage, making him a good bridge between classic deep-learning milestones and the Gemini era.
Model-written evaluations for LM behavior
Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.
Grade-school math reasoning (GSM8K)
Co-authored GSM8K: a core benchmark/dataset for math word problems and verification.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Reasoning + acting for LLM agents (ReAct)
Co-authored ReAct: a simple, high-leverage template for tool-using LLM agents.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Co-authored Gemma 2: improving open language models at a practical size.
Instruction-following via RLHF (InstructGPT)
A useful page for the OpenAI preference-learning line, especially if you want to understand how the field moved from InstructGPT-era RLHF into later work on whether stated preferences actually predict model behavior.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Efficient MoE scaling (GLaM)
Co-authored GLaM: an influential MoE scaling reference in large language modeling.
Open code models (CodeGemma)
Co-authored CodeGemma: open code models based on Gemma.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored Gemma: open models based on Gemini research and technology.
Code-focused LLMs and evaluation (Codex)
Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.
Gemini (multimodal foundation models)
Worth tracking for the data side of multimodal frontier models, where the quality and shape of training mixtures strongly determine what large systems can actually do.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Efficient MoE scaling (GLaM)
Co-authored GLaM: an influential MoE scaling reference in large language modeling.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open code models (CodeGemma)
Co-authored CodeGemma: open code models based on Gemma.
Instruction tuning for better zero-shot behavior
Co-authored FLAN: a practical anchor for instruction tuning and zero-shot transfer.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored the Qwen Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
NLP systems and evaluation
A strong person to follow for practical language systems because his work sits right at the intersection of pretraining, retrieval, and question answering, where product-grade NLP systems either become robust or fall apart.
Co-authored “The Llama 3 Herd of Models”.
Open-weight LLMs (Qwen2)
Co-authored the Qwen2 Technical Report.
Scaled multilingual vision-language models (PaLI)
Co-authored PaLI: a key reference for scaling multilingual vision-language models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Holistic evaluation of language models (HELM)
Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.