Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Large-scale open code data (The Stack)
Co-authored The Stack: a major permissively-licensed dataset used for open code models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Teaching LMs to use tools (Toolformer)
Co-authored Toolformer: an influential approach to tool use via self-supervision.
Hybrid Transformer–Mamba language models (Jamba)
A useful profile for the data-and-evaluation side of AI work because it points to the people shaping language data quality, annotation, and analysis inside a frontier-model organization.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-source tooling for modern NLP (Transformers library)
Co-authored the Hugging Face Transformers paper that helped standardize modern NLP workflows.
Few-shot vision-language models (Flamingo)
Co-authored Flamingo: an influential multimodal model for few-shot vision-language tasks.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
Open-weight foundation models (LLaMA)
Important for the open-weight frontier-model story because her paper trail runs through both the original LLaMA releases and the early Mistral efficiency push.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored GPT-3: Language Models are Few-Shot Learners.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Pathways-scale language modeling (PaLM)
Co-authored PaLM: Scaling Language Modeling with Pathways.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Model-written evaluations for LM behavior
Co-authored model-written evals: a practical technique for discovering and measuring LM behaviors.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Deep Q-Networks (DQN)
Co-authored the original DQN preprint: a core reference for deep reinforcement learning.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Hybrid Transformer–Mamba language models (Jamba)
Useful because it captures one of the less-visible people behind AI21’s training stack, where hybrid-model quality depends as much on pre- and post-training choices as on the architectural headline.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Faster LLM inference via speculative decoding
An important systems page because he is one of the named authors on speculative decoding, a technique that became part of the mainstream conversation about making large-model inference materially faster without changing outputs.
Open language models from Google (Gemma)
Co-authored Gemma: open models based on Gemini research and technology.
Large-scale language modeling (GPT-3)
Co-authored GPT-3: Language Models are Few-Shot Learners.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Self-supervised vision transformers (DINO)
Co-authored DINO: influential self-supervised representation learning for vision transformers.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Universal jailbreak-style attacks on aligned LMs
Co-authored universal and transferable adversarial attacks on aligned language models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Frontier model development (GPT-4)
Co-authored the GPT-4 Technical Report: a key reference for the GPT-4-era frontier.
RWKV and efficient sequence modeling
Worth keeping because he is one of the original RWKV coauthors who clearly did not stop there: his public work moves into production AI for crisis intelligence, security-aware infrastructure tooling, and later open-LLM experimentation.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Open, fully-documented language models (OLMo)
Co-authored OLMo: Accelerating the Science of Language Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Training-data extraction and privacy risks
Co-authored Extracting Training Data from Large Language Models: a core paper on memorization and extraction risk.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Code-focused LLMs and evaluation (Codex)
Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.
Self-play RL with search (AlphaZero)
Co-authored AlphaZero: a canonical reference for self-play + search in RL.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Neural radiance fields (NeRF)
Co-authored NeRF: a foundational paper for neural rendering and 3D scene representations.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Vision Transformers (ViT)
Co-authored ViT: a turning point for transformers in vision.
Code-focused LLMs and evaluation (Codex)
Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Efficient MoE scaling (GLaM)
Co-authored GLaM: an influential MoE scaling reference in large language modeling.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Open multimodal models (Gemma 3)
Co-authored the Gemma 3 Technical Report.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open-weight frontier models (Llama 3)
Co-authored “The Llama 3 Herd of Models”.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Open large-scale image-text data (LAION-5B)
Co-authored LAION-5B: a widely used open dataset for vision-language foundation models.
Open language models (Gemma 2)
Co-authored Gemma 2: improving open language models at a practical size.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Co-authored the Qwen Technical Report.
Open-weight chat and foundation models (Llama 2)
Co-authored Llama 2: Open Foundation and Fine-Tuned Chat Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Large-scale language modeling (GPT-3)
Co-authored GPT-3: Language Models are Few-Shot Learners.
Gemini (multimodal foundation models)
A strong person to study for multilingual systems and instruction tuning, especially where translation, speech, and large-model post-training intersect.
Co-authored the DeepSeek-V3 Technical Report.
Small, capable models (Phi-3)
Co-authored the Phi-3 Technical Report (capable models designed for smaller footprints).
Holistic evaluation of language models (HELM)
Co-authored HELM: a framework for evaluating language models across many axes beyond raw accuracy.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Co-authored “The Llama 3 Herd of Models”.
Open-model frontier reports (DeepSeek-V3)
Co-authored the DeepSeek-V3 Technical Report.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.
RWKV and efficient sequence modeling
A useful page because it turns an otherwise stray RWKV byline into a visible builder profile: his public work is less about academic publishing and more about making efficient models, AI agents, and production RWKV systems usable in practice.
Multimodal frontier models (Gemini)
Co-authored Gemini: A Family of Highly Capable Multimodal Models.