Large-scale training, systems
A high-signal figure for understanding how DeepMind turned ambitious research systems into durable products, especially across reinforcement learning, speech, and code generation.
Topic
Researchers behind code-specialized models, datasets, and evaluation setups for software engineering tasks.
Start with Koray Kavukcuoglu, Matteo Grella, Xiangru Tang if you want the clearest first pass through code models as it shows up in practice.
This area overlaps heavily with OpenAI, Google, Meta. Common institution signals include Northeastern University, Wellesley College, Boston College. Recurring starting points include BigCode (project), StarCoder: may the source be with you!.
Related Labs
Snapshot
Researchers
139
Related labs
4
Starting points
8
Developed dossiers
8
Useful entry points pulled from the strongest linked researcher dossiers.
Large-scale research leadership at Google DeepMind
Via Koray Kavukcuoglu
Original RWKV authorship
Via Matteo Grella
Agentic AI for biomedical discovery
Via Xiangru Tang
Open code LLMs (StarCoder)
Via Brendan Dolan-Gavitt
StarCoder: may the source be with you!
Via Arjun Guha
BigCode (project)
Via Danish Contractor
Frequent institutions showing up across profiles in this area.
Papers, project pages, and repositories that recur across this part of the field.
BigCode (project)
62Linked by 62 profiles in this topic
StarCoder: may the source be with you!
62Linked by 62 profiles in this topic
Evaluating Large Language Models Trained on Code
36Linked by 36 profiles in this topic
CodeGemma: Open Code Models Based on Gemma
20Linked by 20 profiles in this topic
Gemma (docs)
20Linked by 20 profiles in this topic
Code Llama: Open Foundation Models for Code
17Linked by 17 profiles in this topic
RWKV (project)
2Linked by 2 profiles in this topic
RWKV: Reinventing RNNs for the Transformer Era
2Linked by 2 profiles in this topic
Source clusters that repeatedly anchor researchers in this area.
BigCode (project)
62Used across 62 researcher pages in this topic
StarCoder: may the source be with you!
62Used across 62 researcher pages in this topic
Evaluating Large Language Models Trained on Code
36Used across 36 researcher pages in this topic
CodeGemma: Open Code Models Based on Gemma
20Used across 20 researcher pages in this topic
Gemma (docs)
20Used across 20 researcher pages in this topic
Code Llama: Open Foundation Models for Code
17Used across 17 researcher pages in this topic
A stronger first pass through code models, ranked by profile depth, evidence, and editorial importance.
Large-scale training, systems
A high-signal figure for understanding how DeepMind turned ambitious research systems into durable products, especially across reinforcement learning, speech, and code generation.
RWKV and efficient sequence modeling
Worth keeping because he is one of the original RWKV coauthors who clearly did not stop there: his public work moves into production AI for crisis intelligence, security-aware infrastructure tooling, and later open-LLM experimentation.
RWKV and efficient sequence modeling
Worth keeping because it connects an early RWKV byline to a much more visible later research program in agentic AI, biomedical discovery, and code-focused evaluation, which makes the page far more useful than a one-paper ghost profile.
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Open code LLMs (StarCoder)
Co-authored StarCoder: a foundational open code model effort (BigCode).
Code-focused LLMs and evaluation (Codex)
Co-authored the Codex evaluation paper: an early anchor for code LLM capability measurement.
139 linked profiles.