Deep RL, scientific AI, leadership
Important both as a researcher and as an institution builder whose long-running agenda tied deep RL, multimodal systems, and scientific AI into one coherent lab strategy.
Topic
Researchers working on decision-making, planning, self-play, and RL methods that still shape modern AI systems.
Start with Demis Hassabis, Chris Olah, Dario Amodei if you want the clearest first pass through reinforcement learning as it shows up in practice.
This area overlaps heavily with Anthropic, Google DeepMind, AI21. Common institution signals include Anthropic, Google DeepMind, Google. Recurring starting points include Constitutional AI: Harmlessness from AI Feedback, Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback.
Snapshot
Researchers
90
Related labs
6
Starting points
8
Developed dossiers
27
Useful entry points pulled from the strongest linked researcher dossiers.
Deep reinforcement learning
Via Demis Hassabis
Feature visualization and interpretability
Via Chris Olah
Frontier-model scaling and deployment tradeoffs
Via Dario Amodei
Behavior shaping in large models
Via Amanda Askell
Reward modeling
Via Paul Christiano
Policy optimization and reinforcement learning
Via John Schulman
Frequent institutions showing up across profiles in this area.
Papers, project pages, and repositories that recur across this part of the field.
Constitutional AI: Harmlessness from AI Feedback
48Linked by 48 profiles in this topic
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
47Linked by 47 profiles in this topic
Discovering Language Model Behaviors with Model-Written Evaluations
22Linked by 22 profiles in this topic
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
20Linked by 20 profiles in this topic
Challenges in evaluating AI systems
9Linked by 9 profiles in this topic
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
9Linked by 9 profiles in this topic
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
8Linked by 8 profiles in this topic
Measuring Faithfulness in Chain-of-Thought Reasoning
8Linked by 8 profiles in this topic
Source clusters that repeatedly anchor researchers in this area.
Constitutional AI: Harmlessness from AI Feedback
48Used across 48 researcher pages in this topic
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
47Used across 47 researcher pages in this topic
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
6Used across 6 researcher pages in this topic
Deep Reinforcement Learning from Human Preferences
4Used across 4 researcher pages in this topic
Playing Atari with Deep Reinforcement Learning
4Used across 4 researcher pages in this topic
Reflexion: Language Agents with Verbal Reinforcement Learning
4Used across 4 researcher pages in this topic
A stronger first pass through reinforcement learning, ranked by profile depth, evidence, and editorial importance.
Deep RL, scientific AI, leadership
Important both as a researcher and as an institution builder whose long-running agenda tied deep RL, multimodal systems, and scientific AI into one coherent lab strategy.
Mechanistic interpretability, visualization
One of the clearest interpreters of neural-network internals, especially in the line of work that turned interpretability into a concrete research agenda rather than a vague aspiration.
Alignment, post-training, frontier LLMs
A high-signal figure for understanding the frontier model era because his work sits at the intersection of scaling, post-training, and deployment-risk framing.
Alignment, behavior shaping, safety
A high-signal researcher for understanding how post-training and behavioral steering become concrete product behavior rather than abstract alignment talk.
Alignment theory, reward modeling
A foundational thinker in oversight, reward modeling, and delegation-style alignment ideas that influenced much of the modern post-training conversation.
Reinforcement learning, post-training
A key bridge between reinforcement-learning methodology and the post-training techniques now used to shape assistant behavior.
Scaling laws, LLM training dynamics
One of the clearest anchors for understanding why scaling laws became such a central planning tool for frontier-model research and training strategy.
Alignment research, scalable oversight
One of the clearest public anchors for scalable oversight and alignment research in the frontier-model era.
Alignment via AI feedback (Constitutional AI)
High-signal for the seam between machine learning and hardware systems, especially where learned optimization methods begin affecting the actual compute infrastructure underneath frontier models.
90 linked profiles.