Topic

Open Models

Researchers pushing open-weight language, code, and multimodal systems that the broader ecosystem can inspect and build on.

Start with Stella Biderman, Bartłomiej Koptyra, Eric Alcaide if you want the clearest first pass through open models as it shows up in practice.

This area overlaps heavily with Meta, Google, DeepSeek. Common institution signals include Meta, EleutherAI, Google. Recurring starting points include The Llama 3 Herd of Models, Llama (site).

Snapshot

Researchers

1,545

Related labs

Starting points

Developed dossiers

Angles To Understand

Useful entry points pulled from the strongest linked researcher dossiers.

Open-model infrastructure

Via Stella Biderman

Eagle and Finch

Via Bartłomiej Koptyra

Machine learning for molecules, proteins, and graph learning

Via Eric Alcaide

GPT-NeoX and open-source large-model training

Via Eric Hallahan

LM Evaluation Harness

Via Anish Thite

Open language models and open pretraining corpora

Via Alon Albalak

Institution Signals

Frequent institutions showing up across profiles in this area.

Meta (15)EleutherAI (12)Google (9)Mistral AI (8)Institute on Taxation and Economic Policy (7)Microsoft (5)DeepMind (4)Google DeepMind (4)

Canonical Starting Points

Papers, project pages, and repositories that recur across this part of the field.

The Llama 3 Herd of Models

484

Linked by 484 profiles in this topic

Llama (site)

482

Linked by 482 profiles in this topic

Gemma (docs)

359

Linked by 359 profiles in this topic

DeepSeek (project)

195

Linked by 195 profiles in this topic

DeepSeek-V3 Technical Report

195

Linked by 195 profiles in this topic

Gemma 2: Improving Open Language Models at a Practical Size

141

Linked by 141 profiles in this topic

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

122

Linked by 122 profiles in this topic

Gemma 3 Technical Report

113

Linked by 113 profiles in this topic

Frequently Linked Sources

Source clusters that repeatedly anchor researchers in this area.

Llama (site)

482

Used across 482 researcher pages in this topic

The Llama 3 Herd of Models

482

Used across 482 researcher pages in this topic

Gemma (docs)

359

Used across 359 researcher pages in this topic

DeepSeek (project)

195

Used across 195 researcher pages in this topic

DeepSeek-V3 Technical Report

195

Used across 195 researcher pages in this topic

Gemma 2: Improving Open Language Models at a Practical Size

141

Used across 141 researcher pages in this topic

Researchers To Start With

A stronger first pass through open models, ranked by profile depth, evidence, and editorial importance.

Stella Biderman

Open-source LLMs, datasets

5 sources

A key open-model ecosystem builder whose work matters because it combines research, public infrastructure, and field-level coordination rather than isolated paper output alone.

EleutherAI Open Models Systems & Infrastructure

Start HereThe Pile: An 800GB Dataset of Diverse Text for Language Modeling

Bartłomiej Koptyra

RWKV and efficient sequence modeling

5 sources

A strong page to keep because he links the early RWKV work to the later Wrocław-centered PLLuM effort, which makes him one of the clearer continuity threads between open sequence models and Polish-language LLM development.

Open Models Systems & Infrastructure

Start HereBartłomiej Koptyra at Wrocław University of Science and Technology

Eric Alcaide

RWKV and efficient sequence modeling

5 sources

A distinctive page because his work bridges open-sequence-model experimentation with applied machine learning for molecules, proteins, and structural biology, and he shows up on multiple RWKV-family papers including the hybrid GoldFinch branch rather than only the first release.

Open Models Systems & Infrastructure

Start HereEric Alcaide

Eric Hallahan

Open-source LLMs (EleutherAI)

5 sources

Useful because his footprint runs through the early EleutherAI training stack, GPT-NeoX, and Pythia, which makes the page a better map of open-model infrastructure than a generic one-paper profile.

EleutherAI Open Models Systems & Infrastructure

Start HereAbout Eric Hallahan

Anish Thite

Open-source LLMs (EleutherAI)

5 sources

Useful to follow if you care about the practical evaluation layer of open models, especially where benchmark tooling and reproducible comparisons actually shape what the ecosystem measures.

EleutherAI Open Models Evaluation & Benchmarks

Start HereAnish Thite

Alon Albalak

RWKV and efficient sequence modeling

5 sources

A strong open-model and data-centric page because his work sits close to the infrastructure that made OLMo and Dolma useful to the broader research community rather than just another benchmark-driven model release.

Open Models Evaluation & Benchmarks

Start HereAlon Albalak

Arthur Mensch

Open-weight LLMs

4 sources

One of the clearest people to track if you want to understand how frontier open-weight labs balance model quality, deployment speed, and product ambition.

Mistral Open Models

Start HereMistral AI

Connor Leahy

Open models, governance, communication

4 sources

An important bridge figure between open-weight language-model communities and the modern alignment debate, especially when you want to understand how frontier capability, openness, and control arguments collide in practice.

Open Models

Angles To Understand

Institution Signals

Canonical Starting Points

Frequently Linked Sources

Researchers To Start With

All Researchers In This Topic