Back to researchers
Margaret Mitchell
Large-scale open code data (The Stack)
Co-authored The Stack: a major permissively-licensed dataset used for open code models.
Highlights
DatasetsCodeOpen data
Focus: Large-scale open code data (The Stack)
Why it matters: Co-authored The Stack: a major permissively-licensed dataset used for open code models.
Research Areas
DatasetsCodeOpen data