Hybrid Transformer–Mamba language models (Jamba)
A better long-tail AI21 page because it makes the data side of Jamba visible, instead of leaving the impression that hybrid-model progress came only from architecture and not from the people shaping the data pipeline underneath it.