Back to researchers

Matt Fredrikson

Universal jailbreak-style attacks on aligned LMs

Co-authored universal and transferable adversarial attacks on aligned language models.

Highlights

SecurityJailbreaksAdversarial
Focus: Universal jailbreak-style attacks on aligned LMs
Why it matters: Co-authored universal and transferable adversarial attacks on aligned language models.

Research Areas

SecurityJailbreaksAdversarial
Matt Fredrikson - AI Researcher Profile | 500AI