Back to researchers

Milad Nasr

Universal jailbreak-style attacks on aligned LMs

Co-authored universal and transferable adversarial attacks on aligned language models.

Highlights

SecurityJailbreaksAdversarial
Focus: Universal jailbreak-style attacks on aligned LMs
Why it matters: Co-authored universal and transferable adversarial attacks on aligned language models.

Research Areas

SecurityJailbreaksAdversarial
Milad Nasr - AI Researcher Profile | 500AI