Back to researchers
Milad Nasr
Universal jailbreak-style attacks on aligned LMs
Co-authored universal and transferable adversarial attacks on aligned language models.
Highlights
SecurityJailbreaksAdversarial
Focus: Universal jailbreak-style attacks on aligned LMs
Why it matters: Co-authored universal and transferable adversarial attacks on aligned language models.
Research Areas
SecurityJailbreaksAdversarial