1,396 followers
論文化しておきました。https://t.co/5MrTEBB7jm
論文化しておきました。https://t.co/5MrTEBB7jm
published as a journal article. https://t.co/5MrTEBB7jm
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks Simple adversarial attack which "iteratively transforms harmful prompts into benign expressions directly utilizing the target LLM". https://t.co/zWI2Xl6Imu https://t.co/48KcQZ9eRA
All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks. https://t.co/YzP3Kc2luv