Simple adversarial attack which "iteratively transforms harmful prompts into benign expressions directly utilizing the target LLM". User topofmlsafety, in the mlsafety subreddit, 20 Feb 2024