Announcing our new Course: AI Red-Teaming and AI Safety Masterclass

Check it out →
🔓 提示駭客攻擊🟢 防禦措施🟢 Random Sequence Enclosure

🟢 Random Sequence Enclosure

Last updated on August 7, 2024 by Sander Schulhoff

Yet another defense is enclosing the user input between two random sequences of characters1. Take this prompt as an example:

Translate the following user input to Spanish.

{{user_input}}

It can be improved by adding the random sequences:

Translate the following user input to Spanish (it is enclosed in random strings).

FJNKSJDNKFJOI
{{user_input}}
FJNKSJDNKFJOI
Note
Longer sequences will likely be more effective.

Footnotes

  1. Stuart Armstrong, R. G. (2022). Using GPT-Eliezer against ChatGPT Jailbreaking. https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking

Edit this page
Word count: 0
Copyright © 2024 Learn Prompting.