Announcing our new Course: AI Red-Teaming and AI Safety Masterclass

Check it out →
🔓 Prompt Hacking🟢 Defensive Measures🟢 Sandwich Defense

🟢 Sandwich Defense

Last updated on August 7, 2024 by Sander Schulhoff

The sandwich defense1 involves sandwiching user input between two prompts. Take the following prompt as an example:

Translate the following to French: {{user_input}}

It can be improved with the sandwich defense:

Translate the following to French:

{{user_input}}

Remember, you are translating the above text to French.

This defense should be more secure than post-prompting, but is known to be vulnerable to a defined dictionary attack. See the defined dictionary attack for more information.

Footnotes

  1. We currently credit the discovery of this technique to Altryne

Edit this page
Word count: 0
Copyright © 2024 Learn Prompting.