Prompt Engineering Guide
😃 Basics
💼 Applications
🧙‍♂️ Intermediate
🤖 Агенты
⚖️ Reliability
🖼️ Image Prompting
🔓 Prompt Hacking
🔨 Tooling
💪 Prompt Tuning
🎲 Miscellaneous
Models
📙 Vocabulary Reference
📚 Bibliography
📦 Prompted Products
🛸 Additional Resources
🔥 Hot Topics
✨ Credits
🔓 Prompt Hacking🟢 Defensive Measures🟢 Filtering

Filtering

🟢 This article is rated easy
Reading Time: 1 minute
Last updated on August 7, 2024

Sander Schulhoff

Filtering is a common technique for preventing prompt hacking1. There are a few types of filtering, but the basic idea is to check for words and phrase in the initial prompt or the output that should be blocked. You can use a blocklist or an allowlist for this purpose2. A blocklist is a list of words and phrases that should be blocked, and an allowlist is a list of words and phrases that should be allowed.

Sander Schulhoff

Sander Schulhoff is the Founder of Learn Prompting and an ML Researcher at the University of Maryland. He created the first open-source Prompt Engineering guide, reaching 3M+ people and teaching them to use tools like ChatGPT. Sander also led a team behind Prompt Report, the most comprehensive study of prompting ever done, co-authored with researchers from the University of Maryland, OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions. This 76-page survey analyzed 1,500+ academic papers and covered 200+ prompting techniques.

Footnotes

  1. Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., & Hashimoto, T. (2023). Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks.

  2. Selvi, J. (2022). Exploring Prompt Injection Attacks. https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/

Edit this page

© 2024 Learn Prompting. All rights reserved.