🔓 Prompt Hacking🟢 Defensive Measures🟢 Filtering

Filtering

🟢 This article is rated easy

Reading Time: 1 minute

Last updated on August 7, 2024

Filtering is a common technique for preventing prompt hacking¹. There are a few types of filtering, but the basic idea is to check for words and phrase in the initial prompt or the output that should be blocked. You can use a blocklist or an allowlist for this purpose². A blocklist is a list of words and phrases that should be blocked, and an allowlist is a list of words and phrases that should be allowed.

Footnotes

Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., & Hashimoto, T. (2023). Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks. ↩
Selvi, J. (2022). Exploring Prompt Injection Attacks. https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ ↩

Sander Schulhoff

Sander Schulhoff is the Founder of Learn Prompting and an ML Researcher at the University of Maryland. He created the first open-source Prompt Engineering guide, reaching 3M+ people and teaching them to use tools like ChatGPT. Sander also led a team behind Prompt Report, the most comprehensive study of prompting ever done, co-authored with researchers from the University of Maryland, OpenAI, Microsoft, Google, Princeton, Stanford, and other leading institutions. This 76-page survey analyzed 1,500+ academic papers and covered 200+ prompting techniques.

Edit this page

🟢 Introduction

🟢 Instruction Defense

Master Generative AI with Our Courses

Need Business GenAI Training?

Contact Sales

Want to keep learning

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

Live Courses

Filtering

Footnotes

Sander Schulhoff

Master Generative AI with Our Courses

Contact Sales

Explore Our Full Course Collection

Explore Courses

Resources

Follow Us