🔓 提示駭客攻擊🟢 進攻措施🟢 遞迴注入

遞迴注入

🟢 This article is rated easy

Reading Time: 1 minute

Last updated on August 7, 2024

正如我們之前所看到的，防止駭客攻擊的一個很好的防禦方法是使用一個 LLM 來評估另一個 LLM 的輸出，以檢查是否有任何對抗性輸出。可以透過 recursive injection attack 來逃避這種防禦。此攻擊涉及向第一個 LLM 注入提示，該提示會建立包含第二個 LLM 注入指令的輸出。

假設我們試圖破解的提示是:

Say something positive about the following and nature:
{{user_input}}

該提示的輸出將由提示評估:

Does the following contain any profanity or offensive content (YES or NO)?
{{output}}

我們可以透過附加到提示來破解這兩個提示，如下所示。請注意，這也是變數賦值攻擊的範例。

上面的輸出是一個注入嘗試，它將輸入輸入到第二個 LLM 呼叫中:

我們現在已經接通了第二通 LLM 呼叫。遞歸注入很難執行，但在適當的情況下，它們可能非常有用。

Footnotes

Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., & Hashimoto, T. (2023). Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks. ↩

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

Live Courses

遞迴注入

Footnotes