๐ Bibliography
The page contains an organized list of all papers used by this course. The papers are organized by topic.
To cite this course, use the provided citation in the Github repository.
@software{Schulhoff_Learn_Prompting_2022,
author = {Schulhoff, Sander and Community Contributors},
month = dec,
title = {{Learn Prompting}},
url = {https://github.com/trigaten/Learn_Prompting},
year = {2022}
}
Note: since neither the GPT-3 nor the GPT-3 Instruct paper correspond to davinci models, I attempt not to cite them as such.
Agentsโ
MRKL1โ
ReAct2โ
PAL3โ
Auto-GPT4โ
Baby AGI5โ
AgentGPT6โ
Toolformer7โ
Automatedโ
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts8โ
automatic prompt engineer9โ
Soft Prompting10โ
discretized soft prompting (interpreting)11โ
Datasetsโ
SCAN dataset (compositional generalization)12โ
GSM8K13โ
hotpotQA14โ
multiarith15โ
fever dataset16โ
bbq17โ
Detectionโ
Don't ban chatgpt in schools. teach with it.18โ
Schools Shouldn't Ban Access to ChatGPT19โ
Certified Neural Network Watermarks with Randomized Smoothing20โ
Watermarking Pre-trained Language Models with Backdooring21โ
GW preparing disciplinary response to AI programs as faculty explore educational use22โ
A Watermark for Large Language Models23โ
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature24โ
Image Prompt Engineeringโ
Prompt Engineering for Text-Based Generative Art25โ
The DALLE 2 Prompt Book26โ
With the right prompt, Stable Diffusion 2.0 can do hands.27โ
Meta Analysisโ
How Generative AI Is Changing Creative Work28โ
How AI Will Change the Workplace29โ
ChatGPT took their jobs. Now they walk dogs and fix air conditioners.30โ
No title31โ
Misclโ
The Turking Test: Can Language Models Understand Instructions?32โ
A Taxonomy of Prompt Modifiers for Text-To-Image Generation33โ
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models34โ
Optimizing Prompts for Text-to-Image Generation35โ
Language Model Cascades36โ
Design Guidelines for Prompt Engineering Text-to-Image Generative Models37โ
Discovering Language Model Behaviors with Model-Written Evaluations38โ
Selective Annotation Makes Language Models Better Few-Shot Learners39โ
Atlas: Few-shot Learning with Retrieval Augmented Language Models40โ
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension41โ
Prompting Is Programming: A Query Language For Large Language Models42โ
Parallel Context Windows Improve In-Context Learning of Large Language Models43โ
Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models44โ
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks45โ
Making Pre-trained Language Models Better Few-shot Learners46โ
How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models47โ
On Measuring Social Biases in Prompt-Based Multi-Task Learning48โ
Plot Writing From Pre-Trained Language Models49โ
{S}tereo{S}et: Measuring stereotypical bias in pretrained language models50โ
Survey of Hallucination in Natural Language Generation51โ
Wordcraft: Story Writing With Large Language Models52โ
PainPoints: A Framework for Language-based Detection of Chronic Pain and Expert-Collaborative Text-Summarization53โ
Self-Instruct: Aligning Language Model with Self Generated Instructions54โ
From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models55โ
New and improved content moderation tooling56โ
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference57โ
Human-level concept learning through probabilistic program induction58โ
{Riffusion - Stable diffusion for real-time music generation}59โ
How to use OpenAIโs ChatGPT to write the perfect cold email60โ
Cacti: biology and uses61โ
Are Language Models Worse than Humans at Following Prompts? Itโs Complicated62โ
Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration63โ
Prompt Hackingโ
Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods64โ
New jailbreak based on virtual functions - smuggle illegal tokens to the backend.65โ
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks66โ
More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models67โ
ChatGPT "DAN" (and other "Jailbreaks")68โ
Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples69โ
Prompt injection attacks against GPT-370โ
Exploiting GPT-3 prompts with malicious inputs that order the model to ignore its previous directions71โ
History Correction72โ
adversarial-prompts73โ
GPT-3 Prompt Injection Defenses74โ
Talking to machines: prompt engineering & injection75โ
Using GPT-Eliezer against ChatGPT Jailbreaking76โ
Exploring Prompt Injection Attacks77โ
The entire prompt of Microsoft Bing Chat?! (Hi, Sydney.)78โ
Ignore Previous Prompt: Attack Techniques For Language Models79โ
Lessons learned on Language Model Safety and misuse80โ
Toxicity Detection with Generative Prompt-based Inference81โ
ok I saw a few people jailbreaking safeguards openai put on chatgpt so I had to give it a shot myself82โ
Bypass @OpenAI's ChatGPT alignment efforts with this one weird trick83โ
ChatGPT jailbreaking itself84โ
Using "pretend" on #ChatGPT can do some wild stuff. You can kind of get some insight on the future, alternative universe.85โ
I kinda like this one even more!86โ
uh oh87โ
Building A Virtual Machine inside ChatGPT88โ
Reliabilityโ
MathPrompter: Mathematical Reasoning using Large Language Models89โ
The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning90โ
Prompting GPT-3 To Be Reliable91โ
On the Advance of Making Language Models Better Reasoners92โ
Ask Me Anything: A simple strategy for prompting language models93โ
Calibrate Before Use: Improving Few-Shot Performance of Language Models94โ
Can large language models reason about medical questions?95โ
Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference96โ
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning97โ
Evaluating language models can be tricky98โ
Constitutional AI: Harmlessness from AI Feedback99โ
Surveysโ
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition100โ
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing101โ
PromptPapers102โ
A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT103โ
Techniquesโ
Chain of Thought Prompting Elicits Reasoning in Large Language Models104โ
Large Language Models are Zero-Shot Reasoners105โ
Self-Consistency Improves Chain of Thought Reasoning in Language Models106โ
What Makes Good In-Context Examples for GPT-3?107โ
Generated Knowledge Prompting for Commonsense Reasoning108โ
Recitation-Augmented Language Models109โ
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?110โ
Show Your Work: Scratchpads for Intermediate Computation with Language Models111โ
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations112โ
STaR: Bootstrapping Reasoning With Reasoning113โ
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models114โ
Reframing Instructional Prompts to GPTkโs Language115โ
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models116โ
Role-Play with Large Language Models117โ
CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society118โ
TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks119โ
Modelsโ
Image Modelsโ
Stable Diffusion120โ
DALLE121โ
Language Modelsโ
ChatGPT122โ
GPT-3123โ
Instruct GPT124โ
GPT-4125โ
PaLM: Scaling Language Modeling with Pathways126โ
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model127โ
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting128โ
Jurassic-1: Technical Details and Evaluation, White paper, AI21 Labs, 2021129โ
GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model130โ
Roberta: A robustly optimized bert pretraining approach131โ
Toolingโ
Idesโ
TextBox 2.0: A Text Generation Library with Pre-trained Language Models132โ
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models133โ
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts134โ
PromptChainer: Chaining Large Language Model Prompts through Visual Programming135โ
OpenPrompt: An Open-source Framework for Prompt-learning136โ
PromptMaker: Prompt-Based Prototyping with Largeย Languageย Models137โ
Toolsโ
LangChain138โ
GPT Index139โ
- Karpas, E., Abend, O., Belinkov, Y., Lenz, B., Lieber, O., Ratner, N., Shoham, Y., Bata, H., Levine, Y., Leyton-Brown, K., Muhlgay, D., Rozen, N., Schwartz, E., Shachaf, G., Shalev-Shwartz, S., Shashua, A., & Tenenholtz, M. (2022). โฉ
- Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). โฉ
- Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J., & Neubig, G. (2022). โฉ
- Significant-Gravitas. (2023). https://news.agpt.co/ โฉ
- Nakajima, Y. (2023). https://github.com/yoheinakajima/babyagi โฉ
- Reworkd.ai. (2023). https://github.com/reworkd/AgentGPT โฉ
- Schick, T., Dwivedi-Yu, J., Dessรฌ, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., & Scialom, T. (2023). โฉ
- Shin, T., Razeghi, Y., Logan IV, R. L., Wallace, E., & Singh, S. (2020). AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/2020.emnlp-main.346 โฉ
- Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H., & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. โฉ
- Lester, B., Al-Rfou, R., & Constant, N. (2021). The Power of Scale for Parameter-Efficient Prompt Tuning. โฉ
- Khashabi, D., Lyu, S., Min, S., Qin, L., Richardson, K., Welleck, S., Hajishirzi, H., Khot, T., Sabharwal, A., Singh, S., & Choi, Y. (2021). Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts. โฉ
- Lake, B. M., & Baroni, M. (2018). Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks. https://doi.org/10.48550/arXiv.1711.00350 โฉ
- Cobbe, K., Kosaraju, V., Bavarian, M., Chen, M., Jun, H., Kaiser, L., Plappert, M., Tworek, J., Hilton, J., Nakano, R., Hesse, C., & Schulman, J. (2021). Training Verifiers to Solve Math Word Problems. โฉ
- Yang, Z., Qi, P., Zhang, S., Bengio, Y., Cohen, W. W., Salakhutdinov, R., & Manning, C. D. (2018). HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. โฉ
- Roy, S., & Roth, D. (2015). Solving General Arithmetic Word Problems. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1743โ1752. https://doi.org/10.18653/v1/D15-1202 โฉ
- Thorne, J., Vlachos, A., Christodoulopoulos, C., & Mittal, A. (2018). FEVER: a large-scale dataset for Fact Extraction and VERification. โฉ
- Parrish, A., Chen, A., Nangia, N., Padmakumar, V., Phang, J., Thompson, J., Htut, P. M., & Bowman, S. R. (2021). BBQ: A Hand-Built Bias Benchmark for Question Answering. โฉ
- Roose, K. (2022). Donโt ban chatgpt in schools. teach with it. https://www.nytimes.com/2023/01/12/technology/chatgpt-schools-teachers.html โฉ
- Lipman, J., & Distler, R. (2023). Schools Shouldnโt Ban Access to ChatGPT. https://time.com/6246574/schools-shouldnt-ban-access-to-chatgpt/ โฉ
- Bansal, A., yeh Ping-Chiang, Curry, M., Jain, R., Wigington, C., Manjunatha, V., Dickerson, J. P., & Goldstein, T. (2022). Certified Neural Network Watermarks with Randomized Smoothing. โฉ
- Gu, C., Huang, C., Zheng, X., Chang, K.-W., & Hsieh, C.-J. (2022). Watermarking Pre-trained Language Models with Backdooring. โฉ
- Noonan, E., & Averill, O. (2023). GW preparing disciplinary response to AI programs as faculty explore educational use. https://www.gwhatchet.com/2023/01/17/gw-preparing-disciplinary-response-to-ai-programs-as-faculty-explore-educational-use/ โฉ
- Kirchenbauer, J., Geiping, J., Wen, Y., Katz, J., Miers, I., & Goldstein, T. (2023). A Watermark for Large Language Models. https://arxiv.org/abs/2301.10226 โฉ
- Mitchell, E., Lee, Y., Khazatsky, A., Manning, C., & Finn, C. (2023). DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature. https://doi.org/10.48550/arXiv.2301.11305 โฉ
- Oppenlaender, J. (2022). Prompt Engineering for Text-Based Generative Art. โฉ
- Parsons, G. (2022). The DALLE 2 Prompt Book. https://dallery.gallery/the-dalle-2-prompt-book/ โฉ
- Blake. (2022). With the right prompt, Stable Diffusion 2.0 can do hands. https://www.reddit.com/r/StableDiffusion/comments/z7salo/with_the_right_prompt_stable_diffusion_20_can_do/ โฉ
- Davenport, T. H., & Mittal, N. (2022). How Generative AI Is Changing Creative Work. Harvard Business Review. https://hbr.org/2022/11/how-generative-ai-is-changing-creative-work โฉ
- Captain, S. (2023). How AI Will Change the Workplace. Wall Street Journal. https://www.wsj.com/articles/how-ai-change-workplace-af2162ee โฉ
- Verma, P., & Vynck, G. D. (2023). ChatGPT took their jobs. Now they walk dogs and fix air conditioners. Washington Post. https://www.washingtonpost.com/technology/2023/06/02/ai-taking-jobs/ โฉ
- Ford, B. (2023). Bloomberg.Com. https://www.bloomberg.com/news/articles/2023-05-01/ibm-to-pause-hiring-for-back-office-jobs-that-ai-could-kill โฉ
- Efrat, A., & Levy, O. (2020). The Turking Test: Can Language Models Understand Instructions? โฉ
- Oppenlaender, J. (2022). A Taxonomy of Prompt Modifiers for Text-To-Image Generation. โฉ
- Wang, Z. J., Montoya, E., Munechika, D., Yang, H., Hoover, B., & Chau, D. H. (2022). DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models. โฉ
- Hao, Y., Chi, Z., Dong, L., & Wei, F. (2022). Optimizing Prompts for Text-to-Image Generation. โฉ
- Dohan, D., Xu, W., Lewkowycz, A., Austin, J., Bieber, D., Lopes, R. G., Wu, Y., Michalewski, H., Saurous, R. A., Sohl-dickstein, J., Murphy, K., & Sutton, C. (2022). Language Model Cascades. โฉ
- Liu, V., & Chilton, L. B. (2022). Design Guidelines for Prompt Engineering Text-to-Image Generative Models. Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491102.3501825 โฉ
- Perez, E., Ringer, S., Lukoลกiลซtฤ, K., Nguyen, K., Chen, E., Heiner, S., Pettit, C., Olsson, C., Kundu, S., Kadavath, S., Jones, A., Chen, A., Mann, B., Israel, B., Seethor, B., McKinnon, C., Olah, C., Yan, D., Amodei, D., โฆ Kaplan, J. (2022). Discovering Language Model Behaviors with Model-Written Evaluations. โฉ
- Su, H., Kasai, J., Wu, C. H., Shi, W., Wang, T., Xin, J., Zhang, R., Ostendorf, M., Zettlemoyer, L., Smith, N. A., & Yu, T. (2022). Selective Annotation Makes Language Models Better Few-Shot Learners. โฉ
- Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., Dwivedi-Yu, J., Joulin, A., Riedel, S., & Grave, E. (2022). Atlas: Few-shot Learning with Retrieval Augmented Language Models. โฉ
- Wang, B., Feng, C., Nair, A., Mao, M., Desai, J., Celikyilmaz, A., Li, H., Mehdad, Y., & Radev, D. (2022). STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension. โฉ
- Beurer-Kellner, L., Fischer, M., & Vechev, M. (2022). Prompting Is Programming: A Query Language For Large Language Models. โฉ
- Ratner, N., Levine, Y., Belinkov, Y., Ram, O., Abend, O., Karpas, E., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2022). Parallel Context Windows Improve In-Context Learning of Large Language Models. โฉ
- Bursztyn, V. S., Demeter, D., Downey, D., & Birnbaum, L. (2022). Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models. โฉ
- Wang, Y., Mishra, S., Alipoormolabashi, P., Kordi, Y., Mirzaei, A., Arunkumar, A., Ashok, A., Dhanasekaran, A. S., Naik, A., Stap, D., Pathak, E., Karamanolakis, G., Lai, H. G., Purohit, I., Mondal, I., Anderson, J., Kuznia, K., Doshi, K., Patel, M., โฆ Khashabi, D. (2022). Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks. โฉ
- Gao, T., Fisch, A., & Chen, D. (2021). Making Pre-trained Language Models Better Few-shot Learners. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). https://doi.org/10.18653/v1/2021.acl-long.295 โฉ
- Dang, H., Mecke, L., Lehmann, F., Goller, S., & Buschek, D. (2022). How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models. โฉ
- Akyรผrek, A. F., Paik, S., Kocyigit, M. Y., Akbiyik, S., Runyun, ล. L., & Wijaya, D. (2022). On Measuring Social Biases in Prompt-Based Multi-Task Learning. โฉ
- Jin, Y., Kadam, V., & Wanvarie, D. (2022). Plot Writing From Pre-Trained Language Models. โฉ
- Nadeem, M., Bethke, A., & Reddy, S. (2021). StereoSet: Measuring stereotypical bias in pretrained language models. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 5356โ5371. https://doi.org/10.18653/v1/2021.acl-long.416 โฉ
- Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A., & Fung, P. (2022). Survey of Hallucination in Natural Language Generation. ACM Computing Surveys. https://doi.org/10.1145/3571730 โฉ
- Yuan, A., Coenen, A., Reif, E., & Ippolito, D. (2022). Wordcraft: Story Writing With Large Language Models. 27th International Conference on Intelligent User Interfaces, 841โ852. โฉ
- Fadnavis, S., Dhurandhar, A., Norel, R., Reinen, J. M., Agurto, C., Secchettin, E., Schweiger, V., Perini, G., & Cecchi, G. (2022). PainPoints: A Framework for Language-based Detection of Chronic Pain and Expert-Collaborative Text-Summarization. arXiv Preprint arXiv:2209.09814. โฉ
- Wang, Y., Kordi, Y., Mishra, S., Liu, A., Smith, N. A., Khashabi, D., & Hajishirzi, H. (2022). Self-Instruct: Aligning Language Model with Self Generated Instructions. โฉ
- Guo, J., Li, J., Li, D., Tiong, A. M. H., Li, B., Tao, D., & Hoi, S. C. H. (2022). From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models. โฉ
- Markov, T. (2022). New and improved content moderation tooling. In OpenAI. OpenAI. https://openai.com/blog/new-and-improved-content-moderation-tooling/ โฉ
- Schick, T., & Schรผtze, H. (2020). Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference. โฉ
- Lake, B. M., Salakhutdinov, R., & Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction. Science, 350(6266), 1332โ1338. โฉ
- Forsgren, S., & Martiros, H. (2022). Riffusion - Stable diffusion for real-time music generation. https://riffusion.com/about โฉ
- Bonta, A. (2022). How to use OpenAIโs ChatGPT to write the perfect cold email. https://www.streak.com/post/how-to-use-ai-to-write-perfect-cold-emails โฉ
- Nobel, P. S., & others. (2002). Cacti: biology and uses. Univ of California Press. โฉ
- Webson, A., Loo, A. M., Yu, Q., & Pavlick, E. (2023). Are Language Models Worse than Humans at Following Prompts? Itโs Complicated. arXiv:2301.07085v1 [Cs.CL]. โฉ
- Wang, Z., Mao, S., Wu, W., Ge, T., Wei, F., & Ji, H. (2023). Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration. โฉ
- Crothers, E., Japkowicz, N., & Viktor, H. (2022). Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods. โฉ
- u/Nin_kat. (2023). New jailbreak based on virtual functions - smuggle illegal tokens to the backend. https://www.reddit.com/r/ChatGPT/comments/10urbdj/new_jailbreak_based_on_virtual_functions_smuggle โฉ
- Kang, D., Li, X., Stoica, I., Guestrin, C., Zaharia, M., & Hashimoto, T. (2023). Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks. โฉ
- Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., & Fritz, M. (2023). More than youโve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models. โฉ
- KIHO, L. (2023). ChatGPT โDANโ (and other โJailbreaksโ). https://github.com/0xk1h0/ChatGPT_DAN โฉ
- Branch, H. J., Cefalu, J. R., McHugh, J., Hujer, L., Bahl, A., del Castillo Iglesias, D., Heichman, R., & Darwishi, R. (2022). Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples. โฉ
- Willison, S. (2022). Prompt injection attacks against GPT-3. https://simonwillison.net/2022/Sep/12/prompt-injection/ โฉ
- Goodside, R. (2022). Exploiting GPT-3 prompts with malicious inputs that order the model to ignore its previous directions. https://twitter.com/goodside/status/1569128808308957185 โฉ
- Goodside, R. (2023). History Correction. https://twitter.com/goodside/status/1610110111791325188?s=20&t=ulviQABPXFIIt4ZNZPAUCQ โฉ
- Chase, H. (2022). adversarial-prompts. https://github.com/hwchase17/adversarial-prompts โฉ
- Goodside, R. (2022). GPT-3 Prompt Injection Defenses. https://twitter.com/goodside/status/1578278974526222336?s=20&t=3UMZB7ntYhwAk3QLpKMAbw โฉ
- Mark, C. (2022). Talking to machines: prompt engineering & injection. https://artifact-research.com/artificial-intelligence/talking-to-machines-prompt-engineering-injection/ โฉ
- Stuart Armstrong, R. G. (2022). Using GPT-Eliezer against ChatGPT Jailbreaking. https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking โฉ
- Selvi, J. (2022). Exploring Prompt Injection Attacks. https://research.nccgroup.com/2022/12/05/exploring-prompt-injection-attacks/ โฉ
- Liu, K. (2023). The entire prompt of Microsoft Bing Chat?! (Hi, Sydney.). https://twitter.com/kliu128/status/1623472922374574080 โฉ
- Perez, F., & Ribeiro, I. (2022). Ignore Previous Prompt: Attack Techniques For Language Models. arXiv. https://doi.org/10.48550/ARXIV.2211.09527 โฉ
- Brundage, M. (2022). Lessons learned on Language Model Safety and misuse. In OpenAI. OpenAI. https://openai.com/blog/language-model-safety-and-misuse/ โฉ
- Wang, Y.-S., & Chang, Y. (2022). Toxicity Detection with Generative Prompt-based Inference. arXiv. https://doi.org/10.48550/ARXIV.2205.12390 โฉ
- Maz, A. (2022). ok I saw a few people jailbreaking safeguards openai put on chatgpt so I had to give it a shot myself. https://twitter.com/alicemazzy/status/1598288519301976064 โฉ
- Piedrafita, M. (2022). Bypass @OpenAIโs ChatGPT alignment efforts with this one weird trick. https://twitter.com/m1guelpf/status/1598203861294252033 โฉ
- Parfait, D. (2022). ChatGPT jailbreaking itself. https://twitter.com/haus_cole/status/1598541468058390534 โฉ
- Soares, N. (2022). Using โpretendโ on #ChatGPT can do some wild stuff. You can kind of get some insight on the future, alternative universe. https://twitter.com/NeroSoares/status/1608527467265904643 โฉ
- Moran, N. (2022). I kinda like this one even more! https://twitter.com/NickEMoran/status/1598101579626057728 โฉ
- samczsun. (2022). uh oh. https://twitter.com/samczsun/status/1598679658488217601 โฉ
- Degrave, J. (2022). Building A Virtual Machine inside ChatGPT. Engraved. https://www.engraved.blog/building-a-virtual-machine-inside/ โฉ
- Imani, S., Du, L., & Shrivastava, H. (2023). MathPrompter: Mathematical Reasoning using Large Language Models. โฉ
- Ye, X., & Durrett, G. (2022). The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning. โฉ
- Si, C., Gan, Z., Yang, Z., Wang, S., Wang, J., Boyd-Graber, J., & Wang, L. (2022). Prompting GPT-3 To Be Reliable. โฉ
- Li, Y., Lin, Z., Zhang, S., Fu, Q., Chen, B., Lou, J.-G., & Chen, W. (2022). On the Advance of Making Language Models Better Reasoners. โฉ
- Arora, S., Narayan, A., Chen, M. F., Orr, L., Guha, N., Bhatia, K., Chami, I., Sala, F., & Rรฉ, C. (2022). Ask Me Anything: A simple strategy for prompting language models. โฉ
- Zhao, T. Z., Wallace, E., Feng, S., Klein, D., & Singh, S. (2021). Calibrate Before Use: Improving Few-Shot Performance of Language Models. โฉ
- Liรฉvin, V., Hother, C. E., & Winther, O. (2022). Can large language models reason about medical questions? โฉ
- Mitchell, E., Noh, J. J., Li, S., Armstrong, W. S., Agarwal, A., Liu, P., Finn, C., & Manning, C. D. (2022). Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference. โฉ
- Shaikh, O., Zhang, H., Held, W., Bernstein, M., & Yang, D. (2022). On Second Thought, Letโs Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning. โฉ
- Chase, H. (2022). Evaluating language models can be tricky. https://twitter.com/hwchase17/status/1607428141106008064 โฉ
- Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., โฆ Kaplan, J. (2022). Constitutional AI: Harmlessness from AI Feedback. โฉ
- Jurafsky, D., & Martin, J. H. (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Prentice Hall. โฉ
- Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., & Neubig, G. (2022). Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys. https://doi.org/10.1145/3560815 โฉ
- Ding, N., & Hu, S. (2022). PromptPapers. https://github.com/thunlp/PromptPapers โฉ
- White, J., Fu, Q., Hays, S., Sandborn, M., Olea, C., Gilbert, H., Elnashar, A., Spencer-Smith, J., & Schmidt, D. C. (2023). A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT. โฉ
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain of Thought Prompting Elicits Reasoning in Large Language Models. โฉ
- Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large Language Models are Zero-Shot Reasoners. โฉ
- Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. โฉ
- Liu, J., Shen, D., Zhang, Y., Dolan, B., Carin, L., & Chen, W. (2022). What Makes Good In-Context Examples for GPT-3? Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. https://doi.org/10.18653/v1/2022.deelio-1.10 โฉ
- Liu, J., Liu, A., Lu, X., Welleck, S., West, P., Bras, R. L., Choi, Y., & Hajishirzi, H. (2021). Generated Knowledge Prompting for Commonsense Reasoning. โฉ
- Sun, Z., Wang, X., Tay, Y., Yang, Y., & Zhou, D. (2022). Recitation-Augmented Language Models. โฉ
- Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H., & Zettlemoyer, L. (2022). Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? โฉ
- Nye, M., Andreassen, A. J., Gur-Ari, G., Michalewski, H., Austin, J., Bieber, D., Dohan, D., Lewkowycz, A., Bosma, M., Luan, D., Sutton, C., & Odena, A. (2021). Show Your Work: Scratchpads for Intermediate Computation with Language Models. โฉ
- Jung, J., Qin, L., Welleck, S., Brahman, F., Bhagavatula, C., Bras, R. L., & Choi, Y. (2022). Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations. โฉ
- Zelikman, E., Wu, Y., Mu, J., & Goodman, N. D. (2022). STaR: Bootstrapping Reasoning With Reasoning. โฉ
- Zhou, D., Schรคrli, N., Hou, L., Wei, J., Scales, N., Wang, X., Schuurmans, D., Cui, C., Bousquet, O., Le, Q., & Chi, E. (2022). Least-to-Most Prompting Enables Complex Reasoning in Large Language Models. โฉ
- Mishra, S., Khashabi, D., Baral, C., Choi, Y., & Hajishirzi, H. (2022). Reframing Instructional Prompts to GPTkโs Language. Findings of the Association for Computational Linguistics: ACL 2022. https://doi.org/10.18653/v1/2022.findings-acl.50 โฉ
- Logan IV, R., Balazevic, I., Wallace, E., Petroni, F., Singh, S., & Riedel, S. (2022). Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models. Findings of the Association for Computational Linguistics: ACL 2022, 2824โ2835. https://doi.org/10.18653/v1/2022.findings-acl.222 โฉ
- Shanahan, M., McDonell, K., & Reynolds, L. (2023). Role-Play with Large Language Models. โฉ
- Li, G., Hammoud, H. A. A. K., Itani, H., Khizbullin, D., & Ghanem, B. (2023). CAMEL: Communicative Agents for โMindโ Exploration of Large Scale Language Model Society. โฉ
- Santu, S. K. K., & Feng, D. (2023). TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks. โฉ
- Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2021). High-Resolution Image Synthesis with Latent Diffusion Models. โฉ
- Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022). Hierarchical Text-Conditional Image Generation with CLIP Latents. โฉ
- OpenAI. (2022). ChatGPT: Optimizing Language Models for Dialogue. https://openai.com/blog/chatgpt/. https://openai.com/blog/chatgpt/ โฉ
- Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., โฆ Amodei, D. (2020). Language Models are Few-Shot Learners. โฉ
- Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. โฉ
- OpenAI. (2023). GPT-4 Technical Report. โฉ
- Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., Barham, P., Chung, H. W., Sutton, C., Gehrmann, S., Schuh, P., Shi, K., Tsvyashchenko, S., Maynez, J., Rao, A., Barnes, P., Tay, Y., Shazeer, N., Prabhakaran, V., โฆ Fiedel, N. (2022). PaLM: Scaling Language Modeling with Pathways. โฉ
- Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Iliฤ, S., Hesslow, D., Castagnรฉ, R., Luccioni, A. S., Yvon, F., Gallรฉ, M., Tow, J., Rush, A. M., Biderman, S., Webson, A., Ammanamanchi, P. S., Wang, T., Sagot, B., Muennighoff, N., del Moral, A. V., โฆ Wolf, T. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. โฉ
- Yong, Z.-X., Schoelkopf, H., Muennighoff, N., Aji, A. F., Adelani, D. I., Almubarak, K., Bari, M. S., Sutawika, L., Kasai, J., Baruwa, A., Winata, G. I., Biderman, S., Radev, D., & Nikoulina, V. (2022). BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting. โฉ
- Lieber, O., Sharir, O., Lentz, B., & Shoham, Y. (2021). Jurassic-1: Technical Details and Evaluation, White paper, AI21 Labs, 2021. URL: Https://Uploads-Ssl. Webflow. Com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_ Tech_paper. Pdf. โฉ
- Wang, B., & Komatsuzaki, A. (2021). GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax. https://github.com/kingoflolz/mesh-transformer-jax โฉ
- Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. arXiv Preprint arXiv:1907.11692. โฉ
- Tang, T., Junyi, L., Chen, Z., Hu, Y., Yu, Z., Dai, W., Dong, Z., Cheng, X., Wang, Y., Zhao, W., Nie, J., & Wen, J.-R. (2022). TextBox 2.0: A Text Generation Library with Pre-trained Language Models. โฉ
- Strobelt, H., Webson, A., Sanh, V., Hoover, B., Beyer, J., Pfister, H., & Rush, A. M. (2022). Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models. arXiv. https://doi.org/10.48550/ARXIV.2208.07852 โฉ
- Bach, S. H., Sanh, V., Yong, Z.-X., Webson, A., Raffel, C., Nayak, N. V., Sharma, A., Kim, T., Bari, M. S., Fevry, T., Alyafeai, Z., Dey, M., Santilli, A., Sun, Z., Ben-David, S., Xu, C., Chhablani, G., Wang, H., Fries, J. A., โฆ Rush, A. M. (2022). PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. โฉ
- Wu, T., Jiang, E., Donsbach, A., Gray, J., Molina, A., Terry, M., & Cai, C. J. (2022). PromptChainer: Chaining Large Language Model Prompts through Visual Programming. โฉ
- Ding, N., Hu, S., Zhao, W., Chen, Y., Liu, Z., Zheng, H.-T., & Sun, M. (2021). OpenPrompt: An Open-source Framework for Prompt-learning. arXiv Preprint arXiv:2111.01998. โฉ
- Jiang, E., Olson, K., Toh, E., Molina, A., Donsbach, A., Terry, M., & Cai, C. J. (2022). PromptMaker: Prompt-Based Prototyping with Large Language Models. Extended Abstracts of the 2022 CHI Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3491101.3503564 โฉ
- Chase, H. (2022). LangChain (0.0.66) [Computer software]. https://github.com/hwchase17/langchain โฉ
- Liu, J. (2022). GPT Index. https://doi.org/10.5281/zenodo.1234 โฉ