Announcing our new Course: AI Red-Teaming and AI Safety Masterclass
Check it out →We've already covered role prompting in the basic section of this guide. If you want you can start there. Now, we'll dive into a more advanced discussion.
If you’re interested in prompting, you’ve probably already come across this technique many times when finding prompt templates for different applications. Most of them start with: “I want you to act as …” or “You’re a …”
This act of assigning a role to a large language model (LLM) you’re prompting is called role prompting12 or role-play prompting3 or persona prompting45. Here is an example of using role prompting:
You are a historian. Explain the significance of the Industrial Revolution.
In this prompt, the role is "historian". Other examples of roles include occupational roles like “plumber,” “therapist” or interpersonal “friend,” “mother,” or “supervisor”. You can find other examples of roles in the table1 below:
Category | Examples |
---|---|
Family | mother (mom, mum, mommy), father (dad, papa, daddy), son, daughter, etc. |
School | student, professor, tutor, instructor, etc. |
Romantic | partner, husband, wife, etc. |
Work | coworker, supervisor, colleague, boss |
Social | friend, best friend, stranger, buddy, etc. |
AI | AI language model, Policy Analysis AI, Medical Diagnostic AI, etc. |
Occupations | driver, accountant, banker, dentist, etc. |
One obvious use case for role prompting is text styling and imitation6. Interestingly, recent studies13 have shown that role prompting can also enhance the performance of prompted models in tasks such as reasoning.
In the following sections, we'll demonstrate how to leverage role prompting for each of these goals.
Here, we make the AI imitate the writing style of a person.
This can be used to change the tone, style, and even the depth of the information presented.
Imagine you need to write an outreach email to a potential marketing partner, and it's your first time doing it.
You go to your favorite LLM for help. Here's what you'll get if you write a simple prompt:
Write a quick outreach email to [person] about partnering up.
Subject: Exploring Partnership Opportunities
Hi [Person's Name],
I hope this email finds you well. I’m reaching out to explore potential collaboration opportunities between our teams. I believe there’s a strong alignment between our goals and expertise, and I’m confident we could create something impactful together.
I’d love to set up a time to discuss how we can partner up and bring value to both of our organizations.
Looking forward to hearing your thoughts!
Compare this to the result you could get with role prompting:
You are a salesperson. Write a quick outreach email to [person] about partnering up.
Subject: Let's Discuss a Potential Partnership Opportunity
Hi [Person's Name],
I’m [Your Name] from [Your Company], and I wanted to reach out because I see a great opportunity for us to collaborate. Our products/services align well with what you’re doing at [Their Company], and I believe that by partnering, we can achieve even greater results together.
I’d love to chat about how we can create value for both of our businesses. Are you available for a quick call sometime this week?
Looking forward to your response.
You can make your communication more business-like and action-driven. This can help the person on the other end see that you get them and can speak their language. You can even customize your prompt based on the recipient's occupation or the individual most competent for the task.
Picture another example. Your boss mentioned a new concept for you, API. And you don't have the time to read through explanatory articles. So you ask an LLM.
Quickly explain what an API is.
An API (Application Programming Interface) is a set of rules and protocols that allows different software applications to communicate with each other. It defines how requests and responses should be formatted, enabling one program to access the features or data of another program without needing to understand its internal workings. APIs are essential for integrating different systems and enabling them to work together, often used for things like retrieving data, accessing services, or interacting with software components.
Sounds good, but can be hard to understand as there are still some unknown concepts like "features of a program" and "program" itself. Let's try role prompting:
You're a teacher. Quickly explain what an API is.
An API, or Application Programming Interface, is like a menu in a restaurant. It lists what you can do with a program or service, and how to ask for it. Just like you don't need to know how the kitchen cooks your meal, you don't need to know how the program works inside—just how to ask for what you want using the API. It makes it easy for different apps or websites to talk to each other and share information.
Looks more comprehensive and digestible. Now the LLM also included an analogy with a menu in a restaurant which is very helpful to grasp the key idea of the API.
Researchers also tried out framing prompts as a teacher-student dialogue instead of just telling an LLM to be thorough and simple6. They found that this approach leads to more stable and consistent outputs from the model.
Your toddler likes to read Joanne Rowling and have already read all the bibliography. You can ask to an LLM imitate Joanne Rowling's style and write a short story. Here's the prompt you can try yourself to compare results.
Simple prompt:
Role prompt:
Role prompting can be used to improve accuracy on certain problems, like general world knowledge1 and reasoning3. While this statement is backed by some research, this area has not been studied enough, as we discussed in our article7.
Here are a few recommendations for creating role prompts that improve accuracy:
We summarize these best practices in one example with an employee-boss setting and a task to craft an elevator pitch.
Step 1:
You're an employee who thrives on generating new ideas and always knows how to present them effectively. I'm your boss, who listens to your ideas and decides whether to implement them.
Great, let's play this out. I'll start by presenting a new idea, and you can provide feedback or decide on the next steps.
Step 2:
Craft an elevator pitch about [your new project idea].
Elevator Pitch:
Imagine our audience eagerly tuning in every week for a live, interactive session where they can ask experts anything about AI and machine learning. Our new project, "AI Unplugged," is a weekly live Q&A that will be streamed on LinkedIn and Twitter. It’s more than just content; it’s a conversation. By inviting thought leaders and engaging directly with our community, we’ll establish ourselves as the go-to resource for cutting-edge insights and practical advice in the AI space. This is our chance to turn followers into loyal advocates by making our brand not just a source of information, but a part of their journey in AI.
Invoking figures such as Mahatma Gandhi, Ayn Rand, or Eliezer Yudkowksy can subtly imply specific moral criteria without explicitly stating them6. This could have different applications, here are some of them:
How well role prompting works depends on how each role is encoded and represented in the LLM. If the role isn’t well-represented or is misrepresented in the training data, the model might respond inaccurately or inappropriately when prompted with that role.
Another limitation is in the data LLMs are trained on. Since LLMs learn from big datasets that can have biases, role prompting could accidentally reinforce stereotypes or biased behaviors. For example, if a role is often linked to certain traits or behaviors in the training data, the model might repeat those connections, even if they’re wrong or inappropriate.
The research that guides our best practices is limited by the number of roles checked and the specific models used. Keep this in mind as you continue, and be cautious of potential biases or misrepresentations when applying role prompting in your work.
Role prompting is a powerful technique that guides LLM's behavior by assigning it specific roles, enhancing the style, accuracy, and depth of its outputs. Whether you're aiming for a certain tone, improving clarity, or eliciting complex and nuanced responses, role prompting can be of help. By adhering to best practices and being aware of potential pitfalls, you can fully leverage role prompting, making your interactions with language models more effective and tailored to your specific needs.
This section highlights some of the latest research papers on role prompting, which propose more sophisticated systems in this area. Each paper warrants a separate review, but if you’re interested, feel free to dive into the further reading.
"Evaluating Persona Prompting for Question Answering Tasks"8 proposes different strategies for composing prompts with role prompting and investigates how different persona styles influence the performance of LLMs, particularly in handling questions with varying levels of "openness." This "openness" refers to the number of correct answers and the various ways those answers can be expressed. The paper also proposes methods for auto-generating roles and introduces a roundtable setting where several agents each take on one role.
"Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation"9 proposes a general prompt structure for role prompting consisting of system instructions, situational context, response instructions, and conversation history, which can be tailored to different conversational tasks. The method allows LLMs to adopt roles that enhance their conversational abilities, such as empathy and engagement, effectively improving open-domain conversations in various languages, especially in French.
"Large Language Models are Diverse Role-Players for Summarization Evaluation"10 introduces a new framework called DRPE (Diverse Role-Player Evaluation) that utilizes role prompting to assess text summarization quality. It leverages LLMs to simulate various role-players, both static (objective) and dynamic (subjective), to evaluate summaries across multiple subjective dimensions such as coherence, grammar, and interestingness. This method outperforms traditional metrics like BLEU/ROUGE by better aligning with human evaluations.
"Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing?"11 The paper explores the ability of LLMs to role-play and simulate persona-driven decision-making in complex scenarios, such as those found in novels. The authors introduce the NEXTDECISIONPREDICTION task and create the LIFECHOICE dataset, which includes key decision points from 1,401 characters in 395 books. The paper finds that while LLMs show promise in predicting decisions based on personas, there is room for improvement.
"Persona is a Double-edged Sword: Enhancing the Zero-shot Reasoning by Ensembling the Role-playing and Neutral Prompts"12 explores how role-playing personas can both help and hinder reasoning in LLMs. It proposes a framework called "Jekyll & Hyde" that combines results from both persona-based and neutral prompts to enhance the robustness of LLMs in reasoning tasks.
"Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs"13 introduces a method called self-prompt tuning, which allows LLMs to generate their own role-play prompts through fine-tuning. By fine-tuning models like Mistral-7B and Llama-2-7B on a newly created LIMA-Role dataset, the models can automatically generate expert role-play prompts for different tasks, leading to improved performance on a range of NLP benchmarks.
Zheng, M., Pei, J., & Jurgens, D. (2023). Is “A Helpful Assistant” the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts. https://arxiv.org/abs/2311.10054 ↩ ↩2 ↩3 ↩4
Wang, Z. M., Peng, Z., Que, H., Liu, J., Zhou, W., Wu, Y., Guo, H., Gan, R., Ni, Z., Yang, J., Zhang, M., Zhang, Z., Ouyang, W., Xu, K., Huang, S. W., Fu, J., & Peng, J. (2024). RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models. https://arxiv.org/abs/2310.00746 ↩
Kong, A., Zhao, S., Chen, H., Li, Q., Qin, Y., Sun, R., & Zhou, X. (2023). Better Zero-Shot Reasoning with Role-Play Prompting. ArXiv, abs/2308.07702. https://api.semanticscholar.org/CorpusID:260900230 ↩ ↩2 ↩3 ↩4
Schmidt, D. C., Spencer-Smith, J., Fu, Q., & White, J. (2023). Cataloging Prompt Patterns to Enhance the Discipline of Prompt Engineering. https://api.semanticscholar.org/CorpusID:257368147 ↩
Wang, Z., Mao, S., Wu, W., Ge, T., Wei, F., & Ji, H. (2024). Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration. https://arxiv.org/abs/2307.05300 ↩
Reynolds, L., & McDonell, K. (2021). Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. https://arxiv.org/abs/2102.07350 ↩ ↩2 ↩3
Schulhoff, S. V. (2024). Is Role Prompting Effective? https://learnprompting.org/blog/2024/7/16/role_prompting ↩
Olea, C., Tucker, H., Phelan, J., Pattison, C., Zhang, S., Lieb, M., Schmidt, D., & White, J. (2024). Evaluating Persona Prompting for Question Answering Tasks. Security, Privacy and Trust Management. https://api.semanticscholar.org/CorpusID:270819947 ↩
Njifenjou, A., Sucal, V., Jabaian, B., & Lefèvre, F. (2024). Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation. https://arxiv.org/abs/2406.18460 ↩
Wu, N., Gong, M., Shou, L., Liang, S., & Jiang, D. (2023). Large Language Models are Diverse Role-Players for Summarization Evaluation. https://arxiv.org/abs/2303.15078 ↩
Xu, R., Wang, X., Chen, J., Yuan, S., Yuan, X., Liang, J., Chen, Z., Dong, X., & Xiao, Y. (2024). Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing? https://arxiv.org/abs/2404.12138 ↩
Kim, J., Yang, N., & Jung, K. (2024). Persona is a Double-edged Sword: Enhancing the Zero-shot Reasoning by Ensembling the Role-playing and Neutral Prompts. https://arxiv.org/abs/2408.08631 ↩
Kong, A., Zhao, S., Chen, H., Li, Q., Qin, Y., Sun, R., Zhou, X., Zhou, J., & Sun, H. (2024). Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs. https://arxiv.org/abs/2407.08995 ↩