Announcing our new Paper: The Prompt Report, with Co-authors from OpenAI & Microsoft!

Check it out →
🧠 AdvancedFew-Shot◆◆ Prompt Mining

◆◆ Prompt Mining

Last updated on September 11, 2024 by Andres Caceres
Takeaways
  • Prompt Mining is a way to select the optimal prompt template for a given task from a corpus of text based on the template that comes up most often in the corpus.
  • It improves the model's performance by providing it with prompts it's most 'familiar' with.
  • Limitations: It can be computationally expensive and doesn't always improve performance if the task is highly specific.

What is Prompt Mining?

Prompt Mining1 is a technique used to identify the best prompt template for a given [relation][relation] between [subject][subject] and [object][object]2 from a corpus of text. Similar to traditional mining, where you search for valuable resources, in prompt mining you use algorithms to uncover the prompt template giving the most accurate results.

The key point here is that Prompt Mining isn’t about selecting the best template for any general task. Instead, it’s focused on improving how large language models (LLMs) retrieve factual knowledge. Essentially, it boosts accuracy by discovering the language patterns and templates the model has "learned" best during training. The goal is to find prompts that consistently trigger the model to predict correct factual information.

A prompt template is a structured format for presenting questions or statements to the model, often with placeholders for customization. For example:

Astronaut

Prompt


Q: Why is the sky blue?

A:

In this case, the template would be:

Astronaut

Template 1


Q: {question}?

A:

Alternatively, another template could guide the model to complete a statement:

Astronaut

Prompt


The sky is blue because ...

Astronaut

Template 2


[x] is [y] because ...

As you can see, in both cases, the user intent is the same. However, Prompt Mining seeks the template the model is most 'acquainted' with based on its training data. Being 'acquainted' means the template frequently appears in the corpus, reflecting language patterns familiar to the model, even if not in exactly the same form.

How to Use Prompt Mining

Prompt Mining is a two-stage process:

  1. Prompt Generation: Extracting prompt templates from a large corpus (e.g., Wikipedia).
  2. Prompt Selection: Choosing the best prompt template based on a selected metric.

Let’s break these stages down in more detail.

Stage 1: Prompt Generation

In this stage, you need a large corpus of text that is representative of the data the model was trained on. For instance, if your model was trained on Wikipedia and research papers, you should use those as your corpus.

Here are two common methods for generating prompt templates:

Method 1: Mining-Based Generation

This method extracts prompts from the corpus by identifying relationships between subject-object pairs. The words between the subject and object typically represent the relation, which can be converted into a prompt template. For example:

Astronaut

Template


[x] was born in [y]

Another approach is to use syntactic analysis to identify relationships between (subject, object) pairs. This method is flexible because it doesn’t require a manually created reference prompt, making it applicable to various types of relations.

Method 2: Paraphrasing-Based Generation

This method starts with an original prompt and aims to improve lexical diversity by paraphrasing the seed prompt while maintaining the same meaning. For example:

Astronaut

Original Template


[x] shares a border with [y]

Astronaut

Paraphrased Templates


- [x] has a common border with [y]
- [x] adjoins [y]

Stage 2, Prompt Selection

Once you’ve generated a set of prompt templates, you can use a mining algorithm to find the optimal one. The simplest approach is to select the template that appears most frequently in the corpus.

A more advanced approach is to select the template that produces the most accurate results, based on ground truth data. Ground truth refers to the actual correct labels or facts in a dataset that are used to evaluate the model's predictions. For example, if you're working with relationship "xx is owned by yy", and the relation you're testing is "YouTube is owned by Alphabet," the ground-truth object in this case would be "Alphabet." When evaluating the model's performance, the accuracy is determined by how often the model's predicted object (e.g., the model might predict "Alphabet" or another company) matches this ground-truth object.

Prompt Mining Example

Let’s say you’re working with the relation "manufacturer." To illustrate the difference, let's suppose you have a manual prompt, a prompt that you've created based on your intuition:

Astronaut

Manual Prompt


[y] manufactured the [x]

This serves as your baseline. Now, you turn to Stage 1 and generate prompt templates from a Wikipedia corpus. You might get results like these:

  • [y] introduced the [x]
  • [y] released the [x]
  • [x] attributed to the [y]
  • [y] sprinter [x]
  • [y] announced the [x]
  • [y] launched the [x]
  • [y] introduces the [x]

This is a list of potential templates. Next, you use a metric to select the best template, for example, by choosing the one that appears most frequently. Here are the results:

TemplateFrequency
[y] introduced the [x]0.5940
[y] released the [x]0.0022
[x] attributed to the [y]0.1109
[y] sprinter [x]0.00005
[y] announced the [x]0.2857
[y] launched the [x]0.0040
[y] introduces the [x]0.00057

As you can see, the most frequent prompt template is "[y] introduced the [x]." Now you can compare the performance of the manual prompt versus the one identified through Prompt Mining.

Astronaut

Manual Prompt


[y] manufactured the [x]

Astronaut

Template Found with Prompt Mining


[y] introduced the [x]

Limitations of Prompt Mining

While Prompt Mining can enhance accuracy, it comes with a few limitations:

  • Computational Cost: Mining through large text corpora is resource-intensive and can be computationally expensive. The potential performance gains might not always justify the computing power required.

  • Minimal Performance Gains: Sometimes, the improvement in performance is minimal, and in certain cases, using a mined prompt template could even result in worse outcomes if it doesn’t align with the specific nuances of the task.

Conclusion

Prompt Mining is a powerful technique that helps improve the accuracy of large language models by identifying the most effective prompt templates based on their training data. By finding patterns the model is familiar with, it enhances the likelihood of retrieving factual information more reliably.

Footnotes

  1. Jiang, Z., Xu, F. F., Araki, J., & Neubig, G. (2019). How Can We Know What Language Models Know? https://arxiv.org/abs/1911.12543

  2. Petroni, F., Rocktäschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A. H., & Riedel, S. (2019). Language Models as Knowledge Bases? https://arxiv.org/abs/1909.01066

Word count: 0
Copyright © 2024 Learn Prompting.