◆◆ Prompt Mining

◆◆ This article is rated expert

Reading Time: 5 minutes

Last updated on March 25, 2025

Takeaways

Prompt Mining is a way to select the optimal prompt template for a given task from a corpus of text based on the template that comes up most often in the corpus.
It improves the model's performance by providing it with prompts it's most 'familiar' with.
Limitations: It can be computationally expensive and doesn't always improve performance if the task is highly specific.

What is Prompt Mining?

Prompt Mining is a technique used to identify the best prompt template for a given $[relation]$ between $[subject]$ and $[object]$ from a corpus of text. Similar to traditional mining, where you search for valuable resources, in prompt mining you use algorithms to uncover the prompt template giving the most accurate results.

The key point here is that Prompt Mining isn’t about selecting the best template for any general task. Instead, it’s focused on improving how Large Language Models (LLMs) retrieve factual knowledge. Essentially, it boosts accuracy by discovering the language patterns and templates the model has "learned" best during training. The goal is to find prompts that consistently trigger the model to predict correct factual information.

A prompt template is a structured format for presenting questions or statements to the model, often with placeholders for customization. For example:

Prompt

Q: Why is the sky blue?

In this case, the template would be:

Template 1

Q: {question}?

Alternatively, another template could guide the model to complete a statement:

Prompt

The sky is blue because ...

Template 2

[x] is [y] because ...

As you can see, in both cases, the user intent is the same. However, Prompt Mining seeks the template the model is most 'acquainted' with based on its training data. Being 'acquainted' means the template frequently appears in the corpus, reflecting language patterns familiar to the model, even if not in exactly the same form.

How to Use Prompt Mining

Prompt Mining is a two-stage process:

Prompt Generation: Extracting prompt templates from a large corpus (e.g., Wikipedia).
Prompt Selection: Choosing the best prompt template based on a selected metric.

Let’s break these stages down in more detail.

Stage 1: Prompt Generation

In this stage, you need a large corpus of text that is representative of the data the model was trained on. For instance, if your model was trained on Wikipedia and research papers, you should use those as your corpus.

Here are two common methods for generating prompt templates:

Method 1: Mining-Based Generation

This method extracts prompts from the corpus by identifying relationships between subject-object pairs. The words between the subject and object typically represent the relation, which can be converted into a prompt template. For example:

Template

[x] was born in [y]

Another approach is to use syntactic analysis to identify relationships between (subject, object) pairs. This method is flexible because it doesn’t require a manually created reference prompt, making it applicable to various types of relations.

Method 2: Paraphrasing-Based Generation

This method starts with an original prompt and aims to improve lexical diversity by paraphrasing the seed prompt while maintaining the same meaning. For example:

Original Template

[x] shares a border with [y]

Paraphrased Templates

- [x] has a common border with [y] - [x] adjoins [y]

Stage 2, Prompt Selection

Once you’ve generated a set of prompt templates, you can use a mining algorithm to find the optimal one. The simplest approach is to select the template that appears most frequently in the corpus.

A more advanced approach is to select the template that produces the most accurate results, based on ground truth data. Ground truth refers to the actual correct labels or facts in a dataset that are used to evaluate the model's predictions. For example, if you're working with relationship " $x$ is owned by $y$ ", and the relation you're testing is "YouTube is owned by Alphabet," the ground-truth object in this case would be "Alphabet." When evaluating the model's performance, the accuracy is determined by how often the model's predicted object (e.g., the model might predict "Alphabet" or another company) matches this ground-truth object.

Prompt Mining Example

Let’s say you’re working with the relation "manufacturer." To illustrate the difference, let's suppose you have a manual prompt, a prompt that you've created based on your intuition:

Manual Prompt

[y] manufactured the [x]

This serves as your baseline. Now, you turn to Stage 1 and generate prompt templates from a Wikipedia corpus. You might get results like these:

[y] introduced the [x]
[y] released the [x]
[x] attributed to the [y]
[y] sprinter [x]
[y] announced the [x]
[y] launched the [x]
[y] introduces the [x]

This is a list of potential templates. Next, you use a metric to select the best template, for example, by choosing the one that appears most frequently. Here are the results:

Template	Frequency
[y] introduced the [x]	0.5940
[y] released the [x]	0.0022
[x] attributed to the [y]	0.1109
[y] sprinter [x]	0.00005
[y] announced the [x]	0.2857
[y] launched the [x]	0.0040
[y] introduces the [x]	0.00057

As you can see, the most frequent prompt template is "[y] introduced the [x]." Now you can compare the performance of the manual prompt versus the one identified through Prompt Mining.

Manual Prompt

[y] manufactured the [x]

Template Found with Prompt Mining

[y] introduced the [x]

Limitations of Prompt Mining

While Prompt Mining can enhance accuracy, it comes with a few limitations:

Computational Cost: Mining through large text corpora is resource-intensive and can be computationally expensive. The potential performance gains might not always justify the computing power required.
Minimal Performance Gains: Sometimes, the improvement in performance is minimal, and in certain cases, using a mined prompt template could even result in worse outcomes if it doesn’t align with the specific nuances of the task.

Conclusion

Prompt Mining is a powerful technique that helps improve the accuracy of large language models by identifying the most effective prompt templates based on their training data. By finding patterns the model is familiar with, it enhances the likelihood of retrieving factual information more reliably.

Andres Caceres

Andres Caceres, a documentation writer at Learn Prompting, has a passion for AI, math, and education. Outside of work, he enjoys playing soccer and tennis, spending time with his three huskies, and tutoring. His enthusiasm for learning and sharing knowledge drives his dedication to making complex concepts more accessible through clear and concise documentation.

Footnotes

Jiang, Z., Xu, F. F., Araki, J., & Neubig, G. (2019). How Can We Know What Language Models Know? https://arxiv.org/abs/1911.12543 ↩
Petroni, F., Rocktäschel, T., Lewis, P., Bakhtin, A., Wu, Y., Miller, A. H., & Riedel, S. (2019). Language Models as Knowledge Bases? https://arxiv.org/abs/1909.01066 ↩

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

AI Red-Teaming and AI Security Masterclass

Live AI Security Courses

◆◆ Prompt Mining

What is Prompt Mining?

Prompt

Template 1

Prompt

Template 2

How to Use Prompt Mining

Stage 1: Prompt Generation

Method 1: Mining-Based Generation

Template

Method 2: Paraphrasing-Based Generation

Original Template

Paraphrased Templates

Stage 2, Prompt Selection

Prompt Mining Example

Manual Prompt

Manual Prompt

Template Found with Prompt Mining

Limitations of Prompt Mining

Conclusion

Andres Caceres

Footnotes