🧠 Advanced

🟢 Self-Generated In-Context Learning (SG-ICL)

Last updated on October 3, 2024 by Andres Caceres

Takeaways
  • Self-Generated In-Context Learning (SG-ICL) is a technique used to get few shot examples directly from the model you're trying to get answers from.
  • It's intuitive, easy-to-use, and fast, and it comes in handy when you don't have a dataset of exemplars available.
  • However, its speed and ease of use come at the expense of quality, and it doesn't perform as well as other techniques that are done with datasets of exemplars.

Information and Links

TechniqueInstitutionDate of PublicationPaper
Self Generated In-Context Learning (SG-ICL)Seoul National University, Hanyang University, NAVER AI Lab, NAVER CLOVAJun 2022Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

What is Self-Generated In-Context Learning?

Self-Generated In-Context Learning (SG-ICL) is a technique to generate exemplars for a few-shot standard prompt by asking the model itself to generate them.

Typically, in-context learning (ICL) relies on a few input-label pairs (called exemplars) to help models perform tasks without fine-tuning. However, these demonstrations are often chosen from external datasets, which introduces dependency on external data. SG-ICL generates these demonstrations using the language model itself, reducing reliance on external datasets and improving performance consistency.

How does SG-ICL work?

It works in two steps:

  1. Self-Generation Step: The model generates examplars closely related to the specific task at hand, improving input-demonstration correlation.

  2. Inference Step: The generated samples are used as exemplars. The model then predicts the class for the test input based on these generated samples, which are tailored to the task, leading to better performance than relying on external examples.

Benefits and Applications

SG-ICL offers several benefits for text classification tasks such as sentiment analysis and natural language inference:

  • No external data required: The main advantage of SG-ICL is that you don't need a dataset of exemplars, and it's really easy to use.
  • Low variance in performance: By generating task-specific examples, SG-ICL provides more consistent results compared to relying on randomly selected demonstrations from datasets.

How SG-ICL differs from existing techniques

SG-ICL stands out because it self-generates demonstrations instead of retrieving them from external datasets. Here's how it differs from other methods:

  • Few-Shot Learning: Few-shot learning uses a small number of manually selected training samples. SG-ICL, on the other hand, eliminates the need for any external training data, performing well even without direct training samples.

  • Zero-Shot Learning: In zero-shot learning, models perform tasks without any examples or training data. SG-ICL generates its own in-context examples, resulting in better performance than zero-shot models, which have no context for the task.

How to use SG-ICL

The following are the templates for the self generation step:

TaskSelf Generation Template
Sentiment analysis (2 categories)Generate a review: a fast, funny, highly enjoyable movie. Generate a "negative" review:
Sentiment analysis (5 categories)Generate a review: it 's worth taking the kids to. Generate a "negative" review:
Recognizing Textual EntailmentPremise: Dana Reeve, the widow of the actor Christopher Reeve, has died of lung cancer at age 44, according to the Christopher Reeve Foundation. Generate a Hypothesis: Christopher Reeve had an accident. Generate a "true" Hypothesis:
CommitmentBankPremise: It was a complex language. Not written down but handed down. One might say it was peeled down. Generate a Hypothesis: the language was peeled down. Generate a "neither" Hypothesis:

Those templates are used to generate exemplars for the inference step. The templates for the inference step are as follows:

TaskInference TemplateVerbalizer
Minimala fast , funny , highly enjoyable movie . positive-
Sentiment analysis (2 categories)Review: a fast, funny, highly enjoyable movie. Sentiment : positivepositive / negative
Sentiment analysis (5 categories)Review: it 's worth taking the kids to.terrible / bad / okay / good / great
Recognizing Textual EntailmentPremise: Dana Reeve, the widow of the actor Christopher Reeve, has died of lung cancer at age 44, according to the Christopher Reeve Foundation. Hypothesis: Christopher Reeve had an accident. True or False? falsetrue / false
CommitmentBankPremise: It was a complex language. Not written down but handed down. One might say it was peeled down. Hypothesis: the language was peeled down Yes, No, or Neither? yesyes / no / neither

Note that those templates don't include the exemplars generated in the self generation step, but a real prompt would include the exemplars before the actual inference template.

SG-ICL Example: Sentiment Analysis

Lets say you have a review from a customer of a restaurant that goes like this: "the food was amazingly bad, and the service wasn't anything to write home about."

You want to determine the sentiment of the review, but you don't have a dataset of example reviews to go off of, so you decide to use SG-ICL.

Step 1: Self Generation

First, you include your original review in a prompt that asks the model to generate more example reviews. We'll do two for this example; one positive and one negative.

Note that this initial step is a few shot prompt of its own, with one exemplar. That's why it's better to have "generate a review" prefacing your example.

Astronaut

Prompt


Generate a review: the food was amazingly bad, and the service wasn't anything to write home about.

Generate a positive review:
Robot

AI Output


the food was insanely good!

Astronaut

Prompt


Generate a review: the food was amazingly bad, and the service wasn't anything to write home about.

Generate a negative review:
Robot

AI Output


I could've gone back home and made homemade lasagna in the time they took to complete my order.

Step 2: Inference

This part is simple. You just put it all together into one prompt, input it into the model, and get back your answer.

Astronaut

Prompt


Review: the food was insanely good!
Sentiment: Positive

Review: I could've gone back home and made homemade lasagna in the time they took to complete my order. Sentiment: Negative

Review: the food was amazingly bad, and the service wasn't anything to write home about. Sentiment:
Robot

AI Output


Negative

Limitations of SG-ICL

SG-ICL is good for when you don't have a dataset and is also much less computationally expensive than other techniques that are done on a dataset (e.g KNN, Vote-k). It is worse than those other techniques, though, and should only really be used when there's no dataset available or when you have one but you don't have much computational resources.

Conclusion

Self-Generated In-Context Learning is an intuitive method for generating exemplars for a few shot prompt directly from the model that you're going to be prompting. It works best when you don't have access to a dataset of exemplars or when you don't have computational resources available to do operations on a dataset. SG-ICL performs better than zero-shot prompting but not than other techniques that involve operations on datasets, like KNN or Vote-k.

Footnotes

  1. Kim, H. J., Cho, H., Kim, J., Kim, T., Yoo, K. M., & goo Sang-Lee. (2022). Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator. https://arxiv.org/abs/2206.08082

Edit this page
Word count: 0

Get AI Certified by Learn Prompting


Copyright © 2024 Learn Prompting.