Announcing our new Course: AI Red-Teaming and AI Safety Masterclass

Check it out →
🧠 Advanced
🧠 AdvancedFew-Shot◆◆ Vote-K

◆◆ Vote-K

Last updated on September 27, 2024 by Andres Caceres

Takeaways
  • Vote-kk is a graph-based technique for selecting diverse and representative exemplars from unlabeled datasets for few-shot prompts.
  • It ensures efficient selection by focusing on diversity and covering a broad range of topics using as few exemplars as possible.
  • Stage 1: Selects initial diverse exemplars based on dissimilarity using a directed graph and cosine similarity.
  • Stage 2: Chooses challenging examples by ranking the remaining data based on model confidence, ensuring a wide range of difficulty.
  • Use cases: Ideal for scenarios with unlabeled data or when dealing with complex, multifaceted topics that require diverse representations.

What is Vote-K?

Vote-kk is a graph-based technique used to select exemplars that are diverse and representative from a dataset of unlabeled exemplars for annotation in a few-shot standard prompt.

It ranks exemplars based on their dissimilarity to others in the dataset, with the most diverse ones receiving the highest rank. In other words, the exemplars that are the least similar to others get more "votes" or are ranked higher.

The idea behind Vote-kk is that rather than annotating the entire training data, which is often costly and time-consuming, Vote-kk helps efficiently select the most informative exemplars for annotation that could be further used as exemplars in your few-shot prompt1.

How to Use Vote-K?

To use Vote-kk, you first need a dataset of unlabeled exemplars related to your task.

Then, you apply Vote-K, a two-stage process:

Stage 1: Initial Selection

Imagine having a bucket where you collect the unlabelled examplars. At first, your bucket is empty. But as you iterate during Stage 1, you gradually fill this bucket by picking diverse and representative examples.

Here are the steps:

  1. Vote-kk computes vector representations of each unlabeled sample using Sentence-BERT, a model used to derive sentence embeddings that can be compared using cosine similarity.

  2. Using the embeddings, a directed graph G=(V,E)G = (V, E) is constructed. In this graph:

  • Each vertex vVv \in V represents an unlabeled instance
  • An edge is created between each vertex and its kk nearest neighbors based on the cosine similarity of the embeddings. This forms local neighborhoods of similar instances.
  1. Using the selected metric reflecting both diversity and representativeness, the algorithm scores each unchosen vertex uu based on the connections in the graph. The score for each vertex decreases if it is too close to already selected exemplars stored in your bucket, encouraging diversity in the selected set.

  2. In each iteration, the vertex with the highest score is selected for annotation. The process continues until M/10M/10 samples are selected, where MM is the total annotation budget.

Stage 2: Confidence-Based Selection

Now that your bucket has some annotated examples, you focus on the remaining unlabeled instances to select the most challenging and diverse ones, avoiding the easiest examples.

  1. Using a language model, Vote-kk takes the already-annotated exemplars from Stage 1 to make predictions for the rest of the data.
  2. The remaining instances are then ranked based on the model’s confidence scores (how confident the model is in its predictions). The less confident the model is about an instance, the more likely it is to be selected, as these cases are typically more diverse and challenging.
  3. The remaining unlabeled samples are divided into MM equal-sized buckets based on their confidence scores. From the first 9M/109M/10 buckets, the most diverse examples are chosen. The most confident bucket (i.e., the top M/10M/10 easiest cases) is discarded to ensure the final selection includes a range of difficult, diverse instances.

As a result of these two stages, you have a complete annotated set containing MM exemplars:

  • M/10M/10 exemplars from Stage 1 (the initial diverse and representative samples).
  • 9M/109M/10 exemplars selected from Stage 2 (instances chosen based on the model's confidence scores).

You can now annotate these MM exemplars and plug them into your few-shot prompt!

Vote-K Example

You're an ecology student and want to understand climate change on a higher level by asking AI about it. You decide to use a few-shot prompt since you have a dataset of climate change-related questions, but you don't know which of the questions would be useful to use as exemplars and, frankly, you don't have time to sift through all of them. You decide to use Vote-kk to choose the exemplars for you. Your dataset is as follows:

  1. "How does global warming affect polar bear populations?"
  2. "What are the primary sources of methane emissions?"
  3. "How do deforestation and land-use change contribute to carbon emissions?"
  4. "What are the long-term effects of ocean acidification on marine ecosystems?"
  5. "How does climate change influence extreme weather events like hurricanes and floods?"
  6. "What are the economic impacts of transitioning to renewable energy sources?"
  7. "How do rising temperatures affect agricultural productivity in different regions?"
  8. "What are the social consequences of climate-induced migration?"
  9. "What role do international agreements play in addressing climate change?"
  10. "How does climate change impact biodiversity loss?"
  11. etc.

Stage 1: Initial Diversity-Based Selection

Vote-kk will first compute vector representations of each question in your dataset using a model like Sentence-BERT. These vectors allow the questions to be compared for similarity. The algorithm will then create a graph connecting similar questions based on their cosine similarity. In this graph, each question will be a vertex, and edges will connect it to its kk nearest neighbors (the most similar questions).

Vote-kk will start selecting the most diverse exemplars based on this graph using the selected metric (some score). Remember, it's an iterative process and you want to pick an exemplars that’s not too similar to the ones you’ve already picked, so each time you calculate a score for each remaining exemplar and select the one with the highest score. For example, the first selected exemplars are:

  • "How does global warming affect polar bear populations?" (focus on biodiversity and animal species).
  • "What are the primary sources of methane emissions?" (focus on emissions sources).
  • "What are the economic impacts of transitioning to renewable energy sources?" (focus on economic factors).

Now that you've selected three exemplars, you now annotate them with the corresponding answers. In theory, this is supposed to be M/10M/10 exemplars, but for the sake of simplicity and for the example we'll ignore that (MM would have to be 30).

Stage 2: Confidence-Based Selection

You now use these exemplars to train a model and have it make predictions on the rest of the unlabeled questions in your dataset.

From the rest of the unlabeled questions, Vote-kk will now choose questions where the model's confidence in its predictions is lower since that means they're harder and more unique. For example, the model could struggle with:

  • "What are the social consequences of climate-induced migration?"
  • "How do rising temperatures affect agricultural productivity in different regions?" etc.

Vote-kk will continue selecting these 9M/109M/10 exemplars, and once it's done, you annotate all of the ones it chooses.

Final Outcome:

By the end of both stages, Vote-kk has selected a total of MM exemplars:

  • The M/10M/10 from the initial diversity-based selection
  • Then, the 9M/109M/10 from the confidence-based selection

These exemplars have all been annotated and are ready for use in your few-shot prompt.

Limitations of Vote-K

Vote-kk prioritizes diversity when selecting exemplars, but it doesn't inherently evaluate whether the selected exemplars are relevant or useful for the task at hand. This means that while you might cover a wide range of topics, you might end up with exemplars that are too peripheral or not directly helpful for your specific task. Additionally, for very large datasets of exemplars, Vote-kk can be computationally expensive, and having to label the exemplars can be a pain, especially if you do it manually.

Conclusion

Vote-kk is a structured approach to finding good exemplars for your few-shot prompt based on the diversity of exemplars and being able to cover a wide range of topics as efficiently as possible. It's a powerful weapon to add to your exemplar-finding arsenal, especially if you find yourself with a dataset of exemplars that are unlabeled. It does have limitations, though; its focus on diversity can end up making it ignore quality or relevance, it can take up a lot of resources to do, and having to label the exemplars can be annoying.

Footnotes

  1. Su, H., Kasai, J., Wu, C. H., Shi, W., Wang, T., Xin, J., Zhang, R., Ostendorf, M., Zettlemoyer, L., Smith, N. A., & Yu, T. (2022). Selective Annotation Makes Language Models Better Few-Shot Learners. https://arxiv.org/abs/2209.01975

Edit this page
Word count: 0
Copyright © 2024 Learn Prompting.