Select by similarity
The SemanticSimilarityExampleSelector selects examples based on their similarity to the input text. It uses cosine similarity between sentence embeddings to find the most relevant examples.
How it works
Under the hood, the SemanticSimilarityExampleSelector does the following:
Generates embeddings for each example using a sentence embedding model like OpenAIEmbeddings.
Stores these embeddings in a vector store like Chroma to enable fast nearest neighbor search.
When given an input, generates its embedding using the same sentence embedding model.
Finds the most similar example embeddings in the vector store using cosine similarity.
Returns the examples corresponding to the most similar embeddings.
Performance considerations
Using SemanticSimilarityExampleSelector can be slow for large example sets, since it needs to embed every example. Some options to improve performance:
- Reduce the number of examples if possible
- Use a faster embedding model like SBERT
- Increase k so fewer examples need to be compared per query
Choosing k
The k parameter controls how many similar examples to return. Lower k improves performance, higher k potentially improves accuracy. Some guidelines:
- Start with k=1 or k=2 for baseline performance
- Increase k if model accuracy is poor and more examples may help
- Decrease k if query latency is too high and fewer examples are needed
Adding new examples
When adding new examples via add_example()
, be sure to call embed_examples()
afterwards to generate embeddings needed for similarity search.
Usage
Import required classes
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
Prepare examples
examples = [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"},
{"input": "energetic", "output": "lethargic"},
]
Create selector
selector = SemanticSimilarityExampleSelector.from_examples(
examples,
OpenAIEmbeddings(),
Chroma,
k=2
)
This creates the selector preloaded with the examples, ready to find similar ones.
Use selector
Pass the selector to FewShotPromptTemplate:
prompt = FewShotPromptTemplate(
example_selector=selector,
# ...
)
Or select examples manually:
input = "joyful"
selected = selector.select_examples({"input": input})
Examples
Question Answering
examples = [
{"question": "Who is the CEO of Apple?", "answer": "Tim Cook"},
{"question": "When was Python created?", "answer": "1991"},
]
selector = SemanticSimilarityExampleSelector.from_examples(examples, OpenAIEmbeddings(), Chroma, k=1)
prompt = FewShotPromptTemplate(
example_selector=selector,
# ...
)
Summarization
examples = [
{"text": "The fox jumped over the lazy dog.", "summary": "A fox jumped over a lazy dog."},
{"text": "Apple sells iPhones, iPads and Macbooks.", "summary": "Apple sells various electronics."},
]
selector = SemanticSimilarityExampleSelector.from_examples(examples, OpenAIEmbeddings(), Chroma, k=1)
prompt = FewShotPromptTemplate(
example_selector=selector,
# ...
)