Search Types in LangChain Retrievers

Posted: Feb 2, 2025.

The SearchType enum in LangChain allows you to specify what kind of vector search to perform when using retrievers. Let's explore how this works and how to use it effectively.

What is SearchType?

SearchType is an enumeration class that defines the different types of vector searches that can be performed by LangChain retrievers. It provides two options:

similarity: A standard similarity search that finds the most similar vectors to the query
mmr (Maximal Marginal Relevance): A search method that provides diversity in results by considering both similarity and diversity when selecting documents

Reference

Option	Description
`similarity`	Performs a standard similarity search using vector embeddings
`mmr`	Uses Maximal Marginal Relevance to rerank similarity search results for better diversity

How to Use SearchType

Here are some examples of using SearchType in your retrievers:

Basic Similarity Search

from langchain.retrievers.multi_vector import SearchType
from langchain_community.retrievers import ZepRetriever

# Initialize retriever with similarity search
retriever = ZepRetriever(
    session_id="123",
    url="http://localhost:8000",
    top_k=5,
    search_type=SearchType.similarity
)

# Retrieve documents using similarity search
docs = retriever.get_relevant_documents("What is machine learning?")

Using MMR for Diverse Results

# Initialize retriever with MMR search
retriever = ZepRetriever(
    session_id="123", 
    url="http://localhost:8000",
    top_k=5,
    search_type=SearchType.mmr,
    mmr_lambda=0.5  # Controls diversity vs similarity tradeoff
)

# Retrieve documents using MMR
docs = retriever.get_relevant_documents("What is machine learning?")

With MultiVectorRetriever

from langchain.retrievers.multi_vector import MultiVectorRetriever
from langchain_chroma import Chroma
from langchain.storage import InMemoryStore

# Initialize stores
vectorstore = Chroma()
docstore = InMemoryStore()

# Create retriever with MMR search
retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    docstore=docstore,
    search_type=SearchType.mmr,
    search_kwargs={"k": 2}
)

# Retrieve using MMR
results = retriever.get_relevant_documents("query")

The key difference between similarity and mmr is that MMR will try to ensure diversity in the results by balancing similarity scores with how different documents are from each other. This can be particularly useful when you want to:

Avoid redundant information in your results
Get a broader coverage of potentially relevant information
Reduce the impact of near-duplicate documents in your results

When using MMR, you can adjust the mmr_lambda parameter (typically between 0 and 1) to control the tradeoff between relevance and diversity. A higher value gives more weight to similarity, while a lower value prioritizes diversity.

This functionality is particularly useful in applications like:

Question answering systems where you want diverse context
Content recommendation systems
Document summarization where you need varied perspectives
Research tasks requiring broad coverage of a topic

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs