Mastering Search Types in LangChain with SearchType Enum

Posted: Feb 3, 2025.

The SearchType enum class in LangChain provides a way to specify different types of vector search algorithms when retrieving documents. This guide will explore how to use SearchType to optimize your document retrieval results.

What is SearchType?

SearchType is an enumeration class that defines the type of search to perform when retrieving documents from a vector store or retriever. It currently supports two search methods:

  • similarity: Standard similarity search using vector embeddings
  • mmr: Maximal Marginal Relevance (MMR) search which optimizes for both relevance and diversity in results

Reference

MethodDescription
similarityPerforms a standard similarity search using vector embeddings to find the most relevant documents
mmrUses Maximal Marginal Relevance algorithm to balance between relevance and diversity in search results

How to use SearchType

Let's look at different ways to use SearchType in your LangChain applications.

The default search type is similarity search. Here's how to explicitly specify it:

from langchain.retrievers.multi_vector import MultiVectorRetriever
from langchain_community.retrievers.docarray import SearchType

retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    byte_store=store, 
    search_type=SearchType.similarity
)

# Perform a similarity search
results = retriever.invoke("What is quantum computing?")

MMR search helps reduce redundancy in search results by considering both relevance and diversity:

from langchain.retrievers.multi_vector import MultiVectorRetriever
from langchain_community.retrievers.docarray import SearchType

retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    byte_store=store,
    search_type=SearchType.mmr,
    search_kwargs={"k": 10, "lambda_mult": 0.5}  # Configure MMR parameters
)

# Perform MMR search
results = retriever.invoke("What are the applications of AI?")

Using SearchType with Zep Retriever

The SearchType enum is particularly useful when working with specialized retrievers like ZepRetriever:

from langchain_community.retrievers.zep import SearchType, ZepRetriever

# Initialize retriever with MMR search
retriever = ZepRetriever(
    session_id="user_123",
    url="http://localhost:8000",
    top_k=5,
    search_type=SearchType.mmr,
    mmr_lambda=0.5  # Control diversity vs relevance trade-off
)

# Search documents
results = await retriever.ainvoke("Tell me about deep learning")

Combining with MultiVector Retrieval

SearchType is particularly powerful when used with MultiVectorRetriever for advanced retrieval scenarios:

from langchain.retrievers.multi_vector import MultiVectorRetriever
from langchain_community.retrievers.docarray import SearchType
from langchain_chroma import Chroma
from langchain.storage import InMemoryStore

# Set up stores
vectorstore = Chroma(collection_name="my_collection")
docstore = InMemoryStore()

# Initialize retriever with MMR search
retriever = MultiVectorRetriever(
    vectorstore=vectorstore,
    docstore=docstore,
    search_type=SearchType.mmr,
    search_kwargs={
        "k": 5,  # Number of documents to retrieve
        "fetch_k": 20,  # Number of documents to fetch before reranking
        "lambda_mult": 0.7  # Balance between relevance and diversity
    }
)

# Add documents and perform search
retriever.vectorstore.add_documents(child_documents)
retriever.docstore.mset(parent_documents)
results = retriever.invoke("How does natural language processing work?")

When using SearchType with MMR, you can tune the trade-off between relevance and diversity by adjusting parameters like lambda_mult - higher values favor relevance while lower values promote diversity in the results.

This configuration gives you fine-grained control over how documents are retrieved, allowing you to optimize for your specific use case whether you need the most relevant results or a more diverse set of documents.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs