Using ScaNN Vector Store for Efficient Similarity Search in LangChain

Posted: Jan 28, 2025.

ScaNN (Scalable Nearest Neighbors) is a powerful vector store implementation in LangChain that enables efficient similarity search at scale. In this guide, we'll explore how to use ScaNN with LangChain for document storage and retrieval.

What is ScaNN?

ScaNN is a method developed by Google Research for performing efficient vector similarity search at scale. It provides:

  • Fast approximate nearest neighbor search
  • Support for different distance metrics like Euclidean distance and Maximum Inner Product Search (MIPS)
  • Optimized implementation for x86 processors with AVX2 support
  • Search space pruning and quantization techniques for better performance

The LangChain ScaNN integration allows you to use these capabilities as a vector store for document embeddings.

Reference

Here are the key methods provided by the ScaNN vector store class:

MethodDescription
from_documents()Create a ScaNN index from a list of Documents
from_texts()Create a ScaNN index from raw text strings
from_embeddings()Create a ScaNN index from pre-computed embeddings
similarity_search()Find similar documents using text query
similarity_search_by_vector()Find similar documents using embedding vector
max_marginal_relevance_search()Find diverse similar documents using MMR
save_local()Save the ScaNN index to disk
load_local()Load a saved ScaNN index from disk

How to use ScaNN Vector Store

Basic Setup

First, install the required dependencies:

pip install scann

Creating a ScaNN Index

Here's how to create a ScaNN vector store from documents:

from langchain_community.vectorstores import ScaNN
from langchain_community.embeddings import HuggingFaceEmbeddings

# Initialize embeddings
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)

# Create ScaNN index from texts
texts = ["hello world", "hello langchain", "goodbye world"]
db = ScaNN.from_texts(texts, embeddings)

Perform similarity search to find relevant documents:

# Basic similarity search
results = db.similarity_search(
    "hello", 
    k=2  # Number of results to return
)

# Search with score
results_with_scores = db.similarity_search_with_score(
    "hello",
    k=2
)

# Search using embedding vector directly
vector = embeddings.embed_query("hello")
vector_results = db.similarity_search_by_vector(vector, k=2)

Use MMR to get diverse results:

# MMR search balances relevance with diversity
diverse_results = db.max_marginal_relevance_search(
    "hello",
    k=2,  # Number of results
    fetch_k=10,  # Number of initial candidates
    lambda_mult=0.5  # Diversity factor (0=max diversity, 1=max relevance)
)

Saving and Loading

Save your ScaNN index to disk and load it later:

# Save to disk
db.save_local("path/to/save", index_name="my_index")

# Load from disk
loaded_db = ScaNN.load_local(
    "path/to/save",
    embeddings,
    index_name="my_index"
)

Using with a Retrieval Chain

Integrate ScaNN with a retrieval chain for question answering:

from langchain.chains import RetrievalQA
from langchain_community.chat_models import ChatOpenAI

# Create retriever
retriever = db.as_retriever(search_kwargs={"k": 3})

# Create QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    chain_type="stuff",
    retriever=retriever
)

# Get answer
answer = qa_chain.run("What would you like to know?")

By using ScaNN as your vector store, you get efficient similarity search capabilities that can scale well with large document collections. The implementation is particularly well-suited for applications that need fast approximate nearest neighbor search with good accuracy.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs