Using LangChain Azure Cosmos DB NoSQL Vector Search Store
Posted: Feb 8, 2025.
Azure Cosmos DB NoSQL Vector Search allows you to store and search high-dimensional vectors alongside your data in Cosmos DB NoSQL containers. This guide shows you how to use LangChain's integration to leverage vector search capabilities for similarity search and retrieval.
What is AzureCosmosDBNoSqlVectorSearch?
AzureCosmosDBNoSqlVectorSearch is a LangChain vector store that uses Azure Cosmos DB's built-in vector search capabilities. It allows you to:
- Store documents and their vector embeddings in Cosmos DB NoSQL containers
- Perform semantic similarity searches using different distance functions (cosine, euclidean, etc.)
- Index high-dimensional vectors for efficient search at scale
- Co-locate vectors with other document data in the same logical unit
Reference
Key attributes and methods:
Method | Description |
---|---|
__init__ | Initializes vector store with cosmos client, embedding function and configurations |
add_texts | Add text strings and their embeddings to the store |
add_documents | Add LangChain documents to the store |
similarity_search | Find similar documents based on query |
similarity_search_with_score | Search with similarity scores |
delete | Remove documents by IDs |
as_retriever | Convert into a LangChain retriever |
How to Use AzureCosmosDBNoSqlVectorSearch
Initial Setup
First, set up an Azure Cosmos DB NoSQL account and create the required embedding models:
Configure Vector Search
Set up vector search policies:
Initialize Vector Store
Create an instance of the vector store:
Search Documents
Perform similarity search:
Use as Retriever
Convert to LangChain retriever for use in chains:
Best Practices
- Configure optimal vector dimensions and distance metrics for your use case
- Use pre-filtering to narrow searches when needed
- Consider score thresholds to filter low-quality matches
- Batch document additions using
add_texts()
for better performance - Use
delete()
to remove outdated or irrelevant documents - Monitor cosmosdb metrics to optimize performance
Azure Cosmos DB NoSQL Vector Search provides a powerful way to implement semantic search in your LangChain applications while leveraging Azure's enterprise-grade database capabilities.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.