Using LangChain's CassandraSemanticCache for Semantic-Based LLM Response Caching
Posted: Feb 19, 2025.
LangChain's CassandraSemanticCache enables you to cache LLM responses based on semantic similarity, allowing you to reuse responses for semantically similar prompts. This guide explains how to use this cache effectively with Apache Cassandra or compatible databases.
What is CassandraSemanticCache?
CassandraSemanticCache is a caching implementation that uses Cassandra as a vector store backend for semantic (similarity-based) lookup of cached LLM responses. Unlike traditional exact-match caching, it can return cached results for prompts that are semantically similar but not identical, potentially reducing API calls and improving response times.
Reference
Here are the key methods available in CassandraSemanticCache:
Method | Description |
---|---|
lookup | Look up cached results based on prompt and LLM string |
update | Store new results in the cache |
clear | Clear the entire semantic cache |
lookup_with_id | Look up results and return document ID if found |
delete_by_document_id | Delete a cached entry by its document ID |
How to Use CassandraSemanticCache
Setting up the Cache
First, install the required dependency:
Then initialize the cache:
Basic Usage
Once configured, the cache will automatically handle caching for your LLM calls:
Advanced Configuration
You can customize the cache behavior:
Invalidating Cache Entries
To remove specific entries from the cache:
Clearing the Entire Cache
To clear all cached entries:
The CassandraSemanticCache provides a powerful way to optimize your LLM applications by reducing duplicate API calls while maintaining response quality through semantic matching. By properly tuning the similarity threshold and TTL settings, you can balance cache hit rates with response accuracy for your specific use case.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.