Using SingleStore DB Semantic Cache in LangChain

Posted: Feb 20, 2025.

What is SingleStoreDBSemanticCache?

SingleStoreDBSemanticCache is a caching implementation in LangChain that uses SingleStore DB as a backend for semantically caching LLM responses. Unlike traditional exact-match caching, semantic caching allows you to retrieve cached results based on the semantic similarity between prompts, potentially reducing API calls for similar questions.

Reference

The SingleStoreDBSemanticCache class has the following key methods:

Method	Description
`lookup(prompt, llm_string)`	Looks up if there's a semantically similar cached response for the given prompt and LLM configuration
`update(prompt, llm_string, return_val)`	Updates the cache with a new prompt-response pair
`clear(**kwargs)`	Clears the cache
`alookup(prompt, llm_string)`	Async version of lookup
`aupdate(prompt, llm_string, return_val)`	Async version of update
`aclear(**kwargs)`	Async version of clear

How to use SingleStoreDBSemanticCache

Basic Configuration

Here's a basic example of how to set up SingleStoreDBSemanticCache:

import langchain
from langchain.cache import SingleStoreDBSemanticCache
from langchain_openai import OpenAIEmbeddings

langchain.llm_cache = SingleStoreDBSemanticCache(
    embedding=OpenAIEmbeddings(),
    host="https://user:password@127.0.0.1:3306/database"
)

This sets up the cache with basic configuration using the default settings.

Advanced Configuration

For more control over the cache behavior and connection settings:

import langchain
from langchain.cache import SingleStoreDBSemanticCache
from langchain_openai import OpenAIEmbeddings

langchain.llm_cache = SingleStoreDBSemanticCache(
    embedding=OpenAIEmbeddings(),
    host="127.0.0.1",
    port=3306,
    user="user",
    password="password",
    database="db",
    cache_table_prefix="custom_cache_",
    search_threshold=0.3,
    use_vector_index=True,
    vector_size=1536,
    pool_size=10,
    timeout=60,
    distance_strategy="DOT_PRODUCT"
)

This configuration includes:

Custom table prefix for cache tables
Adjusted similarity threshold for semantic matching
Vector index enablement for better performance
Connection pool settings
Custom timeout values
Specific distance strategy for vector similarity

Using with Vector Indexing

To enable vector indexing for better performance (requires SingleStoreDB 8.5+):

langchain.llm_cache = SingleStoreDBSemanticCache(
    embedding=OpenAIEmbeddings(),
    host="127.0.0.1",
    user="user",
    password="password",
    database="db",
    use_vector_index=True,
    vector_size=1536,
    vector_index_name="cache_vector_idx",
    vector_index_options={
        "index_type": "IVF_PQFS"
    }
)

SSL Configuration

For secure connections, you can configure SSL settings:

langchain.llm_cache = SingleStoreDBSemanticCache(
    embedding=OpenAIEmbeddings(),
    host="127.0.0.1",
    user="user",
    password="password",
    database="db",
    ssl_ca="/path/to/ca.pem",
    ssl_cert="/path/to/cert.pem",
    ssl_key="/path/to/key.pem",
    ssl_verify_cert=True,
    ssl_verify_identity=True
)

The SingleStoreDBSemanticCache is particularly useful when you want to:

Cache LLM responses in a robust, scalable database
Implement semantic similarity search for cached responses
Optimize response times by retrieving similar cached answers
Set up a production-grade caching solution with proper connection pooling and SSL

Remember that the effectiveness of semantic caching depends on:

The quality of the embedding model used
The search_threshold setting (lower values allow more liberal matching)
The vector indexing configuration for performance
The connection pool settings for concurrent access

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs