Loading Documents from Cassandra with LangChain
Posted: Feb 17, 2025.
The CassandraLoader in LangChain provides a convenient way to load documents from Apache Cassandra databases. This guide will show you how to use it effectively to retrieve and process your data.
What is CassandraLoader?
CassandraLoader is a document loader that allows you to fetch data from Apache Cassandra, a NoSQL database, and convert it into LangChain Document objects. You can load data either by specifying a table name or by providing a custom CQL query. The loader supports both synchronous and asynchronous operations.
Reference
Here are the key methods available in CassandraLoader:
Method | Description |
---|---|
load() | Synchronously loads data and returns a list of Document objects |
aload() | Asynchronously loads data and returns a list of Document objects |
lazy_load() | Creates a synchronous iterator of Document objects |
alazy_load() | Creates an asynchronous iterator of Document objects |
load_and_split() | Loads documents and splits them into chunks using a TextSplitter |
How to Use CassandraLoader
Basic Setup
There are two ways to initialize the CassandraLoader:
1. Using a Cassandra Driver Session
2. Using Cassio
Custom Data Mapping
You can customize how row data is converted to document content and metadata:
Using Custom Queries
Instead of specifying a table, you can use a custom CQL query:
Async Loading
For better performance in async applications:
Lazy Loading
When dealing with large datasets, you can use lazy loading to conserve memory:
The CassandraLoader makes it easy to integrate Cassandra data into your LangChain applications. Remember to handle your database connections properly and consider using connection pooling for production environments.
When working with large datasets, consider using the lazy loading methods to prevent memory issues, and take advantage of the async capabilities if you're building an async application.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.