Loading Email Files with LangChain UnstructuredEmailLoader
Posted: Jan 31, 2025.
The UnstructuredEmailLoader in LangChain provides an easy way to extract content from email files, supporting both .eml and .msg formats. Let's explore how to use this document loader effectively.
What is UnstructuredEmailLoader?
UnstructuredEmailLoader is a document loader that processes email files and converts them into Document objects that can be used in your LangChain applications. It supports:
- Reading .eml and .msg file formats
- Extracting email content and metadata
- Processing email attachments (optional)
- Different loading modes for structured content extraction
Reference
Here are the key methods available in UnstructuredEmailLoader:
Method | Description |
---|---|
load() | Loads the email file and returns a list of Document objects |
lazy_load() | Iterator that loads Documents one at a time to save memory |
aload() | Asynchronous version of load() |
alazy_load() | Asynchronous version of lazy_load() |
load_and_split() | Loads and splits the documents using a text splitter |
How to Use UnstructuredEmailLoader
Basic Usage
Here's a simple example of loading an email file:
Retaining Email Elements
By default, the loader combines all text elements. To keep the email structure and metadata, use the "elements" mode:
Processing Attachments
To handle email attachments, enable attachment processing:
Async Loading
For better performance in async applications:
Memory-Efficient Loading
For large email files, use lazy loading to conserve memory:
Before using UnstructuredEmailLoader, make sure you have the required dependencies installed:
This loader is particularly useful when you need to:
- Extract content from email archives for analysis
- Process email threads for chatbots or QA systems
- Build search functionality over email collections
- Create structured data from email communications
Remember that the extracted content will be converted into Document objects that can be used with other LangChain components like text splitters, embeddings, or vector stores.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.