Load Mastodon Posts with LangChain's MastodonTootsLoader
Posted: Feb 15, 2025.
Mastodon is a federated social media platform that offers an alternative to centralized networks. In this guide, we'll explore how to use LangChain's MastodonTootsLoader to extract content from Mastodon posts (known as "toots") for processing in your applications.
What is MastodonTootsLoader?
MastodonTootsLoader is a document loader that lets you fetch and process posts from Mastodon accounts. It can work with both public accounts without authentication and private accounts using API tokens. The loader converts Mastodon posts into Document objects that can be used in LangChain's document processing pipelines.
Reference
Here are the key methods and parameters available in MastodonTootsLoader:
Method | Description |
---|---|
load() | Loads toots and returns them as a list of Document objects |
lazy_load() | Returns an iterator of Document objects for memory-efficient loading |
aload() | Async version of load() |
alazy_load() | Async version of lazy_load() |
load_and_split() | Loads documents and splits them into chunks |
Constructor parameters:
mastodon_accounts
: List of Mastodon accounts to query (in @username@instance format)number_toots
: Number of toots to fetch per account (default: 100)exclude_replies
: Whether to exclude replies from the results (default: False)access_token
: API token for authenticated accessapi_base_url
: Base URL for the Mastodon instance (default: https://mastodon.social)
How to Use MastodonTootsLoader
Basic Usage with Public Accounts
Here's how to load toots from a public Mastodon account:
Authenticated Access
For private accounts or instances, you'll need to use authentication:
You can also set the access token via environment variable:
Async Loading
For better performance in async applications:
Memory-Efficient Loading
If you're working with a large number of toots, use lazy loading:
Working with Multiple Accounts
You can load toots from multiple accounts simultaneously:
Note that the loaded documents' page_content will contain HTML as returned by the Mastodon API. If you need plain text, you'll need to process the HTML content accordingly.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.