Working with MLflow Embeddings in LangChain

Posted: Nov 15, 2024.

MLflow's AI Gateway provides a unified interface for working with various LLM providers. In this guide, we'll explore how to use the MlflowEmbeddings class in LangChain to generate embeddings through MLflow's gateway.

What is MlflowEmbeddings?

MlflowEmbeddings is a LangChain embeddings class that allows you to generate vector embeddings for text using MLflow's AI Gateway. It provides a convenient way to interact with embedding models (like OpenAI's text-embedding-ada-002) through a centralized MLflow endpoint, making it easier to manage and standardize embedding operations across your organization.

Reference

Method	Description
`embed_documents(texts)`	Generates embeddings for a list of text documents
`embed_query(text)`	Generates an embedding for a single query text
`aembed_documents(texts)`	Asynchronous version of embed_documents
`aembed_query(text)`	Asynchronous version of embed_query

Configuration parameters:

target_uri: The URI of your MLflow gateway server (required)
endpoint: The name of the embeddings endpoint configured in MLflow (required)
documents_params: Additional parameters for document embedding (optional)
query_params: Additional parameters for query embedding (optional)

How to use MlflowEmbeddings

Setup and Installation

First, install MLflow with GenAI dependencies:

pip install 'mlflow[genai]'

You'll need to configure and start the MLflow gateway with an embeddings endpoint. Here's a sample configuration:

endpoints:
  - name: embeddings
    endpoint_type: llm/v1/embeddings
    model:
      provider: openai
      name: text-embedding-ada-002
      config:
        openai_api_key: $OPENAI_API_KEY

Basic Usage

Here's how to initialize and use MlflowEmbeddings:

from langchain_community.embeddings import MlflowEmbeddings

embeddings = MlflowEmbeddings(
    target_uri="http://localhost:5000",
    endpoint="embeddings"
)

# Generate embedding for a single text
query_embedding = embeddings.embed_query("What is machine learning?")

# Generate embeddings for multiple texts
document_embeddings = embeddings.embed_documents(
    ["Machine learning is...", "Deep learning is...", "Neural networks are..."]
)

Async Operations

For better performance in async applications, you can use the async methods:

import asyncio
from langchain_community.embeddings import MlflowEmbeddings

async def generate_embeddings():
    embeddings = MlflowEmbeddings(
        target_uri="http://localhost:5000",
        endpoint="embeddings"
    )
    
    # Generate embeddings asynchronously
    query_embedding = await embeddings.aembed_query("What is machine learning?")
    
    documents = [
        "Machine learning is...",
        "Deep learning is...",
        "Neural networks are..."
    ]
    doc_embeddings = await embeddings.aembed_documents(documents)
    
    return query_embedding, doc_embeddings

# Run the async function
query_emb, doc_embs = asyncio.run(generate_embeddings())

Using with Additional Parameters

You can provide additional parameters for both document and query embedding operations:

embeddings = MlflowEmbeddings(
    target_uri="http://localhost:5000",
    endpoint="embeddings",
    documents_params={"batch_size": 32},
    query_params={"truncate": "end"}
)

The MlflowEmbeddings class provides a convenient way to generate embeddings through MLflow's AI Gateway, allowing you to centralize and standardize your embedding operations. This is particularly useful in enterprise settings where you want to maintain control over model access and usage while providing a consistent interface to your development team.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs