Using Cross Encoders in LangChain with HuggingFace Models

Posted: Nov 11, 2024.

Cross encoders are powerful models for comparing text pairs and computing similarity scores. In this guide, we'll explore how to use the HuggingFaceCrossEncoder in LangChain to leverage these models effectively.

What is HuggingFaceCrossEncoder?

HuggingFaceCrossEncoder is a LangChain class that wraps HuggingFace cross encoder models, making them easy to use for text similarity scoring tasks. Cross encoders are particularly useful for:

Reranking search results
Computing semantic similarity between text pairs
Question answering relevance scoring
Document retrieval refinement

Unlike bi-encoders that encode texts separately, cross encoders look at both texts simultaneously, often providing more accurate similarity scores at the cost of computational efficiency.

Reference

Method/Parameter	Description
`model_name`	The name of the HuggingFace model to use. Defaults to 'BAAI/bge-reranker-base'
`model_kwargs`	Optional dictionary of keyword arguments to pass to the model
`score()`	Method that computes similarity scores for a list of text pairs

How to Use HuggingFaceCrossEncoder

Basic Usage

Here's how to initialize and use the cross encoder for basic scoring:

from langchain_community.cross_encoders import HuggingFaceCrossEncoder

# Initialize the cross encoder
cross_encoder = HuggingFaceCrossEncoder(
    model_name="BAAI/bge-reranker-base",
    model_kwargs={'device': 'cpu'}
)

# Create text pairs to score
text_pairs = [
    ("How do I bake bread?", "This is a recipe for sourdough bread"),
    ("How do I bake bread?", "The weather is nice today")
]

# Get similarity scores
scores = cross_encoder.score(text_pairs)

Using with ContextualCompressionRetriever

A common use case is to use the cross encoder for reranking search results:

from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import CrossEncoderReranker

# Initialize the cross encoder
model = HuggingFaceCrossEncoder(model_name="BAAI/bge-reranker-base")

# Create a reranker compressor
compressor = CrossEncoderReranker(model=model, top_n=3)

# Wrap your base retriever with the compression retriever
compression_retriever = ContextualCompressionRetriever(
    base_compressor=compressor,
    base_retriever=your_base_retriever
)

# Use the compression retriever
compressed_docs = compression_retriever.invoke("Your query here")

Custom Model Configuration

You can customize the model behavior using model_kwargs:

# Using a specific device and other custom parameters
cross_encoder = HuggingFaceCrossEncoder(
    model_name="cross-encoder/ms-marco-MiniLM-L-6-v2",
    model_kwargs={
        'device': 'cuda',  # Use GPU
        'max_length': 512  # Set maximum sequence length
    }
)

Using Alternative Models

The HuggingFaceCrossEncoder works with various cross encoder models from HuggingFace:

# Using different cross encoder models
models = [
    "cross-encoder/ms-marco-MiniLM-L-6-v2",
    "cross-encoder/stsb-roberta-large",
    "BAAI/bge-reranker-base"
]

# Initialize with your chosen model
cross_encoder = HuggingFaceCrossEncoder(
    model_name=models[0]  # Choose the model you want to use
)

Remember that cross encoders compute similarity by processing both texts together, which makes them more accurate but slower than bi-encoders for large-scale comparisons. They're best used in scenarios where accuracy is more important than speed, such as reranking a small set of pre-filtered results.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs

Join 10,000+ subscribers

Every 2 weeks, latest model releases and industry news.

An alternative to LangSmith

Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.

LangChain Docs