Skip to main content

Overview

The Flashrank reranker improves search result relevance by re-scoring retrieved documents.

Features

  • Better Relevance - Improves search accuracy
  • Fast Performance - Optimized ONNX models
  • Automatic Download - Models downloaded automatically
  • Easy Setup - Simple configuration

Basic Usage

from langchat.reranker import Flashrank

reranker = Flashrank(
    model_name="ms-marco-MiniLM-L-12-v2",
    top_n=3  # Keep top 3 after reranking
)

Configuration

reranker = Flashrank(
    model_name="ms-marco-MiniLM-L-12-v2",  # Model name
    top_n=3,  # Number of documents to keep
    cache_dir="rerank_models"  # Where to store models
)
Reranker models are automatically downloaded on first use (~50MB).

Using with LangChat

from langchat import LangChat
from langchat.llm import OpenAI
from langchat.vector_db import Pinecone
from langchat.database import Supabase
from langchat.reranker import Flashrank

llm = OpenAI(api_key="sk-...", model="gpt-4o-mini")
vector_db = Pinecone(api_key="...", index_name="...")
db = Supabase(url="https://...", key="...")
reranker = Flashrank(top_n=3)

ai = LangChat(
    llm=llm,
    vector_db=vector_db,
    db=db,
    reranker=reranker
)

How It Works

  1. Vector search retrieves K documents
  2. Reranker re-scores documents for relevance
  3. Top N documents selected
  4. Used as context for LLM response

Best Practices

1. Balance Retrieval and Reranking

# Retrieve more, rerank to fewer
retriever = vector_db.get_retriever(k=10)  # Retrieve 10
reranker = Flashrank(top_n=3)  # Keep top 3

2. Use Appropriate Top N

# Precise queries
reranker = Flashrank(top_n=3)

# Broad queries
reranker = Flashrank(top_n=5)

Next Steps


Built with ❤️ by NeuroBrain