Overview
The Flashrank reranker improves search result relevance by re-scoring retrieved documents.
Features
- ✅ Better Relevance - Improves search accuracy
- ✅ Fast Performance - Optimized ONNX models
- ✅ Automatic Download - Models downloaded automatically
- ✅ Easy Setup - Simple configuration
Basic Usage
from langchat.reranker import Flashrank
reranker = Flashrank(
model_name="ms-marco-MiniLM-L-12-v2",
top_n=3 # Keep top 3 after reranking
)
Configuration
reranker = Flashrank(
model_name="ms-marco-MiniLM-L-12-v2", # Model name
top_n=3, # Number of documents to keep
cache_dir="rerank_models" # Where to store models
)
Reranker models are automatically downloaded on first use (~50MB).
Using with LangChat
from langchat import LangChat
from langchat.llm import OpenAI
from langchat.vector_db import Pinecone
from langchat.database import Supabase
from langchat.reranker import Flashrank
llm = OpenAI(api_key="sk-...", model="gpt-4o-mini")
vector_db = Pinecone(api_key="...", index_name="...")
db = Supabase(url="https://...", key="...")
reranker = Flashrank(top_n=3)
ai = LangChat(
llm=llm,
vector_db=vector_db,
db=db,
reranker=reranker
)
How It Works
- Vector search retrieves K documents
- Reranker re-scores documents for relevance
- Top N documents selected
- Used as context for LLM response
Best Practices
1. Balance Retrieval and Reranking
# Retrieve more, rerank to fewer
retriever = vector_db.get_retriever(k=10) # Retrieve 10
reranker = Flashrank(top_n=3) # Keep top 3
2. Use Appropriate Top N
# Precise queries
reranker = Flashrank(top_n=3)
# Broad queries
reranker = Flashrank(top_n=5)
Next Steps
Built with ❤️ by NeuroBrain