Skip to main content

The pattern

All providers live in langchat.providers. Every provider reads credentials from environment variables automatically — you don’t pass keys in code.
from langchat import LangChat
from langchat.providers import OpenAI, Pinecone, Supabase

lc = LangChat(
    llm=OpenAI("gpt-4o-mini"),      # reads OPENAI_API_KEY
    vector_db=Pinecone("my-index"), # reads PINECONE_API_KEY
    db=Supabase(),                  # reads SUPABASE_URL + SUPABASE_KEY
)
You can also pass credentials explicitly when environment variables aren’t available:
lc = LangChat(
    llm=OpenAI("gpt-4o-mini", api_key="sk-..."),
    vector_db=Pinecone("my-index", api_key="pcsk-..."),
    db=Supabase(url="https://xxxx.supabase.co", key="eyJ..."),
)

LLM providers

OpenAI

from langchat.providers import OpenAI

llm = OpenAI("gpt-4o-mini")                    # default
llm = OpenAI("gpt-4o", temperature=0.3)        # more deterministic
llm = OpenAI("gpt-4o-mini", temperature=1.0)   # more creative (default)
Environment variable: OPENAI_API_KEY Available models: gpt-4o-mini · gpt-4o · gpt-4-turbo · gpt-3.5-turbo · any valid OpenAI chat model Multiple API keys (automatic rotation on failure):
llm = OpenAI("gpt-4o-mini", api_keys=["sk-key1", "sk-key2", "sk-key3"])
model
str
default:"gpt-4o-mini"
OpenAI model name. First positional argument.
api_key
str | None
default:"None"
API key. Defaults to OPENAI_API_KEY environment variable.
api_keys
list[str] | None
default:"None"
Multiple API keys for automatic rotation. Takes precedence over api_key.
temperature
float
default:"1.0"
Sampling temperature. Lower = more deterministic.
max_retries_per_key
int
default:"2"
Retries per key before rotating to the next one.

Anthropic

from langchat.providers import Anthropic

llm = Anthropic("claude-3-5-sonnet-20241022")
llm = Anthropic("claude-3-5-haiku-20241022")   # faster, cheaper
llm = Anthropic("claude-3-opus-20240229")       # most capable
Environment variable: ANTHROPIC_API_KEY

Google Gemini

from langchat.providers import Gemini

llm = Gemini("gemini-1.5-flash")   # fast and efficient
llm = Gemini("gemini-1.5-pro")     # higher quality
llm = Gemini("gemini-2.0-flash")   # latest
Environment variables: GEMINI_API_KEY or GOOGLE_API_KEY (either works)

Mistral

from langchat.providers import Mistral

llm = Mistral("mistral-large-latest")
llm = Mistral("mistral-small-latest")   # faster, cheaper
Environment variable: MISTRAL_API_KEY

Cohere

from langchat.providers import Cohere

llm = Cohere("command-r-plus")
llm = Cohere("command-r")        # faster, cheaper
Environment variable: COHERE_API_KEY

Ollama (local)

No API key required. Ollama runs on your own machine.
from langchat.providers import Ollama

llm = Ollama("llama3.2")                                  # default endpoint
llm = Ollama("llama3.2", base_url="http://192.168.1.5:11434")  # remote instance
Install Ollama and pull a model first:
ollama pull llama3.2

Vector database

Pinecone

from langchat.providers import Pinecone

vector_db = Pinecone("my-index")                              # default embedding model
vector_db = Pinecone("my-index", embedding_model="text-embedding-3-small")  # smaller, cheaper
Environment variables: PINECONE_API_KEY and OPENAI_API_KEY (for embeddings)
index
str
required
Pinecone index name. First positional argument.
api_key
str | None
default:"None"
Pinecone API key. Defaults to PINECONE_API_KEY.
embedding_api_key
str | None
default:"None"
OpenAI API key for embeddings. Defaults to OPENAI_API_KEY.
embedding_model
str
default:"text-embedding-3-large"
OpenAI embedding model. Use text-embedding-3-small for lower cost.
Your Pinecone index dimensions must match the embedding model:
  • text-embedding-3-large3072 dimensions
  • text-embedding-3-small1536 dimensions

History database

Supabase

from langchat.providers import Supabase

db = Supabase()                                              # reads from env
db = Supabase(url="https://xxxx.supabase.co", key="eyJ...")  # explicit
Environment variables: SUPABASE_URL and SUPABASE_KEY (or SUPABASE_SERVICE_ROLE_KEY) LangChat creates the required tables automatically on first run.

LangChat settings

Pass these to LangChat(...) as keyword arguments:
llm
Any
required
LLM provider instance.
vector_db
Any
required
Vector database provider instance.
db
Any
required
History database provider instance.
reranker
Any | None
default:"None"
Reranker instance. Defaults to FlashrankRerankAdapter with ms-marco-MiniLM-L-12-v2.
prompt_template
str | None
default:"None"
System prompt template. Must contain {context}, {chat_history}, {question}. See Prompts guide.
standalone_question_prompt
str | None
default:"None"
Prompt for reformulating follow-up questions. Must contain {chat_history} and {question}.
verbose
bool
default:"False"
Enable verbose LangChain chain logging.
max_chat_history
int
default:"20"
Number of past exchanges to include in each prompt. One exchange = one user message + one AI response.

Complete example

import asyncio
from langchat import LangChat
from langchat.providers import OpenAI, Pinecone, Supabase

LangChat.load_env()

lc = LangChat(
    llm=OpenAI("gpt-4o-mini", temperature=0.7),
    vector_db=Pinecone("my-index"),
    db=Supabase(),
    max_chat_history=10,
    prompt_template="""You are a helpful assistant.

Context:
{context}

Conversation history:
{chat_history}

User: {question}
Assistant:""",
)

async def main():
    response = await lc.chat(query="Hello!", user_id="alice")
    print(response)

asyncio.run(main())