Overview
TheOpenAILLMService is a production-ready wrapper around OpenAI’s Chat API that provides:
- Automatic API Key Rotation: Seamlessly rotate between multiple API keys
- Fault Tolerance: Automatic retry with key rotation on failures
- Error Recovery: Handles rate limits and API errors gracefully
Features
Automatic API Key Rotation
When configured with multiple API keys, the service automatically rotates between them:- On API failures
- On rate limit errors
- When a key is exhausted
Fault Tolerance
Built-in retry mechanism:- Automatic retry on failures
- Key rotation on errors
- Configurable retry attempts per key
Performance
- Efficient key cycling using Python’s
itertools.cycle - Lazy initialization of ChatOpenAI instances
- Minimal overhead
Configuration
The service is configured throughLangChatConfig:
Configuration Options
| Parameter | Type | Default | Description |
|---|---|---|---|
openai_api_keys | List[str] | Required | List of OpenAI API keys for rotation |
openai_model | str | "gpt-4o-mini" | OpenAI model to use |
openai_temperature | float | 1.0 | Model temperature (0.0-2.0) |
max_llm_retries | int | 2 | Number of retries per API key |
Usage
Basic Usage
The service is automatically initialized byLangChatEngine:
Direct Usage (Advanced)
API Reference
Class: OpenAILLMService
Constructor
OpenAI model name (e.g., “gpt-4o-mini”, “gpt-4”, “gpt-3.5-turbo”)
Model temperature (0.0-2.0)
List of OpenAI API keys for rotation
Maximum retries per API key (default: 2)
Methods
invoke(messages, **kwargs)
Invoke the OpenAI Chat API with automatic retry and key rotation.
Parameters:
messages: List of message dictionaries (LangChain format)**kwargs: Additional arguments passed to ChatOpenAI
- AI model response
Exception: If all API keys are exhausted
How It Works
Key Rotation Flow
Retry Logic
The service implements intelligent retry:- Per Key Retries: Each key gets
max_retries_per_keyattempts - Total Retries: Total attempts =
len(api_keys) × max_retries_per_key - Error Handling: Distinguishes between retryable and non-retryable errors
Best Practices
Multiple API Keys
Use multiple API keys to:- Handle rate limits
- Distribute load
- Improve fault tolerance
Error Handling
The service handles common errors:- Rate Limits: Automatically rotates keys
- API Errors: Retries with different key
- Network Errors: Retries with exponential backoff (via LangChain)
Monitoring
Monitor API usage:- Track which keys are being used
- Monitor error rates
- Watch for exhausted keys
Related Documentation
- Configuration Guide - Configuration options
- Adapters Overview - All adapters