Skip to main content

Overview

The OpenAILLMService is a production-ready wrapper around OpenAI’s Chat API that provides:
  • Automatic API Key Rotation: Seamlessly rotate between multiple API keys
  • Fault Tolerance: Automatic retry with key rotation on failures
  • Error Recovery: Handles rate limits and API errors gracefully

Features

Automatic API Key Rotation

When configured with multiple API keys, the service automatically rotates between them:
  • On API failures
  • On rate limit errors
  • When a key is exhausted

Fault Tolerance

Built-in retry mechanism:
  • Automatic retry on failures
  • Key rotation on errors
  • Configurable retry attempts per key

Performance

  • Efficient key cycling using Python’s itertools.cycle
  • Lazy initialization of ChatOpenAI instances
  • Minimal overhead

Configuration

The service is configured through LangChatConfig:
from langchat.config import LangChatConfig

config = LangChatConfig(
    openai_api_keys=["sk-...", "sk-...", "sk-..."],  # Multiple keys for rotation
    openai_model="gpt-4o-mini",
    openai_temperature=1.0,
    max_llm_retries=2  # Retries per key
)

Configuration Options

ParameterTypeDefaultDescription
openai_api_keysList[str]RequiredList of OpenAI API keys for rotation
openai_modelstr"gpt-4o-mini"OpenAI model to use
openai_temperaturefloat1.0Model temperature (0.0-2.0)
max_llm_retriesint2Number of retries per API key

Usage

Basic Usage

The service is automatically initialized by LangChatEngine:
from langchat import LangChat, LangChatConfig

config = LangChatConfig(
    openai_api_keys=["your-api-key"],
    # ... other config
)

langchat = LangChat(config=config)
# Service is automatically initialized

Direct Usage (Advanced)

from langchat.adapters.services.openai_service import OpenAILLMService

service = OpenAILLMService(
    model="gpt-4o-mini",
    temperature=1.0,
    api_keys=["sk-...", "sk-..."],
    max_retries_per_key=2
)

# Use the service
messages = [{"role": "user", "content": "Hello!"}]
response = service.invoke(messages)

API Reference

Class: OpenAILLMService

Constructor

OpenAILLMService(
    model: str,
    temperature: float,
    api_keys: List[str],
    max_retries_per_key: int = 2
)
Parameters:
model
str
required
OpenAI model name (e.g., “gpt-4o-mini”, “gpt-4”, “gpt-3.5-turbo”)
temperature
float
required
Model temperature (0.0-2.0)
api_keys
List[str]
required
List of OpenAI API keys for rotation
max_retries_per_key
int
default:"2"
Maximum retries per API key (default: 2)

Methods

invoke(messages, **kwargs)
Invoke the OpenAI Chat API with automatic retry and key rotation. Parameters:
  • messages: List of message dictionaries (LangChain format)
  • **kwargs: Additional arguments passed to ChatOpenAI
Returns:
  • AI model response
Raises:
  • Exception: If all API keys are exhausted
Example:
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Python?"}
]

response = service.invoke(messages)
print(response.content)

How It Works

Key Rotation Flow

1. Initialize with list of API keys
2. Use first key for API calls
3. On failure:
   a. Rotate to next key
   b. Retry the request
   c. If all keys exhausted, raise exception
4. On success, continue with current key

Retry Logic

The service implements intelligent retry:
  • Per Key Retries: Each key gets max_retries_per_key attempts
  • Total Retries: Total attempts = len(api_keys) × max_retries_per_key
  • Error Handling: Distinguishes between retryable and non-retryable errors

Best Practices

Multiple API Keys

Use multiple API keys to:
  • Handle rate limits
  • Distribute load
  • Improve fault tolerance
config = LangChatConfig(
    openai_api_keys=[
        "sk-org1-key1",
        "sk-org1-key2",
        "sk-org2-key1"  # Different organization
    ],
    max_llm_retries=2
)

Error Handling

The service handles common errors:
  • Rate Limits: Automatically rotates keys
  • API Errors: Retries with different key
  • Network Errors: Retries with exponential backoff (via LangChain)

Monitoring

Monitor API usage:
  • Track which keys are being used
  • Monitor error rates
  • Watch for exhausted keys