DocumentIndexer API

Class: DocumentIndexer

Standalone document loader and indexer for Pinecone. Use this to index documents without full LangChat setup.

Constructor

DocumentIndexer(
    pinecone_api_key: str,
    pinecone_index_name: str,
    openai_api_key: str,
    embedding_model: str = "text-embedding-3-large"
)

Parameters:

pinecone_api_key

str

required

Pinecone API key

pinecone_index_name

str

required

Pinecone index name (must exist)

openai_api_key

str

required

OpenAI API key for embeddings

embedding_model

str

default:"text-embedding-3-large"

Embedding model to use

Example:

from langchat.core.utils.document_indexer import DocumentIndexer

indexer = DocumentIndexer(
    pinecone_api_key="pcsk-...",
    pinecone_index_name="my-index",
    openai_api_key="sk-...",
    embedding_model="text-embedding-3-large"
)

Methods

`load_and_index_documents()`

Index a single document.

def load_and_index_documents(
    self,
    file_path: str,
    chunk_size: int = 1000,
    chunk_overlap: int = 200,
    namespace: Optional[str] = None,
    prevent_duplicates: bool = True
) -> dict

Returns: dict with:

chunks_indexed (int): Number of chunks indexed
chunks_skipped (int): Number of duplicates skipped

`load_and_index_multiple_documents()`

Index multiple documents.

def load_and_index_multiple_documents(
    self,
    file_paths: List[str],
    chunk_size: int = 1000,
    chunk_overlap: int = 200,
    namespace: Optional[str] = None,
    prevent_duplicates: bool = True
) -> dict

Example

from langchat.core.utils.document_indexer import DocumentIndexer

indexer = DocumentIndexer(
    pinecone_api_key="pcsk-...",
    pinecone_index_name="my-index",
    openai_api_key="sk-..."
)

# Index document
result = indexer.load_and_index_documents(
    file_path="document.pdf",
    chunk_size=1000,
    chunk_overlap=200
)

print(f"✅ Indexed {result['chunks_indexed']} chunks")

Next Steps

Document Indexing Guide - Complete guide
Examples - More examples

Built with ❤️ by NeuroBrain

Quick Start

Guides

Components

API Reference

Examples

Advanced

DocumentIndexer API

Class: DocumentIndexer

Constructor

Methods

`load_and_index_documents()`

`load_and_index_multiple_documents()`

Example

Next Steps

Quick Start

Guides

Components

API Reference

Examples

Advanced

​Class: DocumentIndexer

​Constructor

​Methods

​load_and_index_documents()

​load_and_index_multiple_documents()

​Example

​Next Steps

Class: DocumentIndexer

Constructor

Methods

`load_and_index_documents()`

`load_and_index_multiple_documents()`

Example

Next Steps