> ## Documentation Index > Fetch the complete documentation index at: https://langchat.neurobrains.co/llms.txt > Use this file to discover all available pages before exploring further. # DocumentIndexer > Standalone document indexing class for advanced use cases. ## Overview `DocumentIndexer` is the underlying class that powers `LangChat.index()`. Use it directly only when you need to index documents outside of a `LangChat` instance — for example, in a standalone indexing script that doesn't start the full chatbot. For most use cases, use `lc.index()` instead. ```python theme={null} from langchat.core.utils.document_indexer import DocumentIndexer ``` *** ## Constructor ```python theme={null} DocumentIndexer( pinecone_api_key: str, pinecone_index_name: str, openai_api_key: str, embedding_model: str = "text-embedding-3-large", ) ``` Pinecone API key. Pinecone index name. OpenAI API key for creating embeddings. OpenAI embedding model. *** ## Methods ### `load_and_index_documents()` Index a single file. ```python theme={null} def load_and_index_documents( self, file_path: str, chunk_size: int = 1000, chunk_overlap: int = 200, namespace: str | None = None, prevent_duplicates: bool = True, ) -> dict ``` Path to the document file. Characters per chunk. Overlap between adjacent chunks. Pinecone namespace. Skip chunks already in Pinecone (checked by content hash). **Returns:** `dict` with `chunks_indexed`, `chunks_skipped`, and metadata. *** ### `load_and_index_multiple_documents()` Index multiple files. ```python theme={null} def load_and_index_multiple_documents( self, file_paths: list[str], chunk_size: int = 1000, chunk_overlap: int = 200, namespace: str | None = None, prevent_duplicates: bool = True, ) -> dict ``` Same parameters as `load_and_index_documents()`, but accepts a list of file paths. *** ## Standalone indexing script Use `DocumentIndexer` directly when you want to index documents independently of the chatbot: ```python theme={null} # standalone_indexer.py import os from langchat.core.utils.document_indexer import DocumentIndexer from dotenv import load_dotenv load_dotenv() indexer = DocumentIndexer( pinecone_api_key=os.environ["PINECONE_API_KEY"], pinecone_index_name="my-index", openai_api_key=os.environ["OPENAI_API_KEY"], ) # Index multiple documents result = indexer.load_and_index_multiple_documents( file_paths=[ "docs/manual.pdf", "docs/faq.md", "docs/policies.txt", ], chunk_size=1000, chunk_overlap=200, namespace="main", prevent_duplicates=True, ) print(f"Indexed: {result['chunks_indexed']}") print(f"Skipped: {result['chunks_skipped']}") ``` `LangChat.index()` is a convenience wrapper around `DocumentIndexer` that reads credentials from environment variables automatically. Prefer it when you already have a `LangChat` instance.