Ingest any document format into your knowledge base with the Docling Reader

The DoclingReader provides a single, unified interface for processing the full range of document formats an AI agent encounters — PDFs, Word files, PowerPoint decks, Excel spreadsheets, images, and even audio and video files — all through the same reader, without format-specific ingestion logic or a sprawling set of dependencies. Built on IBM Research's open-source Docling library, it preserves document structure (headings, tables, hierarchies, formulas, and layout) during extraction, so context is not lost in translation before content reaches your vector store.

‍

Details:

Supports PDFs, .docx, .pptx, .xlsx, markup files, images (JPEG, PNG), and audio/video (MP4 and others via FFmpeg and Whisper)
Structure-preserving extraction keeps tables, headings, and hierarchies intact for higher-quality RAG retrieval
Outputs flow directly into Agno's chunking pipeline with no additional preprocessing required
Configurable output_format supports Markdown (default), plain text, JSON, HTML, DocTags, and VTT for audio/video transcripts
Load from local paths or directly from URLs with the same interface

‍

See Docling Reader docs.