OpenXtract
Extract structured data from any document

Turn anything into
structured data

Open‑source toolkit for extracting clean, structured data from PDFs, images, and text.
Auto-routing with minimal setup.

Install
uv add open-xtract

Features

Auto-routing

Single method detects PDFs, images, URLs, or raw text automatically

Multi-provider

Supports Claude, GPT, Gemini with automatic provider detection

Structured output

Pydantic schemas ensure clean, typed data extraction

Examples

Python