Turn documents into
structured data.
Multi-Media Support
Extract from documents, images, audio, and video. Smart detection routes to the right handler automatically.
Any LLM Provider
Built on pydantic-ai. Use OpenAI, Anthropic, Google, or any compatible provider with a single model string.
Type Safe
Leverage Pydantic schemas to ensure extracted data is validated, typed, and clean every time.
Built for developers.
Stop writing regex. Just define your schema and let the LLM do the heavy lifting.
- Works with documents, images, audio, and video
- Structured error handling with typed exceptions
- Optional logfire instrumentation for tracing
receipt_parser.py
from open_xtract import extract, UrlFetchError, ModelError
class Receipt(BaseModel):
vendor: str
items: list[LineItem]
total: float
try:
result = extract(
schema=Receipt,
model="anthropic:claude-sonnet-4-5",
url="https://example.com/receipt.jpg",
instructions="Extract receipt details",
)
print(f"Vendor: {result.vendor}, Total: ${result.total}")
except UrlFetchError as e:
print(f"Failed to fetch: {e}")