Overview
UsePOST /v1/documents/ingest when parsing already happened in your own pipeline.
You send vendor output (unstructured, llamaparse, or canonical) and OkraPDF handles normalization, hydration, lifecycle processing, and document endpoints.
Request
Supported connector IDs
vendor value | Expected shape |
|---|---|
unstructured | array of Unstructured elements (type, metadata.page_number) |
llamaparse | object with pages[].items[] entries |
canonical | object with canonical pages[].blocks[] |
vendor is omitted, OkraPDF tries to auto-detect from payload shape.
Response model
The endpoint returns202 Accepted and starts lifecycle processing.
What happens after ingest
- Vendor payload is normalized to Okra’s canonical parse shape.
- Parsed nodes are hydrated into the document graph.
- Lifecycle jobs run (snapshot/materialization/projection workflow).
- Standard document surfaces become available (
pages, chat/completion, output profiles, URL builder).
Failure modes
- Unknown payload shape without
vendor:422with supported connector list. - Invalid payload for chosen connector:
422normalization error. - Workflow startup failure:
500with error payload.
When to use this endpoint
Use Ingest API when you:- already run extraction with external vendors,
- want OkraPDF delivery + policy + output layers,
- need a stable
doc-...lifecycle without re-running OCR in Okra.