Skip to main content
POST
/
document
/
{documentId}
/
structured-output
Structured Output
curl --request POST \
  --url https://api.example.com/document/{documentId}/structured-output \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "<string>",
  "schema": {},
  "timeoutMs": 123
}
'
Extract structured data from a completed document by providing a JSON Schema and a natural-language query.

Request

documentId
string
required
The document ID (e.g., ocr-abc123).
query
string
required
Natural-language prompt describing what to extract.
schema
object
required
JSON Schema defining the expected output shape.
timeoutMs
number
Optional server-side timeout in milliseconds.

Authentication

Include your API key via the x-api-key header:
x-api-key: okra_YOUR_KEY

Example

curl -X POST https://api.okrapdf.com/document/ocr-abc123/structured-output \
  -H "x-api-key: okra_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Extract the company name and total revenue",
    "schema": {
      "type": "object",
      "properties": {
        "companyName": { "type": "string" },
        "totalRevenue": { "type": "string" }
      },
      "required": ["companyName", "totalRevenue"]
    }
  }'

Response (200)

{
  "data": {
    "companyName": "Acme Corp",
    "totalRevenue": "$4.2B"
  },
  "meta": {
    "confidence": 1,
    "model": "accounts/fireworks/models/kimi-k2p5",
    "durationMs": 13200,
    "citations": []
  }
}

Error codes

CodeStatusDescription
SCHEMA_VALIDATION_FAILED422Request body missing required fields or schema is invalid.
EXTRACTION_FAILED500LLM extraction failed or returned unparseable output.
TIMEOUT504Server-side extraction timed out.
DOCUMENT_NOT_FOUND404No document found with the given ID.

Using the TypeScript SDK

For a typed client with Zod validation, see the @okrapdf/runtime SDK.