Quick start guide

Get started with DocDigitizer's AI-powered document extraction API in minutes. This guide walks you through your first API call to extract structured data from documents.

Prerequisites

Before you begin, ensure you have the following:

API Key – Your unique API key provided by DocDigitizer. If you don’t have one, contact our sales team.
Context ID – A UUID that identifies your processing context/configuration. This is provided during your account setup.
A PDF document – The document you want to process (invoices, contracts, ID cards, etc.)

Authentication

All API requests require authentication using an API key. Include your API key in the request header:

Header	Value	Description
`x-api-key`	Your API Key	Required. Your unique API key for authentication.

Important: Keep your API key secure. Never expose it in client-side code or public repositories.

Your First API Call

API Endpoint

POST https://apix.docdigitizer.com/sync

Request Format

The API accepts multipart/form-data requests with the following parameters:

Parameter	Type	Required	Description
`files`	File	Yes	The PDF document to process.
`id`	UUID	Yes	A unique identifier for this document/request. Generate a new UUID for each request.
`contextId`	UUID	Yes	Your context identifier that determines the processing pipeline and schema configuration.

Example: cURL

curl -X POST https://apix.docdigitizer.com/sync \
  -H "x-api-key: YOUR_API_KEY" \
  -F "files=@/path/to/your/document.pdf" \
  -F "id=550e8400-e29b-41d4-a716-446655440000" \
  -F "contextId=YOUR_CONTEXT_ID"

Example: PowerShell

$headers = @{
    "x-api-key" = "YOUR_API_KEY"
}

$form = @{
    files = Get-Item -Path "C:\path\to\your\document.pdf"
    id = "550e8400-e29b-41d4-a716-446655440000"
    contextId = "YOUR_CONTEXT_ID"
}

$response = Invoke-RestMethod -Uri "https://apix.docdigitizer.com/sync" `
    -Method Post `
    -Headers $headers `
    -Form $form

$response | ConvertTo-Json -Depth 10

Example: Python

import requests
import uuid

url = "https://apix.docdigitizer.com/sync"

headers = {
    "x-api-key": "YOUR_API_KEY"
}

# Generate a unique document ID
document_id = str(uuid.uuid4())

files = {
    "files": open("/path/to/your/document.pdf", "rb")
}

data = {
    "id": document_id,
    "contextId": "YOUR_CONTEXT_ID"
}

response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

Example: JavaScript (Node.js)

const FormData = require('form-data');
const fs = require('fs');
const fetch = require('node-fetch');
const { v4: uuidv4 } = require('uuid');

const form = new FormData();
form.append('files', fs.createReadStream('/path/to/your/document.pdf'));
form.append('id', uuidv4());
form.append('contextId', 'YOUR_CONTEXT_ID');

fetch('https://apix.docdigitizer.com/sync', {
    method: 'POST',
    headers: {
        'x-api-key': 'YOUR_API_KEY'
    },
    body: form
})
.then(response => response.json())
.then(data => console.log(data));

Example: C# (.NET)

using var client = new HttpClient();
using var form = new MultipartFormDataContent();

// Add API key header
client.DefaultRequestHeaders.Add("x-api-key", "YOUR_API_KEY");

// Add file
var fileContent = new ByteArrayContent(File.ReadAllBytes(@"C:\path\to\your\document.pdf"));
fileContent.Headers.ContentType = new MediaTypeHeaderValue("application/pdf");
form.Add(fileContent, "files", "document.pdf");

// Add parameters
form.Add(new StringContent(Guid.NewGuid().ToString()), "id");
form.Add(new StringContent("YOUR_CONTEXT_ID"), "contextId");

// Send request
var response = await client.PostAsync("https://apix.docdigitizer.com/sync", form);
var result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);

Understanding the Response

A successful response returns a JSON object containing the extracted data from your document.

Response Structure

{
    "StateText": "COMPLETED",
    "TraceId": "ABC1234",
    "NumberPages": 2,
    "Output": [
        {
            "docType": "Invoice",
            "country": "PT",
            "pages": [1, 2],
            "schema": "Invoice_PT.json",
            "extraction": {
                "invoiceNumber": "INV-2024-001",
                "invoiceDate": "2024-01-15",
                "totalAmount": 1250.00,
                "vendorName": "Example Corp",
                "vendorTaxId": "123456789",
                ...
            }
        }
    ]
}

Response Fields

Field	Type	Description
`StateText`	String	Processing status: COMPLETED, PROCESSING, or ERROR.
`TraceId`	String	Unique trace identifier for debugging and support requests.
`NumberPages`	Integer	Number of pages in the processed document.
`Output`	Array	Array of extraction results. Multi-document PDFs may contain multiple entries.

Extraction Object Fields

Field	Type	Description
`docType`	String	The detected document type (Invoice, Contract, CitizenCard, etc.).
`country`	String	ISO country code for the document.
`pages`	Array	Page numbers where this document was found.
`schema`	String	The schema used for extraction.
`extraction`	Object	The extracted field values. Structure varies by document type.

Error Response

If an error occurs, the response will include error details:

{
    "StateText": "ERROR",
    "TraceId": "XYZ9876",
    "Messages": [
        "Invalid file format. Only PDF files are accepted."
    ]
}