Quick start guide
Get started with DocDigitizer's AI-powered document extraction API in minutes. This guide walks you through your first API call to extract structured data from documents.
Prerequisites
Before you begin, ensure you have the following:
- API Key – Your unique API key provided by DocDigitizer. If you don’t have one, contact our sales team.
- Context ID – A UUID that identifies your processing context/configuration. This is provided during your account setup.
- A PDF document – The document you want to process (invoices, contracts, ID cards, etc.)
Authentication
All API requests require authentication using an API key. Include your API key in the request header:
| Header | Value | Description |
|---|---|---|
x-api-key |
Your API Key | Required. Your unique API key for authentication. |
Important: Keep your API key secure. Never expose it in client-side code or public repositories.
Your First API Call
API Endpoint
POST https://apix.docdigitizer.com/sync
Request Format
The API accepts multipart/form-data requests with the following parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
files |
File | Yes | The PDF document to process. |
id |
UUID | Yes | A unique identifier for this document/request. Generate a new UUID for each request. |
contextId |
UUID | Yes | Your context identifier that determines the processing pipeline and schema configuration. |
Example: cURL
curl -X POST https://apix.docdigitizer.com/sync \ -H "x-api-key: YOUR_API_KEY" \ -F "files=@/path/to/your/document.pdf" \ -F "id=550e8400-e29b-41d4-a716-446655440000" \ -F "contextId=YOUR_CONTEXT_ID"
Example: PowerShell
$headers = @{
"x-api-key" = "YOUR_API_KEY"
}
$form = @{
files = Get-Item -Path "C:\path\to\your\document.pdf"
id = "550e8400-e29b-41d4-a716-446655440000"
contextId = "YOUR_CONTEXT_ID"
}
$response = Invoke-RestMethod -Uri "https://apix.docdigitizer.com/sync" `
-Method Post `
-Headers $headers `
-Form $form
$response | ConvertTo-Json -Depth 10
Example: Python
import requests
import uuid
url = "https://apix.docdigitizer.com/sync"
headers = {
"x-api-key": "YOUR_API_KEY"
}
# Generate a unique document ID
document_id = str(uuid.uuid4())
files = {
"files": open("/path/to/your/document.pdf", "rb")
}
data = {
"id": document_id,
"contextId": "YOUR_CONTEXT_ID"
}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
Example: JavaScript (Node.js)
const FormData = require('form-data');
const fs = require('fs');
const fetch = require('node-fetch');
const { v4: uuidv4 } = require('uuid');
const form = new FormData();
form.append('files', fs.createReadStream('/path/to/your/document.pdf'));
form.append('id', uuidv4());
form.append('contextId', 'YOUR_CONTEXT_ID');
fetch('https://apix.docdigitizer.com/sync', {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY'
},
body: form
})
.then(response => response.json())
.then(data => console.log(data));
Example: C# (.NET)
using var client = new HttpClient();
using var form = new MultipartFormDataContent();
// Add API key header
client.DefaultRequestHeaders.Add("x-api-key", "YOUR_API_KEY");
// Add file
var fileContent = new ByteArrayContent(File.ReadAllBytes(@"C:\path\to\your\document.pdf"));
fileContent.Headers.ContentType = new MediaTypeHeaderValue("application/pdf");
form.Add(fileContent, "files", "document.pdf");
// Add parameters
form.Add(new StringContent(Guid.NewGuid().ToString()), "id");
form.Add(new StringContent("YOUR_CONTEXT_ID"), "contextId");
// Send request
var response = await client.PostAsync("https://apix.docdigitizer.com/sync", form);
var result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);
Understanding the Response
A successful response returns a JSON object containing the extracted data from your document.
Response Structure
{
"StateText": "COMPLETED",
"TraceId": "ABC1234",
"NumberPages": 2,
"Output": [
{
"docType": "Invoice",
"country": "PT",
"pages": [1, 2],
"schema": "Invoice_PT.json",
"extraction": {
"invoiceNumber": "INV-2024-001",
"invoiceDate": "2024-01-15",
"totalAmount": 1250.00,
"vendorName": "Example Corp",
"vendorTaxId": "123456789",
...
}
}
]
}
Response Fields
| Field | Type | Description |
|---|---|---|
StateText |
String | Processing status: COMPLETED, PROCESSING, or ERROR. |
TraceId |
String | Unique trace identifier for debugging and support requests. |
NumberPages |
Integer | Number of pages in the processed document. |
Output |
Array | Array of extraction results. Multi-document PDFs may contain multiple entries. |
Extraction Object Fields
| Field | Type | Description |
|---|---|---|
docType |
String | The detected document type (Invoice, Contract, CitizenCard, etc.). |
country |
String | ISO country code for the document. |
pages |
Array | Page numbers where this document was found. |
schema |
String | The schema used for extraction. |
extraction |
Object | The extracted field values. Structure varies by document type. |
Error Response
If an error occurs, the response will include error details:
{
"StateText": "ERROR",
"TraceId": "XYZ9876",
"Messages": [
"Invalid file format. Only PDF files are accepted."
]
}