Quick start guide

Get started with DocDigitizer's AI-powered document extraction API in minutes. This guide walks you through your first API call to extract structured data from documents.

Prerequisites

Before you begin, ensure you have the following:

  • API Key – Your unique API key provided by DocDigitizer. If you don’t have one, contact our sales team.
  • Context ID – A UUID that identifies your processing context/configuration. This is provided during your account setup.
  • A PDF document – The document you want to process (invoices, contracts, ID cards, etc.)

Authentication

All API requests require authentication using an API key. Include your API key in the request header:

HeaderValueDescription
x-api-keyYour API KeyRequired. Your unique API key for authentication.

Important: Keep your API key secure. Never expose it in client-side code or public repositories.


Your First API Call

API Endpoint

POST https://apix.docdigitizer.com/sync

Request Format

The API accepts multipart/form-data requests with the following parameters:

ParameterTypeRequiredDescription
filesFileYesThe PDF document to process.
idUUIDYesA unique identifier for this document/request. Generate a new UUID for each request.
contextIdUUIDYesYour context identifier that determines the processing pipeline and schema configuration.

Example: cURL

curl -X POST https://apix.docdigitizer.com/sync \
  -H "x-api-key: YOUR_API_KEY" \
  -F "files=@/path/to/your/document.pdf" \
  -F "id=550e8400-e29b-41d4-a716-446655440000" \
  -F "contextId=YOUR_CONTEXT_ID"

Example: PowerShell

$headers = @{
    "x-api-key" = "YOUR_API_KEY"
}

$form = @{
    files = Get-Item -Path "C:\path\to\your\document.pdf"
    id = "550e8400-e29b-41d4-a716-446655440000"
    contextId = "YOUR_CONTEXT_ID"
}

$response = Invoke-RestMethod -Uri "https://apix.docdigitizer.com/sync" `
    -Method Post `
    -Headers $headers `
    -Form $form

$response | ConvertTo-Json -Depth 10

Example: Python

import requests
import uuid

url = "https://apix.docdigitizer.com/sync"

headers = {
    "x-api-key": "YOUR_API_KEY"
}

# Generate a unique document ID
document_id = str(uuid.uuid4())

files = {
    "files": open("/path/to/your/document.pdf", "rb")
}

data = {
    "id": document_id,
    "contextId": "YOUR_CONTEXT_ID"
}

response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

Example: JavaScript (Node.js)

const FormData = require('form-data');
const fs = require('fs');
const fetch = require('node-fetch');
const { v4: uuidv4 } = require('uuid');

const form = new FormData();
form.append('files', fs.createReadStream('/path/to/your/document.pdf'));
form.append('id', uuidv4());
form.append('contextId', 'YOUR_CONTEXT_ID');

fetch('https://apix.docdigitizer.com/sync', {
    method: 'POST',
    headers: {
        'x-api-key': 'YOUR_API_KEY'
    },
    body: form
})
.then(response => response.json())
.then(data => console.log(data));

Example: C# (.NET)

using var client = new HttpClient();
using var form = new MultipartFormDataContent();

// Add API key header
client.DefaultRequestHeaders.Add("x-api-key", "YOUR_API_KEY");

// Add file
var fileContent = new ByteArrayContent(File.ReadAllBytes(@"C:\path\to\your\document.pdf"));
fileContent.Headers.ContentType = new MediaTypeHeaderValue("application/pdf");
form.Add(fileContent, "files", "document.pdf");

// Add parameters
form.Add(new StringContent(Guid.NewGuid().ToString()), "id");
form.Add(new StringContent("YOUR_CONTEXT_ID"), "contextId");

// Send request
var response = await client.PostAsync("https://apix.docdigitizer.com/sync", form);
var result = await response.Content.ReadAsStringAsync();
Console.WriteLine(result);

Understanding the Response

A successful response returns a JSON object containing the extracted data from your document.

Response Structure

{
    "StateText": "COMPLETED",
    "TraceId": "ABC1234",
    "NumberPages": 2,
    "Output": [
        {
            "docType": "Invoice",
            "country": "PT",
            "pages": [1, 2],
            "schema": "Invoice_PT.json",
            "extraction": {
                "invoiceNumber": "INV-2024-001",
                "invoiceDate": "2024-01-15",
                "totalAmount": 1250.00,
                "vendorName": "Example Corp",
                "vendorTaxId": "123456789",
                ...
            }
        }
    ]
}

Response Fields

FieldTypeDescription
StateTextStringProcessing status: COMPLETED, PROCESSING, or ERROR.
TraceIdStringUnique trace identifier for debugging and support requests.
NumberPagesIntegerNumber of pages in the processed document.
OutputArrayArray of extraction results. Multi-document PDFs may contain multiple entries.

Extraction Object Fields

FieldTypeDescription
docTypeStringThe detected document type (Invoice, Contract, CitizenCard, etc.).
countryStringISO country code for the document.
pagesArrayPage numbers where this document was found.
schemaStringThe schema used for extraction.
extractionObjectThe extracted field values. Structure varies by document type.

Error Response

If an error occurs, the response will include error details:

{
    "StateText": "ERROR",
    "TraceId": "XYZ9876",
    "Messages": [
        "Invalid file format. Only PDF files are accepted."
    ]
}