Your First API Call
This step-by-step tutorial walks you through making your first successful API call to DocDigitizer. By the end, you'll have extracted structured data from a PDF document.
Before You Begin
Ensure you have the following ready:
| Item | Description | Status |
|---|---|---|
| API Key | Your DocDigitizer API key | [ ] Ready |
| Context ID | Your processing context UUID | [ ] Ready |
| PDF Document | A sample PDF to process (invoice, ID card, contract, etc.) | [ ] Ready |
| HTTP Client | cURL, PowerShell, Postman, or your preferred tool | [ ] Ready |
If you don’t have your credentials yet, see the Installation guide.
Step 1: Prepare Your Request
Every API call requires three pieces of information in the request body and one in the headers.
Request Components
| Component | Location | Value |
|---|---|---|
x-api-key |
Header | Your API key |
files |
Body (form-data) | Your PDF file |
id |
Body (form-data) | A new UUID for this request |
contextId |
Body (form-data) | Your Context ID |
Generate a Document ID
Each request needs a unique document ID (UUID). Here’s how to generate one:
PowerShell
[guid]::NewGuid().ToString() # Output: 550e8400-e29b-41d4-a716-446655440000
Linux/macOS
uuidgen # Output: 550e8400-e29b-41d4-a716-446655440000
Python
import uuid print(uuid.uuid4()) # Output: 550e8400-e29b-41d4-a716-446655440000
Online
Use an online UUID generator like uuidgenerator.net
Step 2: Send the Request
Choose your preferred method to send the API request.
Option A: Using cURL (Recommended for Testing)
Open your terminal and run:
curl -X POST https://apix.docdigitizer.com/sync \ -H "x-api-key: YOUR_API_KEY" \ -F "files=@/path/to/your/document.pdf" \ -F "id=550e8400-e29b-41d4-a716-446655440000" \ -F "contextId=YOUR_CONTEXT_ID"
Replace:
YOUR_API_KEY– Your actual API key/path/to/your/document.pdf– Path to your PDF file550e8400-...– A newly generated UUIDYOUR_CONTEXT_ID– Your Context ID
Option B: Using PowerShell
# Set your credentials
$apiKey = "YOUR_API_KEY"
$contextId = "YOUR_CONTEXT_ID"
$pdfPath = "C:\path\to\your\document.pdf"
# Create the request
$headers = @{
"x-api-key" = $apiKey
}
$form = @{
files = Get-Item -Path $pdfPath
id = [guid]::NewGuid().ToString()
contextId = $contextId
}
# Send the request
$response = Invoke-RestMethod -Uri "https://apix.docdigitizer.com/sync" `
-Method Post `
-Headers $headers `
-Form $form
# Display the response
$response | ConvertTo-Json -Depth 10
Option C: Using Postman
- Create a new POST request
- Set URL to:
https://apix.docdigitizer.com/sync - Go to Headers tab:
- Add header:
x-api-key=YOUR_API_KEY
- Add header:
- Go to Body tab:
- Select form-data
- Add key
files(type: File) – select your PDF - Add key
id(type: Text) – enter a UUID - Add key
contextId(type: Text) – enter your Context ID
- Click Send
Expected Processing Time
Processing time depends on document complexity:
| Document Type | Typical Time |
|---|---|
| Single-page invoice | 2-5 seconds |
| Multi-page contract | 5-15 seconds |
| Complex multi-document PDF | 15-60 seconds |
Step 3: Understand the Response
A successful API call returns a JSON response with the extracted data.
Success Response Example
{
"StateText": "COMPLETED",
"TraceId": "ABC1234",
"NumberPages": 2,
"Output": [
{
"docType": "Invoice",
"country": "PT",
"pages": [1, 2],
"schema": "Invoice_PT.json",
"extraction": {
"invoiceNumber": "INV-2024-001",
"invoiceDate": "2024-01-15",
"dueDate": "2024-02-15",
"vendorName": "Example Supplier Ltd",
"vendorTaxId": "PT123456789",
"customerName": "Your Company",
"subtotal": 1000.00,
"taxAmount": 230.00,
"totalAmount": 1230.00,
"currency": "EUR",
"lineItems": [
{
"description": "Professional Services",
"quantity": 10,
"unitPrice": 100.00,
"amount": 1000.00
}
]
}
}
]
}
Response Fields Explained
| Field | Description |
|---|---|
StateText |
COMPLETED = success, ERROR = failure |
TraceId |
Unique ID for this request (save this for support inquiries) |
NumberPages |
Total pages in the PDF |
Output |
Array of extracted documents (one PDF can contain multiple documents) |
Output[].docType |
Detected document type (Invoice, Contract, CitizenCard, etc.) |
Output[].extraction |
The extracted field values |
Step 4: Extract the Data
Now that you have a response, here’s how to access the extracted data in different languages.
PowerShell
# After receiving $response from the API call
# Get the first document's extraction
$extraction = $response.Output[0].extraction
# Access specific fields
Write-Host "Invoice Number: $($extraction.invoiceNumber)"
Write-Host "Total Amount: $($extraction.totalAmount)"
Write-Host "Vendor: $($extraction.vendorName)"
# Loop through line items
foreach ($item in $extraction.lineItems) {
Write-Host " - $($item.description): $($item.amount)"
}
Python
import requests
import uuid
# Make the API call
response = requests.post(
"https://apix.docdigitizer.com/sync",
headers={"x-api-key": "YOUR_API_KEY"},
files={"files": open("document.pdf", "rb")},
data={
"id": str(uuid.uuid4()),
"contextId": "YOUR_CONTEXT_ID"
}
)
# Parse the response
result = response.json()
# Check for success
if result["StateText"] == "COMPLETED":
# Get the first document
doc = result["Output"][0]
extraction = doc["extraction"]
print(f"Document Type: {doc['docType']}")
print(f"Invoice Number: {extraction.get('invoiceNumber', 'N/A')}")
print(f"Total Amount: {extraction.get('totalAmount', 'N/A')}")
else:
print(f"Error: {result.get('Messages', ['Unknown error'])}")
JavaScript (Node.js)
const FormData = require('form-data');
const fs = require('fs');
const fetch = require('node-fetch');
const { v4: uuidv4 } = require('uuid');
async function extractDocument() {
const form = new FormData();
form.append('files', fs.createReadStream('document.pdf'));
form.append('id', uuidv4());
form.append('contextId', 'YOUR_CONTEXT_ID');
const response = await fetch('https://apix.docdigitizer.com/sync', {
method: 'POST',
headers: { 'x-api-key': 'YOUR_API_KEY' },
body: form
});
const result = await response.json();
if (result.StateText === 'COMPLETED') {
const doc = result.Output[0];
const extraction = doc.extraction;
console.log(`Document Type: ${doc.docType}`);
console.log(`Invoice Number: ${extraction.invoiceNumber}`);
console.log(`Total Amount: ${extraction.totalAmount}`);
} else {
console.error('Error:', result.Messages);
}
}
extractDocument();
C# (.NET)
using System.Text.Json;
// After making the HTTP request and getting responseString
var result = JsonDocument.Parse(responseString);
var root = result.RootElement;
if (root.GetProperty("StateText").GetString() == "COMPLETED")
{
var output = root.GetProperty("Output")[0];
var extraction = output.GetProperty("extraction");
Console.WriteLine($"Document Type: {output.GetProperty("docType")}");
Console.WriteLine($"Invoice Number: {extraction.GetProperty("invoiceNumber")}");
Console.WriteLine($"Total Amount: {extraction.GetProperty("totalAmount")}");
}
Complete Working Examples
Copy and paste these complete examples to get started quickly.
Complete PowerShell Script
# DocDigitizer API - Complete Example
# Save as: Extract-Document.ps1
param(
[Parameter(Mandatory=$true)]
[string]$PdfPath,
[Parameter(Mandatory=$true)]
[string]$ApiKey,
[Parameter(Mandatory=$true)]
[string]$ContextId
)
# Validate file exists
if (-not (Test-Path $PdfPath)) {
Write-Error "File not found: $PdfPath"
exit 1
}
# Create request
$headers = @{ "x-api-key" = $ApiKey }
$documentId = [guid]::NewGuid().ToString()
$form = @{
files = Get-Item -Path $PdfPath
id = $documentId
contextId = $ContextId
}
Write-Host "Sending document to DocDigitizer API..."
Write-Host "Document ID: $documentId"
try {
$response = Invoke-RestMethod -Uri "https://apix.docdigitizer.com/sync" `
-Method Post `
-Headers $headers `
-Form $form
if ($response.StateText -eq "COMPLETED") {
Write-Host "`nExtraction successful!" -ForegroundColor Green
Write-Host "Trace ID: $($response.TraceId)"
Write-Host "Pages: $($response.NumberPages)"
Write-Host "`nExtracted Documents:"
foreach ($doc in $response.Output) {
Write-Host "`n--- $($doc.docType) ---"
$doc.extraction | Format-List
}
} else {
Write-Host "`nExtraction failed!" -ForegroundColor Red
Write-Host "Messages: $($response.Messages -join ', ')"
}
}
catch {
Write-Error "API request failed: $_"
}
# Usage: .\Extract-Document.ps1 -PdfPath "invoice.pdf" -ApiKey "your_key" -ContextId "your_context"
Complete Python Script
#!/usr/bin/env python3
"""
DocDigitizer API - Complete Example
Save as: extract_document.py
"""
import requests
import uuid
import sys
import json
import os
def extract_document(pdf_path, api_key, context_id):
"""Extract data from a PDF document using DocDigitizer API."""
# Validate file exists
if not os.path.exists(pdf_path):
print(f"Error: File not found: {pdf_path}")
return None
# Generate unique document ID
document_id = str(uuid.uuid4())
print(f"Sending document to DocDigitizer API...")
print(f"Document ID: {document_id}")
# Make API request
try:
response = requests.post(
"https://apix.docdigitizer.com/sync",
headers={"x-api-key": api_key},
files={"files": open(pdf_path, "rb")},
data={
"id": document_id,
"contextId": context_id
},
timeout=300 # 5 minute timeout
)
response.raise_for_status()
result = response.json()
except requests.exceptions.RequestException as e:
print(f"API request failed: {e}")
return None
# Process response
if result.get("StateText") == "COMPLETED":
print(f"\nExtraction successful!")
print(f"Trace ID: {result.get('TraceId')}")
print(f"Pages: {result.get('NumberPages')}")
for i, doc in enumerate(result.get("Output", []), 1):
print(f"\n--- Document {i}: {doc.get('docType')} ---")
print(json.dumps(doc.get("extraction", {}), indent=2))
return result
else:
print(f"\nExtraction failed!")
print(f"Messages: {result.get('Messages', ['Unknown error'])}")
return None
if __name__ == "__main__":
if len(sys.argv) != 4:
print("Usage: python extract_document.py ")
sys.exit(1)
extract_document(sys.argv[1], sys.argv[2], sys.argv[3])
Troubleshooting
Common Issues
| Problem | Cause | Solution |
|---|---|---|
401 Unauthorized |
Invalid API key | Check your API key is correct and has no extra spaces |
400 Bad Request |
Missing required fields | Ensure files, id, and contextId are all present |
Invalid file format |
File is not a valid PDF | Verify the file is a genuine PDF (check magic bytes) |
| Timeout | Large or complex document | Increase timeout; consider splitting large PDFs |
| Empty extraction | Document type not supported or poor quality | Check document quality; contact support if issue persists |