Near 100% Accuracy:
How We Do It

Only (near) 100% Accuracy is good enough!

Data extraction is not an accuracy game. A Bank can’t survive with having a mistake in extracting data from documents in “just” 20% of the times that lead to a wrong KYC or loan approval process. Also the Bank can’t survive with 10%, 1% or even 0.1%. The business and reputational risks are too high.

OCR and other Cognitive Data capture solutions extensively talk about accuracy rate. But that is a flawed discussion. What those providers are doing is actually transferring the burden and responsibility of data curation to the customer.

DocDigitizer not only guarantees near 100% accuracy but also we back that through SLAs in our contracts.

Why and How DocDigitizer can claim (near) 100% accuracy?

Instead of shifting the burden to the customer, we acknowledge that you cannot live without near 100% accuracy and that managing the entire data curation and revision is a wanting task. You have not only to manage technology but also and most important to manage the revision team. You have to manage demands peaks, off time, upscale and downscale.

Actually, Cognitive Data Capture is still far away from a full AI / ML 100% accurate data extraction. We like to compare it with the state of the art of self-driving. It works kind of well in very well behaved scenarios. The problem is, unlike roads and cars that are stable for decades, documents formats and the types of information in documents are actually growing fast.

The figure below shows our process. We run proprietary AI / ML algorithms to extract data. After that we leverage our human-in-the-loop team of curate and edit information. Humans are a core part of the process and they are required for the corner cases that still technology doesn’t solve.

Name	Provider	Finality	Validity	Type
wordpress_{hash}	Wordpress	WordPress uses the login wordpress_{hash} cookie to store authentication details. Its use is limited to the Administration Screen area, /wp-admin/	session	Core
wordpress_logged_in_{hash}	Wordpress	Remember User session. WordPress sets the after login wordpress_logged_in_{hash} cookie, which indicates when you’re logged in, and who you are, for most interface use.	session	Core
wp-settings-{user_id}	Wordpress	Customization cookie. Used to persist a user’s wp-admin configuration. The ID is the user’s ID. This is used to customize the view of admin interface, and possibly also the main site interface.	1 year	Core
cookielawinfo-checkbox-functional	Cookie/GDPR	This cookie stores if a visitor has accepted "functional" cookies.	choose	Legal
cookielawinfo-checkbox-performance	Cookie/GDPR	This cookie stores if a visitor has accepted "performance" cookies.	choose	Legal
viewed_cookie_policy	Cookie/GDPR	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not the user has consented to the use of cookies. It does not store any personal data.	choose	Legal

Name	Provider	Finality	Validity	Type
wp-wpml_current_language	WPML	Stores the current language. This cookie is enabled by default on sites that use the Language filtering for AJAX operations feature.	session	Multilanguage
wp-wpml_current_admin_language_{hash}	WPML	Stores the current WordPress administration area language.	session	Multilanguage
icl_visitor_lang_js	WPML	Stores the redirected language. This cookie is enabled for all site visitors if you use the Browser language redirect feature.	session	Multilanguage

Name	Provider	Finality	Validity	Type
_gcl_au	Google	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.	3 months	Analytics
_ga	Google	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomlygenerated number to recognize unique visitors.	2 years	Analytics
_gid	Google	installedby Google Analytics, _gid cookie stores information on how visitors usea website, while also creating an analytics report of the website'sperformance. Some of the data that are collected include the number ofvisitors, their source, and the pages they visit anonymously.	1 day	Analytics
_gat_UA-108095224-1	Google	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.	1 minute	Analytics
_hjTLDTest	Hotjar	To determine the most generic cookie path that has to be used instead of the page hostname, Hotjar sets the _hjTLDTest cookie to store different URL substring alternatives until it fails.	session	Analytics
_hjFirstSeen	Hotjar	Hotjar sets this cookie to identify a new user’s first session. It stores a true/false value, indicating whether it was the first time Hotjar saw this user.	30 minutes	Analytics
_hjAbsoluteSessionInProgress	Hotjar	Hotjar sets this cookie to detect the first pageview session of a user. This is a True/False flag set by the cookie.	30 minutes	Analytics

Name	Provider	Finality	Validity	Type
_fbp	Facebook	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.	3 months	Advertisement
test_cookie	.doubleclick.net	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.	15 minutes	Advertisement
m	m.stripe.com	Accept payments and move money globally with Stripe’s powerful APIs and software solutions designed to help you capture more revenue.	2 years	Payment

PowerCapture

Document classifier

WorldObjects

By Industry

By Use Case

Services

Success Stories

Partner Program

Find a Partner

On-Demand Content

Events

Report

Videos

Documentation

Near 100% Accuracy:
How We Do It

Only (near) 100% Accuracy is good enough!

Why and How DocDigitizer can claim (near) 100% accuracy?

You send us a document, you will get (near) 100% accurate extracted data

That simple!

Explore all of our e-content within various topics.

Get Started

PowerCapture

Document classifier

WorldObjects

By Industry

By Use Case

Services

Success Stories

Partner Program

Find a Partner

On-Demand Content

Events

Report

Videos

Documentation

Near 100% Accuracy:How We Do It

Only (near) 100% Accuracy is good enough!

Why and How DocDigitizer can claim (near) 100% accuracy?

You send us a document, you will get (near) 100% accurate extracted data

That simple!

Explore all of our e-content within various topics.

Get Started

Near 100% Accuracy:
How We Do It