Intelligent Document Processing (IDP) Model Benchmarks

AI Models Testing On Digital Documents

We are constantly testing large language models for business automation tasks. AI model benchmarks are based on digital documents datasets of various layouts and languages that represent documents processed in real projects. We test how well AI models work at extracting data from complex documents by assessing data detection accuracy and completeness.

Testing Criteria

We evaluate document recognition models on multiple criteria:

Recognition Accuracy

How accurately an AI model detects and extracts data from a document, like field titles and values, document layout, text and character blocks.

Processing Duration

How long it takes a model to process one document on average.

Cost

The processing cost per 1000 pages and any additional costs.

Monthly Reports

  • February 2025 — Best AI Services For Automatic Invoice Processing: Amazon Analyze Expense API, Azure AI Document Intelligence, Google Document AI, GPT-4o API, GPT-4o API - text input with 3rd party OCR.

AI Models Benchmarks | February 2025

We have analysed 5 most popular AI document detection models to test how well they work “out-of-the-box” on a set of digital invoices and have assessed how well they process invoices of various layouts and languages.

Service

All Fields Detection Accuracy

Essential Fields Detection Accuracy

Product List Detection Accuracy

Processing duration, s

Cost, per 1000 pages

Cost, additional

Amazon Analyze Expense API

54%

69%

82%

2.9 ± 0.2

$10

$0.008 per page after one million per month

Azure AI Document Intelligence

44,4%

86%

97%

4.3 ± 0.2

$10

-

Google Document AI

38,8%

70%

40%

3.8 ± 0.2

$10

-

GPT-4o using 3d party OCR (Prebuilt Layout model by Azure AI)

57,5%

84%

63%

33.0 ± 2.3

$10+ est. $10 ($2.50 / 1M input tokens

$10.00 / 1M output tokens)

-

 

 

GPT-4o only

53,7%

83%

57%

16.9 ± 1.9

est. $10 ($2.50 / 1M input tokens

$10.00 / 1M output tokens)

-

Looking for the best AI model for your project?

Contact us to get a consultation and model recommendations
Contact Us

FAQ

Currently we have tested Amazon Analyze Expense API, Azure AI Document Intelligence, Google Document AI, GPT-4o API, GPT-4o API - text input with 3rd party OCR. We constantly research new AI models to evaluate and text.
We use a variety of digital documents, including invoices, receipts, contracts, and forms, with different layouts and languages to ensure comprehensive testing.
We update our benchmarks monthly to ensure that our evaluations reflect the latest advancements and updates in AI models.
Our benchmarks are based on extensive testing with diverse datasets. We use multiple criteria, including recognition accuracy, processing duration, and cost, to provide a thorough assessment.
Yes, we offer custom benchmarking services tailored to specific business requirements. Contact us to discuss your needs.
Yes, our monthly reports include recommendations on the best AI models for specific tasks, such as invoice processing, based on our comprehensive evaluations.
Yes, we evaluate the processing duration of AI models to assess their suitability for real-time document processing tasks.
We continuously monitor updates and new versions of AI models. When a significant update is released, we retest the model to ensure our benchmarks remain accurate and up-to-date.
Key factors include recognition accuracy, processing speed, cost, ease of integration, and the ability to handle complex layouts and multiple languages.
The GPT-4o API processes documents directly, while the GPT-4o API - text input with 3rd party OCR uses a third-party Optical Character Recognition (OCR) service to convert documents to text before processing.

Our Services

Let's Work Together!

Do you want to know the total cost of development and realization of the project? Tell us about your requirements, our specialists will contact you as soon as possible.

Please read our warning about a Whatsapp job scam.

BWT Chatbot