Intelligent Document Processing (IDP) Models Benchmark

AI Models Testing On Digital Documents

We are constantly testing large language models for business automation tasks. AI model benchmark is based on digital documents datasets of various layouts and languages that represent documents processed in real projects. We test how well AI models work at extracting data from complex documents by assessing data detection accuracy and completeness.

Testing Criteria

We evaluate document recognition models on multiple criteria:

Recognition Accuracy

How accurately an AI model detects and extracts data from a document, like field titles and values, document layout, text and character blocks.

Processing Duration

How long it takes a model to process one document on average.

Cost

The processing cost per 1000 pages and any additional costs.

Monthly Reports

  • February 2025 — Best AI Services For Automatic Invoice Processing: Amazon Analyze Expense API, Azure AI Document Intelligence, Google Document AI, GPT-4o API, GPT-4o API - text input with 3rd party OCR.

AI Models Benchmark | February 2025

We have analysed 5 most popular AI document detection models to test how well they work “out-of-the-box” on a set of digital invoices and have assessed how well they process invoices of various layouts and languages.

 

Service

Essential Fields Detection Accuracy

All Fields Detection Accuracy

Product List Detection Accuracy

Processing duration Per 1 Page, s

Cost, per 1000 pages

Azure AI Document Intelligence

86%

44,4%

97%

4.3 ± 0.2

$10

GPT-4o using 3d party OCR (Prebuilt Layout model by Azure AI)

84%

57,5%

63%

33.0 ± 2.3

$8,8

GPT-4o only

83%

53,7%

57%

16.9 ± 1.9

$8,8

Google Document AI

70%

38,8%

40%

3.8 ± 0.2

$10

Amazon Analyze Expense API

69%

54%

82%

2.9 ± 0.2

$10 1

 

Notes

1 — Additional costs: $0.008 per page after one million

Looking for the best AI model for your project?

Contact us to get a consultation and model recommendations
Contact Us

FAQ

Currently we have tested Amazon Analyze Expense API, Azure AI Document Intelligence, Google Document AI, GPT-4o API, GPT-4o API - text input with 3rd party OCR. We constantly research new AI models to evaluate and text.
We use a variety of digital documents, including invoices, receipts, contracts, and forms, with different layouts and languages to ensure comprehensive testing.
We update our benchmark monthly to ensure that our evaluations reflect the latest advancements and updates in AI models.
Our benchmark is based on extensive testing with diverse datasets. We use multiple criteria, including recognition accuracy, processing duration, and cost, to provide a thorough assessment.
Yes, we offer custom benchmarking services tailored to specific business requirements. Contact us to discuss your needs.
Yes, our monthly reports include recommendations on the best AI models for specific tasks, such as invoice processing, based on our comprehensive evaluations.
Yes, we evaluate the processing duration of AI models to assess their suitability for real-time document processing tasks.
We continuously monitor updates and new versions of AI models. When a significant update is released, we retest the model to ensure our benchmark remains accurate and up-to-date.
Key factors include recognition accuracy, processing speed, cost, ease of integration, and the ability to handle complex layouts and multiple languages.
The GPT-4o API processes documents directly, while the GPT-4o API - text input with 3rd party OCR uses a third-party Optical Character Recognition (OCR) service to convert documents to text before processing.

Our Services

Let's Work Together!

Do you want to know the total cost of development and realization of the project? Tell us about your requirements, our specialists will contact you as soon as possible.

Please read our warning about a Whatsapp job scam.

BWT Chatbot