We have analysed 7 most popular AI document detection models to test how well they work “out-of-the-box” on a set of digital invoices and have assessed how well they process invoices of various layouts and languages.
Service |
Invoice Detection Accuracy Without Items |
Invoice Detection Accuracy With Items |
Processing duration Per 1 Page, s |
Cost, per 1000 pages |
---|---|---|---|---|
85,8% |
85,7% |
4.3 ± 0.2 |
$10 |
|
GPT-4o using 3d party OCR (Prebuilt Layout model by Azure AI) |
90,8% |
86,5% |
33.0 ± 2.3 |
$8,8 1 |
88,3% |
89,2% |
16.9 ± 1.9 |
$8,8 |
|
83,8% |
68,1% |
3.8 ± 0.2 |
$10 |
|
91,3% |
91,1% |
2.9 ± 0.2 |
$10 2 |
|
Gemini 2.0 Pro | 90% | 90,2% | 8 ± 1.5 | $4,5 3 |
DeepSeek v3 API (Prebuilt Layout model by Azure AI) | 93,3% | 88,1% | 69 | 11$ |
1 — Additional $10 per 1000 pages from using a text recognition model
2 — Additional $0.008 per page after one million
3 — $1.25, input prompts ≤ 128k tokens, $2.50, input prompts > 128k tokens; $5.00, output prompts ≤ 128k tokens, $10.00, output prompts > 128k tokens