Case Studies

Data Extraction AI For Old Construction Drawings

Industry:
Construction & Architecture
Client:
Confidential
Platform:
Cloud
Duration:
2 months
Cheaper than manual labor
Data Extraction AI For Old Construction Drawings

Project Summary

An AI-powered system for processing old construction drawings. Rotation of scanned documents, extraction of key data fields.

Services

AI prototype development
AI system development

Team

1 Project manager
1 AI developer

Target Audience

Construction bureaus
Architectural firms
Digital archives

Challenge

Our client operates a system that integrates customers with large volumes of accumulated construction drawings, some of which are very old, scanned paper documents. Before these drawings can be uploaded into the system, they need to be indexed by extracting key information from the title block, such as the drawing number, scale, revision, and other parameters. As this task is highly repetitive and time-consuming, it has been outsourced to manual labor, but this approach is slow and costly. The client sought a faster, more efficient solution using machine learning and AI.

Solution

The resulting system is an AI-powered solution that automates the extraction of key data fields from construction drawings and corrects their orientation for seamless integration into the client’s system. It combines Azure Document Intelligence for data extraction with a two-step image orientation correction process using Tesseract and PaddleOCR.

Data Extraction with Azure Document Intelligence

Azure Document Intelligence forms the backbone of the system. We trained a custom model using sample drawings provided by the client, teaching it to recognize and extract six key fields: title, type, drawing number, sheet number, scale, and revision.

Unlike generic OCR tools, Azure’s ability to learn from labeled data ensures high accuracy, even with the variability found in old, scanned drawings. This eliminates the need for manual intervention and significantly speeds up the indexing process.

Image Orientation Correction

To ensure all drawings are correctly aligned before data extraction, the system employs a two-step orientation correction process:

  • Tesseract: Identifies the rotation angle of the entire image (0°, 90°, 180°, or 270°) and rotates it accordingly.
  • PaddleOCR: After initial rotation, PaddleOCR detects all words and their bounding boxes, calculates the dominant text orientation, and fine-tunes the alignment. This is particularly useful for drawings with mixed vertical and horizontal text.

Cost-Effective Automation

The system was designed with cost efficiency in mind. By automating the data extraction and orientation correction processes, it eliminates the need for expensive manual labor. Budget estimates show that the automated solution is already more cost-effective than outsourcing, while also delivering faster and more consistent results.

Results

The results exceeded expectations. By training the Azure AI model, we achieved highly accurate data extraction for the six target fields. The automated solution not only outperformed manual labor in terms of speed and cost but also introduced an additional benefit: automatic orientation correction, a task that was previously ignored in manual processes.

We estimated the budget for this automation and found it to be 3 times less expensive than outsourcing to manual labor. Additionally, the inclusion of orientation correction adds significant value, as it ensures all documents are uniformly aligned within the system.

Let's Work Together!

Do you want to know the total cost of development and realization of the project? Tell us about your requirements, our specialists will contact you as soon as possible.

BWT Chatbot