Tabular Data Extraction from Invoice Documents

The task of extracting information from tables is a long-running problem statement in the world of machine learning and image processing. Although the latest accomplishments in the field of deep learning have seen a lot of success, tabular data extraction still remains a challenge due to the vast amount of ways in which tables are represented both visually and structurally. Below are some of the examples:

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Invoice Documents

Many companies process their bills in the form of invoices which contain tables that hold information about the items along with their prices and quantities. This information is generally required to be stored in databases while these invoices get processed.

Traditionally, this information is required to be hand filled into a database software however, this approach has some drawbacks:

1. The whole process is time consuming.

2. Certain errors might get induced during the data entry process.

3. Extra cost of manual data entry.

An invoice automation system can be deployed to address these shortcomings. The idea is to upload the invoice document and the system will read and generate the tabular information in the digital format making the whole process faster and more cost-effective for companies.

Fig. 6

Fig. 6 shows a sample invoice that contains some regular invoice details such as Invoice No, Invoice Date, Company details, and two tables holding transaction information. Now, our goal is to extract the information present in the two tables.

Tabular Information

The problem of extracting tables from invoices can be condensed into 2 main subtasks.

1. Table Detection

2. Tabular Structure Extraction.

What is Table Detection?

Table Detection is the process of identifying and locating tables that are present in a document, usually an image. There are multiple ways to detect tables in an image. Some of the approaches make use of image processing toolkits like OpenCV while some of the other approaches use statistical models on features extracted from the documents such as Text Position and Text Characteristics. Recently more deep learning approaches have been used to detect tables using trained neural networks similar to the ones used in Object Detection.

What is Table Structure Extraction?

Table Structure Extraction is the process of extracting the tabular information once the boundaries of the table are detected through Table Detection. The information within the rows and columns is then extracted and transferred to the desired format, usually CSV or Excel file.

Table Detection using Faster RCNN

Faster RCNN is a neural network model that comes from the RCNN family. It is the successor of Fast RCNN created by Ross Girshick in 2015. The name Faster RCNN is to signify an improvement over the previous model both in terms of training speed and detection speed.

To read more about the model framework, one can access the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

There are many other object detection model architectures that are available for use today. Each model comes with certain advantages and disadvantages in terms of prediction accuracy, model parameter size, inference speed, etc.

For the task of detecting tables in invoice documents, we will select the Faster RCNN model with FPN(Feature Pyramid Network) as a feature extraction network. The model is pre-trained on the ImageNet corpus using ResNET 101 architecture. The ImageNet corpus is a public dataset that consists of more than 20,000 image categories of everyday objects. We will therefore make use of a Pytorch framework to train and test the model.

The above mentioned model gives us a fast inference time and a high Mean Average Precision. It is preferred for cases where a quick real time detection is desired.

First, the model is to be trained using public datasets for Table Detection such as Marmot and UNLV datasets. Next, we further fine-tune the model with our custom labeled dataset. For the purpose of labeling, we will follow the COCO annotation format.

Once trained, the model displayed an accuracy close to 86% on our custom dataset. There are certain scenarios where the model fails to locate the tables such as cases containing watermarks and/or overlapping texts. Tables without borders are also missed in a few instances. However, the model has shown its ability to learn from examples and detect tables in multiple different invoice documents.

Fig. 7

After running inference on the sample invoice from Fig 6, we can see two table boundaries being detected by the model in Fig 7. The first table gets detected with 100% accuracy and the second table is detected with 99% accuracy.

Table Structure Extraction

Once the boundaries of the table are detected by the model, an OCR (Optical Character Reader) mechanism is used to extract the text within the boundaries. The text is then processed using the information that is part of a unique table.

We were able to extract the correct structure of the table, including its headers and line items using logics derived from the invoices. The difficulty of this process depends on the type of invoice format at hand.

There are multiple challenges that one may encounter while building an algorithm to extract structure. Some of them are:

The span of some table columns may overlap making it difficult to determine the boundaries between columns.
The fonts and sizes present within tables may vary from one table to another. The algorithm should be able to accomodate for this variation.
The tables might get split into two pages and detecting the continuation of a table might be challenging.

Certain deep learning approaches have also been published recently to determine the structure of a table. However, training them on custom datasets still remains a challenge.

Fig 8

The final result is then stored in a CSV file and can be edited or stored according to one’s convenience as shown in Fig 8 which displays the first table information.

Conclusion

The deep learning approach to extracting information from structured documents is a step in the right direction. With high accuracy and low running time, the systems can only learn to perform better with more data. The recent and upcoming advancements in computer vision approaches have made processes such as invoice automation significantly accessible and robust.

About the author:

Prateek Sethi is a Data Scientist working at Mantra Labs. His work involves leveraging Artificial Intelligence to create data-driven solutions. Apart from his work he takes a keen interest in football and exploring the outdoors.

Further Reading:

Smart Manufacturing starts with real-time visibility.

Manufacturing companies today generate data by the second through sensors, machines, ERP systems, and MES platforms. But without real-time insights, even the most advanced production lines are essentially flying blind.

Manufacturers are implementing real-time dashboards that serve as control towers for their daily operations, enabling them to shift from reactive to proactive decision-making. These tools are essential to the evolution of Smart Manufacturing, where connected systems, automation, and intelligent analytics come together to drive measurable impact.

Data is available, but what’s missing is timely action.

For many plant leaders and COOs, one challenge persists: operational data is dispersed throughout systems, delayed, or hidden in spreadsheets. And this delay turns into a liability.

Real-time dashboards help uncover critical answers:

What caused downtime during last night’s shift?
Was there a delay in maintenance response?
Did a specific inventory threshold trigger a quality issue?

By converting raw inputs into real-time manufacturing analytics, dashboards make operational intelligence accessible to operators, supervisors, and leadership alike, enabling teams to anticipate problems rather than react to them.

1. Why Static Reports Fall Short

Reports often arrive late—after downtime, delays, or defects have occurred.
Disconnected data across ERP, MES, and sensors limits cross-functional insights.
Static formats lack embedded logic for proactive decision support.

2. What Real-Time Dashboards Enable

Line performance and downtime trends
Track OEE in real time and identify underperforming lines.

Predictive maintenance alerts
Utilize historical and sensor data to identify potential part failures in advance.

Inventory heat maps & reorder thresholds
Anticipate stockouts or overstocks based on dynamic reorder points.

Quality metrics linked to operator actions
Isolate shifts or procedures correlated with spikes in defects or rework.

These insights allow production teams to drive day-to-day operations in line with Smart Manufacturing principles.

3. Dashboards That Drive Action

Role-based dashboards
Dashboards can be configured for machine operators, shift supervisors, and plant managers, each with a tailored view of KPIs.

Embedded alerts and nudges
Real-time prompts, like “Line 4 below efficiency threshold for 15+ minutes,” reduce response times and minimize disruptions.

Cross-functional drill-downs
Teams can identify root causes more quickly because users can move from plant-wide overviews to detailed machine-level data in seconds.

4. What Powers These Dashboards

Data lakehouse integration
Unified access to ERP, MES, IoT sensor, and QA systems—ensuring reliable and timely manufacturing analytics.

ETL pipelines
Real-time data ingestion from high-frequency sources with minimal latency.

Visualization tools
Custom builds using Power BI, or customized solutions designed for frontline usability and operational impact.

Smart Manufacturing in Action: Reducing Market Response Time from 48 Hours to 30 Minutes

Mantra Labs partnered with a North American die-casting manufacturer to unify its operational data into a real-time dashboard. Fragmented data, manual reporting, delayed pricing decisions, and inconsistent data quality hindered operational efficiency and strategic decision-making.

Tech Enablement:

Centralized Data Hub with real-time access to critical business insights.
Automated report generation with data ingestion and processing.
Accurate price modeling with real-time visibility into metal price trends, cost impacts, and customer-specific pricing scenarios.
Proactive market analysis with intuitive Power BI dashboards and reports.

Business Outcomes:

Faster response to machine alerts
Quality incidents traced to specific operator workflows
4X faster access to insights led to improved inventory optimization.

As this case shows, real-time dashboards are not just operational tools—they’re strategic enablers.

(Learn More: Powering the Future of Metal Manufacturing with Data Engineering)

Key Takeaways: Smart Manufacturing Dashboards at a Glance

Aspect	What You Should Know
1. Why Static Reports Fall Short	Delayed insights after issues occur Disconnected systems (ERP, MES, sensors) No real-time alerts or embedded decision logic
2. What Real-Time Dashboards Enable	Track OEE and downtime in real-time Predictive maintenance using sensor data Dynamic inventory heat maps Quality linked to operators
3. Dashboards That Drive Action	Role-based views (operator to CEO) Embedded alerts like “Line 4 down for 15+ mins” Drilldowns from plant-level to machine-level
4. What Powers These Dashboards	Unified Data Lakehouse (ERP + IoT + MES) Real-time ETL pipelines Power BI or custom dashboards built for frontline usability

Conclusion

Smart Manufacturing dashboards aren’t just analytics tools—they’re productivity engines. Dashboards that deliver real-time insight empower frontline teams to make faster, better decisions—whether it’s adjusting production schedules, triggering preventive maintenance, or responding to inventory fluctuations.

Explore how Mantra Labs can help you unlock operations intelligence that’s actually usable.

By submitting this form, you consent to receive occasional updates, newsletters, and exclusive insights from Mantra Labs. We respect your privacy and won't share your information. For more details, see our Privacy Policy.

Tabular Data Extraction from Invoice Documents

Invoice Documents

Tabular Information

What is Table Detection?