Kavian Braanaas

Content Writer

Reading time: 3 min.
January 13, 2025

How to Extract Data From PDF Tables to Excel With AI

In this post, we’ll examine the challenges related to data extraction from tabular PDFs, a common bottleneck for businesses managing large volumes of document processing. We'll also demonstrate how to use Cradl AI to successfully automate data extraction from complex tables, and integrating with Excel via automation tools like Zapier to transform unstructured PDFs into actionable data.

A common business challenge

Extracting data from tables in documents like PDFs and images is a significant hurdle for businesses, especially when dealing with unstructured formats. While APIs (Application Programming Interfaces) or EDI (Electronic Data Interchange) are becoming standard in digitalisation, many processes still rely on manually exchanging PDFs via email.

Ironically, tables—designed for structured data—are often stored in unstructured formats like PDFs or images. Their complex layouts can confuse tools like OCR (Optical Character Recognition). As a result, manual data entry remains the go-to solution for many, despite being time-consuming and error-prone.

Why traditional OCR struggles with tables

Traditional OCR struggles with tables because it lacks the ability to understand the relationships between rows and columns, treating each piece of text in isolation. This becomes problematic when tables have varying structures, such as merged cells, inconsistent column widths, or different row counts. OCR is often trained on specific, consistent layouts, making it less adaptable to these variations. Moreover, tables in PDFs or images can be distorted, with issues like skewed text, low resolution, or overlapping elements, which further impair OCR’s ability to extract data accurately and consistently.

Why AI is a game-changer

AI overcomes these challenges by combining text recognition with contextual understanding. Unlike OCR, AI-powered models recognize the structure of tables and extract data while preserving the relationships between rows and columns. This flexibility allows AI to accurately process complex layouts and adapt to various table formats, even when there’s significant variation. AI transforms unstructured PDFs into clean, structured data, eliminating the need for error-prone manual entry and streamlining workflows.

Step-by-step: using Cradl AI for table extraction

Let’s now explore how to successfully extract data from tables to Excel by using Cradl AI.

Before we begin, make sure you’ve created a Cradl AI account!

1. Define data points to extract

Once inside the Cradl AI app, your first task is to define a schema for the data you want to extract. Tables, with their structured rows and columns, are ideal for using Cradl AI's «Line Items» field.

The Line Items field is specifically designed for tabular data and leverages Cradl AI's pre-trained model to extract information efficiently. Whether your table has a few rows or hundreds, this field captures all relevant data points, such as descriptions, unit prices, tax amounts, and totals, in one go.

Screenshot of the AI model configuration UI inside Cradl AI


2. Extract data from your first table

Once you have configured and and saved your field schema, you're ready to extract data from a document! You can do it by simply uploading a document. When the processing is complete, you can review the results in the validation interface.

A particularly handy feature is the visual data-location mapping. Click on an extracted data field, and the location in the document gets highlighted:


This, along with the confidence scores assigned to each data field by the AI, makes it easy for you to verify the AI’s output before exporting it to Excel.

3. Create an Excel sheet for your data

You now have an AI model that extracts data, so the next step is to create an Excel sheet to store the extracted data.

Cradl AI connects with Excel by using popular automation tools like Zapier (see more integrations), so make sure you create a new spreadsheet in a cloud file management service that integrates with Zapier or Power Automate, such as Google Drive or Microsoft OneDrive.

4. Export the data from Cradl AI to Excel

You have a model that extracts data and a spreadsheet to store the data, now it is time to connect the two.

As mentioned above, Cradl AI integrates with Excel by using automation tools like Zapier (a direct integration between Cradl AI and Excel will be released in the future).

Create a new Zap

Head over to Zapier to create a free account and create your first «Zap».

If you are unfamiliar with Zapier, refer to Cradl AI’s video tutorial on connecting with Zapier for detailed guidance.

Choosing connectors

Use Cradl AI's «Document Parsing Completed» trigger to automatically extract data whenever a document is uploaded and validated in Cradl AI. Because we want to write multiple rows to our Excel sheet in one batch, we'll use Excel's «Add Row(s)» action.


When you're mapping Cradl AI's extracted data fields to your spreadsheet's headers in the «Add row(s)» action, you'll notice that you can choose from way more extracted values than the handful you defined in your spreadsheet's headers and AI model.

99% of the time you are looking for those values that are prefixed with «Validated Predictions» and suffixed with «Value» , such as «Validate Predictions Services Description Value», «Validate Predictions Services Unit Value», and so on.

Your data extraction flow is ready to run

Run a test to ensure your flow works. If configured correctly, Cradl AI will extract the table data and populate your Excel spreadsheet automatically!

Summary

Extracting data from PDF tables can be set up within minutes using with Cradl AI. By using an AI model with the «Line Items» field, you can transform unstructured tables into structured JSON data. After validation, this data can seamlessly integrate into your workflows using Cradl AI’s pre-built integrations or APIs.

Get started for free

We’ll help get you started with your document automation journey.

Schedule a free demo with our team today!