how can i extract data from a pdf file in column format

  • Last Post 30 November 2021
Flavio Mendes posted this 18 November 2021

I have the following file:


And I need to export your content into a CSV file in the following format:





How can I condition to read only the data from the columns and bring only this information?


Thank you for the support!

luca.scarpati posted this 30 November 2021

Hi Flavio,


yes, you can read this information (at least for now), which goes to read a part of the OCR and tries to understand (via RegExp) where a "table" begins and where it ends. So as to take (always via RegExp) the desired values in output in the CSV using a DataExport module with %VARIABLE% created during the process.

Attached is a customer use case and in case of more specific customer support please refer to our support.


Just for info in the next monthly version there will be a new Smart Invoice module that automatically extracts the fields of an invoice cool.


Best regards,


Attached Files