Missread string - OCR using SmartOCR

  • 209 Views
  • Last Post 04 February 2022
Anders Hantveit posted this 03 February 2022

Hello people!

Anyone have a tip for how to read more accurate? :

I am supposed to find the ChassisNr using SmartOCR, and have created RegEx for this... I have even included "space" character in the RegEx if it occurs after the initial 2-3 characters because we saw from time to time missreading. And then use script to remove spaces from the string....But now the first occurance of "space" also was encountered between the first 2 characters....

 

I can fix the issue by changing the RegEx again, but getting tired of missread text even when the quality is so good (but small fontsize though)

I wish the OCR engine was as good as human eye, but it fails too often, and I have to make rules to fix the missreading...

Current input is 400 DPI TIF color.

In this string 2 spaces was faulty "detected":

Anyone have a good tip on how to have the OCR engine perform more accurate?

I have not played with "font type settings" yet...

Thanks!

 

 

Senior System Engineer, MCS, PRS and IMS ## Konica Minolta Business Solutions Norway AS

luca.scarpati posted this 04 February 2022

Hi Anders,

 

the Smart OCR module performs a different OCR dependig of mix rules (coordinates of the words, structure of the document...) that's different to the standard OCR which runs full page line by line for example and it could be more accurate or not, depends of the different cases.

We have two engine available, Omnipage (default) and ABBYY(extra) maybe your case should be tested with both and propose something different.

Again in your case the quality of the document during the process could give false results in output, so we suggest try all possible ways:

  1. Zone OCR + Script RegExp
  2. Smart OCR (your actual case)
  3. Script that read the variable %OCRTEXT% and try to extract the "data"

Many script examples of the 1&3 steps suggested above are available in our private section, please try any of it and if you need anything else please also contact our support for more information or specific support on your documents and version.

Have a nice day.

 

Best regards,

Luca

Close