Extract an Account number from various documents using VB script and name the file

  • 212 Views
  • Last Post 21 October 2022
Ianw posted this 21 October 2022

Hi

i am using a VB script ( below ) to search for an account number starting with various numbers / letters then capturing the whole account number to name the file.

eg. starting account numbers such as : MP0,AP0,XP0,0P0,30P0,33P0 etc, if the account is not found, it names it ##NOT FOUND##

My script runs fine but if it looks for 30P0, it will name the file 0P0 as this is in the string of accounts to search. I assume this is a windows thing ?

if i remove 0P0 from the search string in the script then it will name the file 30P0 as it should.

is there anything that can be modified in the script to allow it to search for 30P0 and 0P0 naming them correctly ?

an actual account number is in the format 30P012345

hope this makes sense!!

ocrText = Metadata.Values("OCRTEXT")

On Error Resume Next '###needed to skip the error handling

'Variables contain the regular expressions that will be use to look for the required information, will need to be modified when more combinations may occur

Dim Target, Target2

Target = ".*(EP0|NP0|MP0|AP0|SP0|FP0|WP0|XP0|HP0|TP0|KP0|BP0|UP0|UPO|RPO|LPO|RP0|LP0|BPO|CP0|MPo|MPO|APo|SPo|UPo|FPo|FPO|WPo|XPo|HPo|TPo|KPo|BPo|CPo|EPo|NPo|29PO|30PO|31PO|29P0|30P0|31P0|32P0|32PO|33PO|33P0|3opo|3op0|30PO|30po|dpo|Dp0|Dpo|OPO|opo|0p0|0P0|OP0)([A-z]*[0-9]+)"

'Target2 = ""

 

Dim arrLines

arrLines = Split(ocrText, "\r\n")

 

call Metadata.SetValues("MY_TEXT", "## NOT FOUND ##")

'call Metadata.SetValues("MY_SUPPLIER", "## NOT FOUND ##")

 

Dim matchedValue

 

'Then you can iterate it like this 

For Each strline in arrLines 

match = GetFirstMatch(target, strline)

If match <> "" Then

call Metadata.SetValues("MY_TEXT", match)

End If

'match = GetFirstMatch(target2, strline)

'If match <> "" Then

'call Metadata.SetValues("MY_SUPPLIER", match)

'End If

Next

 

 

' Get the first objRE  submatch from the string

' Returns empty string if not found, otherwise returns the matched string

Function GetFirstMatch(PatternToMatch, StringToSearch)

Dim objRE , CurrentMatch, objMatch

 

Set objRE  = New RegExp

objRE.Pattern = PatternToMatch

objRE.IgnoreCase = True

objRE.Global = False

 

Set objMatch = objRE.Execute(StringToSearch)

 

GetFirstMatch = ""

' We should get only 1 match since the Global property is FALSE

If objMatch.Count = 1 Then

' Item(0) is the (first and only) matching target parts,

' Submatches(1) is the substring between the second set of

' parentheses (all indexes are zero based)

GetFirstMatch = objMatch.Item(0).Submatches(0) + objMatch.Item(0).Submatches(1)

End If

 

Set objRE  = Nothing

End Function

 

 

Attached Files

Order By: Standard | Newest | Votes
luca.scarpati posted this 21 October 2022

 Hi Ian,

 

if I understand correctly what you need innocent , just change this line:

   GetFirstMatch = objMatch.Item(0).Submatches(0)

In this way for example with account number 30P012345 in output you will find in the MY_TEXT variable the value 0P0

 

Have a nice weekend!

 

Best regarda,

Luca

Ianw posted this 21 October 2022

Thanks For the reply, i am no VB expert , im lost, what are you meaning ?

luca.scarpati posted this 21 October 2022

Hi Ian,

 

Open your script and replace your line code that start with GetFirstMatch = ...... 

with this: GetFirstMatch = objMatch.Item(0).Submatches(0)

 

Please if you need something specific write to our support, sending more information for your purpose...maybe they can help you or put you on the right ways with some example.

 

Best regards,

Luca

Close