Useful Regular Expressions
Introduction
This appendix contains some useful regular expressions to use in Print&Share. Remark: You have also an example list of useful regular expressions (regex) in Print&Share. You find the list in the Recognition-dialog, Specific-tab where you will have a drop down list with regular expressions.
For more information and an explanation about regular expressions, go to: http://www.regular-expressions.info or a similar website.
Examples
String and text (general)
Example 1
- Goal: Find the number in the text between the dashes.
- Input:
ABC-1234-XYZ
- RegEx:
(?<=.*?).+(?=-)
- Result:
1234
Example 2
- Goal: Find the string
KLM
in the text. - Input:
ABCKLMXYZ
orABCklmXYZ
- RegEx:
(KLM)|(klm)
- Result:
KLM
orklm
File extensions
Example 1
- Goal: Match all PDF-files, ending with .pdf extension.
- Input:
document.pdf
- RegEx:
.*[.]pdf
- Result: document.pdf
The above regular expression is in its simplest form. If you search the PDF file name inside text, you would like to add word-boundaries to it by using \b
:
- RegEx:
\b.*[.]pdf\b
\b
is the First or last character in a word.
In case you want to match document.PDF
, document.pDf
and document.pdf
you can make the regex case insensitive by using (?i)
:
- RegEx:
(?i).*[.]pdf
XML-files
Example 1
- Goal: Get the value of an XML-tag.
- Input:
<MY_TAG>my information</MY_TAG>
- RegEx:
(?<=<MY_TAG >).+(?=</MY_TAG >)
- Result:
my information
Example 2
- Goal: Get the value of an XML-tag and remove leading numbers.
- Input:
<ORDERNUMB>000123</ORDERNUMB>
- RegEx:
(?<=<ORDERNUMB>.*?)[1-9].+(?=</ORDERNUMB>)
- Result:
123
Example 3
- Goal: Get the value of an XML-tag and remove text before a specific word.
- Input:
<MY_TAG>my information I don’t need this</MY_TAG>
- RegEx:
(?<=<MY_TAG>.*?).+(?= I don’t need this.+</ MY_TAG>)
- Result:
my information
Example 4
- Goal: Get the value of an XML-tag and remove text after a specific word.
- Input:
<MY_TAG> I don’t need this my information </MY_TAG>
- RegEx:
(?<=<MY_TAG>.*? I don’t need this).+(?=</ MY_TAG>)
- Result:
my information