Regular expressions consist of alphanumeric characters and a number of syntax elements which are considered non-alphanumeric.
Here you can find most of RegEx syntax elements which can be used in regular expressions introduced into Querix 4gl programs.
These modifiers serve as possible flags of a regular expression and can be used to enhance search.
|
m |
treats a string as a number of multiple lines: /(\w+)/m |
|
s |
treats a string as a single line: /(\w+)/s |
|
i |
makes the search pattern case-insensitive: /(\w+)/i -- matches variants "word", "Word", "WORD", "wOrD", etc. regardless of their case |
|
x |
allows the pattern to include whitespaces and comments: (\w+) (\w+)/x -- matches "runtime" but not "run time". (\d+)#(\w+)/x -- considers all the characters after # as a comment, matches "7" but not "word" |
|
g |
allows replacing the pattern repeatedly in the string (makes sense only with util.REGEX.replace) |
|
[] |
matches any of the characters in the sequence = creates a set (=class) of characters: /[ab]c/-- matches "ac" or "bc" but not "abc" /[ab]+/-- matches any non-empty string of a's and b's like "ab", "aabbab", "babbaabbbaaa", etc. /([ab]+)c/-- matches any non-empty string of a's and b's followed by c like "abc", "aabbabc", "babbaabbbaaac", etc. |
|
[x-y] |
matches any of the characters from x to y inclusively in ASCII: /[a-d]/-- matches every combination of a's, b's, c's, and d's but not e's, f's, etc. like "a", "abcd", "acbddd" |
|
[\-] |
matches the hyphen (-) character: /[\-]/ -- matches "-" in "fifty-fifty" |
|
[\n] |
matches the newline character |
|
[^smth] |
matches any characters except those preceded by ^: /[^c]/ -- matches any combination of alphanumeric characters except for c like "hdfj" but not "hdcfj" |
|
^ |
beginning of the string with /m means the beginning of a new line |
|
$ |
end of the string with /m means the end of the line |
|
. |
any character except the newline one |
|
| |
possible alternative: /(a*)|(b*)/-- matches "a", "aa", "aaa", etc., "b", "bb", "bb", etc. but not "ab", "aabb" |
|
() |
allows a part of a regular expression to be treated as a single unit: /(ab)c/-- matches "abc" /(a|b)c/-- matches "ac" or "bc" but not "abc" /(\d\d):(\d\d):(\d\d)/ -- matches time values given in the hh:mm:ss format |
|
\ |
quotes the following metacharacter: /\|\/-- matches "|" |
|
* |
matches 0 or more times (={0,}): /(a*)/ -- matches "", "a", "aa", "aaa", etc. |
|
+ |
matches 1 or more times (={1,}): /(a+)/-- matches "a", "aa", "aaa", etc. but not "" /(a++a)/ -- never matches "aaaa" as a+ will take all a's a leave nothing for the remaining part of the pattern |
|
? |
matches 0 or 1 time or the shortest match (={0,1}): /(a?)/-- matches "", "a", etc. but not "aaa" |
|
{n} |
repetition = matches exactly n times: /(a{5})/ -- matches "aaaaa" but not "aa", "aaa", "aaaaaaaaaa" etc. |
|
{n,} |
matches at least n times: /(a{5,})/ -- matches "aaaaa"and "aaaaaaaaaa" but not "aa", "aaa", etc. |
|
{,m} |
matches no more m times: /(a{,5})/ -- matches "aa", "aaa", and "aaaaa" but not "aaaaaaaaaa" |
|
{n,m} |
matches at least n but no more n times: /(a{2,5})/ -- matches "aa", "aaa", "aaaa", and "aaaaa" but not "a" or "aaaaaaaaaa" |
|
\w |
matches any word characters word characters include all alphanumeric characters + _ (the underscore character) + other connector punctuation chars + Unicode marks |
|
\W |
matches non-"word" characters |
|
\s |
matches a whitespace character |
|
\S |
matches any non-whitespace characters |
|
\d |
matches decimal digits (0-9) |
|
\D |
matches non-digits |
|
\t |
creates a tab character |
|
\n |
creates a newline character |
|
\N |
matches any characters but "\n" |
|
\b |
matches word boundaries word boundary is a spot between two characters that has word characters on both sides, including the imaginary characters off the beginning and end of the string as matching a \W . Within character classes, \b represents backspace rather than a word boundary, just as it normally does in any double-quoted string. For greater details, refer to PERL documentation here. |
|
\B |
matches any characters except word boundaries |
|
\A |
matches only the beginning of the string |
|
\z |
matches only the end of the string |
|
\Z |
matches only the end of the string or before a newline character (for a multi-line search) |