Scroll to navigation

RGXG(1) User Commands RGXG(1)

NAME

rgxg - ReGular eXpression Generator

SYNOPSIS

rgxg COMMAND [ARGS]

DESCRIPTION

rgxg is a generator for (extended) regular expressions.

For instance it is useful to generate a regular expression to exactly match a numeric range or all addresses of a given CIDR block.

COMMANDS

alternation [options] [PATTERN...]

Generate a regular expression that matches any of the given patterns.

Options

Omit the outer parentheses, if any, of the regular expression. This option can be useful if the generated regular expression is used within another alternation.

Display help and exit.

Examples

Match either lion, elephant, rhino, buffalo or leopard:

$ rgxg alternation lion elephant rhino buffalo leopard
(lion|elephant|rhino|buffalo|leopard)

cidr [options] CIDR

Generate a regular expression that matches all addresses of the given CIDR block. Both IPv4 and IPv6 CIDR blocks are supported.

Options

Omit the outer parentheses, if any, of the regular expression. This option can be useful if the generated regular expression is used within another alternation.

Match only IPv6 addresses with lower case letters. By default both lower and upper case letters are matched.

Match only IPv6 addresses with upper case letters. By default both lower and upper case letters are matched.

Do not match IPv6 addresses with zero compression (second form of text representation of IPv6 addresses mentioned in section 2.2 of RFC 4291).

Do not match IPv6 addresses in mixed notation (third form of text representation of IPv6 addresses mentioned in section 2.2 of RFC 4291).

Display help and exit.

Examples

Match 192.168.0.0/24:

$ rgxg cidr 192.168.0.0/24
192.168.0.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])

Match 2001:db8:aaaa:bbbb:cccc:dddd::/96 limited to lower case letters:

$ rgxg cidr -l 2001:db8:aaaa:bbbb:cccc:dddd::/112
2001:0?db8:aaaa:bbbb:cccc:dddd((::[0-9a-f]{1,4}|::|:0?0?0?0(::|:[0-9a-f]{1,4}))|:0.0(.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){2})

Match 2001:db8:1234::/48 restricted to uncompressed standard notation:
$ rgxg cidr -u -s 2001:db8:1234::/48
2001:0?[Dd][Bb]8:1234(:[0-9A-Fa-f]{1,4}){5}

escape [options] STRING

Generate the regular expression which matches the given string by escaping the escape characters.

Options

Display help and exit.

Examples

Match '1+(2*(3-4))':

$ rgxg escape 1+(2*(3-4))
1\+\(2\*\(3-4\)\)

help [COMMAND]

Describe the usage of rgxg or the given COMMAND.

range [options] FIRST [LAST]

Generate a regular expression that matches the number range between FIRST and LAST. If LAST is omitted the regular expression matches all numbers which are greater than or equal FIRST. The numbers must be positive and given in base-10 notation.

Options

Generate the regular expression for the number range with base BASE. The BASE must be in the range between 2 and 32. The default base is 10.

Omit the outer parentheses, if any, of the regular expression. This option can be useful if the generated regular expression is used within another alternation.

For bases greater than 10 only match lower case letters. By default both lower and upper case letters are matched.

For bases greater than 10 only match upper case letters. By default both lower and upper case letters are matched.

Display help and exit.

Only match numbers with leading zeros. By default the number of leading zeros depends on the length (i.e. the number of digits) of LAST (see also -m). The default is to not match numbers with leading zeros.

Match numbers with a variable number of leading zeros. By default the maximum number of leading zeros depends on the length (i.e. the number of digits) of LAST (see also -m). The default is to not match numbers with leading zeros.

with -z or -Z, the minimum LENGTH of matched numbers. For instance the number 5 with LENGTH set to 3 and -z option set is matched as '005'. If LENGTH is lesser than or equal to the number of digits of LAST, this option has no effect.

Examples

Match the numbers from 0 to 31:

$ rgxg range 0 31
(3[01]|[12]?[0-9])

Match numbers from 0 to 31 with base 2:

$ rgxg range -b 2 0 31
(1[01]{0,4}|0)

Match 0 to 31 with base 16:

$ rgxg range -b 16 0 31
1?[0-9A-Fa-f]

Match 0 to 31 with base 16 limited to upper case letters:

$ rgxg range -b 16 -U 0 31
1?[0-9A-F]

Match 0 to 31 with base 16 limited to lower case letters:

$ rgxg range -b 16 -l 0 31
1?[0-9a-f]

Match 00 to 31:

$ rgxg range -z 0 31
(3[01]|[0-2][0-9])

Match 0000 to 0031:

$ rgxg range -z -m 4 0 31
(003[01]|00[0-2][0-9])

Match 0 to 31 and 00 to 31 and 000 to 031:

$ rgxg range -Z -m 3 0 31
(0?3[01]|0?[0-2]?[0-9])

Match 0 to 31 and omit outer parentheses:

$ rgxg range -N 0 31
3[01]|[12]?[0-9]

Match all numbers greater than or equal to 4096:

$ rgxg range 4096
([1-9][0-9]{4,}|[5-9][0-9]{3}|4[1-9][0-9]{2}|409[6-9])

version

Prints the version of the rgxg command.

EXIT STATUS

The exit status is 0 if the regular expression has been successfully generated. If an error occurred the exit status is 1.

NOTES

The regular expressions generated by rgxg are supposed to be used in any context. This may lead to some side effects.

For instance consider the following:

$ echo '192.168.0.999' | grep -E "$(rgxg cidr 192.168.0.0/24)"
192.168.0.999
$

This is correct because the regular expression for '192.168.0.0/24' matches '192.168.0.99'.

One can verify this by adding '-o' to grep:

echo '192.168.0.999' | grep -oE "$(rgxg cidr 192.168.0.0/24)"
192.168.0.99
$

As rgxg cannot know in which context the generated regular expression is used, it is up to the user to ensure that the regular expression works as expected (e.g. by adding anchors like '^' and '$').

In the example above adding line anchors leads to the expected behaviour:

$ echo '192.168.0.999' | grep -E "^$(rgxg cidr 192.168.0.0/24)$"
$

SEE ALSO

regex(7)

AUTHOR

Hannes von Haugwitz <hannes@vonhaugwitz.com>

April 12, 2020 rgxg 0.1.2