Lugaru's Epsilon Programmer's Editor 14.04
Context:
|
Epsilon User's Manual and Reference > Commands by Topic > Changing Text > Regular Expressions > Character ClassesIn place of any letter, you can specify a character class. A character class consists of a sequence of characters between square brackets. For example, the character class[adef] stands for any of the following characters: "a", "d",
"e", or "f".
In place of a letter in a character class, you can specify a range of
characters using a hyphen: the character class
To specify the complement of a character class, put a caret as the
first character in the class. Using the above examples, the class
If you need to put a right square bracket character in
a character class, put it immediately after the opening
left square bracket, or in the case of an inverted character
class, immediately after the caret. For example, the class
To include the hyphen character Any regular expression you can write with character classes you can also write without character classes. But character classes sometimes let you write much shorter regular expressions.
The period character (outside a character class)
represents any character except a <Newline>. For example, the
pattern
You can also specify a character class using a
variant of the angle bracket syntax described in the previous section
for entering special characters. The expression
<Comma|Period|Question> represents any one of those three
punctuation characters. The expression You can also use a few character class names that match some common sets of characters.
The character class <hspace> includes <Space>, and <Tab> plus all Unicode characters in category Z (separators). The name <wspace> includes all those plus <Newline>. Similarly, the character classes for digits, letters, and so forth include all Unicode characters of the appropriate category.
You can match all characters with a particular Unicode
property, using the syntax <p:hex-digit>. After the You can combine character classes using addition, subtraction, or intersection. Addition means a matching character can be in either of two classes, as in <alpha|digit> to match either alphabetic characters or digits. Intersection means a matching character must be a member of both classes, as in <p:HexDigit&p:numeric-type=decimal>, which matches characters with the HexDigit binary Unicode property that also have a Numeric-Type property of Decimal. Subtraction means a matching character must be a member of one class but not another, as in <p:currency-symbol&!dollar sign&!cent sign> which matches all characters with the Currency-Symbol property except for the dollar sign and cent sign characters.
More precisely, we can say that inside the angle brackets you can put
one or more character "rules", each separated from the next by
either a vertical bar
Each character rule may be a character specification or a range, a
character class name from the table above, or a Unicode property
specification using the
Separately, Epsilon recognizes the syntax <h:0d
0a 45> as a shorthand to search for a series of characters by their
hexadecimal codes. This example is equivalent to the pattern
|