Lugaru's Epsilon
Programmer's
Editor

 Previous Up Next Control Flow Primitives and EEL Subroutines Examining Strings

### Character Types

int isspace(int ch)
int isdigit(int ch)
int isalpha(int ch)
int islower(int ch)
int isupper(int ch)
int isalnum(int ch)  /* basic.e */
int isident(int ch)  /* basic.e */
int any_uppercase(char *p)

Epsilon has several primitives that are helpful for determining if a character is in a certain class. The isspace( ) primitive tells if its character argument is a space, tab, or newline character. It returns 1 if it is, otherwise 0.

In the same way, the isdigit( ) primitive tells if a character is a digit (one of the characters 0 through 9), and the isalpha( ) primitive tells if the character is a letter. The islower( ) and isupper( ) primitives tell if the character is a lower case letter or upper case letter, respectively.

The isalnum( ) subroutine returns nonzero if the specified character is alphanumeric: either a letter or a digit. The isident( ) subroutine returns nonzero if the specified character is an identifier character: a letter, a digit, or the _ character.

The any_uppercase( ) subroutine returns nonzero if there are any upper case characters in its string argument p.

int tolower(int ch)
int toupper(int ch)

The tolower( ) primitive converts an upper case letter to the corresponding lower case letter. It returns a character that is not an upper case letter unchanged. The toupper( ) primitive converts a lower case letter to its upper case equivalent, and leaves other characters unchanged.

int set_character_property(int ch, int propcode, int value)

You can alter the rules Epsilon uses for determining if a particular character is alphabetic, uppercase, or lowercase, and how Epsilon case-folds when searching, sorting or otherwise comparing text, using the set_character_property( ) primitive. It takes the numeric code of the character whose properties you want to modify, a property code indicating which of its properties to access, and a new value for that property.

The property code CPROP_CTYPE sets whether the isalpha( ), isupper( ), islower( ), and isdigit( ) primitives consider a character alphabetic, uppercase, lowercase, or a digit, respectively. These attributes are independent, though there are conventions for their use. (For instance, only alpha characters generally have a case, no character is both uppercase and lowercase, and so forth.) The bits C_ALPHA, C_LOWER, C_UPPER, and C_DIGIT represent these attributes. The bits also control whether the regular expressions <digit>, <alpha>, <alphanum>, and <word> match these characters; see Character Classes.

The property code CPROP_TOLOWER controls what value the tolower( ) primitive returns for the specified character, and the property code CPROP_TOUPPER controls what value the toupper( ) primitive returns for it.

The property code CPROP_FOLD controls how Epsilon case-folds that character during searching, sorting, and similar functions, whenever case folding is in use. It specifies a replacement character to be used in place of the original during comparisons. The complete set of case-folding properties must follow two rules: if some character X folds to Y, then Y must fold to itself, and character codes below 256 must never fold to a value greater than or equal to 256. (If a particular group of characters should be treated as equal when searching, setting the case folding property of each to the code of the lowest-numbered one is sufficient to comply with these rules.)

The primitive returns the previous value of the specified property of that character. If the new value is out of range for the property (such as a negative value), it will be ignored, and the primitive will just return the current value. You can use this to retrieve the current properties of a character without changing them.

Epsilon doesn't store current character properties in its state file. If you want to use non-default properties all the time, write a startup function that calls this primitive. See Starting and Finishing.

Epsilon always starts with character classifications based on standard Unicode properties, except for the Win32 console version. That version, when running with a DOS/OEM character set (see the console-ansi-font variable), begins with its classifications for 8-bit characters set to match the current OEM font.

int get_direction()         /* window.e */

The get_direction( ) subroutine converts the last key pressed into a direction. It understands arrow keys, as well as the equivalent control characters. It returns BTOP, BBOTTOM, BLEFT, BRIGHT, or -1 if the key doesn't correspond to any direction.

 Previous Up Next Control Flow Primitives and EEL Subroutines Examining Strings