Lugaru's Epsilon
Programmer's
Editor 14.04

 Previous Up Next Character Types Primitives and EEL Subroutines Modifying Strings

### Examining Strings

```int strlen(char *s) ```
Epsilon provides various functions for manipulating strings, or equivalently, zero-terminated arrays of characters. (General-purpose functions for modifying strings are covered in the next section.) The strlen( ) primitive returns the length of a string. That is, it tells the position in the array of the first zero character.

```int strcmp(char *first, char *second) int strncmp(char *first, char *second, int count) ```
The strcmp( ) primitive tells if two strings are identical. It returns `0` if all characters in them are the same (and if they have the same length). Otherwise, it returns a negative number if the lexicographic ordering of these strings would put the first before the second. It returns a positive number otherwise. The strncmp( ) primitive is like strcmp( ), except only the first `count` characters matter.

```int strfcmp(char *first, char *second) int strnfcmp(char *first, char *second, int count) int charfcmp(int first, int second) ```
Epsilon also has similar comparison primitives that consider upper case and lower case letters to be equal. The strfcmp( ) primitive acts like strcmp( ) and the strnfcmp( ) primitive acts like strncmp( ), but if the buffer-specific variable case_fold is nonzero, Epsilon folds characters in the same way searching or sorting would before making the comparison. The charfcmp( ) primitive takes two characters and performs the same comparison on them. For characters a and b, `charfcmp('a', 'b')` equals ```strfcmp("a", "b")```. (EEL also recognizes the corresponding ANSI C name stricmp( ) instead of strfcmp( ).)

```int compare_chars(char *str1, char *str2, int num, int fold) ```
The compare_chars( ) primitive works like strcmp( ), except that it makes no assumptions about zero-termination. It takes two strings and a size, then compares that many characters from each string. If the strings exactly match, compare_chars( ) returns zero. If `str1` would be alphabetically before `str2`, it returns a negative value. If `str2` would be alphabetically before `str1`, it returns a positive value. It ignores the case of the characters when comparing if `fold` is nonzero.

```char *index(char *s, int ch) char *rindex(char *s, int ch) char *strstr(char *s, char *t) char *strpbrk(char *s, char *charset) char *strpbrk_cnt(char *s, char *charset, int skip) ```
The index( ) primitive tells if a character `ch` appears in the string `s`. It returns a pointer to the first appearance of `ch`, or a null pointer if there is none. The rindex( ) primitive works the same, but returns a pointer to the last appearance of `ch`. (EEL also recognizes the corresponding ANSI C names strchr( ) instead of index( ) and strrchr( ) instead of rindex( ).)

The strstr( ) primitive searches the string `s` for a copy of the string `t`. It returns a pointer to the first appearance of `t`, or a null pointer if there is none. It case-folds as described above for strfcmp( ).

The strpbrk( ) subroutine returns a pointer to the first character in `s` that appears in the list of characters `charset`. Both strings must be null-terminated. If the strings have no characters in common, it returns a null pointer.

The strpbrk_cnt( ) subroutine is similar, but it skips over the first `skip` characters in `s` that also appear in `charset`. For instance, with `skip` set to `1`, it returns a pointer to the second character in `s` that also appears in `charset`.

```int fpatmatch(char *s, char *pat, int prefix, int flags) #define FPAT_FOLD                      1 #define FPAT_IGNORE_SQUARE_BRACKETS    2 ```
The fpatmatch( ) primitive returns nonzero if a string `s` matches a pattern `pat`. It uses a simple filename-style pattern syntax: `*` matches any number of characters; `?` matches a single character, and `[a-z]` match a character class (with the same character class syntax as other patterns in Epsilon). It also recognizes `|` to permit alternatives. If `prefix` is nonzero, `s` must begin with text matching `pat`; otherwise `pat` must match all of `s`.

The `flags` parameter recognizes two bits. The `FPAT_FOLD` bit makes Epsilon fold characters before comparing, according to the current buffer's folding rules. The `FPAT_IGNORE_SQUARE_BRACKETS` bit makes Epsilon treat the character `[` in a pattern like any other, instead of interpreting it as the start of a character class.

```int string_matches_regex(char *str, char *pat, int fold) int string_matches_pattern(char *str, char *pat) int regex_replace_in_string(char *dest, char *src, char *pat, char *repl) ```
The string_matches_regex( ) subroutine returns nonzero if the start of the given string matches the regular expression pattern. Use <eof> at the end of the pattern to check if the entire string matches. It does case-folding if fold is nonzero.

The similar string_matches_pattern( ) subroutine returns the length of the match (which differs from the above only with patterns that can match zero-length text), and uses `case_fold.default`.

Both return zero when given an invalid regular expression pattern.

The regex_replace_in_string( ) subroutine copies src to dest, performing a regular expression replacement as it does. As in string_replace( ), which it runs, the replacement text may contain `#` sequences to interpolate text from each match. It returns the number of replacements it performed, or -1 if an error like an invalid search pattern was detected. If src and dest are the same, the text is updated in place.

```int word_in_list(char *word, char *list, int fold) int starts_with_in_list(char *word, char *list, int fold) ```
The word_in_list( ) subroutine returns nonzero whenever the text in word appears in the `|`-separated list of words `list`. "Word" here means any text that doesn't contain an actual `|` character. The list of words must begin and end with `|` delimiters. The similar starts_with_in_list( ) subroutine returns nonzero whenever word starts with one of the words in the list. Both do case-folding if `fold` is nonzero. They are faster than the regular-expression-based subroutines above.

 Previous Up Next Character Types Primitives and EEL Subroutines Modifying Strings