Lugaru's Epsilon
Programmer's
Editor

Context:
Epsilon User's Manual and Reference
   Primitives and EEL Subroutines
      Buffer Primitives
         . . .
         Undo Primitives
         Searching Primitives
            Regular Expression Searching
            Searching Subroutines
         Moving by Lines
         . . .

Previous   Up    Next
Searching Primitives  Primitives and EEL Subroutines   Searching Subroutines


Epsilon User's Manual and Reference > Primitives and EEL Subroutines > Buffer Primitives > Searching Primitives >

Regular Expression Searching


int re_search(int flags, char *pat)
int re_compile(int flags, char *pat)
int re_match()
#define RE_FORWARD      0
#define RE_REVERSE      2
#define RE_FIRST_END    4
#define RE_SHORTEST     8
#define RE_IGNORE_COLOR 16

Several searching primitives deal with a powerful kind of pattern known as a regular expression. Regular expressions allow you to search for complex patterns. Regular expressions are strings formed according to the rules in Regular Expressions.

The re_search( ) primitive searches the buffer for one of these patterns. It operates like the search( ) primitive, taking a direction and pattern and returning 1 if it finds the pattern. It moves to the far end of the pattern from the starting point, and sets matchstart to the near end. If it doesn't find the pattern, or if the pattern is illegal, it returns 0. In the latter case point doesn't move, in the former point moves to the end (or beginning) of the buffer.

When you specify a direction using 1 or -1, Epsilon selects the first-beginning, longest match, unless the search string overrides this. However, instead of providing a direction (1 or -1) as the first parameter to re_search( ) or re_compile( ), you can provide a set of flags. These let you specify finding the shortest possible match, for example, without altering the search string.

The RE_FORWARD flag searches forward, while the RE_REVERSE flag searches backward. (If you don't include either, Epsilon searches forward.) The RE_FIRST_END flag says to find a match that ends first, rather than one that begins first. The RE_SHORTEST flag says to find the shortest possible match, rather than the longest. However, if the search string contains sequences that specify first-ending, first-beginning, shortest, or longest matches, those sequences override any flags.

A pattern may include color class assertions, as described in Regular Expression Assertions. The RE_IGNORE_COLOR flag makes Epsilon ignore such assertions. The do_color_searching( ) subroutine uses this; if your search might include such assertions, calling that subroutine instead of these primitives will take care of ensuring that the buffer's syntax highlighting is up to date.

The re_compile( ) primitive checks a pattern for legality. It takes the same arguments as re_search( ) and returns 1 if the pattern is illegal, otherwise 0. The re_match( ) primitive tells if the last-compiled pattern matches at this location in the buffer, returning the far end of the match if it does, or -1 if it does not.

int parse_string(int flags, char *pat, ?char *dest)
int matches_at(int pos, int dir, char *pat)
int matches_at_length(int pos, int dir, char *pat)
int matches_in(int start, int end, char *pat)

The parse_string( ) primitive looks for a match starting at point, using the same rules as re_match( ). It takes a direction (or flags) and a pattern like re_compile( ), and a character pointer. It looks for a match of the pattern beginning at point, and returns the length of such a match, or zero if there was no match.

The third argument dest may be a null pointer, or may be omitted entirely. But if it's a pointer to a character array, parse_string( ) copies the characters of the match there, and moves point past them. If the pattern does not match, dest isn't modified.

The matches_at( ) subroutine accepts a regular expression pat and returns nonzero if the given pattern matches at a particular position in the buffer in the given direction. The matches_at_length( ) subroutine is similar, but it returns the length of the match, or zero if there was no match.

The matches_in( ) subroutine accepts a regular expression pat and searches for the pattern in the specified buffer range, returning nonzero to indicate it matches. Neither matches_at( ) nor matches_in( ) move point.

int find_group(int n, int open)

The find_group( ) primitive tells where in the buffer certain parts of the last pattern matched. It counts opening parentheses used for grouping in the last pattern, numbered from 1, and returns the position it was at when it reached a certain parenthesis. If open is nonzero, it returns the position of the n'th left parenthesis, otherwise it returns the position of its matching right parenthesis. If n is zero, it returns information on the whole pattern. If n is too large, or negative, the primitive aborts with an error message. Parentheses that use the syntax (?: ) don't count.



Previous   Up    Next
Searching Primitives  Primitives and EEL Subroutines   Searching Subroutines


Lugaru Copyright (C) 1984, 2012 Lugaru Software Ltd. All Rights Reserved.