regexec Subroutine

Purpose

Compares the null-terminated string specified by the value of the String parameter against the compiled basic or extended regular expression Preg, which must have previously been compiled by a call to the regcomp subroutine.

Library

Standard C Library (libc. a)

Syntax

#include <regex.h>

int regexec (PregStringNMatchPMatchEFlags)
const regex_t * Preg;
const char * String;
size_t  NMatch;
regmatch_t * PMatch;
int  EFlags;

Description

The regexec subroutine compares the null-terminated string in the String parameter with the compiled basic or extended regular expression in the Preg parameter initialized by a previous call to the regcomp subroutine. If a match is found, the regexec subroutine returns a value of 0. The regexec subroutine returns a nonzero value if it finds no match or it finds an error.

If the NMatch parameter has a value of 0, or if the REG_NOSUB flag was set on the call to the regcomp subroutine, the regexec subroutine ignores the PMatch parameter. Otherwise, the PMatch parameter points to an array of at least the number of elements specified by the NMatch parameter. The regexec subroutine fills in the elements of the array pointed to by the PMatch parameter with offsets of the substrings of the String parameter. The offsets correspond to the parenthetic subexpressions of the original pattern parameter that was specified to the regcomp subroutine.

The pmatch.rm_so structure is the byte offset of the beginning of the substring, and the pmatch.rm_eo structure is one greater than the byte offset of the end of the substring. Subexpression i begins at the i th matched open parenthesis, counting from 1. The 0 element of the array corresponds to the entire pattern. Unused elements of the PMatch parameter, up to the value PMatch[NMatch-1], are filled with -1. If more than the number of subexpressions specified by the NMatch parameter (the pattern parameter itself counts as a subexpression), only the first NMatch-1 subexpressions are recorded.

When a basic or extended regular expression is being matched, any given parenthetic subexpression of the pattern parameter might match several different substrings of the String parameter. Otherwise, it might not match any substring even though the pattern as a whole did match.

The following rules are used to determine which substrings to report in the PMatch parameter when regular expressions are matched:

If the REG_NOSUB flag was set in the cflags parameter in the call to the regcomp subroutine, and the NMatch parameter is not equal to 0 in the call to the regexec subroutine, the content of the PMatch array is unspecified.

If the REG_NEWLINE flag was not set in the cflags parameter when the regcomp subroutine was called, then a new-line character in the pattern or String parameter is treated as an ordinary character. If the REG_NEWLINE flag was set when the regcomp subroutine was called, the new-line character is treated as an ordinary character except as follows:

Parameters

Item Description
Preg Contains the compiled basic or extended regular expression to compare against the String parameter.
String Contains the data to be matched.
NMatch Contains the number of subexpressions to match.
PMatch Contains the array of offsets into the String parameter that match the corresponding subexpression in the Preg parameter.
EFlags Contains the bitwise inclusive OR of 0 or more of the flags controlling the behavior of the regexec subroutine capable of customizing.

The EFlags parameter modifies the interpretation of the contents of the String parameter. It is the bitwise inclusive OR of 0 or more of the following flags, which are defined in the regex.h file:

REG_NOTBOL
The first character of the string pointed to by the String parameter is not the beginning of the line. Therefore, the ^ (circumflex), when used as a special character, does not match the beginning of the String parameter.
REG_NOTEOL
The last character of the string pointed to by the String parameter is not the end of the line. Therefore, the $ (dollar sign), when used as a special character, does not match the end of the String parameter.

Return Values

On successful completion, the regexec subroutine returns a value of 0 to indicate that the contents of the String parameter matched the contents of the pattern parameter, or to indicate that no match occurred. The REG_NOMATCH error is defined in the regex.h file.

Error Codes

If the regexec subroutine is unsuccessful, it returns a nonzero value indicating the type of problem. The following macros for possible error codes that can be returned are defined in the regex.h file:

Item Description
REG_NOMATCH Indicates the basic or extended regular expression was unable to find a match.
REG_BADPAT Indicates a basic or extended regular expression that is not valid.
REG_ECOLLATE Indicates a collating element referenced that is not valid.
REG_ECTYPE Indicates a character class-type reference that is not valid.
REG_EESCAPE Indicates a trailing \ (backslash) in the pattern.
REG_ESUBREG Indicates a number in \digit is not valid or is in error.
REG_EBRACK Indicates a [ ] (left and right brackets) imbalance.
REG_EPAREN Indicates a \ ( \ ) (backslash, left parenthesis, backslash, right parenthesis) or ( ) (left and right parentheses) imbalance.
REG_EBRACE Indicates a \ { \ } (backslash, left brace, backslash, right brace) imbalance.
REG_BADBR Indicates the content of \ { \ } (backslash, left brace, backslash, right brace) is unusable (not a number, number too large, more than two numbers, or first number larger than second).
REG_ERANGE Indicates an unusable end point in range expression.
REG_ESPACE Indicates out of memory.
REG_BADRPT Indicates a ? (question mark), * (asterisk), or + (plus sign) not preceded by valid basic or extended regular expression.

If the value of the Preg parameter to the regexec subroutine is not a compiled basic or extended regular expression returned by the regcomp subroutine, the result is undefined.

Examples

The following example demonstrates how the REG_NOTBOL flag can be used with the regexec subroutine to find all substrings in a line that match a pattern supplied by a user. (For simplicity, very little error-checking is done in this example.)

(void) regcomp (&re, pattern, 0) ;
/* this call to regexec finds the first match on the line */
error = regexec (&re, &buffer[0], 1, &pm, 0) ;
while (error = = 0) {   /* while matches found */
<subString found between pm.r._sp and pm.rm_ep>
/* This call to regexec finds the next match */
error = regexec (&re, pm.rm_ep, 1, &pm, REG_NOTBOL) ;