Regex Class Reference

#include <Pt/Regex.h>

Regular Expressions for Unicode Strings. More...

Public Member Functions

 Regex ()
 Default Constructor.
 
 Regex (const Pt::Char *ex)
 Construct from regex string.
 
 Regex (const Pt::String &ex)
 Construct from regex string.
 
 Regex (const Regex &other)
 Copy constructor.
 
 ~Regex ()
 Destructor.
 
bool match (const Pt::String &str, RegexSMatch &sm) const
 Matches the regular experession to a string. More...
 
bool match (const Pt::String &str) const
 Returns true if string matches.
 
bool match (const Char *str, RegexSMatch &sm) const
 Matches the regular experession to a string. More...
 
bool match (const Char *str) const
 Returns true if string matches.
 
Regexoperator= (const Regex &other)
 Assignment operator.
 

Detailed Description

The Pt::Regex class allows to match a string pattern in unicode text. It resembles the std::basic_regex class and can be used to support systems, where std::basic_regex is not available in the standard C++ implementation. The syntax for the match pattern is similar to the extended POSIX syntax. The following table shows the special characters that can be used to write regular expressions:

. Any character
[ ] A character in a given set
[^ ] A character not in a given set
^ Begin of line
$ End of line
\< Begin of a word
\> End of a word
( ) A marked subexpression
* Matches the preceding element zero or more times
? Matches the preceding element zero or one time
+ Matches the preceding element one or more times
| Matches either the expression before or after the operator
\ Escapes the next character

The regular expression is constructed from a unicode string, either a Pt::String or a null-terminated sequence of unicode characters of type Pt::Char. It can then be used to match it against unicode strings as shown in the next example:

Pt::String expr = L"[hc]ats";
Pt::Regex regex(expr);
Pt::String str1 = L"I like cats!";
Pt::String str2 = L"I like hats!";
Pt::String str3 = L"I like bats!";
// this does match
bool matched = regex.match(str1);
// this does also match
matched = regex.match(str2);
// this does not match
matched = regex.match(str3);

It is also possibe to match a regular expression against a unicode input string and find out what tokens in the string actually matched. The match() member function has an overload, which fills a Pt::RegexSMatch with the result. Note that the first result at index 0 is always the input string itself. The following example illustrates this:

Pt::String expr = L"([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)";
Pt::Regex regex(expr);
Pt::String str = L"My IP address is 192.168.0.77";
bool matched = regex.match(str, smatch);
if(matched)
{
std::cout << "IP: " << smatch.str(1).narrow() << std::endl;
}
else
{
std::cout << "No IP in " << smatch.str(0).narrow() << std::endl;
}

Member Function Documentation

bool match ( const Pt::String str,
RegexSMatch sm 
) const

The result sm holds pointers into the original string that was matched and therefore should not be used after the original string was destroyed.

bool match ( const Char str,
RegexSMatch sm 
) const

The result @ sm holds pointers into the original string that was matched and therefore should not be used after the original string was destroyed.