Regular Expression Syntax cheat sheet.

Published: 5th of August, 2011
Regular Expression Syntax
Expression Description Examples and expansions
Single character expressions
. any single character spi.e matches "spice", "spike", etc.
\char for a nonalphanumerical char,
matches char literally
\* matches "*"
\n new line character
\r carriage return character
\t tab character
[...] any single character listed in the brackets [abc] matches "a", "b", or "c"
[...-...] any single character in the range [0-9] matches "0" or "1" ... or "9"
[^...] any single character not listed [^sS] matches one character that is neither "s" nor "S"
[^...-...] any single character not in the range [^A-Z] matches one character that is not an uppercase letter
Anchors/Expressions which match positions
^ beginning of the line
$ end of the line
\b word boundary nt\b matches "nt" in "paint" but not in "pants"
\B word non-boundary all\B matches "all" in "ally" but not in "wall"
Counters/Expressions which quantify previous expressions
* zero or more of previous r.e. a* matches "", "a", "aa", "aaa",...
+ one or more of previous r.e. a+ matches "a", "aa", "aaa",...
? exactly one or zero of previous r.e. colou?r matches "color" or "colour"
{n} n of previous r.e. a{4} matches "aaaa"
{n,m} from n to m of previous r.e.
{n,} at least n of previous r.e.

.* any string of characters
(...) grouping for precedence
and memory for backreference
...|... matches either of neighbor r.e.s (dog)|(cat) matches "dog" or "cat"
\d any digit [0-9]
\D any non-digit [^0-9]
\w any alphanumeric/underscore [a-zA-Z0-9_]
\W any non-alphanumeric [^a-zA-Z0-9_]
\s whitespace (space, tab) [\r\t\n\f]
\S non-whitespace [^\r\t\n\f]

This cheat-sheet table is from the book Speech and language processing by Daniel Jurafsky & James H. Martin.

blog comments powered by Disqus