Python Regex Cheat Sheet
Basic Patterns
Pattern | Description |
---|---|
^ | Matches the beginning of a string. |
$ | Matches the end of a string. |
. | Matches any single character except newline characters. |
[...] | Matches any character listed between the square brackets. |
[^...] | Matches any character not listed between the square brackets. |
\d | Matches any digit, equivalent to [0-9]. |
\D | Matches any non-digit. |
\w | Matches any word character (alphanumeric plus underscore). |
\W | Matches any non-word character. |
\s | Matches any whitespace character. |
\S | Matches any non-whitespace character. |
\b | Matches a word boundary. |
\B | Matches a non-word boundary. |
Quantifiers
Pattern | Description |
---|---|
* | Matches zero or more occurrences of the preceding element. |
+ | Matches one or more occurrences of the preceding element. |
? | Matches zero or one occurrence of the preceding element. |
{n} | Matches exactly n occurrences of the preceding element. |
{n,} | Matches n or more occurrences of the preceding element. |
{n,m} | Matches between n and m occurrences of the preceding element. |
Groups and Lookaround
Pattern | Description |
---|---|
(...) | Defines a capturing group. |
(?:...) | Defines a non-capturing group. |
(?=...) | Positive lookahead assertion. |
(?!...) | Negative lookahead assertion. |
(?<=...) | Positive lookbehind assertion. |
(? | Negative lookbehind assertion. |
Modifiers
Modifier | Description |
---|---|
i | Case-insensitive matching. |
m | Multiline matching, affects ^ and $. |
s | Dot matches newline. |
x | Extended mode. Whitespace is ignored. |
Escape Sequences
Pattern | Description |
---|---|
\ | Escapes a special character. |
\t | Matches a tab character. |
\n | Matches a newline character. |
\r | Matches a carriage return character. |
\f | Matches a form feed character. |
\v | Matches a vertical tab character. |
\A | Matches the start of a string. |
\Z | Matches the end of a string. If the string ends with a newline, it matches just before the newline. |
\z | Matches the absolute end of a string. |
Special Sequences
Pattern | Description |
---|---|
\A | Matches only at the start of the string. |
\Z | Matches only at the end of the string. |
\number | Matches the contents of the group of the same number. |
Flags
Flag | Description |
---|---|
re.ASCII | Make \w, \W, \b, \B, \d, \D, \s and \S perform ASCII-only matching. |
re.DEBUG | Display debug information about the compiled expression. |
re.LOCALE | Make \w, \W, \b, \B, \s, \S dependent on the current locale. |
re.UNICODE | Make \w, \W, \b, \B, \d, \D, \s, \S dependent on the Unicode character properties database. |
Non-capturing and Named Groups
Pattern | Description |
---|---|
(?:...) | Non-capturing version of regular parentheses. |
(?P<name>...) | Group matches the expression and can be accessed by the name. |
(?P=name) | Backreference to a named group. |
Comments
Pattern | Description |
---|---|
(?#...) | Comment; the contents of the parentheses are simply ignored. |
Conditional Matching
Pattern | Description |
---|---|
(?(id/name)yes-pattern|no-pattern) | Will try to match with yes-pattern if the group with given id or name exists, otherwise will try to match with no-pattern. |
Greedy vs. Non-Greedy Matching
Pattern | Description |
---|---|
*? | Matches as few characters as possible (non-greedy version of *). |
+? | Matches as few characters as possible (non-greedy version of +). |
?? | Matches as few characters as possible (non-greedy version of ?). |
{m,n}? | Matches as few characters as possible (non-greedy version of {m,n}). |
Escape Sequences for Special Characters
Pattern | Description |
---|---|
\ | Matches a single backslash. |
\^ | Matches a caret (^) character. |
\$ | Matches a dollar ($) character. |
\. | Matches a period (.) character. |
\* | Matches an asterisk (*) character. |
\+ | Matches a plus (+) character. |
\? | Matches a question mark (?) character. |
\{ | Matches an opening curly brace ({) character. |
\} | Matches a closing curly brace (}) character. |
POSIX Character Classes
Pattern | Description |
---|---|
[:alnum:] | Matches alphanumeric characters. |
[:alpha:] | Matches alphabetic characters. |
[:digit:] | Matches digits. |
[:lower:] | Matches lowercase alphabetic characters. |
[:upper:] | Matches uppercase alphabetic characters. |
[:space:] | Matches whitespace characters. |