Java Regex Cheat Sheet
Basic Patterns
Pattern | Description |
---|---|
^ | Matches the beginning of a line. |
$ | Matches the end of a line. |
. | Matches any single character except newline characters. |
[...] | Matches any character listed between the square brackets. |
[^...] | Matches any character not listed between the square brackets. |
\d | Matches any digit, equivalent to [0-9]. |
\D | Matches any non-digit. |
\w | Matches any word character (alphanumeric plus underscore). |
\W | Matches any non-word character. |
\s | Matches any whitespace character. |
\S | Matches any non-whitespace character. |
\b | Matches a word boundary. |
\B | Matches a non-word boundary. |
Quantifiers
Pattern | Description |
---|---|
* | Matches zero or more occurrences of the preceding element. |
+ | Matches one or more occurrences of the preceding element. |
? | Matches zero or one occurrence of the preceding element. |
{n} | Matches exactly n occurrences of the preceding element. |
{n,} | Matches n or more occurrences of the preceding element. |
{n,m} | Matches between n and m occurrences of the preceding element. |
Groups and Lookaround
Pattern | Description |
---|---|
(...) | Defines a capturing group. |
(?:...) | Defines a non-capturing group. |
(?=...) | Positive lookahead assertion. |
(?!...) | Negative lookahead assertion. |
(?<=...) | Positive lookbehind assertion. |
(? | Negative lookbehind assertion. |
Modifiers
Modifier | Description |
---|---|
i | Case-insensitive matching. |
m | Multiline matching, affects ^ and $. |
s | Dot matches newline. |
x | Extended mode. Whitespace is ignored. |
u | Enables Unicode-aware case folding. |
Escape Sequences
Pattern | Description |
---|---|
\ | Escapes a special character. |
\t | Matches a tab character. |
\n | Matches a newline character. |
\r | Matches a carriage return character. |
\f | Matches a form feed character. |
\A | Matches the start of a string. |
\Z | Matches the end of a string. If the string ends with a newline, it matches just before the newline. |
\z | Matches the absolute end of a string. |
Character Classes
Pattern | Description |
---|---|
\h | Matches any horizontal whitespace character. |
\H | Matches any character that's not a horizontal whitespace. |
\v | Matches any vertical whitespace character. |
\V | Matches any character that's not a vertical whitespace. |
\R | Matches any Unicode linebreak sequence, which is equivalent to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029] |
Java-Specific Constructs
Pattern | Description |
---|---|
\X | Matches a Unicode extended grapheme cluster. |
\C | Matches a single code point or code unit. Useful in Unicode regex mode. |
Backreferences
Pattern | Description |
---|---|
\k<name> | Backreference to a named-capturing group. |
POSIX Character Classes (US-ASCII only)
Pattern | Description |
---|---|
[:alnum:] | Matches alphanumeric characters. |
[:alpha:] | Matches alphabetic characters. |
[:blank:] | Matches a space or a tab. |
[:cntrl:] | Matches control characters. |
[:digit:] | Matches digits. |
[:graph:] | Matches visible characters. |
[:lower:] | Matches lowercase alphabetic characters. |
[:print:] | Matches visible characters and spaces. |
[:punct:] | Matches punctuation characters. |
[:space:] | Matches whitespace characters. |
[:upper:] | Matches uppercase alphabetic characters. |
[:xdigit:] | Matches hexadecimal digits. |
Java Regex Flags
Flag | Description |
---|---|
PATTERN.CASE_INSENSITIVE or (?i) | Enables case-insensitive matching. |
PATTERN.MULTILINE or (?m) | Enables multiline mode. |
PATTERN.DOTALL or (?s) | Enables dotall mode, which makes . match newlines as well. |
PATTERN.UNICODE_CASE or (?u) | Enables Unicode-aware case folding. |
PATTERN.CANON_EQ | Enables canonical equivalence. |
PATTERN.UNIX_LINES or (?d) | Only the '\n' line terminator is recognized in the behavior of ., ^, and $. |
Boundary Matchers (Continued)
Pattern | Description |
---|---|
\Q...\E | Quotes all characters in between \Q and \E, making them literal. |
Named Groups
Pattern | Description |
---|---|
(?<name>...) | Defines a named capturing group. |
Java-Specific Methods
Method | Description |
---|---|
Pattern.compile() | Compiles the given regular expression into a pattern. |
Matcher.find() | Searches for the next occurrence that matches the pattern. |
Matcher.matches() | Tests whether the entire region matches the pattern. |
Matcher.group() | Returns the matched subsequence. |
Matcher.group(int group) | Returns the matched subsequence of the given group index. |
Matcher.group(String name) | Returns the matched subsequence of the named group. |
Matcher.lookingAt() | Attempts to match the input sequence from the start against the pattern. |
Pattern.split(CharSequence input) | Splits the given input sequence around matches of this pattern. |
Java Regex Flags (Continued)
Flag | Description |
---|---|
PATTERN.LITERAL | Pattern is treated as a sequence of literal characters. Special characters and escape sequences are given no special meaning. |
PATTERN.COMMENTS or (?x) | Permits whitespace and comments in the pattern. |