Java Regular Expressions (java regex)

Regular expressions are used for defining String patterns that can be used for searching, manipulating and editing a text. These expressions are also known as Regex (short form of Regular expressions).

The Java Regex or Regular Expression is an API to define a pattern for searching or manipulating strings.

Example:
import java.util.regex.*; class Test { public static void main(String args[]) { String content = "This is Rama " + "from cprogramcoding.com."; String pattern = ".*program.*"; boolean match = Pattern.matches(pattern, content); System.out.println("The text contains 'program'? " + match); } }


Output:
The text contains 'program'? true

In this tutorial we will learn how to define patterns and how to use them. The java.util.regex API (the package which we need to import while dealing with Regex) has two main classes:

java.util.regex.Pattern – Used for defining patterns

java.util.regex.Matcher – Used for performing match operations on text using patterns

java.util.regex.Pattern class:
1) Pattern.matches()

We have already seen the usage of this method in the above example where we performed the search for string “program” in a given text. This is one of simplest and easiest way of searching a String in a text using Regex.

String con = "This is a tutorial program!"; String pattern = ".*tutorial.*"; boolean match = Pattern.matches(pattern, con); System.out.println("The text contains 'tutorial'? " + match);

As you can see we have used matches() method of Pattern class to search the pattern in the given text. The pattern .*tutorial.* allows zero or more characters at the beginning and end of the String “tutorial” (the expression .* is used for zero and more characters).

Limitations: This way we can search a single occurrence of a pattern in a text. For matching multiple occurrences you should use the Pattern.compile() method (discussed in the next section).

2) Pattern.compile()

In the above example we searched a string “tutorial” in the text, that is a case sensitive search, however if you want to do a CASE INSENSITIVE search or want to do search multiple occurrences then you may need to first compile the pattern using Pattern.compile() before searching it in text. This is how this method can be used for this case.

String con = "This is a tutorial Website!"; String patternString = ".*tuToRiAl."; Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);

Here we have used a flag Pattern.CASE_INSENSITIVE for case insensitive search, there are several other flags that can be used for different-2 purposes. To read more about such flags refer this document.

Now what: We have obtained a Pattern instance but how to match it? For that we would be needing a Matcher instance, which we can get using Pattern.matcher() method. Lets discuss it.

3) Pattern.matcher() method

Here we will learn How to get Matcher instance from Pattern instance by using matcher() method.

import java.util.regex.*; class Test { public static void main(String args[]) { String content = "This is a tutorial Website!"; String patternString = ".*tuToRiAl.*"; Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE); Matcher matcher = pattern.matcher(content); boolean isMatched = matcher.matches(); System.out.println("Is it a Match?" + isMatched); } }


Output:
Is it a Match?true
4) Pattern.split()

To split a text into multiple strings based on a delimiter (Here delimiter would be specified using regex), we can use Pattern.split() method. This is how it can be done.

import java.util.regex.*; class Test { public static void main(String args[]) { String text = "ThisIsRama.ItISMyWebsite"; // Pattern for delimiter String patternS = "is"; Pattern pattern = Pattern.compile(patternS, Pattern.CASE_INSENSITIVE); String[] myStrings = pattern.split(text); for(String temp: myStrings) { System.out.println(temp); } System.out.println("Number of split strings: "+myStrings.length); } }


Output:
Th Rama.It MyWebsite Number of split strings: 4
java.util.regex.Matcher Class
Creating a Matcher instance
String content = "Some text"; String patternS = ".*somestring.*"; Pattern pattern = Pattern.compile(patternS); Matcher matcher = pattern.matcher(content);
Main methods

matches(): It matches the regular expression against the whole text passed to the Pattern.matcher() method while creating Matcher instance.

... Matcher matcher = pattern.matcher(content); boolean isMatch = matcher.matches();

lookingAt(): Similar to matches() method except that it matches the regular expression only against the beginning of the text, while matches() search in the whole text.

find(): Searches the occurrences of of the regular expressions in the text. Mainly used when we are searching for multiple occurrences.

start() and end(): Both these methods are generally used along with the find() method. They are used for getting the start and end indexes of a match that is being found using find() method.

Example to find out the multiple occurrences using Matcher methods:
import java.util.regex.*; class Test { public static void main(String args[]) { String content = "XXX RR PP qq LLL AAA UU"; String string = "RR"; Pattern pattern = Pattern.compile(string); Matcher matcher = pattern.matcher(content); while(matcher.find()) { System.out.println("Found at: "+ matcher.start() + " - " + matcher.end()); } } }


Output:
Found at: 4 - 6
java.util.regex package

The Matcher and Pattern classes provide the facility of Java regular expression. The java.util.regex package provides following classes and interfaces for regular expressions.

  1. MatchResult interface
  2. Matcher class
  3. Pattern class
  4. PatternSyntaxException class
Matcher class

It implements the MatchResult interface. It is a regex engine which is used to perform match operations on a character sequence.

No.MethodDescription
1boolean matches()test whether the regular expression matches the pattern.
2boolean find()finds the next expression that matches the pattern.
3boolean find(int start)finds the next expression that matches the pattern from the given start number.
4String group()returns the matched subsequence.
5int start()returns the starting index of the matched subsequence.
6int end()returns the ending index of the matched subsequence.
7int groupCount()returns the total number of the matched subsequence.
Pattern class
No.MethodDescription
1static Pattern compile(String regex)compiles the given regex and returns the instance of the Pattern.
2Matcher matcher(CharSequence input)creates a matcher that matches the given input with the pattern.
3static boolean matches(String regex, CharSequence input)It works as the combination of compile and matcher methods. It compiles the regular expression and matches the given input with the pattern.
4String[] split(CharSequence input)splits the given input string around matches of given pattern.
5String pattern()returns the regex pattern.
Example:
import java.util.regex.*; public class Test { public static void main(String args[]) { Pattern p = Pattern.compile(".p"); Matcher m = p.matcher("rp"); boolean b = m.matches(); boolean b2=Pattern.compile(".p").matcher("rp").matches(); boolean b3 = Pattern.matches(".p", "rp"); System.out.println(b+" "+b2+" "+b3); } }


Output:
true true true
Regular Expression .Example:
import java.util.regex.*; class Test { public static void main(String args[]){ System.out.println(Pattern.matches(".p", "ap")); //true (2nd char is p) System.out.println(Pattern.matches(".p", "mk")); //false (2nd char is not p) System.out.println(Pattern.matches(".p", "mpt")); //false (has more than 2 char) System.out.println(Pattern.matches(".p", "ammp")); //false (has more than 2 char) System.out.println(Pattern.matches("..p", "map")); //true (3rd char is p) } }


Output:
true false false false true
Regex Character classes
No.Character ClassDescription
1[abc]a, b, or c (simple class)
2[^abc]Any character except a, b, or c (negation)
3[a-zA-Z]a through z or A through Z, inclusive (range)
4[a-d[m-p]]a through d, or m through p: [a-dm-p] (union)
5[a-z&&[def]]d, e, or f (intersection)
6[a-z&&[^bc]]a through z, except for b and c: [ad-z] (subtraction)
7[a-z&&[^m-p]]a through z, and not m through p: [a-lq-z](subtraction)
import java.util.regex.*; class Test { public static void main(String args[]) { System.out.println(Pattern.matches("[amn]", "abcd")); //false (not a or m or n) System.out.println(Pattern.matches("[amn]", "a")); //true (among a or m or n) System.out.println(Pattern.matches("[amn]", "ammmna")); //false (m and a comes more than once) } }


Output:
false true false
Regex Quantifiers
RegexDescription
X?X occurs once or not at all
X+X occurs once or more times
X*X occurs zero or more times
X{n}X occurs n times only
X{n,}X occurs n or more times
X{y,z}X occurs at least y times but less than z times
import java.util.regex.*; class Text { public static void main(String args[]) { System.out.println("? quantifier ...."); System.out.println(Pattern.matches("[rmn]?", "r")); //true (r or m or n comes one time) System.out.println(Pattern.matches("[rmn]?", "rrr")); //false (r comes more than one time) System.out.println(Pattern.matches("[rmn]?", "rrmmmnn")); //false (r m and n comes more than one time) System.out.println(Pattern.matches("[rmn]?", "rrzztr")); //false (r comes more than one time) System.out.println(Pattern.matches("[rmn]?", "rm")); //false (r or m or n must come one time) System.out.println("+ quantifier ...."); System.out.println(Pattern.matches("[rmn]+", "r")); //true (r or m or n once or more times) System.out.println(Pattern.matches("[rmn]+", "rrr")); //true (r comes more than one time) System.out.println(Pattern.matches("[rmn]+", "rrmmmnn")); //true (r or m or n comes more than once) System.out.println(Pattern.matches("[rmn]+", "rrzztr")); //false (z and t are not matching pattern) System.out.println("* quantifier ...."); System.out.println(Pattern.matches("[amn]*", "ammmna")); //true (r or m or n may come zero or more times) } }


Output:
false true false
Regex Metacharacters
RegexDescription
.Any character (may or may not match terminator)
\dAny digits, short of [0-9]
\DAny non-digit, short for [^0-9]
\sAny whitespace character, short for [\t\n\x0B\f\r]
\SAny non-whitespace character, short for [^\s]
\wAny word character, short for [a-zA-Z_0-9]
\WAny non-word character, short for [^\w]
\bA word boundary
\BA non word boundary
Regular Expression Metacharacters Example
import java.util.regex.*; class Test { public static void main(String args[]) { System.out.println("metacharacters d...."); \\d means digit System.out.println(Pattern.matches("\\d", "pqr")); //false (non-digit) System.out.println(Pattern.matches("\\d", "5")); //true (digit and comes once) System.out.println(Pattern.matches("\\d", "9473")); //false (digit but comes more than once) System.out.println(Pattern.matches("\\d", "527pqr")); //false (digit and char) System.out.println("metacharacters D...."); \\D means non-digit System.out.println(Pattern.matches("\\D", "pqr")); //false (non-digit but comes more than once) System.out.println(Pattern.matches("\\D", "7")); //false (digit) System.out.println(Pattern.matches("\\D", "7775")); //false (digit) System.out.println(Pattern.matches("\\D", "157pqr")); //false (digit and char) System.out.println(Pattern.matches("\\D", "n")); //true (non-digit and comes once) System.out.println("metacharacters D with quantifier...."); System.out.println(Pattern.matches("\\D*", "mak")); //true (non-digit and may come 0 or more times) } }


Regular Expression Question 1
import java.util.regex.*; class Test { public static void main(String args[]) { System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "Rama36")); //true System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "Pavi31")); //false (more than 6 char) System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "Rama1")); //true System.out.println(Pattern.matches("[a-zA-Z0-9]{6}", "Rama$2")); //false ($ is not matched) } }


Output:
true true false false
Regular Expression Question 2:
/*Create a regular expression that accepts 10 digit numeric characters starting with 7, 8 or 9 only.*/ import java.util.regex.*; class Test { public static void main(String args[]) { System.out.println("by character classes and quantifiers ..."); System.out.println(Pattern.matches("[789]{1}[0-9]{9}", "9983435949")); //true System.out.println(Pattern.matches("[789][0-9]{9}", "9852737947")); //true System.out.println(Pattern.matches("[789][0-9]{9}", "98730389490")); //false (11 characters) System.out.println(Pattern.matches("[789][0-9]{9}", "2753938959")); //false (starts from 2) System.out.println(Pattern.matches("[789][0-9]{9}", "8753538949")); //true System.out.println("by metacharacters ..."); System.out.println(Pattern.matches("[789]{1}\\d{9}", "8873838945")); //true System.out.println(Pattern.matches("[789]{1}\\d{9}", "5875373899")); //false (starts from 53) } }


Output:
by character classes and quantifiers ... true true false false true by metacharacters ... true false
Java Regex Finder Example
import java.util.regex.Pattern; import java.util.Scanner; import java.util.regex.Matcher; public class Test { public static void main(String[] args) { Scanner sc=new Scanner(System.in); while (true) { System.out.println("Enter regex pattern:"); Pattern pattern = Pattern.compile(sc.nextLine()); System.out.println("Enter text:"); Matcher matcher = pattern.matcher(sc.nextLine()); boolean found = false; while (matcher.find()) { System.out.println("I found the text "+matcher.group()+" starting at index "+ matcher.start()+" and ending at index "+matcher.end()); found = true; } if(!found){ System.out.println("No match found."); } } } }


Output:
C:\new>javac Test.java C:\new>java Test Enter regex pattern: Rama Enter text: My name is rama No match found. Enter regex pattern: Rama Enter text: My name is Rama I found the text Rama starting at index 11 and ending at index 15 Enter regex pattern: Rama Enter text: My name is Rama . My first name is Rama I found the text Rama starting at index 11 and ending at index 15 I found the text Rama starting at index 35 and ending at index 39



Instagram