Explain quantifiers in Java regular expressions


Quantifiers in Java are special characters that allow you to specify the number of times a character or group of characters can occur in a regular expression. The most common quantifiers are:

  • *: One or more instances of the character or set of characters that came before it.

  • ?: The character or set of characters before it, either zero or one instance.

  • +: One or more instances of the character or collection of characters that came before it.

  • {n}: Specifically n instances of the character or set of characters that came before it.

  • {n,}: The character or set of characters before it appears at least n times.

  • {n,m}: n to m instances of the character or set of characters that came before it.

Quantifiers can be greedy, reluctant, or possessive. In contrast to reluctant quantifiers, greedy quantifiers attempt to match as little of the input text as possible. When a possessive quantifier matches the full input text, it may match fewer instances of the letter or group that comes before than the quantifier indicates.

Types of Quantifiers

  • Greedy Quantifier

  • Reluctant Quantifier

  • Possessive Quantifier

Greedy Quantifier

Greedy quantifiers are the default type of quantifier in regular expressions. They try to match the longest possible string that matches the pattern. For example, the regular expression a+ will match the string aaaa as aaaa, not a.

Greedy quantifiers first processes the entire string. In case the pattern is not followed by the complete string, the algorithm will eliminate the final character and retry. Until the string matches the pattern or there are no more characters to be removed, this procedure will keep going.

while()

This code in Java looks for the pattern "a+" in the input "aaa". It finds where this pattern appears and shows each match's starting and ending positions. The program outputs a message like "Pattern found ranging from 0 to 2," indicating where it found the pattern in the input.

Algorithm

  • Step 1: Compile the regular expression pattern using Pattern.compile() method with pattern "a+" and assign it to variable p.

  • Step 2: Generate a Matcher object called m by using the matcher() method on pattern p and passing input string "aaa".

  • Step 3: Initiate a while loop for iterating through the Matcher's matches.

  • Step 4: Check for a match using the find() method of Matcher m.

  • Step 5: If a match is found, execute code within the loop.

  • Step 6: Print "Pattern found ranging from" concatenated with m.start() and " to " and (m.end()-1).

  • Step 7: Close the while loop.

  • Step 8: Conclude the main method and the TLP class.

Example

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TLP
{
	public static void main(String[] args)
	{
		Pattern p = Pattern.compile("a+");

		Matcher m = p.matcher("aaa");

		while (m.find())
			System.out.println("Pattern found ranging from " + m.start() +
							" to " + (m.end()-1));

	}
}

Output

Pattern found ranging from 0 to 2

Reluctant Qualifier

Non-greedy quantifiers are the opposite of greedy quantifiers. They try to match the shortest possible string that matches the pattern. For instance, the regular expression a+? will match the string aaaa as a, not aaaa.

while()

The code creates a regular expression pattern that matches one or more a characters, but it allows zero a characters to match as well. The code then creates a matcher object for the input string "aaa". The matcher object is employed to find all instances of the pattern in the input string. For each occurrence of the pattern, the code prints the message "Pattern ranging from start() to end()-1". The start() and end() methods of the matcher object return the start and end indices of the match in the input string.

Algorithm

  • Step 1: Create a Pattern object by compiling the regular expression "a+?".

  • Step 2: Create a Matcher object by passing the input string "aaa" to it.

  • Step 3: Enter a loop to find and print matches in the input string.

  • Step 4: The outcome will display the beginning and ending indexes of the matching patterns found in the input string "aaa."

Example

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TLP
{
	public static void main(String[] args)
	{
		Pattern p = Pattern.compile("a+?");

		Matcher m = p.matcher("aaa");

		while (m.find())
			System.out.println("Pattern ranging from " + m.start() +
							" to " + (m.end()-1));

	}
}

Output

Pattern ranging from 0 to 0
Pattern ranging from 1 to 1
Pattern ranging from 2 to 2

Possessive Quantifier

Like a greedy quantifier, a possessive quantifier matches as many characters as feasible. Possessive quantifiers, in contrast, do not try to remove characters from the end of the string if the entire string does not follow the pattern, which is opposite to greedy quantifiers.

while()

The code creates a regular expression pattern that matches the sequence of two characters c and ++. This example then creates a matcher object for the input string "ccc". The matcher object is employed to find all instances of the pattern in the input string. For each occurrence of the pattern, the code prints the message "Pattern ranging from start() to end()-1".

Algorithm

  • Step 1: Declare the main class named TLP.

  • Step 2: Come up with a Pattern object by compiling the regular expression "c++."

  • Step 3: Create a Matcher object by passing the input string "ccc" to it.

  • Step 4: Enter a loop to find and print matches in the input string.

  • Step 5: When the code runs, the input string "ccc" will display the beginning and ending indexes of the matched pattern "c++"

Example

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class TLP
{
	public static void main(String[] args)
	{
		Pattern p = Pattern.compile("c++");

		// Making an instance of Matcher class
		Matcher m = p.matcher("ccc");

		while (m.find())
			System.out.println("Pattern ranging from from " + m.start() +
							" to " + (m.end()-1));
	}
}

Output

Pattern ranging from from 0 to 2

Conclusion

In Java regular expressions, quantifiers indicate how many times a character or set of characters can appear in a match. Quantifiers come in three varieties: possessive, hesitant, and greedy.

Possessive quantifiers match exactly the number of characters supplied. The greedy quantifiers match as many characters as possible. And the reluctant quantifiers match as few characters as feasible.

Here are some examples of quantifiers in Java regular expressions:

  • a+ matches one or more a characters

  • a* matches zero or more a characters

  • a? matches one a character or none

  • a{3} matches exactly three a characters

Updated on: 29-Aug-2023

159 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements