How to find duplicate elements in a Stream in Java


Finding duplicate elements in a stream of data is one of the common questions that is asked in Java interviews and even in the exams of many students. Java provides several ways to find duplicate elements, we will focus mainly on two ways: the first one is by using Set of Java Collection Framework and the other one is by using the built-in method of stream named Collections.frequency().

Java Program to find duplicate elements in a Stream

Before discussing the different ways to get duplicate items from a collection of data, it is necessary to talk about the filter() method first. It will be a crucial part of example programs.

filter()

It allows us to strain elements of the stream based on the specified condition. It is a part of higher-order function that is used to apply a certain behavior on stream items. This method takes a predicate as an argument and returns a list of elements that match the predicate.

Syntax

                
filter(predicate);

Using Set of Java Collection Framework

It is the sub interface of Java Collection Interface that doesn’t allow duplicate values. It is quite similar to a mathematical set. We can use the add() method, which will append only those elements to the set that are dissimilar. To use the property of Set Interface, we need to use HashSet class that implements this interface.

Example 

The following example illustrates the use of Set interface in finding duplicate elements from a stream.

Approach

  • Create a list using Arrays.asList() method to store a fixed-size list.

  • Then, define a Set using the HashSet class to store only dissimilar elements.

  • Now, use the filter() method along with stream() and forEach() to filter out the duplicate items only. Here, stream() specifies input in the form of stream and we will use forEach() to iterate and print the duplicate elements.

import java.util.*;
public class Duplicate {
   public static void main(String []args) {
      // create a list with duplicate items
      List<Integer> itemsList = Arrays.asList(10, 12, 10, 33, 40, 40, 61, 61);
      // declaring a new Set 
      Set<Integer> newitemSet = new HashSet<>();
      System.out.println("The list of duplicate Items: ");
      itemsList.stream() // converting list to stream
         .filter(nums -> !newitemSet.add(nums)) // to filter out the elementsthat are not added to the set
         .forEach(System.out::println); // print the duplicates
   }
}

Output

The list of duplicate Items: 
10
40
61

Using Collections.frequency() method

Another simplest way to filter duplicate elements from a stream or collection is to use the Collections.frequency() method of 'java.util' package, which is used to return the total number of elements from a specified collection.

Syntax

Collections.frequency(nameOfCollection, obj);

Here,

nameOfCollection signifies the stream and obj indicates the element whose frequency needs to be determined.

Example 

In the following example, we will use the Collections.frequency() method to count the occurrences of each element in the stream and then return the elements that occurred more than one time. We will print the whole list of occurrences of duplicate elements along with the count.

import java.util.*;
public class FindDuplicates {
   public static void main(String[] args) {
      // create a list with duplicate items
      List<Integer> itemslist = Arrays.asList(10, 12, 10, 10, 33, 40, 40, 61, 61);
      System.out.println("The list of duplicate Items with frequency: ");
      itemslist.stream() // converting list to stream 
         .filter(itr -> Collections.frequency(itemslist, itr) > 1) // checking the frequency of duplicate items
         .forEach(System.out::println); // printing the frequency of duplicate items
      System.out.println("Count of duplicate items: ");    
      // to count the duplicate items    
      System.out.println(itemslist.stream()
       .filter(itr -> Collections.frequency(itemslist, itr) > 1)
       .count());
    }
}

Output

The list of duplicate Items with frequency: 
10
10
10
40
40
61
61
Count of duplicate items: 
7

Example

Here is another example in which we will use both Set Interface and Collections.frequency() method together to get only duplicate elements. The Collections.frequency() method will count the occurrences of each element in the stream and then collects the elements with a count greater than one into a Set to remove duplicates. The resulting Set will contain only the duplicate elements from the stream.

import java.util.stream.*;
import java.util.*;
public class FindDuplicates {
   public static void main(String[] args) {
      // create a list with duplicate items
      List<Integer> itemslist = Arrays.asList(10, 12, 10, 10, 33,40, 40, 61, 61);
      // set to store duplicate items
      Set<Integer> duplicates = itemslist.stream()
         .filter(itr -> Collections.frequency(itemslist, itr) > 1) // checking the frequency of duplicate items
         .collect(Collectors.toSet()); // adding only duplicate items to set
      // printing the duplicate items    
      System.out.println("The list of duplicate Items:" + duplicates); 
   }
}

Output

The list of duplicate Items:[40, 10, 61]

Conclusion

In this section, we will conclude our discussion with some key points from the above example and concepts. We can use the filter() method to filter out particular types of elements from a collection of data. It works in the background by applying a Predicate to each element. The Set Interface has the ability to store only distinct elements which makes it an excellent choice for the given task.

Updated on: 19-Jul-2023

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements