Java program to delete duplicate lines in text file

The interface set does not allow duplicate elements. The add() method of this interface accepts elements and adds to the Set object, if the addition is successful it returns true, if you try to add an existing element using this method, the addition operations fails returning false.

Therefore, to remove duplicate lines from a File −

  • Instantiate Scanner class (any class that reads data from a file)

  • Instantiate the FileWriter class (any class that writes data into a file)

  • Create an object of the Set interface.

  • Read each line of the file Store it in a Sting say input.

  • Try to add this String to the Set object.

  • If the addition is successful, append that particular line to file writer.

  • Finally, flush the contents of the FileWriter to the output file.

If a file contains a particular line more than one time, for the 1st time it is added to the set object and thus appended to the file writer.

If the same line is encountered again while reading all the lines in the file, since it already exists in the set object the add() method rejects it.


Assume we have a file with name sample.txt with the following contents −

Hello how are you
Hello how are you
welcome to Tutorialspoint

The following Java program removes the duplicate lines from the above file and adds them to the file named output.txt.

import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;
public class DeletingDuplcateLines {
   public static void main(String args[]) throws Exception {
      String filePath = "D://sample.txt";
      String input = null;
      //Instantiating the Scanner class
      Scanner sc = new Scanner(new File(filePath));
      //Instantiating the FileWriter class
      FileWriter writer = new FileWriter("D://output.txt");
      //Instantiating the Set class
      Set set = new HashSet();
      while (sc.hasNextLine()) {
         input = sc.nextLine();
         if(set.add(input)) {
");          }       }       writer.flush();       System.out.println("Contents added............");    } }


Contents added............

The contents of the output.txt will be:

Hello how are you
welcome to Tutorialspoint