Merging PDFs using Java


PDFMergerUtility class is used for merging multiple PDF documents into a single PDF document. PDFMergerUtility class will take a number of PDF files and merge them,and save the result as a new document. To merge PDFs using java will require the installation of apache library. There may be different approches to merge PDF files using java.

Definition: Merging PDFs using java

Example

Input − PDF1 = Alice.pdf, PDF2 = Bob.pdf

Output − newMerged.pdf // merged pdf of pdf1 and pdf2

Program code

// Merging two pdf documents here 
 
import org.apache.pdfbox.multipdf.PDFMergerUtility;
import org.apache.pdfbox.pdmodel.PDDocument;
import java.io.File;
import java.io.IOException;
  
public class GFG {
   public static void main(String[] args)
      throws IOException
      {
  
      // loading all the pdf files we wish to merge
  
      File file1 = new File( "/Users/abhilasha/Desktop/Merging Pdfs/file1.pdf");
      File file2 = new File("/Users/abhilasha/Desktop/Merging Pdfs/file2.pdf");
  
      // Instantiating PDFMergerUtility class
      PDFMergerUtility obj = new PDFMergerUtility();
  
      // Setting the destination file path 
      obj.setDestinationFileName("/Users/abhilasha/Desktop/Merging Pdfs/newMerged.pdf");
  
      // Add all source files, to be merged
      obj.addSource(file1);
      obj.addSource(file2);
  
      // Merging documents
      obj.mergeDocuments();
  
      System.out.println( "PDF Documents merged to a single file");
   }
}

Algorithm

Using apache library we have to follow these steps to merge multiple PDF documents −

  • Step 1 − As a First step,we have to install the PDFMergerUtility class.

  • Step 2 − For next step, use the setDestinationFileName() method to set up the destination file.

  • Step 3 − Now, use the addSource() method we set up the source files.

  • Step 4 − For the final step, we use the mergeDocuments() method of the PDFMergerUtility class to merge PDF documents.

Output

Before execution of code −

After execution of code −

Approaches

Approach 1 − Merge Two Pdf Files Using Itext In Java −

Approach 2 − Merge Multiple PDF Files using InputStream in Java

Approach 1: Merge two Pdf Files Using Itext in Java

Code

import java.io.FileInputStream;import java.io.FileOutputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import com.itextpdf.text.Document;import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfImportedPage;import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfWriter;
/**
* This class is used to merge two or more 
* existing pdf file using iText jar.
* @author w3spoint
*/public class PDFMergeExample {
   static void mergePdfFiles(List<InputStream> inputPdfList, OutputStream outputStream) throws Exception{
 
      //Create document and pdfReader objects.
	   Document document = new Document();
      List<PdfReader> readers = new ArrayList<PdfReader>();
      int totalPages = 0;
 
      //Create pdf Iterator object using inputPdfList.
      Iterator<InputStream> pdfIterator = inputPdfList.iterator();

      // Create reader list for the input pdf files.
      while (pdfIterator.hasNext()) {
         InputStream pdf = pdfIterator.next();
         PdfReader pdfReader = new PdfReader(pdf);
         readers.add(pdfReader);
         totalPages = totalPages + pdfReader.getNumberOfPages();
      }
 
      // Create writer for the outputStream
      PdfWriter writer = PdfWriter.getInstance(document, outputStream);
 
      //Open document.
      document.open();

      //Contain the pdf data.
      PdfContentByte pageContentByte = writer.getDirectContent();
 
      PdfImportedPage pdfImportedPage;
      int currentPdfReaderPage = 1;
      Iterator<pdfreader> iteratorPDFReader = readers.iterator();
 
      // Iterate and process the reader list.
      while (iteratorPDFReader.hasNext()) {
         PdfReader pdfReader = iteratorPDFReader.next();
         //Create page and add content.
         while (currentPdfReaderPage <= pdfReader.getNumberOfPages()) {
            document.newPage();
            pdfImportedPage = writer.getImportedPage(
            pdfReader,currentPdfReaderPage);
            pageContentByte.addTemplate(pdfImportedPage, 0, 0);
            currentPdfReaderPage++;
         }
         currentPdfReaderPage = 1;
      }
 
      //Close document and outputStream.
      outputStream.flush();
      document.close();
      outputStream.close();
 
      System.out.println("Pdf files merged successfully.");
   }
	public static void main(String args[]){
      try {
         //Prepare input pdf file list as list of input stream.
         List<InputStream> inputPdfList = new ArrayList<InputStream>();
         inputPdfList.add(new FileInputStream("D:\TestFile1.pdf"));
         inputPdfList.add(new FileInputStream("D:\TestFile2.pdf"));
    
         //Prepare output stream for merged pdf file.
         OutputStream outputStream = new FileOutputStream("D:\MergeFile.pdf");
    
         //call method to merge pdf files.
         mergePdfFiles(inputPdfList, outputStream);     
	   } catch (Exception e) {
         e.printStackTrace();
	   }
   }
}

Output

Pdf file merged successfully.

Approach 2: Merge two Pdf Files Using Input Stream In Java

To merge PDFs files by using InputStream, we have to use another mergeFiles() method by passing an array of inputStreams.

Program, Code

import com.spire.pdf.PdfDocument
import com.spire.pdf.PdfDocumentBase;
import java.io.*;
public class MergePdfsUsingInputStreams {
   public static void main(String []args) throws FileNotFoundException {
      //Load the PDF files into FileInputStream objects
      FileInputStream stream1 = new FileInputStream("File1.pdf");
      FileInputStream stream2 = new FileInputStream("File2.pdf");
      FileInputStream stream3 = new FileInputStream("File3.pdf");
      //Create a InputStream array for the FileInputStream objects
      InputStream[] streams = new FileInputStream[]{stream1, stream2, stream3}; //Merge the PDF files
      PdfDocumentBase pdf = PdfDocument.mergeFiles(streams);
      //Create a OutputStream for the merged PDF
      OutputStream outputStream = new FileOutputStream("Merge.pdf");
      //Save the merged PDF file pdf.save(outputStream);
   }
}

Output

Pdf file merged successfully.

Conclusion

If we check the mentioned path, we can see that the new document named “newMerged” has been generated with both the separate file.

Updated on: 18-Jul-2023

983 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements