Spring Batch - CSV to XML



In this chapter, we will create a simple Spring Batch application which uses a CSV Reader and an XML Writer.

Reader − The reader we are using in the application is FlatFileItemReader to read data from the CSV files.

Writer − The Writer we are using in the application is StaxEventItemWriter to write the data to XML file.

Processor − The Processor we are using in the application is a custom processor which just prints the records read from the CSV file.

Create Project

Create a new maven project as discussed in Spring Batch - Environment Chapter.

Following is the input CSV file we are using in this application. This document holds data records which specify details like tutorial id, tutorial author, tutorial title, submission date, tutorial icon and tutorial description.

report.csv

Create this file in src > main > resources folder of maven project.

1001, "Sanjay", "Learn Java", 06/05/2007 
1002, "Abdul S", "Learn MySQL", 19/04/2007 
1003, "Krishna Kasyap", "Learn JavaFX", 06/07/2017

pom.xml

Following are the content of pom.xml file used in this maven project.

<project xmlns = "http://maven.apache.org/POM/4.0.0" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation = "http://maven.apache.org/POM/4.0.0 
   http://maven.apache.org/maven-v4_0_0.xsd"> 
   <modelVersion>4.0.0</modelVersion> 
   <groupId>com.tutorialspoint</groupId> 
   <artifactId>SpringBatchSample</artifactId> 
   <packaging>jar</packaging> 
   <version>1.0-SNAPSHOT</version> 
   <name>SpringBatchExample</name>
   <url>http://maven.apache.org</url>  

   <properties> 
      <jdk.version>21</jdk.version> 
      <spring.version>5.3.14</spring.version> 
      <spring.batch.version>4.3.4</spring.batch.version> 
      <mysql.driver.version>8.4.0</mysql.driver.version> 
      <junit.version>4.11</junit.version> 
      <pdf.version>2.0.32</pdf.version>
      <xstream.version>1.4.17</xstream.version>
      <jaxb.version>2.3.1</jaxb.version> 
      <jaxb.impl.version>2.3.4</jaxb.impl.version> 
   </properties>  

   <dependencies> 
      <!-- Spring Core --> 
      <dependency> 
         <groupId>org.springframework</groupId> 
         <artifactId>spring-core</artifactId> 
         <version>${spring.version}</version> 
      </dependency>  

      <!-- Spring jdbc, for database --> 
      <dependency> 
         <groupId>org.springframework</groupId> 
         <artifactId>spring-jdbc</artifactId> 
         <version>${spring.version}</version> 
      </dependency>  

      <!-- Spring XML to/back object --> 
      <dependency> 
         <groupId>org.springframework</groupId> 
         <artifactId>spring-oxm</artifactId> 
         <version>${spring.version}</version> 
      </dependency>  

      <!-- MySQL database driver --> 
      <dependency>
         <groupId>com.mysql</groupId>
         <artifactId>mysql-connector-j</artifactId>
         <version>${mysql.driver.version}</version>
      </dependency>  

      <!-- Spring Batch dependencies --> 
      <dependency> 
         <groupId>org.springframework.batch</groupId> 
         <artifactId>spring-batch-core</artifactId> 
         <version>${spring.batch.version}</version> 
      </dependency> 

      <dependency> 
         <groupId>org.springframework.batch</groupId> 
         <artifactId>spring-batch-infrastructure</artifactId> 
         <version>${spring.batch.version}</version> 
      </dependency>  

      <!-- Spring Batch unit test --> 
      <dependency> 
         <groupId>org.springframework.batch</groupId> 
         <artifactId>spring-batch-test</artifactId> 
         <version>${spring.batch.version}</version> 
      </dependency>  

      <!-- Junit --> 
      <dependency> 
         <groupId>junit</groupId> 
         <artifactId>junit</artifactId> 
         <version>${junit.version}</version> 
         <scope>test</scope> 
      </dependency> 

      <!-- pdfbox -->
      <dependency>
         <groupId>org.apache.pdfbox</groupId>
         <artifactId>pdfbox</artifactId>
         <version>${pdf.version}</version>
      </dependency>
	  
      <!-- xstream -->
      <dependency>
         <groupId>com.thoughtworks.xstream</groupId>
         <artifactId>xstream</artifactId>
         <version>${xstream.version}</version>
      </dependency>
	  
      <!-- jaxb-api -->
      <dependency>
         <groupId>javax.xml.bind</groupId>
         <artifactId>jaxb-api</artifactId>
         <version>${jaxb.version}</version>
      </dependency>
	  
      <!-- JAXB RI -->
      <dependency>
         <groupId>com.sun.xml.bind</groupId>
         <artifactId>jaxb-impl</artifactId>
         <version>${jaxb.impl.version}</version>
      </dependency>

   </dependencies> 

   <build> 
      <finalName>spring-batch</finalName> 
      <plugins> 
         <plugin> 
            <groupId>org.apache.maven.plugins</groupId> 
            <artifactId>maven-eclipse-plugin</artifactId>
            <version>2.9</version> 
            <configuration> 
               <downloadSources>true</downloadSources> 
               <downloadJavadocs>false</downloadJavadocs> 
            </configuration> 
         </plugin> 
   
         <plugin> 
            <groupId>org.apache.maven.plugins</groupId> 
            <artifactId>maven-compiler-plugin</artifactId> 
            <version>2.3.2</version> 
            <configuration> 
               <source>${jdk.version}</source> 
               <target>${jdk.version}</target> 
            </configuration> 
         </plugin> 
      </plugins> 
   </build> 
</project>    

jobConfig.xml

Following is the configuration file of our sample Spring Batch application. In this file, we will define the Job and the steps. In addition to these, we also define the beans for ItemReader, ItemProcessor, and ItemWriter. (Here, we associate them with respective classes and pass the values for the required properties to configure them.)

Create this file in src > main > resources folder of maven project.

<beans xmlns = "http://www.springframework.org/schema/beans" 
   xmlns:batch = "http://www.springframework.org/schema/batch" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation = "http://www.springframework.org/schema/batch 
      http://www.springframework.org/schema/batch/spring-batch-2.2.xsd 
      http://www.springframework.org/schema/beans 
      http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">  
   
   <import resource = "context.xml" />  
   
   <bean id = "report" class = "Report" scope = "prototype" /> 
   <bean id = "itemProcessor" class = "CustomItemProcessor" />  
   
   <batch:job id = "helloWorldJob"> 
   
      <batch:step id = "step1"> 
   
         <batch:tasklet> 
            <batch:chunk reader = "cvsFileItemReader" writer = "xmlItemWriter" 
               processor = "itemProcessor" commit-interval = "10"> 
            </batch:chunk> 
         </batch:tasklet> 
      </batch:step> 
   </batch:job>  
 
   <bean id = "cvsFileItemReader" 
      class = "org.springframework.batch.item.file.FlatFileItemReader">  
      <property name = "resource" value = "classpath:report.csv" /> 
      <property name = "lineMapper"> 
         <bean 
            class = "org.springframework.batch.item.file.mapping.DefaultLineMapper"> 
            <property name = "lineTokenizer"> 
               <bean    
                  class = "org.springframework.batch.item.file.transform.DelimitedLineTokenizer"> 
                  <property name = "names" value = "tutorial_id, 
                     tutorial_author, Tutorial_title, submission_date" /> 
               </bean> 
            </property> 
      
            <property name = "fieldSetMapper"> 
               <bean class = "ReportFieldSetMapper" /> 
            </property> 
         </bean> 
      </property> 
   </bean>  
   
   <bean id = "xmlItemWriter" 
      class = "org.springframework.batch.item.xml.StaxEventItemWriter"> 
      <property name = "resource" value = "file:tutorials.xml" /> 
      <property name = "marshaller" ref = "reportMarshaller" /> 
      <property name = "rootTagName" value = "tutorials" /> 
   </bean>  
 
   <bean id = "reportMarshaller" 
      class = "org.springframework.oxm.jaxb.Jaxb2Marshaller">
      <property name = "classesToBeBound"> 
         <list> 
            <value>Tutorial</value> 
         </list> 
      </property> 
   </bean> 
</beans> 

Context.xml

Following is the context.xml of our Spring Batch application. In this file, we will define the beans like job repository, job launcher, and transaction manager.

Create this file in src > main > resources folder of maven project.

<beans xmlns = "http://www.springframework.org/schema/beans" 
   xmlns:jdbc = "http://www.springframework.org/schema/jdbc" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation = "http://www.springframework.org/schema/beans 
      http://www.springframework.org/schema/beans/spring-beans-3.2.xsd 
      http://www.springframework.org/schema/jdbc 
      http://www.springframework.org/schema/jdbc/spring-jdbc-3.2.xsd">  
   <!-- stored job-meta in database --> 
   <bean id = "jobRepository" 
      class = "org.springframework.batch.core.repository.support.JobRepositoryFactoryBean"> 
      <property name = "dataSource" ref = "dataSource" /> 
      <property name = "transactionManager" ref = "transactionManager" /> 
      <property name = "databaseType" value = "mysql" /> 
   </bean>  
 
   <bean id = "transactionManager" 
      class = "org.springframework.batch.support.transaction.ResourcelessTransactionManager" />  
   <bean id = "jobLauncher" 
      class = "org.springframework.batch.core.launch.support.SimpleJobLauncher"> 
      <property name = "jobRepository" ref = "jobRepository" /> 
   </bean>  
   
   <bean id = "dataSource" class = "org.springframework.jdbc.datasource.DriverManagerDataSource"> 
      <property name = "url" value = "jdbc:mysql://localhost:3306/details" />
      <property name = "username" value = "myuser" /> 
      <property name = "password" value = "password" /> 
   </bean> 
  
   <!-- create job-meta tables automatically --> 
   <jdbc:initialize-database data-source = "dataSource">   
      <jdbc:script location = "org/springframework/batch/core/schema-drop-mysql.sql" /> 
      <jdbc:script location = "org/springframework/batch/core/schema-mysql.sql" /> 
   </jdbc:initialize-database> 
</beans>

CustomItemProcessor.java

Following is the Processor class. In this class, we write the code of processing in the application. Here, we are printing the contents of each record.

Create this class in src > main > java folder of maven project.

import org.springframework.batch.item.ItemProcessor;  

public class CustomItemProcessor implements ItemProcessor<Tutorial, Tutorial> {  
   
   @Override 
   public Tutorial process(Tutorial item) throws Exception {  
      System.out.println("Processing..." + item); 
      return item; 
   } 
} 

TutorialFieldSetMapper.java

Following is the TutorialFieldSetMapper class which sets the data to the Tutorial class.

Create this class in src > main > java folder of maven project.

import org.springframework.batch.item.file.mapping.FieldSetMapper; 
import org.springframework.batch.item.file.transform.FieldSet; 
import org.springframework.validation.BindException;  

public class TutorialFieldSetMapper implements FieldSetMapper<Tutorial> {  

   @Override 
   public Tutorial mapFieldSet(FieldSet fieldSet) throws BindException {  
      
      //Instantiating the report object  
      Tutorial tutorial = new Tutorial(); 
       
      //Setting the fields  
      tutorial.setTutorial_id(fieldSet.readInt(0)); 
      tutorial.setTutorial_author(fieldSet.readString(1)); 
      tutorial.setTutorial_title(fieldSet.readString(2)); 
      tutorial.setSubmission_date(fieldSet.readString(3)); 
       
      return tutorial; 
   } 
}

Tutorial.java class

Following is the Tutorial class. It is a simple Java class with setter and getter methods. In this class, we are using annotations to associate the methods of this class with the tags of the XML file.

Create this class in src > main > java folder of maven project.

import javax.xml.bind.annotation.XmlAttribute; 
import javax.xml.bind.annotation.XmlElement; 
import javax.xml.bind.annotation.XmlRootElement; 

@XmlRootElement(name = "tutorial") 
public class Tutorial {  
   private int tutorial_id; 
   private String tutorial_author; 
   private String tutorial_title;
   private String submission_date;  
 
   @XmlAttribute(name = "tutorial_id") 
   public int getTutorial_id() { 
      return tutorial_id; 
   }  
 
   public void setTutorial_id(int tutorial_id) { 
      this.tutorial_id = tutorial_id; 
   }  
 
   @XmlElement(name = "tutorial_author") 
   public String getTutorial_author() { 
      return tutorial_author; 
   }  
   public void setTutorial_author(String tutorial_author) { 
      this.tutorial_author = tutorial_author; 
   }  
      
   @XmlElement(name = "tutorial_title") 
   public String getTutorial_title() { 
      return tutorial_title; 
   }  
   
   public void setTutorial_title(String tutorial_title) { 
      this.tutorial_title = tutorial_title; 
   }  
   
   @XmlElement(name = "submission_date") 
   public String getSubmission_date() { 
      return submission_date; 
   }  
   
   public void setSubmission_date(String submission_date) { 
      this.submission_date = submission_date; 
   } 
   
   @Override 
   public String toString() { 
      return "  [Tutorial id=" + tutorial_id + 
         ",Tutorial Author=" + tutorial_author  + 
         ",Tutorial Title=" + tutorial_title + 
         ",Submission Date=" + submission_date + "]"; 
   } 
}  

App.java

Following is the code which launches the batch process. In this class, we will launch the batch application by running the JobLauncher.

Create this class in src > main > java folder of maven project.

import org.springframework.batch.core.Job; 
import org.springframework.batch.core.JobExecution; 
import org.springframework.batch.core.JobParameters; 
import org.springframework.batch.core.launch.JobLauncher; 
import org.springframework.context.ApplicationContext; 
import org.springframework.context.support.ClassPathXmlApplicationContext;  

public class App {  
   public static void main(String[] args) throws Exception { 
     
      String[] springConfig  =  { "jobConfig.xml" };  
      
      // Creating the application context object        
      ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);  
      
      // Creating the job launcher 
      JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher"); 
   
      // Creating the job 
      Job job = (Job) context.getBean("helloWorldJob"); 
   
      // Executing the JOB 
      JobExecution execution = jobLauncher.run(job, new JobParameters());
      System.out.println("Exit Status : " + execution.getStatus()); 
   } 
}       

Output

Right click on the project in eclipse, select run as -> maven build . Set goals as clean package and run the project. You'll see following output.

[INFO] Scanning for projects...
[INFO] 
[INFO] [1m----------------< [0;36mcom.tutorialspoint:SpringBatchSample[0;1m >----------------[m
[INFO] [1mBuilding SpringBatchExample 1.0-SNAPSHOT[m
[INFO]   from pom.xml
[INFO] [1m--------------------------------[ jar ]---------------------------------[m
[INFO] 
[INFO] [1m--- [0;32mclean:3.2.0:clean[m [1m(default-clean)[m @ [36mSpringBatchSample[0;1m ---[m
[INFO] Deleting C:\Users\Tutorialspoint\eclipse-workspace\SpringBatchSample\target
[INFO] 
[INFO] [1m--- [0;32mresources:3.3.1:resources[m [1m(default-resources)[m @ [36mSpringBatchSample[0;1m ---[m
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 4 resources from src\main\resources to target\classes
[INFO] 
[INFO] [1m--- [0;32mcompiler:2.3.2:compile[m [1m(default-compile)[m @ [36mSpringBatchSample[0;1m ---[m
[WARNING] File encoding has not been set, using platform encoding UTF-8, i.e. build is platform dependent!
[INFO] Compiling 5 source files to C:\Users\Tutorialspoint\eclipse-workspace\SpringBatchSample\target\classes
[INFO] 
[INFO] [1m--- [0;32mresources:3.3.1:testResources[m [1m(default-testResources)[m @ [36mSpringBatchSample[0;1m ---[m
[WARNING] Using platform encoding (UTF-8 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 0 resource from src\test\resources to target\test-classes
[INFO] 
[INFO] [1m--- [0;32mcompiler:2.3.2:testCompile[m [1m(default-testCompile)[m @ [36mSpringBatchSample[0;1m ---[m
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] [1m--- [0;32msurefire:3.1.2:test[m [1m(default-test)[m @ [36mSpringBatchSample[0;1m ---[m
[INFO] 
[INFO] [1m--- [0;32mjar:3.3.0:jar[m [1m(default-jar)[m @ [36mSpringBatchSample[0;1m ---[m
[INFO] Building jar: C:\Users\Tutorialspoint\eclipse-workspace\SpringBatchSample\target\spring-batch.jar
[INFO] [1m------------------------------------------------------------------------[m
[INFO] [1;32mBUILD SUCCESS[m
[INFO] [1m------------------------------------------------------------------------[m
[INFO] Total time:  3.323 s
[INFO] Finished at: 2024-07-31T12:54:04+05:30
[INFO] [1m------------------------------------------------------------------------[m

To check the output of the above SpringBatch program, right click on App.java class and select run as -> Java application. It will produce the following output −

Jul 31, 2024 1:00:16 PM org.springframework.batch.core.launch.support.SimpleJobLauncher afterPropertiesSet
INFO: No TaskExecutor has been set, defaulting to synchronous executor.
Jul 31, 2024 1:00:19 PM org.springframework.batch.core.launch.support.SimpleJobLauncher$1 run
INFO: Job: [FlowJob: [name=helloWorldJob]] launched with the following parameters: [{}]
Jul 31, 2024 1:00:19 PM org.springframework.batch.core.job.SimpleStepHandler handleStep
INFO: Executing step: [step1]
Processing...  [Tutorial id=1001,Tutorial Author=Sanjay,Tutorial Title=Learn Java,Submission Date=06/05/2007]
Processing...  [Tutorial id=1002,Tutorial Author=Abdul S,Tutorial Title=Learn MySQL,Submission Date=19/04/2007]
Processing...  [Tutorial id=1003,Tutorial Author=Krishna Kasyap,Tutorial Title=Learn JavaFX,Submission Date=06/07/2017]
Jul 31, 2024 1:00:20 PM org.springframework.batch.core.step.AbstractStep execute
INFO: Step: [step1] executed in 215ms
Jul 31, 2024 1:00:20 PM org.springframework.batch.core.launch.support.SimpleJobLauncher$1 run
INFO: Job: [FlowJob: [name=helloWorldJob]] completed with the following parameters: [{}] and the following status: [COMPLETED] in 540ms
Exit Status : COMPLETED

This will generate an XML file (tutorials.xml in root folder) with the following contents.

<?xml version = "1.0" encoding = "UTF-8"?> 
<tutorials> 
   <tutorial tutorial_id = "1001"> 
      <submission_date>06/05/2007</submission_date> 
      <tutorial_author>Sanjay</tutorial_author> 
      <tutorial_title>Learn Java</tutorial_title> 
   </tutorial> 
   
   <tutorial tutorial_id = "1002"> 
      <submission_date>19/04/2007</submission_date> 
      <tutorial_author>Abdul S</tutorial_author> 
      <tutorial_title>Learn MySQL</tutorial_title> 
   </tutorial> 
   
   <tutorial tutorial_id = "1003"> 
      <submission_date>06/07/2017</submission_date>
      <tutorial_author>Krishna Kasyap</tutorial_author> 
      <tutorial_title>Learn JavaFX</tutorial_title> 
   </tutorial> 
</tutorials>
Advertisements