Spring Batch - CSV to XML



In this chapter, we will create a simple Spring Batch application which uses a CSV Reader and an XML Writer.

Reader − The reader we are using in the application is FlatFileItemReader to read data from the CSV files.

Following is the input CSV file we are using in this application. This document holds data records which specify details like tutorial id, tutorial author, tutorial title, submission date, tutorial icon and tutorial description.

1001, "Sanjay", "Learn Java", 06/05/2007 
1002, "Abdul S", "Learn MySQL", 19/04/2007 
1003, "Krishna Kasyap", "Learn JavaFX", 06/07/2017

Writer − The Writer we are using in the application is StaxEventItemWriter to write the data to XML file.

Processor − The Processor we are using in the application is a custom processor which just prints the records read from the CSV file.

jobConfig.xml

Following is the configuration file of our sample Spring Batch application. In this file, we will define the Job and the steps. In addition to these, we also define the beans for ItemReader, ItemProcessor, and ItemWriter. (Here, we associate them with respective classes and pass the values for the required properties to configure them.)

<beans xmlns = " http://www.springframework.org/schema/beans" 
   xmlns:batch = "http://www.springframework.org/schema/batch" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation = "http://www.springframework.org/schema/batch 
      http://www.springframework.org/schema/batch/spring-batch-2.2.xsd 
      http://www.springframework.org/schema/beans 
      http://www.springframework.org/schema/beans/spring-beans-3.2.xsd">  
   
   <import resource = "../jobs/context.xml" />  
   
   <bean id = "report" class = "Report" scope = "prototype" /> 
   <bean id = "itemProcessor" class = "CustomItemProcessor" />  
   
   <batch:job id = "helloWorldJob"> 
   
      <batch:step id = "step1"> 
   
         <batch:tasklet> 
            <batch:chunk reader = "cvsFileItemReader" writer = "xmlItemWriter" 
               processor = "itemProcessor" commit-interval = "10"> 
            </batch:chunk> 
         </batch:tasklet> 
      </batch:step> 
   </batch:job>  
 
   <bean id = "cvsFileItemReader" 
      class = "org.springframework.batch.item.file.FlatFileItemReader">  
      <property name = "resource" value = "classpath:resources/report.csv" /> 
      <property name = "lineMapper"> 
         <bean 
            class = "org.springframework.batch.item.file.mapping.DefaultLineMapper"> 
            <property name = "lineTokenizer"> 
               <bean    
                  class = "org.springframework.batch.item.file.transform.DelimitedLineTokenizer"> 
                  <property name = "names" value = "tutorial_id, 
                     tutorial_author, Tutorial_title, submission_date" /> 
               </bean> 
            </property> 
      
            <property name = "fieldSetMapper"> 
               <bean class = "ReportFieldSetMapper" /> 
            </property> 
         </bean> 
      </property> 
   </bean>  
   
   <bean id = "xmlItemWriter" 
      class = "org.springframework.batch.item.xml.StaxEventItemWriter"> 
      <property name = "resource" value = "file:xml/outputs/tutorials.xml" /> 
      <property name = "marshaller" ref = "reportMarshaller" /> 
      <property name = "rootTagName" value = "tutorials" /> 
   </bean>  
 
   <bean id = "reportMarshaller" 
      class = "org.springframework.oxm.jaxb.Jaxb2Marshaller">
      <property name = "classesToBeBound"> 
         <list> 
            <value>Tutorial</value> 
         </list> 
      </property> 
   </bean> 
</beans> 

Context.xml

Following is the context.xml of our Spring Batch application. In this file, we will define the beans like job repository, job launcher, and transaction manager.

<beans xmlns = "http://www.springframework.org/schema/beans" 
   xmlns:jdbc = "http://www.springframework.org/schema/jdbc" 
   xmlns:xsi = "http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation = "http://www.springframework.org/schema/beans 
      http://www.springframework.org/schema/beans/spring-beans-3.2.xsd 
      http://www.springframework.org/schema/jdbc 
      http://www.springframework.org/schema/jdbc/spring-jdbc-3.2.xsd">  
   <!-- stored job-meta in database --> 
   <bean id = "jobRepository" 
      class = "org.springframework.batch.core.repository.support.JobRepositoryFactoryBean"> 
      <property name = "dataSource" ref = "dataSource" /> 
      <property name = "transactionManager" ref = "transactionManager" /> 
      <property name = "databaseType" value = "mysql" /> 
   </bean>  
 
   <bean id = "transactionManager" 
      class = "org.springframework.batch.support.transaction.ResourcelessTransactionManager" />  
   <bean id = "jobLauncher" 
      class = "org.springframework.batch.core.launch.support.SimpleJobLauncher"> 
      <property name = "jobRepository" ref = "jobRepository" /> 
   </bean>  
   
   <bean id = "dataSource" class = "org.springframework.jdbc.datasource.DriverManagerDataSource"> 
      <property name = "driverClassName" value = "com.mysql.jdbc.Driver" /> 
      <property name = "url" value = "jdbc:mysql://localhost:3306/details" />
      <property name = "username" value = "myuser" /> 
      <property name = "password" value = "password" /> 
   </bean> 
  
   <!-- create job-meta tables automatically --> 
   <jdbc:initialize-database data-source = "dataSource">   
      <jdbc:script location = "org/springframework/batch/core/schema-drop-mysql.sql" /> 
      <jdbc:script location = "org/springframework/batch/core/schema-mysql.sql" /> 
   </jdbc:initialize-database> 
</beans>

CustomItemProcessor.java

Following is the Processor class. In this class, we write the code of processing in the application. Here, we are printing the contents of each record.

import org.springframework.batch.item.ItemProcessor;  

public class CustomItemProcessor implements ItemProcessor<Tutorial, Tutorial> {  
   
   @Override 
   public Tutorial process(Tutorial item) throws Exception {  
      System.out.println("Processing..." + item); 
      return item; 
   } 
} 

TutorialFieldSetMapper.java

Following is the TutorialFieldSetMapper class which sets the data to the Tutorial class.

import org.springframework.batch.item.file.mapping.FieldSetMapper; 
import org.springframework.batch.item.file.transform.FieldSet; 
import org.springframework.validation.BindException;  

public class TutorialFieldSetMapper implements FieldSetMapper<Tutorial> {  

   @Override 
   public Tutorial mapFieldSet(FieldSet fieldSet) throws BindException {  
      
      //Instantiating the report object  
      Tutorial tutorial = new Tutorial(); 
       
      //Setting the fields  
      tutorial.setTutorial_id(fieldSet.readInt(0)); 
      tutorial.setTutorial_author(fieldSet.readString(1)); 
      tutorial.setTutorial_title(fieldSet.readString(2)); 
      tutorial.setSubmission_date(fieldSet.readString(3)); 
       
      return tutorial; 
   } 
}

Tutorial.java class

Following is the Tutorial class. It is a simple Java class with setter and getter methods. In this class, we are using annotations to associate the methods of this class with the tags of the XML file.

import javax.xml.bind.annotation.XmlAttribute; 
import javax.xml.bind.annotation.XmlElement; 
import javax.xml.bind.annotation.XmlRootElement;  

@XmlRootElement(name = "tutorial") 
public class Tutorial {  
   private int tutorial_id; 
   private String tutorial_author; 
   private String tutorial_title;
   private String submission_date;  
 
   @XmlAttribute(name = "tutorial_id") 
   public int getTutorial_id() { 
      return tutorial_id; 
   }  
 
   public void setTutorial_id(int tutorial_id) { 
      this.tutorial_id = tutorial_id; 
   }  
 
   @XmlElement(name = "tutorial_author") 
   public String getTutorial_author() { 
      return tutorial_author; 
   }  
   public void setTutorial_author(String tutorial_author) { 
      this.tutorial_author = tutorial_author; 
   }  
      
   @XmlElement(name = "tutorial_title") 
   public String getTutorial_title() { 
      return tutorial_title; 
   }  
   
   public void setTutorial_title(String tutorial_title) { 
      this.tutorial_title = tutorial_title; 
   }  
   
   @XmlElement(name = "submission_date") 
   public String getSubmission_date() { 
      return submission_date; 
   }  
   
   public void setSubmission_date(String submission_date) { 
      this.submission_date = submission_date; 
   } 
   
   @Override 
   public String toString() { 
      return "  [Tutorial id=" + tutorial_id + ", 
         Tutorial Author=" + tutorial_author  + ", 
         Tutorial Title=" + tutorial_title + ", 
         Submission Date=" + submission_date + "]"; 
   } 
}  

App.java

Following is the code which launches the batch process. In this class, we will launch the batch application by running the JobLauncher.

import org.springframework.batch.core.Job; 
import org.springframework.batch.core.JobExecution; 
import org.springframework.batch.core.JobParameters; 
import org.springframework.batch.core.launch.JobLauncher; 
import org.springframework.context.ApplicationContext; 
import org.springframework.context.support.ClassPathXmlApplicationContext;  

public class App {  
   public static void main(String[] args) throws Exception { 
     
      String[] springConfig  =  { "jobs/job_hello_world.xml" };  
      
      // Creating the application context object        
      ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);  
      
      // Creating the job launcher 
      JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher"); 
   
      // Creating the job 
      Job job = (Job) context.getBean("helloWorldJob"); 
   
      // Executing the JOB 
      JobExecution execution = jobLauncher.run(job, new JobParameters());
      System.out.println("Exit Status : " + execution.getStatus()); 
   } 
}       

On executing this application, it will produce the following output.

May 08, 2017 10:10:12 AM org.springframework.context.support.ClassPathXmlApplicationContext prepareRefresh 
INFO: Refreshing 
org.springframework.context.support.ClassPathXmlApplicationContext@3d646c37: startup date 
[Mon May 08 10:10:12 IST 2017]; root of context hierarchy 
May 08, 2017 10:10:12 AM org.springframework.beans.factory.xml.XmlBeanDefinitionReader loadBeanDefinitions 
May 08, 2017 10:10:15 AM org.springframework.jdbc.datasource.init.ScriptUtils executeSqlScript 
INFO: Executing step: [step1] 
Processing...  [Tutorial id=1001, Tutorial Author=Sanjay, 
Tutorial Title=Learn Java, Submission Date=06/05/2007] 
Processing...  [Tutorial id=1002, Tutorial Author=Abdul S, 
Tutorial Title=Learn MySQL, Submission Date=19/04/2007] 
Processing...  [Tutorial id=1003, Tutorial Author=Krishna Kasyap, 
Tutorial Title=Learn JavaFX, Submission Date=06/07/2017] 
May 08, 2017 10:10:21 AM org.springframework.batch.core.launch.support.SimpleJobLauncher run 
INFO: Job: [FlowJob: [name=helloWorldJob]] completed with the following parameters: 
[{}] and the following status: [COMPLETED] 
Exit Status : COMPLETED

This will generate an XML file with the following contents.

<?xml version = "1.0" encoding = "UTF-8"?> 
<tutorials> 
   <tutorial tutorial_id = "1001"> 
      <submission_date>06/05/2007</submission_date> 
      <tutorial_author>Sanjay</tutorial_author> 
      <tutorial_title>Learn Java</tutorial_title> 
   </tutorial> 
   
   <tutorial tutorial_id = "1002"> 
      <submission_date>19/04/2007</submission_date> 
      <tutorial_author>Abdul S</tutorial_author> 
      <tutorial_title>Learn MySQL</tutorial_title> 
   </tutorial> 
   
   <tutorial tutorial_id = "1003"> 
      <submission_date>06/07/2017</submission_date>
      <tutorial_author>Krishna Kasyap</tutorial_author> 
      <tutorial_title>Learn JavaFX</tutorial_title> 
   </tutorial> 
</tutorials>
Advertisements