- Spring Batch Tutorial
- Spring Batch - Home
- Spring Batch - Overview
- Spring Batch - Environment
- Spring Batch - Architecture
- Spring Batch - Application
- Spring Batch - Configuration
- Readers, Writers & Processors
- Spring Batch - Basic Application
- Spring Batch - XML to MySQL
- Spring Batch - CSV to XML
- Spring Batch - MySQL to XML
- Spring Batch - MySQL to Flat File
- Spring Batch Useful Resources
- Spring Batch - Quick Guide
- Spring Batch - Useful Resources
- Spring Batch - Discussion
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Spring Batch - Architecture
Following is the diagrammatic representation of the architecture of Spring Batch. As depicted in the figure, the architecture contains three main components namely, Application, Batch Core, and Batch Infrastructure.
Application − This component contains all the jobs and the code we write using the Spring Batch framework.
Batch Core − This component contains all the API classes that are needed to control and launch a Batch Job.
Batch Infrastructure − This component contains the readers, writers, and services used by both application and Batch core components.
Components of Spring Batch
The following illustration shows the different components of Spring Batch and how they are connected with each other.
In a Spring Batch application, a job is the batch process that is to be executed. It runs from start to finish without interruption. This job is further divided into steps (or a job contains steps).
We will configure a job in Spring Batch using an XML file or a Java class. Following is the XML configuration of a Job in Spring Batch.
<job id = "jobid"> <step id = "step1" next = "step2"/> <step id = "step2" next = "step3"/> <step id = "step3"/> </job>
A Batch job is configured within the tags <job></job>. It has an attribute named id. Within these tags, we define the definition and ordering of the steps.
Restartable − In general, when a job is running and we try to start it again that is considered as restart and it will be started again. To avoid this, you need to set the restartable value to false as shown below.
<job id = "jobid" restartable = "false" > </job>
A step is an independent part of a job which contains the necessary information to define and execute the job (its part).
As specified in the diagram, each step is composed of an ItemReader, ItemProcessor (optional) and an ItemWriter. A job may contain one or more steps.
Readers, Writers, and Processors
An item reader reads data into a Spring Batch application from a particular source, whereas an item writer writes data from the Spring Batch application to a particular destination.
An Item processor is a class which contains the processing code which processes the data read into the spring batch. If the application reads "n" records, then the code in the processor will be executed on each record.
When no reader and writer are given, a tasklet acts as a processor for SpringBatch. It processes only a single task. For example, if we are writing a job with a simple step in it where we read data from MySQL database and process it and write it to a file (flat), then our step uses −
A reader which reads from MySQL database.
A writer which writes to a flat file.
A custom processor which processes the data as per our wish.
<job id = "helloWorldJob"> <step id = "step1"> <tasklet> <chunk reader = "mysqlReader" writer = "fileWriter" processor = "CustomitemProcessor" ></chunk> </tasklet> </step> </ job>
Spring Batch provides a long list of readers and writers. Using these predefined classes, we can define beans for them. We will discuss readers and writers in greater detail in the coming chapters.
A Job repository in Spring Batch provides Create, Retrieve, Update, and Delete (CRUD) operations for the JobLauncher, Job, and Step implementations. We will define a job repository in an XML file as shown below.
<job-repository id = "jobRepository"/>
In addition to id, there are some more options (optional) available. Following is the configuration of job repository with all the options and their default values.
<job-repository id = "jobRepository" data-source = "dataSource" transaction-manager = "transactionManager" isolation-level-for-create = "SERIALIZABLE" table-prefix = "BATCH_" max-varchar-length = "1000"/>
In-Memory Repository − In case you don’t want to persist the domain objects of the Spring Batch in the database, you can configure the in-memory version of the jobRepository as shown below.
<bean id = "jobRepository" class = "org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean "> <property name = "transactionManager" ref = "transactionManager"/> </bean>
JobLauncher is an interface which launces the Spring Batch job with the given set of parameters. SampleJoblauncher is the class which implements the JobLauncher interface. Following is the configuration of the JobLauncher.
<bean id = "jobLauncher" class = "org.springframework.batch.core.launch.support.SimpleJobLauncher"> <property name = "jobRepository" ref = "jobRepository" /> </bean>
A JobInstance represents the logical run of a job; it is created when we run a job. Each job instance is differentiated by the name of the job and the parameters passed to it while running.
If a JobInstance execution fails, the same JobInstance can be executed again. Hence, each JobInstance can have multiple job executions.
JobExecution and StepExecution
JobExecution and StepExecution are the representation of the execution of a job/step. They contain the run information of the job/step such as start time (of job/step), end time (of job/step).