Groovy Operators

Control Statements

Groovy File Handling

Groovy Error & Exceptions

Groovy Multithreading

Groovy Synchronization

Groovy - Quering XML



In Groovy, we can query XML intuitively with the help of XmlSlurper. XmlSlurper parses the XML into a navigable structure which can be navigated using dot notations.

Load and Parse existing XML document

Using XmlSlurper we can load the XML document in the application.

def xmlMovies = '''
<movies>
   <movie title = 'Enemy Behind'> 
      <type>War, Thriller</type> 
      <format>DVD</format> 
      <year>2003</year>
      <director>John Adam</director>	  
      <rating>PG</rating> 
      <stars>PG</stars> 
      <description>10</description> 
   </movie> 
   <movie title = 'Transformers'> 
      <type>Anime, Science Fiction</type> 
      <format>DVD</format> 
      <year>1989</year>
	  <director>Cooper</director>
      <rating>R</rating> 
      <stars>R</stars> 
      <description>8</description> 
   </movie> 
</movies>
'''

// parse xml into movies object
def movies = new XmlSlurper().parseText(xmlMovies)

// We can parse XML file as well
// def movies = new XmlSlurper().parse(new File('movies.xml'))

Accessing Elements

Now navigate through xml structure and access the elements using dot notation. In case of multiple elements, we can treat them as List.

// access the root name, prints movies
println movies.name()

// Access all movie element, We gets a List of GPathResult
def allMovies = movies.movie

// prints Number of movies: 2
println "Number of movies: ${allMovies.size()}"

// Access the first movie
def movie = movies.movie[0]

// print movie type
println "Movie type: ${movie.type.text()}"

Accessing Attributes

We can access attributes using @ notation as shown in example below −

// access title of a first movie
println movies.movie[0].@title

Iterating elements

We can use each, collect, findall etc. methods to iterate XML elements as shown below −

// Print the title and type of each movie
movies.movie.each { mv ->
    println "Title: ${mv.@title.text()}, Type: ${mv.type.text()}"
}

// Collect all movie titles
def titles = movies.movie.collect { it.@title.text() }
println "All titles: $titles" 

Filtering elements

We can filter elements easily using findAll. −

// Find all movies in the 'Thriller' category
def thrillerMovies = movies.movie.findAll { it.type == 'War, Thriller' }
println "Number of thriller movies: ${thrillerMovies.size()}" 
thrillerMovies.each { println "Thriller book title: ${it.@title.text()}" }

// Find movies published after 2000
def recentMovies = movies.movie.findAll { it.year.text().toInteger() > 2000 }
println "Recent movie title: ${recentMovies[0].@title.text()}"

Using XPath for Complex Queries

We can use XPath like structure with XmlSlurper depthFirst() or breadthFirst() method in conjuction with findALL() method −

// Find all director elements
def directorElements = movies.depthFirst().findAll { it.name() == 'director' }
directorElements.each { println "Found Director: ${it.text()}" }

// Find a movie of particular director 
def johnMovie = movies.movie.find { it.director.text() == 'John Adam' }
println "Movie by John Adam: ${johnMovie.@title.text()}"

Complete Example

Example.groovy

def xmlMovies = '''
<movies>
   <movie title = 'Enemy Behind'> 
      <type>War, Thriller</type> 
      <format>DVD</format> 
      <year>2003</year>
      <director>John Adam</director>	  
      <rating>PG</rating> 
      <stars>PG</stars> 
      <description>10</description> 
   </movie> 
   <movie title = 'Transformers'> 
      <type>Anime, Science Fiction</type> 
      <format>DVD</format> 
      <year>1989</year>
	  <director>Cooper</director>
      <rating>R</rating> 
      <stars>R</stars> 
      <description>8</description> 
   </movie> 
</movies>
'''

// parse xml into movies object
def movies = new XmlSlurper().parseText(xmlMovies)

// We can parse XML file as well
// def movies = new XmlSlurper().parse(new File('movies.xml'))

// access the root name, prints movies
println movies.name()

// Access all movie element, We gets a List of GPathResult
def allMovies = movies.movie

// prints Number of movies: 2
println "Number of movies: ${allMovies.size()}"

// Access the first movie
def movie = movies.movie[0]

// print movie type
println "Movie type: ${movie.type.text()}"

// access title of a first movie
println movies.movie[0].@title

// Print the title and type of each movie
movies.movie.each { mv ->
    println "Title: ${mv.@title.text()}, Type: ${mv.type.text()}"
}

// Collect all movie titles
def titles = movies.movie.collect { it.@title.text() }
println "All titles: $titles" 

// Find all movies in the 'Thriller' category
def thrillerMovies = movies.movie.findAll { it.type == 'War, Thriller' }
println "Number of thriller movies: ${thrillerMovies.size()}" 
thrillerMovies.each { println "Thriller book title: ${it.@title.text()}" }

// Find movies published after 2000
def recentMovies = movies.movie.findAll { it.year.text().toInteger() > 2000 }
println "Recent movie title: ${recentMovies[0].@title.text()}"

// Find all director elements
def directorElements = movies.depthFirst().findAll { it.name() == 'director' }
directorElements.each { println "Found Director: ${it.text()}" }

// Find a movie of particular director 
def johnMovie = movies.movie.find { it.director.text() == 'John Adam' }
println "Movie by John Adam: ${johnMovie.@title.text()}"

Output

When we run the above program, we will get the following result.

movies
Number of movies: 2
Movie type: War, Thriller
Enemy Behind
Title: Enemy Behind, Type: War, Thriller
Title: Transformers, Type: Anime, Science Fiction
All titles: [Enemy Behind, Transformers]
Number of thriller movies: 1
Thriller book title: Enemy Behind
Recent movie title: Enemy Behind
Found Director: John Adam
Found Director: Cooper
Movie by John Adam: Enemy Behind
Advertisements