
- jsoup - Home
- jsoup - Overview
- jsoup - Environment Setup
- Examples - Input
- jsoup - Parsing String
- jsoup - Parsing Body
- jsoup - Loading URL
- jsoup - Loading File
- Examples - Extracting Data
- jsoup - Using DOM Methods
- jsoup - Using Selector Syntax
- jsoup - Extract Attributes
- jsoup - Extract Text
- jsoup - Extract HTML
- jsoup - Working with URLs
- Examples - Modifying Data
- jsoup - Set Attributes
- jsoup - Set HTML
- jsoup - Set Text Content
- Examples - Cleaning HTML
- jsoup - Sanitize HTML
- jsoup Useful Resources
- jsoup - Quick Guide
- jsoup - Useful Resources
- jsoup - Discussion
jsoup - Using DOM Methods
Overview
Jsoup.parse(file, string) parses the input HTML into a new Document. This document object can be used to traverse and get details of the html dom.
Syntax
Document document = Jsoup.parse(html); Element sampleDiv = document.getElementById("sampleDiv"); Elements links = sampleDiv.getElementsByTag("a");
Where
document − document object represents the HTML DOM.
Jsoup − main class to parse the given HTML String.
html − HTML string
sampleDiv − Element object represent the html node element identified by id "sampleDiv".
links − Elements object represents the multiple node elements identified by tag "a".
Get an element by ID
Element sampleDiv = document.getElementById("sampleDiv");
Get elements by Tag
Elements links = sampleDiv.getElementsByTag("a");
Example - Parsing an html string and read Paragraphs
JsoupTester.java
package com.tutorialspoint; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class JsoupTester { public static void main(String[] args) { String html = "<html><head><title>Sample Title</title></head>" + "<body>" + "<p>Sample Content</p>" + "<div id='sampleDiv'><a href='www.google.com'>Google</a></div>" +"</body></html>"; Document document = Jsoup.parse(html); System.out.println(document.title()); Elements paragraphs = document.getElementsByTag("p"); for (Element paragraph : paragraphs) { System.out.println(paragraph.text()); } } }
Verify the result
Compile and run the JsoupTester to verify the result −
Sample Title Sample Content
Example - Parsing an html String and read a particular Div
JsoupTester.java
package com.tutorialspoint; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; public class JsoupTester { public static void main(String[] args) { String html = "<html><head><title>Sample Title</title></head>" + "<body>" + "<p>Sample Content</p>" + "<div id='sampleDiv'><a href='www.google.com'>Google</a></div>" +"</body></html>"; Document document = Jsoup.parse(html); Element sampleDiv = document.getElementById("sampleDiv"); System.out.println("Data: " + sampleDiv.text()); } }
Verify the result
Compile and run the JsoupTester to verify the result −
Data: Google
Example - Parsing an html String and read links
JsoupTester.java
package com.tutorialspoint; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class JsoupTester { public static void main(String[] args) { String html = "<html><head><title>Sample Title</title></head>" + "<body>" + "<p>Sample Content</p>" + "<div id='sampleDiv'><a href='www.google.com'>Google</a></div>" +"</body></html>"; Document document = Jsoup.parse(html); Element sampleDiv = document.getElementById("sampleDiv"); Elements links = sampleDiv.getElementsByTag("a"); for (Element link : links) { System.out.println("Href: " + link.attr("href")); System.out.println("Text: " + link.text()); } } }
Verify the result
Compile and run the JsoupTester to verify the result −
Href: www.google.com Text: Google
Advertisements