jsoup - Parsing String



Overview

Jsoup.parse(String) method parses the input HTML into a new Document. This document object can be used to traverse and get details of the html dom.

Following example will showcase parsing an HTML String into a Document object.

Syntax

Document document = Jsoup.parse(html);

Where

  • document − document object represents the HTML DOM.

  • Jsoup − main class to parse the given HTML String.

  • html − HTML String.

Get the tags using Document object

String title = document.title();
Elements paragraphs = document.getElementsByTag("p");

Read tag values

for (Element paragraph : paragraphs) {
   System.out.println(paragraph.text());
}

Example - Parsing an HTML String to get Title of HTML

JsoupTester.java

package com.tutorialspoint;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;

public class JsoupTester {
   public static void main(String[] args) {
   
      String html = "<html><head><title>Sample Title</title></head>"
         + "<body><p>Sample Content</p></body></html>";
      Document document = Jsoup.parse(html);
      System.out.println(document.title());
   }
}

Verify the result

Compile and run the JsoupTester to verify the result −

Sample Title

Example - Parsing an HTML String to get Body of HTML

JsoupTester.java

package com.tutorialspoint;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JsoupTester {
   public static void main(String[] args) {
   
      String html = "<html><head><title>Sample Title</title></head>"
         + "<body><p>Sample Content</p></body></html>";
      Document document = Jsoup.parse(html);
      Elements paragraphs = document.getElementsByTag("p");
      for (Element paragraph : paragraphs) {
            System.out.println(paragraph.text());
      }
   }
}

Verify the result

Compile and run the JsoupTester to verify the result −

Sample Content
Advertisements