jsoup - Parsing HTML Fragment/Body



Overview

Jsoup.parseBodyFragment(String) method parses the input HTML into a new Document. This document object can be used to traverse and get details of the html body fragment.

Following example will showcase parsing an HTML fragment into a Document object.

Syntax

Document document = Jsoup.parseBodyFragment(html);

Where

  • document − document object represents the HTML DOM.

  • Jsoup − main class to parse the given HTML String.

  • html − HTML Fragment String.

Get the body using document object

Element body = document.body();

Here body represents element children of the document's body element and is equivalent to document.getElementsByTag("body").

Read tag values

Elements paragraphs = body.getElementsByTag("p");

for (Element paragraph : paragraphs) {
   System.out.println(paragraph.text());
}

Example - Parsing an HTML Fragment String to read paragraphs

JsoupTester.java

package com.tutorialspoint;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JsoupTester {
   public static void main(String[] args) {
   
      String html = "<div><p>Sample Content</p></div>";
      Document document = Jsoup.parseBodyFragment(html);
      Element body = document.body();
      Elements paragraphs = body.getElementsByTag("p");
      for (Element paragraph : paragraphs) {
         System.out.println(paragraph.text());
      }
   }
}

Verify the result

Compile and run the JsoupTester to verify the result −

Sample Content

Example - Parsing an HTML Fragment String to read Div tags

JsoupTester.java

package com.tutorialspoint;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JsoupTester {
   public static void main(String[] args) {
   
      String html = "<div>Sample Content</div>";
      Document document = Jsoup.parseBodyFragment(html);
      Element body = document.body();
      Elements divs = body.getElementsByTag("div");
      for (Element div : divs) {
         System.out.println(div.text());
      }
   }
}

Verify the result

Compile and run the JsoupTester to verify the result −

Sample Content
Advertisements