- jsoup - Home
- jsoup - Overview
- jsoup - Environment Setup
- Examples - Input
- jsoup - Parsing String
- jsoup - Parsing Body
- jsoup - Loading URL
- jsoup - Loading File
- Examples - Extracting Data
- jsoup - Using DOM Methods
- jsoup - Using Selector Syntax
- jsoup - Extract Attributes
- jsoup - Extract Text
- jsoup - Extract HTML
- jsoup - Working with URLs
- Examples - Modifying Data
- jsoup - Set Attributes
- jsoup - Set HTML
- jsoup - Set Text Content
- Examples - Cleaning HTML
- jsoup - Sanitize HTML
- jsoup Useful Resources
- jsoup - Quick Guide
- jsoup - Useful Resources
- jsoup - Discussion
jsoup - Sanitizing HTML
Overview
Jsoup.clean() method sanitizes an html using Whitelist configurations. It helps in prevention of XSS attacks or cross-site scripting attack.
Syntax
String safeHtml = Jsoup.clean(html, Safelist.basic());
Where
Jsoup − main class to parse the given HTML String.
html − Initial HTML String.
safeHtml − Cleaned HTML.
Safelist − Object to provide default configurations to safeguard html.
clean() − cleans the html using Whitelist.
Example - Santize an HTML Content
JsoupTester.java
package com.tutorialspoint;
import org.jsoup.Jsoup;
import org.jsoup.safety.Safelist;
public class JsoupTester {
public static void main(String[] args) {
String html = "<p><a href='http://example.com/'"
+" onclick='checkData()'>Link</a></p>";
System.out.println("Initial HTML: " + html);
String safeHtml = Jsoup.clean(html, Safelist.basic());
System.out.println("Cleaned HTML: " +safeHtml);
}
}
Verify the result
Compile and run the JsoupTester to verify the result −
Initial HTML: <p><a href='http://example.com/' onclick='checkData()'>Link</a></p> Cleaned HTML: <p><a href="http://example.com/" rel="nofollow">Link</a></p>
Advertisements