How to remove the HTML tags from a given string in Java?


A String is a final class in Java and it is immutable, it means that we cannot change the object itself, but we can change the reference to the object. The HTML tags can be removed from a given string by using replaceAll() method of String class. We can remove the HTML tags from a given string by using a regular expression. After removing the HTML tags from a string, it will return a string as normal text.

Syntax

public String replaceAll(String regex, String replacement)

Example

public class RemoveHTMLTagsTest {
   public static void main(String[] args) {
      String str = "<p><b>Welcome to Tutorials Point</b></p>";
      System.out.println("Before removing HTML Tags: " + str);
      str = str.replaceAll("\\<.*?\\>", "");
      System.out.println("After removing HTML Tags: " + str);
   }
}

Output

Before removing HTML Tags: <p><b>Welcome to Tutorials Point</b></p>
After removing HTML Tags: Welcome to Tutorials Point
raja
Published on 07-Aug-2019 17:35:40
Advertisements