How to check if a string is html or not using JavaScript?


Sometimes, developers require to manage the HTML from JavaScript. For example, developers need to append some HTML nodes to a particular HTML element by accessing them in JavaScript.

So, before we append an HTML string to any HTML element using JavaScript, we need to evaluate the string we are appending and check if it is valid.

If we append the HTML string, which has an opening tab but doesn’t contain the closing tag, it can generate errors in the webpage. So, we will learn different approaches to validate the HTML string using JavaScript.

Use the regular expression to validate the HTML string

Programmers can use the regular expression to create a search pattern for a string. We can create a regular expression pattern by following its rules that perfectly match every HTML string.

After that, we can use the test() method of a regular expression, which returns the match result of the string passed as a parameter with the regular expression.

Syntax

Users can follow the syntax below to match the regular expression with the HTML string.

let regexForHTML = /<([A-Za-z][A-Za-z0-9]*)\b[^>]*>(.*?)<\/\1>/;
let isValid = regexForHTML.test(string);

In the above syntax, we have passed the string as a test() method parameter, which needs to be matched with the regexForHTML regular expression.

Regex Explanation

Here, we have explained the regular expression, which we used to match the HTML string

The regular expression is divided into three parts.

  • <([A-Za-z][A-Za-z0-9]*)\b[^>]*> − This is the first part of the regular expression, which matches the opening tag of the HTML string. It suggests the opening tag should contain ‘<’, some alphabetic and numeric characters after that, and ‘>’ at last.

  • (.*?) − It is a second part of the regular expression, which suggest that the string should contain at least one character after opening the tag.

  • <\/\1> − It is a third part of the regular expression representing that HTML string should contain ‘</’, and the same value as the first group after that, and ‘>’ at last.

Example

In the example below, We have created two different strings. The string1 is a valid HTML string, and string2 is an invalid one.

We have created the validateHTMLString() function, which uses the test() method to match a string with a regular expression.

<html>
<body>
   <h3>Using the <i>regular expression</i> to validate the HTML string.</h2>
   <div id = "output"> </div>
   <script>
      let Output = document.getElementById("output");
      
      // Creating the regular expression
      let regexForHTML = /<([A-Za-z][A-Za-z0-9]*)\b[^>]*>(.*?)<\/\1>/;
      let string1 = "<b> Hello users! </b>";
      let string2 = "<Hi there!>";
      function validateHTMLString(string) {
         
         // check if the regular expression matches the string
         let isValid = regexForHTML.test(string);
         if (isValid) {
            Output.innerHTML += "The " + string + " is a valid HTML stirng <br/>";
         }else{
            Output.innerHTML += "The " + string + " is not a valid HTML stirng <br/>";
         }
      }
      validateHTMLString(string1);
      validateHTMLString(string2);
   </script>
</body>
</html>

Use the nodeType property of the HTML element

We can create a dummy HTML element and append a string as an inner HTML of the element using the innnerHTML property of the element. After that, we can use the nodeType property of every child node to check if it's the type of HTML element.

For any HTML element value of its nodeType property is equal to 1.

Syntax

Users can follow the syntax below to validate the HTML string using the nodeType property of the HTML element.

var element = document.createElement("p");
element.innerHTML = string;
var childNodes = element.childNodes;
for (var i = 0; i < childNodes.length; i++) {
   if (childNodes[i].nodeType != 1) {
      
      // string is not valid
      return;
   }
   if (childNodes[i].nodeType == 1 && i == childNodes.length - 1) {
      
      // string is valid
      return;
   } 
}
// string is not valid 

In the above syntax, we are checking the node Type of every child node to verify that the string contains only HTML nodes.

Steps

Users can follow the below steps to implement the above syntax.

Step 1 − Create a dummy HTML element. It can be div, p, or any other element to store a string as an HTML.

Step 2 − Use the innerHTML property of the dummy element, and store the string as an HTML into that.

Step 3 − Get all the child nodes of the dummy element using the childNodes property.

Step 4 − Use the for loop to iterate through every child node of the dummy element.

Step 5 − In the for-loop, check the node type of every child element, and if it’s not equal to 1, it means the string is not a valid HTML string, and return any value from there to terminate a function.

Step 6 − If you reach the last child node while iterating all child nodes, and the last child node is also valid, it means the HTML string is valid and returns any value to terminate the function.

Example

In the example below, we have created the validateHTMLString() function, which implements the above steps to validate an HTML string.

<html>
<body>
   <h3>Using the <i> node Type property </i> to validate the HTML string.</h3>
   <div id = "output"> </div>
   <script>
      let output = document.getElementById("output");
      let string1 = "<b> This is an valid HTML! </b>";
      let string2 = "<Hi there!";
      function validateHTMLString(string) {
         var element = document.createElement("p");
         element.innerHTML = string;
         var childNodes = element.childNodes;
         for (var i = 0; i < childNodes.length; i++) {
            if (childNodes[i].nodeType != 1) {
               output.innerHTML += "The string is not valid HTML string! <br/>";
               return;
            }
            if (childNodes[i].nodeType == 1 && i == childNodes.length - 1) {
               output.innerHTML += "The " + string + " is a valid HTML string! <br/>";
               return;
            }
         }
         output.innerHTML += "The string is not valid HTML string! <br/>";
      }
      validateHTMLString(string1);
      validateHTMLString(string2);
   </script>
</body>
</html>

Users learned three different approaches to check whether the HTML string is valid. The best approach is using the regular expression, which allows us to validate an HTML string by writing a single line of code.

Updated on: 10-Mar-2023

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements