XHTML vs HTML
Due to the fact that XHTML is an XML application, certain practices that were perfectly legal in SGML-based HTML 4 must be changed. You already have seen XHTML syntax in previous chapter, so differences between XHTML and HTML are very obvious. Following is the comparison between XHTML and HTML.
XHTML documents must be well-formed
Well-formedness is a new concept introduced by XML. Essentially this means, all the elements must have closing tags and you must nest them properly.
CORRECT : nested elements
<p>Here is an emphasized <em>paragraph</em>.</p>
INCORRECT : overlapping elements
<p>Here is an emphasized <em>paragraph.</p></em>
Elements and attributes must be in lower case
XHTML documents must use lower case for all HTML elements and attribute names. This difference is necessary because XHTML document is assumed to be an XML document and XML is case-sensitive. For example, <li> and <LI> are different tags.
End tags are required for all elements
In HTML, certain elements are permitted to omit the end tag. But XML does not allow end tags to be omitted.
CORRECT : terminated elements
<p>Here is a paragraph.</p><p>here is another paragraph.</p> <br/><hr/>
INCORRECT : unterminated elements
<p>Here is a paragraph.<p>here is another paragraph. <br><hr>
Attribute values must always be quoted
All attribute values including numeric values, must be quoted.
CORRECT : quoted attribute values
INCORRECT : unquoted attribute values
XML does not support attribute minimization. Attribute-value pairs must be written in full. Attribute names such as compact and checked cannot occur in elements without their value being specified.
CORRECT : non minimized attributes
INCORRECT : minimized attributes
Whitespace handling in attribute values
When a browser processes attributes, it does the following:
Strips leading and trailing whitespace.
Maps sequences of one or more white space characters (including line breaks) to a single inter-word space.
Script and style elements
In XHTML, the script and style elements should not have < and & characters directly, if they exist; then they are treated as the start of markup. The entities such as < and & are recognized as entity references by the XML processor for displaying < and & characters respectively.
Wrapping the content of the script or style element within a CDATA marked section avoids the expansion of these entities.
An alternative is to use external script and style documents.
The elements with id and name attributes
XHTML recommends the replacement of name attribute with id attribute. Note that in XHTML 1.0, the name attribute of these elements is formally deprecated, and it will be removed in a subsequent versions of XHTML.
Attributes with pre-defined value sets
HTML and XHTML both have some attributes that have pre-defined and limited sets of values. For example, type attribute of the input element. In HTML and XML, these are called enumerated attributes. Under HTML 4, the interpretation of these values was case-insensitive, so a value of TEXT was equivalent to a value of text.
Under XHTML, the interpretation of these values is case-sensitive so all of these values are defined in lower-case.
Entity references as hex values
HTML and XML both permit references to characters by using hexadecimal value. In HTML these references could be made using either nn; or nn; and they are valid but in XHTML documents, you must use the lower-case version only such as nn;.
The <html> Element is a must
All XHTML elements must be nested within the <html> root element. All other elements can have sub elements which must be in pairs and correctly nested within their parent element. The basic document structure is:
<!DOCTYPE html....> <html> <head> ... </head> <body> ... </body> </html>