[this page | pdf | back links]
XHTML stands for eXtensible Hypertext Markup Langauge. It is
designed to be very like HTML but
structured in a fashion that also adheres to the rules of XML.
Typically, most browsers accept some types of ‘badly formed’
HTML, e.g. HTML in which a document’s <head> element
is not properly closed before its <body> element
is opened. This is despite such markup text failing to adhere to the rules that
HTML is supposed to follow. However, such pages may not work well or
consistently on some devices. A processing overhead is incurred when a browser
tries to interpret badly-formed HTML, which may not be practical for some
smaller devices. There may also be several possible ways of interpreting
badly-formed HTML. XML is more rigidly structured than HTML (and it is easier
to test that its rules are being adhered to), making it an easier vehicle
through which to introduce disciplines that aim to ensure all markup text is
The main differences between HTML and XHTML are:
element (which takes the form <!DOCTYPE
must be present at the start of the document.
2. <html>, <head>, <title> and <body> elements
must also be present, and the xmlns
attribute of the <html>
element must be defined appropriately.
3. All XHTML
elements must be properly closed (and properly nested), e.g. using </p> to close a paragraph (<p>) element and
not just starting a new one with a new <p>.
Note, usually browsers would interpret <p> text1 <p> text2 </p> as two consecutive paragraphs
even though this involves badly-formed HTML.
corollary of 3. is that HTML empty elements such as the <br>, <hr> and <img> element
must also be properly closed in XHTML, i.e. they should be written as <br />, <hr /> and <img
element and attribute names must use lower case, e.g. the XHTML <p> element must be
written as <p> text
</p> rather than <P> text </P>.
XHTML attribute values must be included in quotes. So, HTML such as <p width=100px> is wrong in
XHTML and should be replaced by <p
minimisation is forbidden. Attribute minimisation in HTML involves including
just the attribute name rather than both the name and its value if its value is
the same as its name. For example, HTML of the form <input type="checkbox" name="fruit" value="apple"
checked /> should be replaced in HTML by <input type="checkbox" name="fruit" value="apple"
In practice, it is usually quite easy (if possibly
laborious) to convert HTML to XHTML by:
(a) Adding a
suitable XHTML <!DOCTYPE>
statement to the first line of the page and adding an xmlns attribute to the html element of the page
(b) Changing all element
names and attribute names to lower case
(c) Closing all
(d) Putting all attribute
values in quotes (and eliminating any attribute minimisation that is present)
An example of a minimal XHTML page is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Contents | Prev | Next | HTML Elements