Recently, I have been updating some of my HTML generation tools to output valid HTML5, rather than the XHTML 1.0 standard I have been using for the last few years. The main advantage from my perspective is the ability to use the more semantic block elements, such as the nav, section and article elements.
In general this is a fairly straightforward task, as I am generating clean XHTML using XSLT and my template library works pretty well, but I ran into some problems whilst validating the output using the W3C Validator.
The first issue is to sort the DOCTYPE out. The XHTML doctype looks like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
This is easy to generate in XSLT using the following output element.
<xsl:output encoding="UTF-8" indent="yes" method="xml" omit-xml-declaration="yes" doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" />
This unfortunately forces the document to validate against the XHTML 1.0 specification which does not include all the lovely new semantic elements – which means that my new documents are suddenly invalid!
We need to generate:
which is really hard to do using XSLT. I have read a number of articles that suggest you output the element as text, however this is extremely ugly and as it turns out, incorrect.
The correct XSLT incantation is:
<xsl:output method="xml" doctype-system="about:legacy-compat" encoding="UTF-8" indent="yes" />
This forces use of a dummy DTD (about:legacy-compat), which is the W3C recommended way of not using a standard DTD URI.
Now the W3C validator will happily validate against the HTML5 specification rather than the XHTML 1.0 specification.