The web is constantly evolving. New and innovative websites are being created every day, pushing the boundaries of HTML in every direction. HTML 4 has been around for nearly a decade now, and publishers seeking new techniques to provide enhanced functionality are being held back by the constraints of the language and browsers.
To give authors more flexibility and interoperability, and enable more interactive and exciting websites and applications, HTML 5 introduces and enhances a wide range of features including form controls, APIs, multimedia, structure, and semantics.
Work on HTML 5, which commenced in 2004, is currently being carried out in a joint effort between the W3C HTML WG and the WHATWG. Many key players are participating in the W3C effort including representatives from the four major browser vendors: Apple, Mozilla, Opera, and Microsoft; and a range of other organisations and individuals with many diverse interests and expertise.
Note that the specification is still a work in progress and quite a long way from completion. As such, it is possible that any feature discussed in this article may change in the future. This article is intended to provide a brief introduction to some of the major features as they are in the current draft.
HTML 5 introduces a whole set of new elements that make it much easier to structure pages. Most HTML 4 pages include a variety of common structures, such as headers, footers and columns and today, it is fairly common to mark them up using div elements, giving each a descriptive id or class.
The use of
div elements is largely because current versions of HTML 4 lack the necessary semantics for describing these parts more specifically. HTML 5 addresses this issue by introducing new elements for representing each of these different sections.
The markup for that document could look like the following:
<body> <header>...</header> <nav>...</nav> <article> <section> ... </section> </article> <aside>...</aside> <footer>...</footer> </body>
There are several advantages to using these elements. When used in conjunction with the heading elements (
h6), all of these provide a way to mark up nested sections with heading levels, beyond the six levels possible with previous versions of HTML. The specification includes a detailed algorithm for generating an outline that takes the structure of these elements into account and remains backwards compatible with previous versions. This can be used by both authoring tools and browsers to generate tables of contents to assist users with navigating the document.
For example, the following markup structure marked up with nested
<section> <h1>Level 1</h1> <section> <h1>Level 2</h1> <section> <h1>Level 3</h1> </section> </section> </section>
Note that for better compatibility with current browsers, it is also possible to make use of the other heading elements (
h6) appropriately in place of the
By identifying the purpose of sections in the page using specific sectioning elements, assistive technology can help the user to more easily navigate the page. For example, they can easily skip over the navigation section or quickly jump from one article to the next without the need for authors to provide skip links. Authors also benefit because replacing many of the divs in the document with one of several distinct elements can help make the source code clearer and easier to author.
header element represents the header of a section. Headers may contain more than just the section’s heading—for example it would be reasonable for the header to include sub headings, version history information or bylines.
<header> <h1>A Preview of HTML 5</h1> <p>By Lachlan Hunt</p> </header>
<header> <h1>Example Blog</h1> <h2>Insert tag line here.</h2> </header>
footer element represents the footer for the section it applies to. A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like.
<footer>© 2007 Example Inc.</footer>
nav element represents a section of navigation links. It is suitable for either site navigation or a table of contents.
<nav> <ul> <li><a href="/">Home</a></li> <li><a href="/products">Products</a></li> <li><a href="/services">Services</a></li> <li><a href="/about">About</a></li> </ul> </nav>
aside element is for content that is tangentially related to the content around it, and is typically useful for marking up sidebars.
<aside> <h1>Archives</h1> <ul> <li><a href="/2007/09/">September 2007</a></li> <li><a href="/2007/08/">August 2007</a></li> <li><a href="/2007/07/">July 2007</a></li> </ul> </aside>
section element represents a generic section of a document or application, such as a chapter, for example.
<section> <h1>Chapter 1: The Period</h1> <p>It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, ...</p> </section>
(Excerpt from A Tale of Two Cities)
article element represents an independent section of a document, page or site. It is suitable for content like news or blog articles, forum posts or individual comments.
<article id="comment-2"> <header> <h4><a href="#comment-2" rel="bookmark">Comment #2</a> by <a href="http://example.com/">Jack O'Niell</a></h4> <p><time datetime="2007-08-29T13:58Z">August 29th, 2007 at 13:58</time> </header> <p>That's another great article!</p> </article>
Video and Audio
In recent years, video and audio on the web has become increasingly viable and sites like YouTube, Viddler, Revver, MySpace, and dozens of others are making it easy for anyone to publish video and audio. However, since HTML currently lacks the necessary means to successfully embed and control multimedia itself, many sites are relying on Flash to provide that functionality. Although it is possible to embed multimedia using various plug-ins (such as QuickTime, Windows Media, etc.), Flash is currently the only widely deployed plugin that provides a cross-browser compatible solution with the desired APIs for developers.
As evidenced by the various Flash-based media players, authors are interested in providing their own custom-designed user interfaces, which generally allow users to play, pause, stop, seek, and adjust volume. The plan is to provide this functionality in browsers by adding native support for embedding video and audio and providing DOM APIs for scripts to control the playback.
audio elements make this really easy. Most of the APIs are shared between the two elements, with the only differences being related to the inherent differences between visual and non-visual media.
Both Opera and WebKit have released builds with partial support for the video element. You may download the experimental build of Opera or a recent nightly build of WebKit to try out these examples. Opera includes support for Ogg Theora and WebKit supports all the formats that are supported by QuickTime, including third party codecs.
The simplest way to embed a video is to use a
video element and allow the browser to provide a default user interface. The
controls attribute is a boolean attribute that indicates whether or not the author wants this UI on or off by default.
<video src="video.ogv" controls poster="poster.jpg" width="320" height="240"> <a href="video.ogv">Download movie</a> </video>
poster attribute can be used to specify an image which will be displayed in place of the video before the video has begun playing. Although there are some video formats that support their own poster frame feature, such as MPEG-4, this provides an alternative solution that can work independently of the video format.
It is just as simple to embed audio into a page using the
audio element. Most of the attributes are common between the
audio elements, although for obvious reasons, the
audio element lacks the
<audio src="music.oga" controls> <a href="music.oga">Download song</a> </audio>
HTML 5 provides the
source element for specifying alternative video and audio files which the browser may choose from based on its media type or codec support. The
media attribute can be used to specify a media query for selection based on the device limitations and the type attribute for specifying the media type and codecs. Note that when using the
source elements, the
src attribute needs to be omitted from their parent
audio element or the alternatives given by the
source elements will be ignored.
<video poster="poster.jpg"> <source src="video.3gp" type="video/3gpp" media="handheld"> <source src="video.ogv" type="video/ogg; codecs=theora, vorbis"> <source src="video.mp4" type="video/mp4"> </video> <audio> <source src="music.oga" type="audio/ogg"> <source src="music.mp3" type="audio/mpeg"> </audio>
For authors who want a little more control over the user interface so that they can make it fit the overall design of the web page, the extensive API provides several methods and events to let scripts control the playback of the media. The simplest methods to use are the
pause(), and setting
currentTime to rewind to the beginning. The following example illustrates the use of these.
<video src="video.ogg" id="video"></video> <script> var video = document.getElementById("video"); </script> <p><button type="button" onclick="video.play();">Play</button> <button type="button" onclick="video.pause();">Pause</button> <button type="button" onclick="video.currentTime = 0;"> << Rewind</button>
There are many other attributes and APIs available for the video and audio elements that have not been discussed here. For more information, you should consult the current draft specification.
Unlike previous versions of HTML and XHTML, which are defined in terms of their syntax, HTML 5 is being defined in terms of the Document Object Model (DOM)—the tree representation used internally by browsers to represent the document. For example, consider a very simple document consisting of a title, heading and paragraph. The DOM tree could look something like this:
The advantage of defining HTML 5 in terms of the DOM is that the language itself can be defined independently of the syntax. There are primarily two syntaxes that can be used to represent HTML documents: the HTML serialisation (known as HTML 5) and the XML serialisation (known as XHTML 5).
The HTML serialisation refers to the syntax that is inspired by the SGML syntax from earlier versions of HTML, but defined to be more compatible with the way browsers actually handle HTML in practice.
<!DOCTYPE html> <html> <head> <title>An HTML Document</title> </head> <body> <h1>Example</h1> <p>This is an example HTML document. </body> </html>
Note that like previous versions of HTML, some tags are optional and are automatically implied.
The XML serialisation refers to the syntax using XML 1.0 and namespaces, just like XHTML 1.0.
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>An HTML Document</title> </head> <body> <h1>Example</h1> <p>This is an example HTML document.</p> </body> </html>
Excluding differences in whitespace and the presence of the
xmlns attribute, those two examples are equivalent.
Browsers use the MIME type to distinguish between the two. Any document served as
text/html must conform to the requirements for the HTML serialisation and any document served with an XML MIME type such as
application/xhtml+xml must conform to the requirements for the XML serialisation.
Authors should make an informed choice about which serialisation to use, which may be dependent on a number of different factors. Authors should not be unconditionally forced to use one or the other; each one is optimised for different situations.
Benefits of Using HTML
- Backwards compatible with existing browsers
- Authors are already familiar with the syntax
- The lenient and forgiving syntax means there will be no user-hostile “Yellow Screen of Death” if a mistake accidentally slips through
- Convenient shorthand syntax, e.g. authors can omit some tags and attribute values
Benefits of Using XHTML
- Strict XML syntax encourages authors to write well-formed markup, which some authors may find easier to maintain
- Integrates directly with other XML vocabularies, such as SVG and MathML
- Allows the use of XML Processing, which some authors use as part of their editing and/or publishing processes