







In this issue's Web Technologies column we ask
Brian Kelly to tell us more about XHTML.
The XHTML Interview
- What Is XHTML?
- The answer to the Webmaster's nightmares. One of the technical highlights
of the recent WWW 9 conference.
-
- Can you be slightly more explicit, please! What does it stand for?
How is if different from HTML? Who developed it? Why are you so excited about it?
- XHTML stands for "Extensible HyperText Markup Language". It was developed by
the World Wide Web Consortium (W3C) and is now a W3C Recommendation
[1].
- XHTML is a reformulation of HTML 4 in XML 1.0. This means that the benefits
provided by XML will be available to XHTML.
-
- But how does HTML differ from XHTML?
- XHTML has a small number of differences. The most noticeable being the
requirement for elements to be lowercase (e.g. <p> and not
<p>) and elements to be closed (e.g. paragraphs must end
with a </p>).
-
- That's a pain. I prefer to type my tags in uppercase, and I never bother
closing my paragraphs. Why do I have to do this?
- For reasons on internationalisation XML elements are case sensitive. A choice
had to be made, and lowercase won on the day.
-
- What about the need for end tags?
- Remember that XHTML is an XML application.
-
- So?
- Have a look at the markup fragments in the following table.
| Markup |
Comments |
<part-number>273</part-number>
wheel |
Invalid XML |
<part-number>273</part-number>
<part-type>wheel</part-type> |
Well-formed XML |
<h1>Introduction</h1>
Welcome to this document on XHTML. |
Valid HTML but invalid XHTML |
<h1>Introduction</h1>
<p>Welcome to this document on XHTML.</p> |
Valid HTML and well-formed XHTML |
- Since XML documents can use arbitrary elements an XML application cannot
know how the document is structured. Web browsers, however, do
know something about the document structure. For example, text
that occurs immediately after a heading is normally assumed to be part of a paragraph,
and a <p> element is assumed. XML applications can't make such
assumptions, so more rigourous markup is required.
-
- Since XML documents can use arbitrary elements an XML application cannot
know how the document is structured. Web browsers, however, do
know something about the document structure. For example, text
that occurs immediately after a heading is normally assumed to be part of a paragraph,
and a <p> element is assumed. XML applications can't make such
assumptions, so more rigourous markup is required.
-
- OK. But what about elements that don't have a close tag, such as
<IMG> (sorry I mean <img>!) and <hr>
- There are two solutions. You could use a close tag
(e.g. <img src="logo.gif" ...></img>). However the best solution is
to simply include a forward slash in the element:
<img src="logo.gif" ... />
-
- Will this work?
- As long as you include a space before the slash it will cause no problems
in most Web browsers - although there have been reports of problems with some
embedded HTML viewers such as Java's Swing HTML editor.
-
- Are there any other differences between HTML and XHTML?
- Attribute values must be in quotes
(e.g. <img src="logo.gif" alt="University logo" height="50" width="75">).
-
- Sorry for pestering you, but why?
- Remember that XML applications don't know what the tags mean.
Do you know what <jnh tsd=logo.gif bmu=University logo ifjhiu=50 xjeui=75>
means? To save confusion and ambiguity all attributes must be quoted.
-
- Any other differences?
- Some, but I've covered the main ones. I should also point out that the XHTML
document should begin with an XML Processing Instruction and then be followed by
the XHTML DTD. It will normally look something like this:
-
- <?xml version="1.0" encoding="UTF-8"?>
<DOCTYPE PUBLIC "-//W3C/DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/TR/xhml1">
-
- You still haven't explained what the benefits of XHTML are.
- As XHTML is an XML application, you will benefit from developments in the XML world.
For example XML tools such as editors, converters, browsers, etc. can be used
with XHTML resources. In addition there are developments to the XML family of
protocols and formats which will provide additional functionality for XHTML.
-
- Go on.
- XLink [2] [3],
for example, will provide richer hyperlinking functionality and
XML Namespaces [4]
will support the deployment of modular XML DTDs. XHTML, for examples, consists
of a series of modular DTDs.
-
- Why do I need modular DTDs?
- An application may wish to support only a subset of XHTML. For example a mobile
phone, an Internet TV or even a Web-aware cooker may only require a subset of XHTML.
Also modularity makes it easier to deploy new developments. Once XForms
[5], for example, has been finalised it will
be much easier to deploy documents which make use of the enhanced forms
capabilities which this proposal will bring.
-
- Any other important new developments?
- Yes: XSLT, XSL Transformations [6] [7].
XSLT provides a transformation language which can be used to transform XML documents
into other formats. XSLT can be used to transform documents from one XML DTD to another,
or even to transform an XML document to an alternative format such as RTF or PDF.
-
- Why is this important?
- You've heard all the hype about mobile phones and WAP haven't you? How do you
think the WAP world, which expects documents to be in WML format, to be populated?
Rather than manually creating WML markup, XSLT will enable XHTML documents to be
automatically converted to WML.
-
- So XHTML should be the master storage format for my resources?
- NO! XHTML still lacks semantics. Ideally your resources should be stored in
an appropriate XML format. XSLT can then be used to convert the resources to XHTML
(for Web browsers), WML (for mobile phones), etc. XHTML is a useful intermediate stage.
-
- Can we get down to practicalities. How do I create XHTML pages?
- The eGroups XHTML-L Web site provides links to XHTML tools, including conversion
tools and editors [8]. A couple of free tools are available
(HTML-Kit, 1st Page 2000).
Mozquito Factory appears to be the first licensed package on the
market.
-
- Hmm. So there's not many authoring tools, and none I've heard of.
- That's true. But you can expect the usual suspects (Microsoft, Dreamweaver, etc)
to bring out new versions of their products with XHTML support.
-
- What about conversion of existing HTML pages - especially bulk conversion,
as I have many thousands of HTML files!
- Dave Raggett, W3C has written a utility program called Tidy
[9] which can be used to convert HTML pages to XHTML.
Tidy can be used in batch mode to bulk-convert documents. Tidy is an open source
program, which has been incorporated into an number of authoring tools,
most notably HTML-Kit [10], which is
illustrated below.
-

Figure 1: HTML-Kit |
- Are there any problems you haven't mentioned?
- XHTML documents should start with an XML Processing Instruction:
<?xml version="1.0" encoding="UTF-8"?>.
It should be noted that some browsers (e.g. Netscape versions 1-3, Mosaic 3
[11]) will display the Processing Instruction in the browser.
-
- Is this a problem?
- Probably not. If you are concerned you could "user-agent negotiation" so that
the processing instruction is not sent to those browsers.
-
- The 64 thousand dollar question: Should I be using XHTML?
- It is the approved W3C Recommendation, so if you are committed to support
for standards you should be using it. However telling your users that they should
stop using FrontPage, HoTMetal and DreamWeaver and start using HTML-Kit is
probably not a sensible idea. I would say that XHTML should be recommended for
use if you do not have users of current HTML authors tools. It should
definitely be used by software developers who generate HTML on the fly.
-
- How do I find out more?
XHTML books are being written. One of the first to be published is
"Beginning XHTML" [12].
The book is available from Amazon for £21-74 [13].
Note that one of the authors is Dave Raggett, a W3C employee who has been
involved in HTML developments since the early days.
-
- Another very useful resource is eGroup's XHTML-L mailing list
and accompanying Web site [14]. Although the
mailing list is active and provides a useful source of advice,
the best feature of this resource is the accompanying Web site
which provides many links to additional resources, as shown below.

Figure 2: eGroups XHTML Web Site |
- Another useful resource is the W3School, which not only provides
useful information about XHTML [15] but also on technologies
such as XML, WML, etc.
-
- Thank you
- You're welcome.
References
- XHTMLTM 1.0: The Extensible
HyperText Markup Language, W3C,
URL: <http://www.w3.org/TR/xhtml1/>
- XML Linking Language (XLink), W3C
URL: <http://www.w3.org/TR/xlink/>
- What Are .. XLink and XPointer?, Ariadne issue 16
URL: <http://www.ariadne.ac.uk/issue16/what-is/>
- Namespaces in XML, W3C
URL: <http://www.w3.org/TR/REC-xml-names/>
- XForms 1.0: Data Model, W3C
URL: <http://www.w3.org/TR/xforms-datamodel/>
- XForms Requirements, W3C
URL: <http://www.w3.org/TR/xhtml-forms-req>
- XSL Transformations (XSLT) Version 1.0, W3C
URL: <http://www.w3.org/TR/xslt>
- XHTML - Links : Tools, eGroups
URL: <http://www.egroups.com/links/XHTML-L/Tools_000957360438/>
- Tidy, W3C
URL: <http://www.w3.org/People/Raggett/tidy/>
- HTML-Kit, Chami
URL: <http://www.chami.com/html-kit/>
- XML Declaration test results, Robin Lionheart, posting to XHTML-L list, 3 June 2000
URL: <http://www.egroups.com/message/XHTML-L/288?&start=266>
- Beginning XHTML, Boumphrey, Greer, Raggett,
Raggett, Schnitzenbaumer and Wugofski, Wrox Press Ltd,
- A Glance: Beginning XHTML, Amazon.co.uk
URL: <http://www.amazon.co.uk/exec/obidos/ASIN/1861003439/o/qid=961063119/sr=8-1/026-2492660-4333201>
- XHTML-L, eGroups
URL: <http://www.egroups.com/group/XHTML-L>
- Welcome to XHTML School, W3Schools
URL: <http://www.w3schools.com/xhtml/>
Author Details
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath
England
BA2 7AY
URL: <http://www.ukoln.ac.uk>
Email: b.kelly@ukoln.ac.uk
For citation purposes:
Brian Kelly, "The XHTML Interview",
Exploit Interactive, issue 6, 26th June 2000
URL: <http://www.exploit-lib.org/issue6/xhtml/>
[HTML Validation] -
[Accessibility check]