Whether you're ... ... ... ... or the Internet in general, you've likely seen or heard ... to ... Markup Language (XML). XML is, without a doubt, one of the m
To comprehend XML, it's beneficial to draw a comparison with HyperText Markup Language (HTML), a technology that many are familiar with. HTML was designed to format and display information on the Web using a fixed set of tags, elements, and attributes. However, if HTML doesn't support a specific functionality, users are left with limited options. This is where XML comes into play.
XML, like HTML, is text-based and consists of tags, elements, and attributes. However, XML allows users to structure and define the information in their documents, making it a meta-language. It enables users to create their own collection of tags, elements, and attributes to accurately describe the physical contents of a document, thus offering extensibility.
An XML document is composed of tags, elements, and attributes. Consider the following HTML fragment:
<table> <tr> <td>Here is the first group of text</td> <td>Here is the second group of text</td> </tr> </table>
This HTML document contains a table element (<table>), a table row element (<tr>), and two table cell elements (<td>). Each of these elements has both an opening tag (<table>) and a closing tag (</table>).
In XML, users can replace the element and attribute names with their own custom tags. For instance, a document describing a company's employee roster might look like this:
<?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
Let's break down the components of the XML document:
The header indicates that this is an XML document, using version 1.0 of the XML specification.
Just like in HTML, tags are used to indicate the opening and closing of an element.
Elements are the basic building blocks of XML. They may contain text, comments, or other elements, and consist of a start tag and an end tag.
Attributes represent the adjectives that describe the elements. They must be contained within quotation marks to be well-formed.
Elements contain contents that give critical information about them.
The World Wide Web Consortium (W3C), the Internet's governing body, is considering a proposal to rewrite the HTML 4 language in XML 1.0. As of the time this article was written, XHTML had received endorsement by the director of the W3C as a recommendation. This proposal, known as XHTML, will require well-formedness in all HTML documents. The W3C's full support of XML indicates its potential to revolutionize the way we program applications for the Web.