![]() ![]() ![]() ![]() ![]() |
Top Contents Index Glossary |
Link Summary
|
External Links
Glossary Terms |
Now that you have a basic understanding of XML, it makes sense to get a high-level overview of the various XML-related acronyms and what they mean. There is a lot of work going on around XML, so there is a lot to learn.
The current APIs for accessing XML documents either serially or in random access mode are, respectively, SAX and DOM. The specifications for ensuring the validity of XML documents are DTD (the original mechanism, defined as part of the XML specification) and various schema proposals (newer mechanisms that use XML syntax to do the job of describing validation criteria). Other future standards that are nearing completion include the XSL standard -- a mechanism for setting up translations of XML documents (for example to HTML or other XML) and for dictating how the document is rendered. Another effort nearing completion is the XML Link Language specification (XLL), which enables links between XML documents.
Those are the major initiatives you will want to be familiar with. This section also surveys a number of other interesting proposals, including the HTML-lookalike standard, XHTML, and the meta-standard for describing the information an XML document contains, RDF. It also covers the XML Namespaces initiative that promotes modular reuse of XML documents by avoiding naming collisions.
Several of the XML schema proposals are covered here as well, along with a quick survey of the standards efforts that are using XML for remote control of desktops (DMTF) and document servers (WebDAV).
Finally, there are a number of interesting standards and standards-proposals that build on XML, including Synchronized Multimedia Integration Language (SMIL), Mathematical Markup Language (MathML), Scalable Vector Graphics (SVG), and DrawML.
The remainder of this section gives you a more detailed description of these initiatives. To help keep things straight, it's divided into:
Skim the terms once, so you know what's here, and keep a copy of this document handy so you can refer to it whenever you see one of these terms in something you're reading. Pretty soon, you'll have them all committed to memory, and you'll be at least "conversant" with XML!
SAX
Simple API for XMLThis API was actually a product of collaboration on the XML-DEV mailing list, rather than a product of the W3C. It's included here because it has the same "final" characteristics as a W3C recommendation.
You can also think of this standard as the "serial access" protocol for XML. This is the fast-to-execute mechanism you would use to read and write XML data in a server, for example. This is also called an event-driven protocol, because the technique is to register your handler with a SAX parser, after which the parser invokes your callback methods whenever it sees a new XML tag (or encounters an error, or wants to tell you anything else).
For more information on the SAX protocol, see Serial Access with the Simple API for XML.
DOM
The Document Object Model protocol converts an XML document into a collection of objects in your program. You can then manipulate the object model in any way that makes sense. This mechanism is also known as the "random access" protocol, because you can visit any part of the data at any time. You can then modify the data, remove it, or insert new data. For more information on the DOM specification, see Manipulating Document Contents with the Document Object Model.
Document Object ModelDTD
Document Type DefinitionThe DTD specification is actually part of the XML specification, rather than a separate entity. On the other hand, it is optional -- you can write an XML document without it. And there are a number of schema proposals that offer more flexible alternatives. So it is treated here as though it were a separate specification.
A DTD specifies the kinds of tags that can be included in your XML document, and the valid arrangements of those tags. You can use the DTD to make sure you don't create an invalid XML structure. You can also use it to make sure that the XML structure you are reading (or that got sent over the net) is indeed valid.
Unfortunately, it is difficult to specify a DTD for a complex document in such a way that it prevents all invalid combinations and allows all the valid ones. So constructing a DTD is something of an art. The DTD can exist at the front of the document, as part of the prolog. It can also exist as a separate entity, or it can be split between the document prolog and one or more additional entities.
However, while the DTD mechanism was the first method defined for specifying valid document structure, it was not the last. Several newer schema specifications have been devised. You'll learn about those momentarily.
For more information, see Defining a Document Type.
RDF
Resource Description FrameworkRDF is a proposed standard for defining data about data. Used in conjunction with the XHTML specification, for example, or with HTML pages, RDF could be used to describe the content of the pages. For example, if your browser stored your ID information as
FIRSTNAME
,LASTNAME
, andNAME
andEMAILADDRESS
. Just think! Some day you may not need to type your name and address at every web site you visit!For the latest information on RDF, see
http://www.w3.org/TR/PR-rdf-syntax/
.Namespaces
The namespace standard lets you write an XML document that uses two or more sets of XML tags in modular fashion. Suppose for example that you created an XML-based parts list that uses XML descriptions of parts supplied by other manufacturers (online!). The "price" data supplied by the subcomponents would be amounts you want to total up, while the "price" data for the structure as a whole would be something you want to display. The namespace specification defines mechanisms for qualifying the names so as to eliminate ambiguity. That lets you write programs that use information from other sources and do the right things with it.
The latest information on namespaces can be found at
http://www.w3.org/TR/REC-xml-names
.
A W3C "proposed recommendation" is a not-quite-final-but-probably-really-close proposal for a W3C recommendation. It is still open for review, and may see some change if the harsh light of reality forces it. But a lot of thought has been given to the proposal by many gifted people, so it's a pretty good bet that a standard in this category will go forward without much change.
RDF Schema
The RDF Schema proposal allows the specification of consistency rules and additional information that describe how the statements in a Resource Description Framework (RDF) should be interpreted.
For more information on the RDF Schema recommendation, see http://www.w3.org/TR/PR-rdf-schema.
XSL
Extensible Stylesheet LanguageThe XML standard specifies how to identify data, not how to display it. HTML, on the other hand, told how things should be displayed without identifying what they were. The coalescing XSL standard is essentially a translation mechanism that lets you specify what to convert an XML tag into so that it can be displayed -- for example, in HTML. Different XSL formats can then be used to display the same data in different ways, for different uses.
The translation part of XSL is pretty complete, and a number of implementations exist. The second part of XSL is a bit more tenuous, however. That part covers formatting objects, also known as flow objects, which give you the ability to define multiple areas on a page and then link them together. When a text stream is directed at the collection, it fills the first area and then "flows" into the second when the first area is filled. Such objects are used by newsletters, catalogs, and periodical publications.
The latest W3C work on XSL is at
http://www.w3.org/TR/WD-xsl
.XLL
XML Link LanguageThe XLL protocol consists of two proposed specifications to handle links between XML documents: XLink and XPointer, discussed next. These specifications are still in their preliminary stages, but are sure to have a big impact on how XML documents are used.
XLink: The XLink protocol is a proposed specification to handle links between XML documents. This specification allows for some pretty sophisticated linking, including two-way links, links to multiple documents, "expanding" links that insert the linked information into your document rather than replacing your document with a new page, links between two documents that are created in a third, independent document, and indirect links (so you can point to an "address book" rather than directly to the target document -- updating the address book then automatically changes any links that use it). For more information on the XLink specification, see
http://www.w3.org/TR/WD-xml-link
.XPointer: In general, the XLink specification targets a document or document-segment using its ID. The XPointer specification defines mechanisms for "addressing into the internal structures of XML documents", without requiring the author of the document to have defined an ID for that segment. To quote the spec, it provides for "reference to elements, character strings, and other parts of XML documents, whether or not they bear an explicit ID attribute". For the latest XPointer specification, see
http://www.w3.org/TR/WD-xptr
.XHTML
The XHTML specification is a way of making XML documents that look and act like HTML documents. Since an XML document can contain any tags you care to define, why not define a set of tags that look like HTML? That's the thinking behind the XHTML specification, at any rate. The result of this specification is a document that can be displayed in browsers and also treated as XML data. The data may not be quite as identifiable as "pure" XML, but it will be a heck of a lot easier to manipulate than standard HTML, because XML specifies a good deal more regularity and consistency.
For example, every tag in a well-formed XML document must either have an end-tag associated with it or it must end in
/>
. So you might see<p>...</p>
, or you might see<p/>
, but you will never see<p>
standing by itself. The upshot of that requirement is that you never have to program for the weird kinds of cases you see in HTML where, for example, a<dt>
tag might be terminated by</dt>
, by another<dt>
, by<dd>
, or by</dl>
. That makes it a lot easier to write code!The XHTML specification is a reformulation of HTML 4.0 into XML. The latest information is at
http://www.w3.org/TR/WD-html-in-xml/
.XML Schema
This specification is built on the schema proposals described below. It defines the types of elements a document can contain, their relationships, and the data they can contain in ways that go far beyond what the current DTD specification provides. See the "Schema Proposals" section below for more insight into the limitations of DTDs. For more information on the XML Schema proposal, see the W3C specs XML Schema (Structures) and XML Schema (Datatypes).
"Notes" are not W3C standards at all. Instead, they are proposals made by various individuals and groups that cover topics that are under consideration. The W3C publishes them so that people who are busy working on the standards and reviewing them have some ideas to get started. One "note" is no more likely to reflect the eventual standard than any other -- each will be judged on its merits and, hopefully, the best features of all will be combined in the W3C draft. Most of the schema proposals to date [Mar 1999] fall into this category.
Although DTDs let you validate XML documents, they suffer from a number of shortcomings. Many of the issues stem from the fact that a DTD specification is not hierarchical. For a mailing address that contained several "parsed character data" (PCDATA) elements, for example, the DTD might look something like this:
<!ELEMENT mailAddress (name, address, zipcode)> <!ELEMENT name (#PCDATA)> <!ELEMENT address (#PCDATA)> <!ELEMENT zipcode (#PCDATA)>
As you can see, the specifications are linear. There is no sense of containment,
which can pollute the namespace, forcing you to come up with new names for similar
elements in different settings. So if you wanted to add another "name"
element to the DTD that contained of the elements firstName
, middleInitial
,
and lastName
, then you would have to come up with another identifier.
You could not simply call it "name" without conflicting with the name
element defined for use in a mailAddress
.
Another problem with the nonhierarchical nature of DTD specifications is that
it is not clear what comments are meant to explain. A comment at the top like
<!-- Address used for mailing via the postal system -->
would
apply to all of the elements that constitute a mailing address. But a comment
like <!-- Addressee -->
would apply to the name
element only. On the other hand, a comment like <!-- A 5-digit string
-->
would apply specifically to the #PCDATA
part of the
zipcode
element, to describe the valid formats. Finally, DTDs do
not allow you to formally specify field-validation criteria, such as the 5-digit
(or 5 and 4) limitation for the zipcode
field.
To remedy these shortcomings, a number of proposals have been made for a more database-like, hierarchical "schema" that specifies validation criteria. Some of the major proposals are shown below.
DDML / Xschema
Document Definition Markup Language / XSchema
Document definitions like DTD are good to have, but a DTD has a somewhat strange syntax. DDML is the new name for the older XSchema proposal, which specifies validity constraints for an XML document using XML. DDML is one of several proposals that aim to be the successor to DTD. It is not yet clear what the final validation standard will be.
For more information on DDML, see
http://www.w3.org/TR/NOTE-ddml
.DCD
Document Content Description
The DCD proposal is a mechanism for defining a standard XML front end for databases.
For more information on DCD, see
http://www.w3.org/TR/NOTE-dcd
.SOX
Schema for Object-oriented XML
SOX is a schema proposal that includes extensible data types, namespaces, and embedded documentation.
For more information on SOX, see
http://www.w3.org/TR/NOTE-SOX
.
ICE
Information and Content Exchange
ICE is a protocol for use by content syndicators and their subscribers. It focuses on "automating content exchange and reuse, both in traditional publishing contexts and in business-to-business relationships".
For more information on ICE, see
http://www.w3.org/TR/NOTE-ice
.
The following standards and proposals build on XML. Since XML is basically a language-definition tool, these specifications use it to define standardized languages for specialized purposes.
SMIL
Synchronized Multimedia Integration Language
SMIL is a W3C recommendation that covers audio, video, and animations. It also addresses the difficult issue of synchronizing the playback of such elements.
For more information on SMIL, see
http://www.w3.org/TR/REC-smil
.MathML
Mathematical Markup Language
MathML is a W3C recommendation that deals with the representation of mathematical formulas.
For more information on MathML, see
http://www.w3.org/TR/REC-MathML
.SVG
Scalable Vector Graphics
SVG is a W3C working draft that covers the representation of vector graphic images. (Vector graphic images that are built from commands that say things like "draw a line (square, circle) from point x,y to point m,n" rather than encoding the image as a series of bits. Such images are more easily scalable, although they typically require more processing time to render.)
For more information on SVG, see
http://www.w3.org/TR/WD-SVG
.DrawML
Drawing Meta Language
DrawML is a W3C note that covers 2D images for technical illustrations. It also addresses the problem of updating and refining such images.
For more information on DrawML, see
http://www.w3.org/TR/1998/NOTE-drawml-19981203
.
cXML
Commerce XML
cXML is a RosettaNet (
www.rosettanet.org
) standard for setting up interactive online catalogs for different buyers, where the pricing and product offerings are company specific. Includes mechanisms to handle purchase orders, change orders, status updates, and shipping notifications.For more information on cXML, see
http://corp.ariba.com/News/AribaArchive/cxml.htm
.CBL
Common Business Library
CBL is a library of element and attribute definitions maintained by CommerceNet (
www.commerce.net
).For more information on CBL and a variety of other initiatives that work together to enable eCommerce applications, see
http://www.commerce.net/projects/currentprojects/eco/wg/eCo_Framework_Specifications.html
.
DMTF
Distributed Management Task ForceThe DMTF is a group that is coming up with standards to remotely administer desktop equipment. They are planning to use XML to maintain catalogs of devices and their descriptions, and for other remote-management tasks. This group is not part of the W3C, but their activities appear to have progressed to the draft stage, so they are listed here.
For more information on this organization, see
http://www.dmtf.org/
.WebDAV
Web Distributed Authoring and VersioningWebDAV is an effort from the IETF that uses XML to maintain web servers. It allows a server's content to be created, modified, and changed over an HTTP connection. (The IETF is not affiliated with the W3C, but their "draft standard" is approximately the equivalent of a W3C "recommendation", so it is included here.)
For more information, see the "webdav" working group at
http://www.ietf.org
.
![]() ![]() ![]() ![]() ![]() |
Top Contents Index Glossary |