![]() ![]() ![]() ![]() ![]() |
Top Contents Index Glossary |
Link Summary
|
Glossary Terms |
Let's start out by writing up a simple version of the kind of XML data you could use for a slide presentation. In this exercise, you'll use your text editor to create the data in order to become comfortable with the basic format of an XML file. You'll be using this file and extending it in later exercises.
Using a standard text editor, create a file called slideSample.xml
.
Note: Here is a version of it that already exists:
slideSample01.xml
. You can use this version to compare your work, or just review it as you read this guide.
Next, write the declaration, which
identifies the file as an XML document. The declaration starts with the characters
"<?
", which is the standard XML identifier for a processor
instruction. (You'll see other processor instructions later on in this tutorial.)
This line identifies the document as an XML document that conforms to version 1.0 of the XML specification, and says that it uses the 8-bit US ASCII character-encoding scheme. Since it has not been specified as a "standalone" document, the parser assumes that it may contain references to other documents. (To see how to specify a document as "standalone", see A Quick Introduction to XML, The XML Prolog.)<?xml version='1.0' encoding='us-ascii'?>
Comments are ignored by XML parsers. You never see them in fact, unless you activate special settings in the parser. You'll see how to do that later on in the tutorial, when we discuss Using a LexicalEventListener. For now, add the text highlighted below to put a comment into the file.
<?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides -->
slideshow
:
<?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides --> <slideshow> </slideshow>
A slide presentation has a number of associated data items, none of which require
any structure. So it is natural to define them as attributes
of the slideshow
element. Add the text highlighted below to set
up some attributes:
... <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > </slideshow>
When you create a name for a tag or an attribute, you can use hyphens ("-"), underscores ("_"), colons (":"), and periods (".") in addition to characters and numbers.
Note:
Colons should be used with care or avoided altogether, because they are used when defining the namespace for an XML document.
XML allows for hierarchically structured data, which means that an element can contain other elements. Add the text highlighted below to define a slide element and a title element contained within it:
...
<!-- TITLE SLIDE --> <slide title="Title of Talk"/> <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title> </slide> </slideshow>
Here you have also added a type attribute to the slide. The idea of
this attribute is that slides could be earmarked for a mostly technical or mostly
executive audience with type="tech"
or type="exec"
,
or identified as suitable for both with type="all"
.
More importantly, though, this example illustrates the difference between things that are more usefully defined as elements (the title element) and things that are more suitable as attributes (the type attribute). The visibility heuristic is primarily at work here. The title is something the audience will see. So it is an element. The type, on the other hand, is something that never gets presented, so it is an attribute. Another way to think about that distinction is that an element is a container, like a bottle. The type is a characteristic of the container (is it tall or short, wide or narrow). The title is a characteristic of the contents (water, milk, or tea). These are not hard and fast rules, of course, but they can help when you design your own XML structures.
Since XML lets you define any tags you want, it makes sense to define a set
of tags that look like HTML. The XHTML
standard does exactly that, in fact. You'll see more about that towards the
end of the SAX tutorial. For now, type the text highlighted below to define
a slide with a couple of list item entries that use an HTML-style <em>
tag for emphasis (usually rendered as italicized text):
...
<!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item>Who <em>buys</em> WonderWidgets</item> </slide> </slideshow>
We'll see later that defining a title element conflicts with the XHTML element that uses the same name. We'll discuss the mechanism that produces the conflict (the DTD) and several possible solutions when we cover Parsing the Parameterized DTD.
One major difference between HTML and XML, though, is that all XML must be well formed -- which means that every tag must have an ending tag or be an empty tag. You're getting pretty comfortable with ending tags, by now. Add the text highlighted below to define an empty list item element with no contents:
Note that any element can be empty element. All it takes is ending the tag with "...
<!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets</item> </slide> </slideshow>
/>
" instead of ">
". You could
do the same thing by entering <item></item>
, which is
equivalent.
Here is the completed version of the XML file:
<?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides --> <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title>
</slide>
<!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets</item> </slide>
</slideshow>
Now that you've created a file to work with, you're ready to write a program to echo it using the SAX parser. You'll do that in the next section.
![]() ![]() ![]() ![]() ![]() |
Top Contents Index Glossary |