Java Tutorial - Java Scipt :
Testing Code
We start off with a demonstration parsing the same XML document with a variety of tools. Seeing the code makes it easier to determine which style you’re most comfortable with. These parsing examples are necessarily academic
to facilitate comparison.
The examples often reference order.xml, an XML document we made up for the purpose of testing the parsers. This document needs to be available in the same directory as the sample code. In some examples, order.xml will be automatically generated by the code. Nevertheless, the contents of the order.xml should remain consistent.
The contents of order.xml are:
<?xml version=”1.0” encoding=”UTF-8”?>
<order>
<line_item name=”T-shirt”
count=”1”
color=”red”
size=”XL”>
<notes>Please ship before next week.</notes>
</line_item>
</order>
The ability to validate XML is one of this document’s great strengths. The DTD standard was popular in the past, but DTDs are being phased out because they can only enforce structural validity. If you’ve worked with
HTML, you may recall seeing DTDs before. The following is an example of a DTD reference in an HTML document:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01 Transitional//EN”>
The most likely new standard for validating XML documents is XML Schema. This is a W3C recommendation for defining the schema and rules governing a valid XML document. With XML Schema, we can define what is
and is not permissible data, what format data should be in, and how it should be structured. As an added bonus, XML schemas are themselves written in XML.
The XML schema for our simple order.xml is contained in the order .xsd file.
Following are the contents of order.xsd:
<schema
xmlns=”http://www.w3.org/2001/XMLSchema”
xmlns:soml=”http://www.theculprit.com/test/simple_order_ml” >
<element name=”order”>
<complexType>
<element name=”line_item”>
type=”decimal”/>
<attribute name=”name” type=”string”/>
<attribute name=”count” type=”integer”>
<minInclusive value=”0” />
</attribute>
<attribute name=”color” type=”string” >
<maxLength value=”14” />
</attribute>
<attribute name=”size” type=”soml:ItemSize” />
<element name=”notes” type=”string” />
</element>
</complexType>
</element>
<simpleType name=”ItemSize” base=”string”>
<enumeration value=”XS” />
<enumeration value=”S” />
<enumeration value=”M” />
<enumeration value=”L” />
<enumeration value=”XL” />
<enumeration value=”XXL” />
</simpleType>
</schema>
The schema is pretty self-explanatory. The structure of the schema document looks similar to the document it is describing. There are rules and restrictions, known as data facets, sprinkled throughout it. For example, the
maxLength element is a data facet. We can also create new types, as in the case of the itemSize element. The XML Schema specification provides a number of standard types we can use to enforce type on our XML data. There are also some alternate methods of validating XML available. These include Regular Language Description for XML (RELAX), RELAX Next Generation (RELAX NG), and Tree Regular Expression for XML (TREX). RELAX NG is basically the combination of RELAX and TREX, so moving forward, it is the most likely to achieve to achieve popularity. More information can be found on RELAX NG at http://www.relaxng.org/.
XML Schema represents a great way to perform validation and is a huge improvement over the DTD standard.
Validation is a relatively expensive operation, so use it accordingly. Because there is still not a definitive standard for validation, we’re going to avoid getting into the topic in great depth. What we can tell you is that XML validation plays a vital role in enterprise systems.