Java Tutorial - Java Script :
Processing XML with XOM
The Serializer class in nu.xom offers control over how an XML document is formatted when it is displayed or stored serially. Indentation, character encoding, line breaks, and other formatting are established by objects of this class. A Serializer object can be created by specifying an output stream and character encoding as arguments to the constructor:
File inFile = new File(arguments[0]);
FileOutputStream fos = new FileOutputStream(“new_” + inFile.getName());
Serializer output = new Serializer(fos, “ISO-8859-1”);
These statements serialize a file using the ISO-8859-1 character encoding. The file is given a name based on a command-line argument. Serializer supports 20 encodings, including ISO-10646-UCS-2, ISO 8859-1 through ISO-8859-10, ISO-8859-13 through ISO-8859-16, UTF-8, and UTF-16. There’s also a Serializer() constructor that takes only an output stream as an argument; this uses the UTF-8 encoding by default.
Indentation is set by calling the serializer’s setIndentation() method with an integer argument specifying the number of spaces: output.setIndentation(2); An entire XML document is written to the serializer destination by calling the serializer’s write() method with the document as an argument: output.write(doc);
The DomainWriter application inserts a comment atop the XML document instead of appending it at the end of a parent node’s children. This requires another method of the parent node, insertChild(), which is called with two arguments—the element to add and the integer position of the insertion:
Builder builder = new Builder();
Document doc = builder.build(arguments[0]);
Comment timestamp = new Comment(“File created “ +new java.util.Date());
doc.insertChild(timestamp, 0);
The comment is placed at position 0 atop the document, moving the domains tag down one line but remaining below the XML declaration.
The DomainWriter application takes an XML filename as a command-line argument when run: java DomainWriter feeds2.rss This command produces a file called new_feeds2.rss that contains an indented copy of the XML document with a time stamp inserted as a comment.
Evaluating XOM
These three sample applications cover the core features of the main XOM package and are representative of its straightforward approach to XML processing. There also are smaller nu.xom.canonical, nu.xom.converters, nu.xom.xinclude, and nu.xom.xslt packages to support XInclude, XSLT, canonical XML serialization, and conversions between the XOM model for XML and the one used by DOM and SAX.
Listing 19.7 contains an application that works with XML from a dynamic source: RSS feeds of recently updated web content from the producer of the feed. The RssFilter application searches the feed for specified text in headlines, producing a new XML document that contains only the matching items and shorter indentation. It also modifies the feed’s title and adds an RSS 0.91 document type declaration if one is needed in an RSS 0.91 format feed.
One feed that can be used to test the application is the one from the Toronto Star newspaper. The following command searches it for items with titles that mention the word “snow”: java RssFilter http://www.thestar.com/rss/000-082-672?searchMode=Lineup snow Comments in the application’s source code describe its functionality.
XOM’s design is strongly informed by one overriding principle: enforced simplicity. On the website for the class library, Harold states that XOM “should help inexperienced developers do the right thing and keep them from doing the wrong thing. The learning curve needs to be really shallow, and that includes not relying on best practices that are known in the community but are not obvious at first glance.”The new class library is useful for Java programmers whose Java programs require a steady diet of XML.
