Before you can load anything, you must create a new instance of the class URL that represents the address of the resource you want to load. URL is an acronym for uniform resource locator, and it refers to the unique address of any document or other resource accessible on the Internet. URL is part of the java.net package, so you must import the package or refer to the class by its full name in your programs. To create a new URL object, use one of four constructors:
· URL(String)—Creates a URL object from a full web address such as http://www.java21days.com or ftp://ftp.netscape.com.
· URL(URL, String)—Creates a URL object with a base address provided by the specified URL and a relative path provided by the String.
· URL(String, String, int, String)—Creates a new URL object from a protocol (such as “http” or “ftp”), hostname (such as “www.cnn.com” or “web.archive.org”), port number (80 for HTTP), and a filename or pathname.
· URL(String, String, String)—The same as the previous constructor minus the port number.
When you use the URL(String) constructor, you must deal with MalformedURLException objects, which are thrown if the String does not appear to be a valid URL. These objects can be handled in a try-catch block:
try {
URL load = new URL(“http://www.samspublishing.com”);
} catch (MalformedURLException e) {
System.out.println(“Bad URL”);
}
The WebReader application in Listing 17.1 uses the four-step technique to open a connection to a website and read a text document from it. When the document is fully loaded, it is displayed in a text area.
To run the WebReader application, specify a URL as the only command-line argument. For example:
java WebReader http://www.rssboard.org/rss-feed
Any URL can be chosen; try http://tycho.usno.navy.mil/cgi-bin/timer.pl for the U.S. Naval Observatory timekeeping site or http://random.yahoo.com/bin/ryl for a random link from the Yahoo! directory. The preceding example loads a page from an RSS file, as shown in Figure 17.1.Two thirds of the WebReader class is devoted to running the application, creating the user interface, and creating a valid URL object. The web document is loaded over a stream and displayed in a text area in the getData() method. Four objects are used: URL, HttpURLConnection, InputStreamReader, and BufferedReader objects. These objects work together to pull the data from the Internet to the Java application. In addition, two objects are created to hold the data when it arrives: a String and a StringBuffer. Lines 24–26 open an HTTP URL connection, which is necessary to get an input stream from that connection.
Lines 27–28 use the connection’s getContent() method to create a new input stream reader. The method returns an input stream representing the connection to the URL. Line 29 uses that input stream reader to create a new buffered input stream reader—a BufferedReader object called buff. After you have this buffered reader, you can use its readLine() method to read a line of text from the input stream. The buffered reader puts characters in a buffer as they arrive and pulls them out of the buffer when requested.
The do-while loop in lines 32–35 reads the web document line by line, appending each line to the StringBuffer object created to hold the page’s text. After all the data has been read, line 36 converts the string buffer into a string with the toString() method and then puts that result in the program’s text area by calling the component’s setText(String) method. The HttpUrlConnection class includes several methods that affect the HTTP request or provide more information: