Internet programming with Java

1.    The Role of Form Data

If you’ve ever used a search engine, visited an on-line bookstore, tracked stocks on the Web, or asked a Web-based site for quotes on plane tickets, you’ve probably seen funny-looking URLs like http://host/path?user=Marty+Hall&origin=bwi&dest=lax. The part after the question mark (i.e., user=Marty+Hall&origin= bwi&dest=lax) is known as form data (or query data) and is the most common way to get information from a Web page to a server-side program. Form data can be attached to the end of the URL after a question mark (as above), for GET requests, or sent to the server on a separate line, for POST requests. If you’re not familiar with HTML forms, Chapter 16 (Using HTML Forms) gives details on how to build forms that collect and transmit data of this sort. Extracting the needed information from this form data is traditionally one of the most tedious parts of CGI programming. First of all, you have to read the data one way for GET requests (in traditional CGI, this is usually through the QUERY_STRING environment variable) and a different way for POST requests (by reading the standard input in traditional CGI). Second, you have to chop the pairs at the ampersands, then separate the parameter names (left of the equal signs) from the parameter values (right of the equal signs). Third, you have to URL-decode the values. Alphanumeric characters are sent unchanged, but spaces are converted to plus signs and other characters are converted to %XX where XX is the ASCII (or ISO Latin-1) value of the character, in hex. Then, the server-side program has to reverse the process. For example, if someone enters a value of “~hall, ~gates, and ~mcnealy” into a textfield with the name users in an HTML form, the data is sent as “users=%7Ehall%2C+%7Egates%2C+and+%7Emcnealy”, and the server-side program has to reconstitute the original string. Finally, the fourth reason that parsing form data is tedious is that values can be omitted (e.g., “param1=val1&param2=&param3=val3”) or a parameter can have more than one value (e.g., “param1=val1&param2=val2&param1=val3”), so your parsing code needs special cases for these situations.

 

2.    Reading Form Data from Servlets

One of the nice features of servlets is that all of this form parsing is handled automatically. You simply call the getParameter method of the Http- ServletRequest, supplying the case-sensitive parameter name as an argument. You use getParameter exactly the same way when the data is sent by GET as you do when it is sent by POST. The servlet knows which request method was used and automatically does the right thing behind the scenes. The return value is a String corresponding to the URL-decoded value of the first occurrence of that parameter name. An empty String is returned if the parameter exists but has no value, and null is returned if there was no such parameter. If the parameter could potentially have more than one value, you should call getParameterValues (which returns an array of strings) instead of getParameter (which returns a single string). The return value of getParameterValues is null for nonexistent parameter names and is a one-element array when the parameter has only a single value. Parameter names are case sensitive so, for example, request.get- Parameter("Param1") and request.getParameter("param1") are not interchangeable. The values supplied to getParameter and getParameterValues are case sensitive. Finally, although most real servlets look for a specific set of parameter names, for debugging purposes it is sometimes useful to get a full list. Use getParameterNames to get this list in the form of an Enumeration, each entry of which can be cast to a String and used in a getParameter or get- ParameterValues call. Just note that the HttpServletRequest API does not specify the order in which the names appear within that Enumeration.

3.    Example: Reading Three Explicit Parameters

Listing 3.1 presents a simple servlet called ThreeParams that reads form data parameters named param1, param2, and param3 and places their values in a bulleted list. Listing 3.2 shows an HTML form that collects user input and sends it to this servlet. By use of an ACTION of /servlet/coreservlets. ThreeParams, the form can be installed anywhere on the system running the servlet; there need not be any particular association between the directory containing the form and the servlet installation directory. Recall that the specific locations for installing HTML files vary from server to server. With the JSWDK 1.0.1 and Tomcat 3.0, HTML pages are placed somewhere in install_dir/webpages and are accessed via http://host/path/file.html. For example, if the form shown in Listing 3.2 is placed in install_dir/webpages/forms/ThreeParams- Form.html and the server is accessed from the same host that it is running on, the form would be accessed by a URL of http://localhost/forms/ThreeParamsForm.html.

 

Although you are required to specify response settings before beginning to generate the content, there is no requirement that you read the request parameters at any particular time. If you’re accustomed to the traditional CGI approach where you read POST data through the standard input, you should note that you can do the same thing with servlets by calling getReader or getInputStream on the HttpServletRequest and then using that stream to obtain the raw input. This is a bad idea for regular parameters since the input is neither parsed (separated into entries specific to each parameter) nor URL-decoded (translated so that plus signs become spaces and %XX gets replaced by the

 

Listing 1 - ThreeParams.java

 

package coreservlets;

import java.io.*;

import javax.servlet.*;

import javax.servlet.http.*;

public class ThreeParams extends HttpServlet {

    public void doGet(HttpServletRequest request,

            HttpServletResponse response)

            throws ServletException, IOException {

        response.setContentType("text/html");

        PrintWriter out = response.getWriter();

        String title = "Reading Three Request Parameters";

        out.println(ServletUtilities.headWithTitle(title) +

        "<BODY BGCOLOR=\"#FDF5E6\">\n" +

        "<H1 ALIGN=CENTER>" + title + "</H1>\n" +

        "<UL>\n" +

        " <LI><B>param1</B>: "

        + request.getParameter("param1") + "\n" +

        " <LI><B>param2</B>: "

        + request.getParameter("param2") + "\n" +

        " <LI><B>param3</B>: "

        + request.getParameter("param3") + "\n" +

        "</UL>\n" +

        "</BODY></HTML>");

    }

}

 

ASCII or ISO Latin-1 character corresponding to the hex value XX). However, reading the raw input might be of use for uploaded files or POST data being sent by custom clients rather than by HTML forms. Note, however, that if you read the POST data in this manner, it might no longer be found by getParameter.

 

Listing 3.2 - ThreeParamsForm.html

 

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML>

<HEAD>

    <TITLE>Collecting Three Parameters</TITLE>

</HEAD>

<BODY BGCOLOR="#FDF5E6">

    <H1 ALIGN="CENTER">Collecting Three Parameters</H1>

    <FORM ACTION="/servlet/coreservlets.ThreeParams">

        First Parameter: <INPUT TYPE="TEXT" NAME="param1"><BR>

        Second Parameter: <INPUT TYPE="TEXT" NAME="param2"><BR>

        Third Parameter: <INPUT TYPE="TEXT" NAME="param3"><BR>

        <CENTER>

            <INPUT TYPE="SUBMIT">

        </CENTER>

    </FORM>

</BODY>

</HTML>