When Sun released J2EE to capture the growing e-business market, it
changed Java from a language to an enterprise platform.
Several key players such as BEA and Oracle have pledged their
support and endorse J2EE standards in their application server
products. Several other companies are either already using the Java
application server or are thinking of using it in the near future.
These companies are scrambling to come up with a scalable enterprise
architecture that works with existing technology and also grows with
future changes. However, developing scalable and adaptable
enterprise architecture is a difficult and time-consuming task.
Several software-engineering principles such as OO methodology and
software patterns need to be used when constructing the enterprise
architecture so it will last.
To solve this Cybelink created Jlink, a vendor-neutral
framework based on Sun's J2EE Blueprint recommendations and
guidelines. Architected at a high-level, Jlink can be easily adapted
to work with any vendor-specific, Java-based application server such
as WebLogic and WebSphere. In this three-part series we'll discuss
the Jlink framework; in Part 1 we'll discuss the Servlet/JSP portion
of Jlink and describe the problems associated with the Servlet/JSP
technology. In Part 2 we'll describe the Jlink architecture and how
it solves the Servlet/JSP problems. In Part 3 we'll describe the
workings of Jlink and provide an example.
Web Programming Model
and Servlet/JSP Container
Before going into the details of Jlink, we'll discuss the Web
programming model and explore how the Servlet/JSP container supports
it. Any Servlet/JSP-based Web application can be modeled using the
following simple actions: a user submits a request using a Web
browser and clicking a button or a URL link. The HTTP request first
goes to the Web server and if it's plain HTML, it sends the HTML page
to the browser. If the request needs to execute a Servlet/JSP, the
Web server passes the request to the Servlet/JSP container, which
processes the request by invoking the appropriate Servlet/JSP and
sending a response back to the browser. The Servlet/JSP might process
the request itself or send it to a JavaBean or EJB, extract the
result, and transfer it to the browser via the Web server.
J2EE defines a Web container as a place where servlets and
JSPs live, and it acts as a bridge between the client and EJB
containers. The communication between the Web and EJB containers is
carried out by JavaBean components. A crucial aspect of Web
development is to architect the middle tier, which processes the HTTP
request and sends the results back to the browser. Jlink provides an
easy mechanism to handle HTTP requests from the browser and send them
to the appropriate JavaBean component, which processes the results
and sends them back to the browser. An HTTP request can be processed
in one of the following ways:
- Within Servlet/JSP components
- Within JavaBean components
- Accessing the EJB business components using local calls or
through RMI/IIOP
- Accessing the Enterprise Information System tier (EIS) using
a connector
- Accessing databases using JDBC
This Jlink currently supports Options 1 and 2 and has been
architected in such a way that future supports for Options 3, 4, and
5 can be integrated into it.
Servlet/JSP Container Runtime Environment
Before going into the details of our Jlink architecture,
we'll discuss the Servlet and JSP working model in detail. Once we
understand the inner workings of that, we can discuss the Jlink
architecture.
When an HTTP request with a Servlet/JSP comes from a browser,
the Web server routes the request to the Servlet/JSP container (see
Figure 1). If the HTTP request comes for the first time, the
Servlet/JSP container compiles the Servlet (in the case of JSP, it's
converted into a Servlet) and processes the HTTP request by creating
the following objects:
- javax.servlet.ServletContext
- javax.servlet.HttpRequest
- javax.servlet.HttpResponse
- javax.servlet.http.HttpSession
Conversely the second time an HTTP request comes for the same
Servlet/JSP from the same browser, the container creates only the
following objects:
- javax.servlet.HttpRequest
- javax.servlet.HttpResponse
The second time an HTTP request comes for the same
Servlet/JSP but from a different browser, the container creates only
the following objects:
- javax.servlet.HttpRequest
- javax.servlet.HttpResponse
- javax.servlet.http.HttpSession
The Servlet/JSP container creates one
javax.servlet.ServletContext object per Web application. This object
provides application-wide services for all Servlet/JSPs within that
Web application. The Servlet/JSP container creates one
javax.servlet.HttpSession object for each browser connection. Each
HttpSession object provides services to the corresponding browser
connection and is responsible for maintaining the individual browser
information. Each time an HTTP request comes from a browser that
contains a Servlet/JSP, the container creates new
javax.servlet.HttpRequest and javax.servlet.HttpResponse objects. The
HttpServletRequest provides browser request information to a
Servlet/JSP. It also provides data, including parameter names and
values, that's set in the browser. The Servlet/JSP uses the
HttpResponse object to send responses back to the browser.
Another important concept of the Servlet/JSP container is
that when multiple HTTP requests from different browsers come to a
Servlet/JSP, the container handles it by creating a thread for each
request. This significantly increases the performance of the Web
application. However, this feature also creates a problem that will
be addressed in the next section.
Life Cycle of the Objects
In the previous section we discussed the Servlet/JSP
container's runtime environment. In this section we'll look at the
life cycle of the HttpSession, HttpRequest, and HttpResponse objects
that make the runtime environment. The ServletContext object exists
as long as the Web application is active in the Servlet/JSP
container. When the Web application is either closed or restarted,
the corresponding ServletContext object becomes a potential candidate
for garbage collection by the Java virtual machine.
The HttpSession object that's created for each browser
connection remains within the Servlet/JSP container as long as the
session is active. When the session is inactive for a specific time
period, the Servlet/JSP container removes that object and makes it a
potential candidate for garbage collection.
When the Servlet/JSP container receives a new HTTP request
from the browser, it removes the references to the old HttpRequest
and HttpResponse objects and creates new ones.
The HTTP is a stateless protocol, that is, it doesn't
maintain the state of the HTTP requests from the browser. Each time a
new HTTP request comes from the browser, it doesn't maintain the
state (session data) of the previous HTTP request. In most Web
applications it's always necessary to maintain the user session so
that content is sent to the correct browser. If not, the user must
log in each and every time, not an acceptable solution. By creating
the HttpSession, HttpRequest, and HttpResponse objects at the
appropriate times, the Servlet/JSP container provides a solution to
the above problem. However, this solution is primitive and has its
own problems. We need to extend this concept so we can have a
scalable, maintainable, and adaptable architecture.
Session Data Storage
Session data can be stored at the client- or server-side. In
the client-side approach we can use the following techniques:
cookies, URL rewriting, and hidden fields. In the server-side
approach we can store the session data in a persistence store that
can be retrieved for later use.
Client-Side Cookies
We can use client-side cookies to store all the session data
with the client ID. Once the session data has been written, it can be
retrieved later using this ID. By converting all the session data
into a string, it's possible to store an entire session data into a
cookie. Listing 1 shows how you can use, store, and retrieve cookies.
Once the cookie has been written, you can retrieve it from an
HttpResponse object using getCookie(). Cookies are easy to implement
but they have several limitations: a cookie can only store up to 4K
of text, and there are restrictions on the number of cookies that can
be stored in a given domain.
Client-Side HTML Hidden Fields
Another approach to preserving session data is using "hidden
fields," available in HTML INPUT tags. The stored session data can be
retrieved using the HttpRequest class that's defined by the
getParameter() method. Although this approach is easy to use and
implement, it has its own problems - we can store only string types
to the hidden fields, and we can't store other objects. Another major
problem with this approach is that the performance of the Web site
decreases significantly. First, a request comes from a browser, you
process that request, store the result in the hidden fields, and send
that HTML page back to the browser for additional user requests.
Next, a request comes from the same user, you process the first
request as well as the second one, and send the result back to HTML.
With this approach you end up processing the same data again and
again so it's not suitable for large Web sites.
Client-Side URL Rewriting
Generally this technique is used when the user disables the
cookies in the browser. You append a client ID as a response
parameter to all URLs that are served by the Web server. This is
accomplished by using the encodeURL() method defined in the
HttpServletResponse class. One problem with this approach is that
every URL on every page must be rewritten dynamically in order to
embed the client ID in every request. Listing 2 provides an example
of URL rewriting. In it the itemID is used as a client ID that
identifies the user in case the browser is disabled by the cookies.
Note: You need to generate these URLs for each and every link in all
your pages, which is tedious and difficult to maintain.
Another problem with this approach is security. If we store
all the session data in the client, with little effort anyone can
access that data. One way to solve the problem is to encrypt the
session data before storing it on the client-side. However, even if
you encrypt the session data it's still not a good practice to store
sensitive business data at the client-side.
Server-Side Persistence Approach
In this approach we use the client-side to store only the
client ID (using cookies or hidden fields); the session data
associated with this ID is stored in the server. When the cookies are
turned off in the browser, we can use URL rewriting to send the
client ID to the browser. Using the method encodeURL() that's defined
in the HttpResponse class solves all the problems mentioned earlier
in the client-side approach.
Problems with the Servlet/JSP Container Programming Model
In this section we'll explore the inherent problems
associated with the Servlet/JSP programming model. As explained
earlier, it's a complex task when multiple HTTP requests come for the
same Servlet/JSP. The Servlet/JSP container creates a new thread to
handle each request. So all the threads share the same instance
variables in the Servlet/JSP. This makes it impossible to store the
different client information in the instance variables in the
Servlet/JSP. The only option is to use the HttpSession object to
store the instance variables, which can be retrieved or removed as
necessary. But adding and removing instance variables to the
HttpSession object from any Servlet/JSP will create unmanageable
spaghetti code. Similarly, accessing the HttpRequest and HttpResponse
objects directly from the Servlet/JSP will also create unmanageable
code. So we need a mechanism that helps us use the HttpSession,
HttpServletRequest, and HttpResponse objects in a more controlled and
manageable way.
When developing Web applications, map the browser requests to
the appropriate Servlet/JSP. The easiest way to map the browser
actions to the back-end Servlet/JSP is to use a query parameter
within the HTML forms or URL links and have an if-then-else statement
to find the appropriate Servlet/JSP. Although this approach is
simple, it creates a performance nightmare for Servlet/JSP
developers. If we have to support new Servlet/JSPs, we need to change
the source code; this leads us to the development life cycle of
compile, test, debug, and deploy.
Another significant problem with developing Web applications
is managing the changes in Web resources such as HTML, Servlet, and
JSP. During development and production time, these resources change
names, directories, paths, and more. If we hard-code these names and
paths in our programs, for each and every change we have to go
through the compile, debug, test, and deploy cycles. An alternative
is to store the names and paths in a text file and access them from
the programs. If there's any change in the Web resources, we can edit
the text file and simply restart the Web application.
High-Traffic Web Sites
The standard JSDK implementation stores the HttpSession
object in the memory of the JVM where it was created so an
HttpSession can be retrieved efficiently. Though this approach works
for smaller Web sites or sites that don't need session information
from one request to another, this approach doesn't scale well in
larger, high-traffic e-business sites.
In those sites where multiple Web servers are used to serve
Web pages, a load balancer (either hardware or software) is used to
distribute the HTTP traffic to different Web servers. When the load
balancer assigns a Web server to a particular HTTP request that comes
from a browser, all subsequent requests from that browser are
assigned to the same Web server. The load balancer accomplishes this
by examining the IP address of the incoming packets and assigning a
particular Web server to that packet, as well as assigning the same
Web server to all the packets that have the same IP address. This
method of assigning the packets with the same IP address to the same
Web server is called server affinity. Server affinity may not be
guaranteed because many companies now assign random IP addresses to
the outgoing packets. In this case the load balancer can't assign the
same Web server to packets that come from the same browser. If we use
the reference implementation of JSDK, we'll run into problems
maintaining the HttpSession objects. The reference implementation of
JSDK maintains the HttpSession object in the memory of the JVM in
which it was created. When the load balancer assigns the packets from
the same browser to different Web servers, the HttpSession object
created for the first packet is lost to subsequent packets. To avoid
this problem we need to store the HttpSession object in the
persistence store, which must be accessible to all Web servers, so
the right HttpSession object can be retrieved using the session ID
stored in the cookie.
Summary
In Part 1 we described the inherent problems associated with
J2EE's Servlet/JSP container and how they affect the creation of
enterprise-wide, scalable Web architecture. In Part 2, we'll look at
Jlink's architecture and how it solves these problems.
References
- Sun's J2EE Blueprint: http://java.sun.com/j2ee/blueprints
- Buschmann, F., et al. (1996). Pattern-Oriented Software
Architecture. John Wiley & Sons.
- Fields, D.K., and Kolb, M.A. (2000). Web Development with
JavaServer Pages. Manning Publications.
- Servlet/JSP API:
http://java.sun.com/products/servlet/2.2/javadoc/index.html
|