Sep 8, 2009

WAS Scalability and Performance

Overview and key concepts
This chapter provides a conceptual overview of the goals and issues associated with scaling applications in the WebSphere environment. It also presents the various techniques that can be used to implement scaling. A short section highlights the performance improvements of WebSphere Application Server V6 over previous versions (see 1.5, “Performance improvemens over previous versions” on page 31). A prerequisite for all WebSphere scalability and workload management concepts discussed in this redbook is the use of IBM WebSphere Application Server Network Deployment V6 or IBM WebSphere Business Integration Server Foundation V6. We assume the use of IBM WebSphere Application Server Network Deployment throughout this book.
WebSphere Application Server V6 Scalability and Performance
1.1 Objectives
As outlined by the product documentation, a typical minimal configuration for a WebSphere Application Server V6 installation is illustrated in Figure 1-1. There is a single Web (HTTP) server and a single application server, and both are co-resident on a single machine. Please note that such a configuration is not recommended for security reasons; there is no DMZ established. This is true for both internal (intranet) and external (Internet) applications. Naturally, the
performance characteristics of this setup are limited by the power of the machine on which it is running, by various constraints inherent in the configuration itself, and by the implementation of WebSphere Application Server.

we are interested in defining system configurations that exhibit the following properties:
 Scalability
 Workload management
 Availability
 Maintainability
 Session management
 Performance impacts of WebSphere Application Server security
WebSphere
Application Server
Configuration
Repository
(XML Files)
HTTP Web
Server
Application
Database
Web Servlet EJB
Client
Admin
Service
Overview and key concepts
Note that scaling the application server environment does not help if your
application has an unscalable design. For Web application and EJB development
best practices, refer to Chapter 16, “Application development: best practices for
application design, performance and scalability” on page 895 and to the white
paper WebSphere Application Server Development Best Practices for
Performance and Scalability, found at:
http://www.ibm.com/software/webservers/appserv/ws_bestpractices.pdf
Scalability
Scalability defines how easily a site will expand. Web sites must expand,
sometimes with little warning, and grow to support an increased load. The
increased load may come from many sources:
 New markets
 Normal growth
 Extreme peaks
An application and infrastructure that is architected for good scalability makes site growth possible and easy. Most often, one achieves scalability by adding hardware resources to improve
throughput. A more complex configuration, employing additional hardware, should allow one to service a higher client load than that provided by the simple basic configuration. Ideally, it should be possible to service any given load simply by adding additional servers or machines (or upgrading existing resources). However, adding new systems or processing power does not always provide a linear increase in throughput. For example, doubling the number of processors in your system will not necessarily result in twice the processing capacity. Nor will adding an additional horizontal server in the Application Server tier necessarily result in twice the request serving capacity. Adding additional resources introduces additional overhead for resource management and request distribution. While the overhead and corresponding degradation may be small, you need to remember that adding n additional machines does not always result
in n times the throughput. Also, you should not simply add hardware without doing some investigation and possible software tuning first to identify potential bottlenecks in your application or any other performance-related software configurations. Adding more hardware may not necessarily improve the performance if the software is badly designed or not tuned correctly. Once the software optimization has been done,
then the hardware resources should be considered as the next step for improving
performance. See also section 14.2, “Scalability” of the redbook IBM WebSphere
V6 Planning and Design Handbook, SG24-6446 for information about this topic.
6 WebSphere Application Server V6 Scalability and Performance Handbook
There are two different ways to improve performance when adding hardware:
vertical scaling and horizontal scaling. See “Scalability” on page 21 for more
information about these concepts.
Performance
Performance involves minimizing the response time for a given request or transaction.
The response time is the time it takes from the moment the user initiates a request at the browser to the moment the result of the HTML page returns to the browser. At one time, there was an unwritten rule of the Internet known as the ”8 second rule.” This rule stated that any page that did not respond within eight seconds would be abandoned. Many enterprises still use this as the response time benchmark threshold for Web applications.

Throughput
Throughput, while related to performance, more precisely defines the number of concurrent transactions that can be accommodated. For example, if an application can handle 10 customer requests simultaneously and each request takes one second to process, this site has a potential
throughput of 10 requests per second.

Workload management
Workload management (WLM) is a WebSphere facility to provide load balancing and affinity between application servers in a WebSphere clustered environment. Workload management can be an important facet of performance. WebSphere uses workload management to send requests to alternate members of the cluster. WebSphere also routes concurrent requests from a user to the application server that serviced the first request, as EJB calls, and session state
will be in memory of this application server. The proposed configuration should ensure that each machine or server in the configuration processes a fair share of the overall client load that is being processed by the system as a whole. In other words, it is not efficient to have one machine overloaded while another machine is mostly idle. If all machines have roughly the same capacity (for example, CPU power), each should process a roughly equal share of the load. Otherwise, there likely needs to be a provision for workload to be distributed in proportion to the processing power available on each machine.
Overview and key concepts
Furthermore, if the total load changes over time, the system should automatically adapt itself; for example, all machines may use 50% of their capacity, or all machines may use 100% of their capacity. But not one machine uses 100% of itscapacity while the rest uses 15% of their capacity.
In this redbook, we discuss both Web server load balancing using WebSphere
Edge Components and WebSphere workload management techniques.
1.1.3 Availability
Also known as resiliency, availability is the description of the system’s ability to
respond to requests no matter the circumstances. Availability requires that the
topology provide some degree of process redundancy in order to eliminate single
points of failure. While vertical scalability can provide this by creating multiple
processes, the physical machine then becomes a single point of failure. For this
reason, a high availability topology typically involves horizontal scaling across
multiple machines.
Hardware-based high availability
Using a WebSphere Application Server multiple machine configuration
eliminates a given application server process as a single point of failure. In
WebSphere Application Server V5.0 and higher, the removal of the application
dependencies on the administrative server process for security, naming and
transactions further reduces the potential that a single process failure can disrupt
processing on a given node. In fact, the only single point of failure in a
WebSphere cell is the Deployment Manager, where all central administration is
performed. However, a failure at the Deployment Manager only impacts the
ability to change the cell configuration and to run the Tivoli Performance Viewer
which is now included in the Administrative Console; application servers are
more self-sufficient in WebSphere Application Server V5.0 and higher compared
to WebSphere Application Server V4.x. A number of alternatives exist to provide
high availability for the Deployment Manager, including the possible use of an
external high availability solution, though the minimal impact of a Deployment
Manager outage typically does not require the use of such a solution. In many
cases, manual solutions such as the one outlined in the article “Implementing a
Highly Available Infrastructure for WebSphere Application Server Network
Deployment, Version 5.0 without Clustering” provides suitable availability. This
article is available at
http://www-128.ibm.com/developerworks/websphere/library/techarticles/0304_a
lcott/alcott.html
See the redbook WebSphere Application Server Network Deployment V6: High
availability solutions, SG24-6688 for further details on external HA solutions.
8 WebSphere Application Server V6 Scalability and Performance Handbook
Failover
The proposition to have multiple servers (potentially on multiple independent
machines) naturally leads to the potential for the system to provide failover. That
is, if any one machine or server in the system were to fail for any reason, the
system should continue to operate with the remaining servers. The load
balancing property should ensure that the client load gets redistributed to the
remaining servers, each of which will take on a proportionately slightly higher
percentage of the total load. Of course, such an arrangement assumes that the
system is designed with some degree of overcapacity, so that the remaining
servers are indeed sufficient to process the total expected client load.
Ideally, the failover aspect should be totally transparent to clients of the system.
When a server fails, any client that is currently interacting with that server should
be automatically redirected to one of the remaining servers, without any
interruption of service and without requiring any special action on the part of that
client. In practice, however, most failover solutions may not be completely
transparent. For example, a client that is currently in the middle of an operation
when a server fails may receive an error from that operation, and may be
required to retry (at which point the client would be connected to another, still
available server). Or the client may observe a pause or delay in processing,
before the processing of its requests resumes automatically with a different
server. The important point in failover is that each client, and the set of clients as
a whole, is able to eventually continue to take advantage of the system and
receive service, even if some of the servers fail and become unavailable.
Conversely, when a previously failed server is repaired and again becomes
available, the system may transparently start using that server again to process a
portion of the total client load.
The failover aspect is also sometimes called fault tolerance, in that it allows the
system to survive a variety of failures or faults. It should be noted, however, that
failover is only one technique in the much broader field of fault tolerance, and that
no such technique can make a system 100 percent safe against every possible
failure. The goal is to greatly minimize the probability of system failure, but keep
in mind that the possibility of system failure cannot be completely eliminated.
Note that in the context of discussions on failover, the term server most often
refers to a physical machine (which is typically the type of component that fails).
However, we will see that WebSphere also allows for the possibility of one server
process on a given machine to fail independently, while other processes on that
same machine continue to operate normally.
HAManager
WebSphere V6 introduces a new concept for advanced failover and thus higher
availability, called the High Availability Manager (HAManager). The HAManager
enhances the availability of WebSphere singleton services like transaction
Chapter 1. Overview and key concepts 9
services or JMS message services. It runs as a service within each application
server process that monitors the health of WebSphere clusters. In the event of a
server failure, the HAManager will failover the singleton service and recover any
in-flight transactions. Please see Chapter 9, “WebSphere HAManager” on
page 465 for details.
1.1.4 Maintainability
Maintainability is the ability to keep the system running before, during, and after
scheduled maintenance. When considering maintainability in performance and
scalability, remember that maintenance periodically needs to be performed on
hardware components, operating systems, and software products in addition to
the application components.
While maintainability is somewhat related to availability, there are specific issues
that need to be considered when deploying a topology that is maintainable. In
fact, some maintainability factors are at cross purposes to availability. For
instance, ease of maintainability would dictate that one should minimize the
number of application server instances in order to facilitate online software
upgrades. Taken to the extreme, this would result in a single application server
instance, which of course would not provide a high availability solution. In many
cases, it is also possible that a single application server instance would not
provide the required throughput or performance.
Some of the maintainability aspects that we consider are:
 Dynamic changes to configuration
 Mixed configuration
 Fault isolation
Dynamic changes to configuration
In certain configurations, it may be possible to modify the configuration on the fly
without interrupting the operation of the system and its service to clients. For
example, it may be possible to add or remove servers to adjust to variations in
the total client load. Or it may be possible to temporarily stop one server to
change some operational or tuning parameters, then restart it and continue to
serve client requests. Such characteristics, when possible, are highly desirable,
since they enhance the overall manageability and flexibility of the system.
Mixed configuration
In some configurations, it may be possible to mix multiple versions of a server or
application, so as to provide for staged deployment and a smooth upgrade of the
overall system from one software or hardware version to another. Coupled with
the ability to make dynamic changes to the configuration, this property may be
used to effect upgrades without any interruption of service.
10 WebSphere Application Server V6 Scalability and Performance Handbook
Fault isolation
In the simplest application of failover, we are only concerned with clean failures of
an individual server, in which a server simply ceases to function completely, but
this failure has no effect on the health of other servers. However, there are
sometimes situations where one malfunctioning server may in turn create
problems for the operation of other, otherwise healthy servers. For example, one
malfunctioning server may hoard system resources, or database resources, or
hold critical shared objects for extended periods of time, and prevent other
servers from getting their fair share of access to these resources or objects. In
this context, some configurations may provide a degree of fault isolation, in that
they reduce the potential for the failure of one server to affect other servers.
1.1.5 Session state
Unless you have only a single application server or your application is completely
stateless, maintaining state between HTTP client requests also plays a factor in
determining your configuration. Use of the session information, however, is a fine
line between convenience for the developer and performance and scalability of
the system. It is not practical to eliminate session data altogether, but care
should be taken to minimize the amount of session data passed. Persistence
mechanisms decrease the capacity of the overall system, or incur additional
costs to increase the capacity or even the number of servers. Therefore, when
designing your WebSphere environment, you need to take session needs into
account as early as possible.
In WebSphere V6, there are two methods for sharing of sessions between
multiple application server processes (cluster members). One method is to
persist the session to a database. An alternate approach is to use
memory-to-memory session replication functionality, which was added to
WebSphere V5 and is implemented using WebSphere internal messaging. The
memory-to-memory replication (sometimes also referred to as “in-memory
replication”) eliminates a single point of failure found in the session database (if
the database itself has not been made highly available using clustering
software). See 1.4.1, “HTTP sessions and the session management facility” on
page 25 for additional information.
1.1.6 Performance impact of WebSphere Application Server security
The potential to distribute the processing responsibilities between multiple
servers and, in particular, multiple machines, also introduces a number of
Tip: Refer to the redbook A Practical Guide to DB2 UDB Data Replication V8,
SG24-6828, for details about DB2 replication.
Chapter 1. Overview and key concepts 11
opportunities for meeting special security constraints in the system. Different
servers or types of servers may be assigned to manipulate different classes of
data or perform different types of operations. The interactions between the
various servers may be controlled, for example through the use of firewalls, to
prevent undesired accesses to data.
SSL
Building up SSL communication causes extra HTTP requests and responses
between the machines and every SSL message is encrypted on one side and
decrypted on the other side.
SSL at the front end or Web tier is a common implementation. This allows for
secured purchases, secure viewing of bank records, etc. SSL handshaking,
however, is expensive from a performance perspective. The Web server is
responsible for encrypting data sent to the user, and decrypting the request from
the user. If a site will be operated using SSL more often than not and the server
is operating at close to maximum CPU then using some form of SSL accelerator
hardware in the server, or even an external device that offloads all SSL traffic
prior to communication with the Web server, may help improve performance.
Performance related security settings
The following settings can help to fine-tune the security-related configurations to
enhance performance:
 Security cache timeout
Its setting determines how long WebSphere should cache information related
to permission and security credentials. When the cache timeout expires, all
cached information becomes invalid. Subsequent requests for the information
result in a database lookup. Sometimes, acquiring the information requires
invoking an LDAP-bind or native authentication, both of which are relatively
costly operations in terms of performance.
 HTTP session timeout
This parameter specifies how long a session will be considered active when it
is unused. After the timeout, the session expires and another session object
will be created. With high-volume Web sites, this may influence the
performance of the server.
 Registry and database performance
Databases and registries used by an application influence the WebSphere
Application Server performance. In a distributed environment, it is highly
recommended that you put the authorization server onto a separate machine
in order to offload application processing. When planning for a secured
environment, special attention should be given to the sizing and
implementation of the directory server. It is a best practice to first tune the
12 WebSphere Application Server V6 Scalability and Performance Handbook
directory server and make sure that it performs well before looking at the
WebSphere environment. Also, make sure that this directory does not
introduce a new single point of failure.
1.2 WebSphere Application Server architecture
This section introduces the major components within WebSphere Application
Server. Refer to the Redpaper entitled WebSphere Application Server V6
architecture, REDP3918 and to the redbook WebSphere Application Server V6
System Management and Configuration Handbook, SG24-6451 for details about
the WebSphere Application Server architecture and components.
1.2.1 WebSphere Application Server Network Deployment
components
The following is a short introduction of each WebSphere Network Deployment
runtime component and their functions:
 Deployment Manager
The Deployment Manager is part of an IBM WebSphere Application Server
Network Deployment V5.x and higher installation. It provides a central point of
administrative control for all elements in a distributed WebSphere cell. The
Deployment Manager is a required component for any vertical or horizontal
scaling and there exists only one specimen of it at a time.
 Node Agent
The Node Agent communicates directly with the Deployment Manager, and is
used for configuration synchronization and administrative tasks such as
stopping/starting of individual application servers and performance
monitoring on the application server node.
 Application server (instance)
An application server in WebSphere is the process that is used to run your
servlet and/or EJB-based applications, providing both Web container and EJB
container.
 Web server and Web server plug-in
While the Web server is not strictly part of the WebSphere runtime,
WebSphere communicates with your Web server of choice: IBM HTTP Server
Note: WebSphere security is out of the scope of this redbook. However, an
entire redbook is dedicated to this topic. See WebSphere Application Server
V6: Security Handbook, SG24-6316 for details.
Chapter 1. Overview and key concepts 13
powered by Apache, Lotus Domino, Microsoft® Internet Information Server,
Apache, Sun ONE Web server/Sun Java System Web Server, and Covalent
Enterprise Ready Server via a plug-in. This plug-in communicates requests
from the Web server to the WebSphere runtime. In WebSphere Application
Server V6, the Web server and plug-in XML file can also be managed from
the Deployment Manager. For more information about this new function, see
3.4, “Web server topology in a Network Deployment cell” on page 78.
 WebContainer Inbound Chain (formerly embedded HTTP transport)
The WebContainer Inbound Chain is a service of the Web container. An
HTTP client connects to a Web server and the HTTP plug-in forwards the
requests to the application server through the WebContainer Inbound Chain.
The communication type is either HTTP or HTTPS.
Although the WebContainer Inbound Chain is available in any WebSphere
Application Server to allow testing or development of the application
functionality prior to deployment, it should not be used for production
environments. See 6.2.1, “WebContainer Inbound Chains” on page 231 for
more details.
 Default messaging provider
An embedded JMS server was introduced with WebSphere Application
Server V5 but has been totally rewritten for WebSphere Application Server V6
and is now called the default messaging provider. This infrastructure supports
both message-oriented and service-oriented applications, and is fully
integrated within WebSphere Application Server V6 (all versions). The default
messaging provider is a JMS 1.1 compliant JMS provider that supports
point-to-point and publish/subscribe styles of messaging.
 Administrative service
The administrative service runs within each WebSphere Application Server
Java virtual machine (JVM). Depending on the version installed, there can be
administrative services installed in many locations. In WebSphere Application
Server V6 and in WebSphere Application Server - Express V6, the
administrative service runs in each application server. In a Network
Deployment configuration, the Deployment Manager, Node Agent, and
application server(s), all host an administrative service. This service provides
the ability to update configuration data for the application server and its
related components.
 Administrative Console
The WebSphere Administrative Console provides an easy-to-use, graphical
“window” to a WebSphere cell and runs in a browser. In a Network
Deployment environment, it connects directly to the Deployment Manager,
which allows management of all nodes in the cell.
14 WebSphere Application Server V6 Scalability and Performance Handbook
 Configuration repository
The configuration repository stores the configuration for all WebSphere
Application Server components inside a WebSphere cell. The Deployment
Manager communicates changes in configuration and runtime state to all cell
members through the administrative service and the Node Agent. Each
application server maintains a subset of the cell’s configuration repository,
which holds this application server’s configuration information.
 Administrative cell
A cell is a set of application servers managed by the same Deployment
Manager. Some application server instances may reside on the same node,
others on different nodes. Cell members do not necessarily run the same
application.
 Server cluster and cluster members
A server cluster is a set of application server processes (the cluster
members). They run the same application, but usually on different nodes. The
main purpose of clusters is load balancing and failover.
Figure 1-2 on page 15 shows the relationship between these components.
Chapter 1. Overview and key concepts 15
Figure 1-2 WebSphere configuration with Java clients and Web clients
1.2.2 Web clients
The basic configuration shown in Figure 1-1 on page 4 refers to a particular type
of client for WebSphere characterized as a Web client. Such a client uses a Web
browser to interact with the WebSphere Application Servers, through the
intermediary of a Web server (and plug-in), which in turn invokes a servlet within
WebSphere (you can also access static HTML pages).
Node Agent
Node
Application
Database
Config
repository
(subset)
Cell
Master
repository
(file)
Deployment Manager
Admin
application
Name Server (JNDI)
Admin Service
Admin Service
Scripting
client
Admin
UI
SOAP or
RMI/IIOP
HTTP(s)
Web
browser
client
HTTP(s)
Config
repository
(subset)
EJB container
Web container
Webcontainer
Inbound
channel
Application Server
EJB container
Web container
Webcontainer
Inbound
channel
Application Server
Java client
Client container RMI/IIOP
HTTP server
WebSphere
plug-in
Cluster
16 WebSphere Application Server V6 Scalability and Performance Handbook
1.2.3 Java clients
Although Web clients constitute the majority of users of WebSphere today,
WebSphere also supports another type of client characterized as a Java client.
Such a client is a stand-alone Java program (GUI-based or not), which uses the
Java RMI/IIOP facilities to make direct method invocations on various EJB
objects within an application server, without going through an intervening Web
server and servlet, as shown in Figure 1-2 on page 15.
1.3 Workload management
Workload Management (WLM) is the process of spreading multiple requests for
work over the resources that can do the work. It optimizes the distribution of
processing tasks in the WebSphere Application Server environment. Incoming
work requests are distributed to the application servers and other objects that
can most effectively process the requests.
Workload management is also a procedure for improving performance,
scalability, and reliability of an application. It provides failover when servers are
not available.
Workload management is most effective when the deployment topology is
comprised of application servers on multiple machines, since such a topology
provides both failover and improved scalability. It can also be used to improve
scalability in topologies where a system is comprised of multiple servers on a
single, high-capacity machine. In either case, it enables the system to make the
most effective use of the available computing resources.
Workload management provides the following benefits to WebSphere
applications:
 It balances client processing requests, allowing incoming work requests to be
distributed according to a configured WLM selection policy.
 It provides failover capability by redirecting client requests to a running server
when one or more servers are unavailable. This improves the availability of
applications and administrative services.
 It enables systems to be scaled up to serve a higher client load than provided
by the basic configuration. With clusters and cluster members, additional
instances of servers can easily be added to the configuration. See 1.3.3,
“Workload management using WebSphere clustering” on page 19 for details.
 It enables servers to be transparently maintained and upgraded while
applications remain available for users.
 It centralizes administration of application servers and other objects.
Chapter 1. Overview and key concepts 17
Two types of requests can be workload-managed in IBM WebSphere Application
Server Network Deployment V6:
 HTTP requests can be distributed across multiple Web containers, as
described in 1.3.2, “Plug-in workload management” on page 17. We refer to
this as Plug-in WLM.
 EJB requests can be distributed across multiple EJB containers, as described
in 1.3.4, “Enterprise Java Services workload management” on page 23. We
refer to this as EJS WLM.
1.3.1 Web server workload management
If your environment uses multiple Web servers, a mechanism is needed to allow
servers to share the load. The HTTP traffic must be spread among a group of
servers. These servers must appear as one server to the Web client (browser),
making up a cluster, which is a group of independent nodes interconnected and
working together as a single system. This cluster concept should not be
confused with a WebSphere cluster as described in 1.3.3, “Workload
management using WebSphere clustering” on page 19.
As shown in Figure 1-3, a load balancing mechanism called IP spraying can be
used to intercept the HTTP requests and redirect them to the appropriate
machine on the cluster, providing scalability, load balancing, and failover.
See Chapter 4, “Introduction to WebSphere Edge Components” on page 99 for
details.
Figure 1-3 Web server workload management
1.3.2 Plug-in workload management
This is a short introduction into plug-in workload management, where servlet
requests are distributed to the Web container in clustered application servers, as
shown in Figure 1-4 on page 18. This configuration is also referred to as the
servlet clustering architecture.
IP Sprayer
HTTP
Web Requests
Client
Caching
Proxy
Plugin
HTTP Server
Plugin
HTTP Server
HTTP
Requests
18 WebSphere Application Server V6 Scalability and Performance Handbook
Figure 1-4 Plug-in (Web container) workload management
Clustering application servers that host Web containers automatically enables
plug-in workload management for the application servers and the servlets they
host. In the simplest case, the cluster is configured on a single machine, where
the Web server process also runs.
Routing of servlet requests occurs between the Web server plug-in and the
clustered application servers using HTTP or HTTPS. This routing is based purely
on the weights associated with the cluster members. If all cluster members have
identical weights, the plug-in sends an equal number of requests to all members
of the cluster, when assuming no session affinity. If the weights are scaled in the
range from zero to 20, the plug-in routes requests to those cluster members with
the higher weight value more often. A rule of thumb formula for determining
routing preference would be:
% routed to Server1 = weight1 / (weight1+weight2+...+weightn)
where n is the number of cluster members in the cluster. The Web server plug-in
distributes requests around cluster members that are not available.
See Chapter 6, “Plug-in workload management and failover” on page 227 for
details.
Servlet
Requests
Application Server
Application Server
Web
Container
Web
Container
Plugin
HTTP Server
Chapter 1. Overview and key concepts 19
1.3.3 Workload management using WebSphere clustering
This section describes how workload management is implemented in IBM
WebSphere Application Server Network Deployment V6 by using application
server clusters and cluster members. How to create clusters and cluster
members is detailed in Chapter 8, “Implementing the sample topology” on
page 387.
A cluster is a set of application servers that are managed together and
participate in workload management. Application servers participating in a
cluster can be on the same node or on different nodes. A Network Deployment
cell can contain no clusters, or have many clusters depending on the need of the
administration of the cell.
The cluster is a logical representation of the application servers. It is not
necessarily associated with any node, and does not correspond to any real
server process running on any node. A cluster contains only application servers,
and the weighted workload capacity associated with those servers.
When creating a cluster, it is possible to select an existing application server as
the template for the cluster without adding that application server into the new
cluster (the chosen application server is used only as a template, and is not
affected in any way by the cluster creation). All other cluster members are then
created based on the configuration of the first cluster member.
Cluster members can be added to a cluster in various ways: during cluster
creation and afterwards. During cluster creation, one existing application server
can be added to the cluster and/or one or more new application servers can be
created and added to the cluster. There is also the possibility of adding additional
members to an existing cluster later on. Depending on the capacity of your
systems, you can define different weights for the various cluster members.
It may be a good idea to create the cluster with a single member first, adjust the
member's configuration and then add the other members. This process
guarantees that all cluster members are created with the same settings. Cluster
members are required to have identical application components, but they can be
sized differently in terms of weight, heap size, and other environmental factors.
You must be careful though not to change anything that might result in different
application behavior on each cluster member. This concept allows large
enterprise machines to belong to a cluster that also contains smaller machines
such as Intel® based Windows servers.
Starting or stopping the cluster automatically starts or stops all cluster members,
and changes to the application are propagated to all application servers in the
cluster.
20 WebSphere Application Server V6 Scalability and Performance Handbook
Figure 1-5 shows an example of a possible configuration that includes server
clusters. Server cluster 1 has two cluster members on node B only. Server
cluster 2, which is completely independent of server cluster 1, has two cluster
members on node A and three cluster members on node B. Finally, node A also
contains a free-standing application server that is not a member of any cluster.
Figure 1-5 Server clusters and cluster members
Clusters and cluster members provide the necessary support for workload
management, failover, and scalability. For details, please refer to Chapter 6,
“Plug-in workload management and failover” on page 227.
Distributing workloads
The ability to route a request to any server in a group of clustered application
servers allows the servers to share work and improving throughput of client
requests. Requests can be evenly distributed to servers to prevent workload
imbalances in which one or more servers has idle or low activity while others are
overburdened. This load balancing activity is a benefit of workload management.
Using weighted definitions of cluster members allows nodes to have different
hardware resources and still participate in a cluster. The weight specifies that the
application server with a higher weight will be more likely to serve the request
faster, and workload management will consequently send more requests to that
node.
Cluster
Members in
Server
Cluster 1
Cluster
Members in
Server
Cluster 2
Servlet EJB
Application Server/Cluster Member
Application Server/Cluster Member Servlet EJB
Application Server/Cluster Member
Servlet EJB
Application Server/Cluster Member
Application Server/Cluster Member
Servlet EJB
Application Server/Cluster Member
Application Server/Cluster Member
Node A
Servlet EJB
Application Server
Node B
Servlet EJB
Servlet EJB
Servlet EJB
Server Cluster 2
Server Cluster 1
Chapter 1. Overview and key concepts 21
Failover
With several cluster members available to handle requests, it is more likely that
failures will not negatively affect throughput and reliability. With cluster members
distributed to various nodes, an entire machine can fail without any application
downtime. Requests can be routed to other nodes if one node fails. Clustering
also allows for maintenance of nodes without stopping application functionality.
Scalability
Clustering is an effective way to perform vertical and horizontal scaling of
application servers.
 In vertical scaling, shown in Figure 1-6, multiple cluster members for an
application server are defined on the same physical machine, or node, which
may allow the machine’s processing power to be more efficiently allocated.
Even if a single JVM can fully utilize the processing power of the machine, you
may still want to have more than one cluster member on the machine for other
reasons, such as using vertical clustering for software failover. If a JVM
reaches a table/memory limit (or if there is some similar problem), then the
presence of another process provides for failover.
Figure 1-6 Vertical scaling
We recommend that you avoid using “rules of thumb” when determining the
number of cluster members needed for a given machine. The only way to
determine what is correct for your environment and application(s) is to tune a
single instance of an application server for throughput and performance, then
EJB
Container
Web
Container
Cluster 1, Member 1
Node A
EJB
Container
Web
Container
Cluster 1, Member 2
Cluster 1
22 WebSphere Application Server V6 Scalability and Performance Handbook
add it to a cluster, and incrementally add additional cluster members. Test
performance and throughput as each member is added to the cluster. Always
monitor memory usage when you are configuring a vertical scaling topology
and do not exceed the available physical memory on a machine.
In general, 85% (or more) utilization of the CPU on a large server shows that
there is little, if any, performance benefit to be realized from adding additional
cluster members.
 In horizontal scaling, shown in Figure 1-7, cluster members are created on
multiple physical machines. This allows a single WebSphere application to
run on several machines while still presenting a single system image, making
the most effective use of the resources of a distributed computing
environment. Horizontal scaling is especially effective in environments that
contain many smaller, less powerful machines. Client requests that
overwhelm a single machine can be distributed over several machines in the
system. Failover is another benefit of horizontal scaling. If a machine
becomes unavailable, its workload can be routed to other machines
containing cluster members.
Figure 1-7 Horizontal scaling
Horizontal scaling can handle application server process failures and
hardware failures (or maintenance) without significant interruption to client
service.
 WebSphere applications can combine horizontal and vertical scaling to reap
the benefits of both scaling techniques, as shown in Figure 1-8 on page 23.
Note: WebSphere Application Server V5.0 and higher supports horizontal
clustering across different platforms and operating systems. Horizontal
cloning on different platforms was not supported in WebSphere Application
Server V4.
EJB
Container
Web
Container
Cluster 1 , M ember 1
Node A
EJB
Container
Web
Container
Cluster 1, Member 2
Node B
Cluster 1
Chapter 1. Overview and key concepts 23
Figure 1-8 Vertical and horizontal scaling
Secure application cluster members
The workload management service has its own built-in security, which works with
the WebSphere Application Server security service to protect cluster member
resources. If security is needed for your production environment, enable security
before you create a cluster for the application server. This enables security for all
of the members in that cluster.
The EJB method permissions, Web resource security constraints, and security
roles defined in an enterprise application are used to protect EJBs and servlets
in the application server cluster. Refer to WebSphere Application Server V6:
Security Handbook, SG24-6316 for more information.
1.3.4 Enterprise Java Services workload management
In this section, we discuss Enterprise Java Services workload management (EJS
WLM), in which the Web container and EJB container are on different application
servers. The application server hosting the EJB container is clustered.
Configuring the Web container in a separate application server from the
Enterprise JavaBean container (an EJB container handles requests for both
session and entity beans) enables distribution of EJB requests between the EJB
container clusters, as seen in Figure 1-9 on page 24.
EJB
Container
Web
Container
Cluster 1, Member 1
Node A
EJB
Container
Web
Container
Cluster 1, Member 2
EJB
Container
Web
Container
Cluster 1, Member 3
Node B
EJB
Container
Web
Container
Cluster 1, Member 4
Cluster 1
24 WebSphere Application Server V6 Scalability and Performance Handbook
Figure 1-9 EJB workload management
In this configuration, EJB client requests are routed to available EJB containers
based on the workload management EJB selection policy (Server-weighted
round robin routing or Prefer local).
The EJB clients can be servlets operating within a Web container, stand-alone
Java programs using RMI/IIOP, or other EJBs.
EJS workload management is covered in detail in Chapter 7, “EJB workload
management” on page 341.
1.4 Managing session state among servers
All the load distribution techniques discussed in this book rely, on one level or
another, on using multiple copies of an application server and arranging for
multiple consecutive requests from various clients to be serviced by different
servers.
Important: Although it is possible to split the Web container and EJB
container, it is not recommended because of the negative performance
impact. In addition, application maintenance also becomes more complex
when the application runs in different application servers.
EJB
Requests
Application Server
EJB
Container
Application Server
EJB
Container
EJB
Requests Java
Client
Application Server
Web
Container
Chapter 1. Overview and key concepts 25
If each client request is completely independent of every other client request,
then it does not matter if two requests are processed on the same server or not.
However, in practice, there are many situations where all requests from an
individual client are not totally independent. In many usage scenarios, a client
makes one request, waits for the result, then makes one or more subsequent
requests that depend upon the results received from the earlier requests.
Such a sequence of operations on behalf of one client falls into one of two
categories:
 Stateless: the server that processes each request does so based solely on
information provided with that request itself, and not on information that it
“remembers” from earlier requests. In other words, the server does not need
to maintain state information between requests.
 Stateful: the server that processes a request does need to access and
maintain state information generated during the processing of an earlier
request.
Again, in the case of stateless interactions, it does not matter if different requests
are being processed by different servers. But in the case of stateful interactions,
we must ensure that whichever server is processing a request has access to the
state information necessary to service that request. This can be ensured either
by arranging for the same server to process all the client requests associated
with the same state information, or by arranging for that state information to be
shared and equally accessible by all servers that may require it. In that last case,
it is often advantageous to arrange for most accesses to the shared state to be
performed from the same server, so as to minimize the communications
overhead associated with accessing the shared state from multiple servers.
This consideration for the management of stateful interactions is a factor in the
discussions of various load distribution techniques throughout this book. It also
requires the discussion of a specific facility, the session management facility, and
of server affinity.
1.4.1 HTTP sessions and the session management facility
In the case of an HTTP client interacting with a servlet, the state information
associated with a series of client requests is represented as an HTTP session,
and identified by a session ID. The session manager module that is part of each
Web container is responsible for managing HTTP sessions, providing storage for
session data, allocating session IDs, and tracking the session ID associated with
each client request, through the use of cookies, URL rewriting, or SSL ID
techniques.
26 WebSphere Application Server V6 Scalability and Performance Handbook
The session manager provides for the storage of session-related information
either in-memory within the application server, in which case it cannot be shared
with other application servers, in a back-end database, shared by all application
servers, or by using memory-to-memory replication.
The second option, sometimes referred to as persistent sessions or session
clustering, is one method that uses HTTP sessions with the load-distribution
configuration. With this option, whenever an application server receives a
request associated with a session ID, which is not in memory, it can obtain it by
accessing the back-end database, and can then serve the request. When this
option is not enabled, and another clustering mechanism is not used, if any load
distribution mechanism happens to route an HTTP request to an application
server other than the one where the session was originally created, that server
would be unable to access the session, and would thus not produce correct
results in response to that request. One drawback to the database solution, just
as with application data, is that it provides a single point of failure (SPOF) so it
should be implemented in conjunction with hardware clustering products such as
HACMP or solutions such as database replication. Another drawback is the
performance hit, caused by database disk I/O operations and network
communications.
The last option, memory-to-memory replication, provides a mechanism for
session clustering within IBM WebSphere Application Server Network
Deployment. It uses the built-in Data Replication Service (DRS) of WebSphere to
replicate session information stored in memory to other members of the cluster.
Using this functionality removes the single point of failure that is present in
persistent sessions through a database solution that has not been made highly
available using clustering software. The sharing of session state is handled by
creating a replication domain and then configuring the Web container to use that
replication domain to replicate session state information to the specified number
of application servers. DRS, like database replication, also incurs a performance
hit, primarily because of the overhead of network communications. Additionally,
since copies of the session object reside in application server memory this
reduces the available heap for application requests and usually results in more
frequent garbage collection cycles by the application server JVM.
Chapter 1. Overview and key concepts 27
Storing session states in a persistent database or using memory-to-memory
replication provides a degree of fault tolerance to the system. If an application
server crashes or stops, any session state that it may have been working on
would normally still be available either in the back-end database or in another still
running application server’s memory, so that other application servers can take
over and continue processing subsequent client requests associated with that
session.
More information can be found in 6.8, “HTTP session management” on
page 279.
Note: DRS, and thus the session replication mechanism, has changed in
WebSphere V6. Configuration has been greatly simplified. The default
configuration setting is now to have a single replica. You can also replicate to
the entire domain (which corresponds to the peer-to-peer configuration in
WebSphere V5.x) or specify the number of replicas.
Performance measurements showed that the overhead associated with
replicating the session to every other cluster member is more significant than
contention introduced by using the database session repository (the more
application servers in the cluster, the more overhead for memory-to-memory
replication). Therefore, it is not recommended that you use the Entire Domain
setting for larger replication domains.
For larger session sizes, the overhead of session replication increases.
Database replication has a lower overall throughput than memory-to-memory,
due to database I/O limitations (the database becomes the bottleneck).
However, while database replication with large sessions performs slower, it is
doing so with less CPU power than memory-to-memory replication. The
unused processor power can be used by other tasks on the system.
Important: Neither mechanism provides a 100% guarantee that a session
state will be preserved in case of a server crash. If a server happens to crash
while it is literally in the middle of modifying the state of a session, some
updates may be lost and subsequent processing using that session may be
affected. Also, in the case of memory replication, if the node crashes during a
replication, or in between the time interval that has been set for replicating
session information, then some session information may be lost.
However, such a situation represents only a very small window of vulnerability
and a very small percentage of all occurrences throughout the life of a system
in production.
28 WebSphere Application Server V6 Scalability and Performance Handbook
1.4.2 EJB sessions or transactions
In the case of an EJB client interacting with one or more EJBs, the management
of state information associated with a series of client requests is governed by the
EJB specification and implemented by the WebSphere EJS container, and
depends on the types of EJBs that are the targets of these requests.
Stateless session bean
By definition, when interacting with a stateless session bean, there is no
client-visible state associated with the bean. Every client request directed to a
stateless session bean is independent of any previous request that was directed
to the same bean. The container maintains a pool of instances of stateless
session beans of each type, and will provide an arbitrary instance of the
appropriate bean type whenever a client request is received. It does not matter if
the same actual bean instance is used for consecutive requests, or even if two
consecutive requests are serviced by bean instances in the same application
server.
Stateful session bean
In contrast, a stateful session bean is used precisely to capture state information
that must be shared across multiple consecutive client requests that are part of
one logical sequence of operations. The client must take special care to ensure
that it is always accessing the same instance of the stateful session bean, by
obtaining and keeping an EJB object reference to that bean. The various
load-distribution techniques available in WebSphere make special provisions to
support this characteristic of stateful session beans.
A new feature in WebSphere V6 is the failover support for stateful session beans;
the state information is now replicated to other application servers in the cluster
using the Data Replication Service (DRS). If an application server fails, a new
instance of the bean is created on a different server, the state information is
recovered, requests are directed to the recovered instance and processing
continues.
Entity bean
Finally, we must consider the case of an entity bean. Most external clients access
WebSphere services through session beans, but it is possible for an external
client to access an entity bean directly. Furthermore, a session bean inside
WebSphere is itself often acting as a client to one or more entity beans also
inside WebSphere; if load distribution features are used between that session
bean and its target entity bean, then the same questions arise as with plain
external clients.
Chapter 1. Overview and key concepts 29
Strictly speaking, the information contained in an entity bean is not usually
associated with a “session” or with the handling of one client request or series of
client requests. However, it is common for one client to make a succession of
requests targeted at the same entity bean instance. Unlike all the previous
cases, it is possible for multiple independent clients to access the same entity
bean instance more or less concurrently. Therefore, it is important that the state
contained in that entity bean be kept consistent across the multiple client
requests.
For entity beans, the notion of a session is more or less replaced by the notion of
transaction. For the duration of one client transaction to which it participates, the
entity bean is instantiated in one container (normally the container where the first
operation within that transaction was initiated). All subsequent accesses to that
same bean, within that same transaction, must be performed against that same
instance in that same container.
In between transactions, the handling of the entity bean is specified by the EJB
specification, in the form of a number of caching options:
 With option A caching, WebSphere Application Server assumes that the
entity bean is used within a single container. Clients of that bean must direct
their requests to the bean instance within that container. The entity bean has
exclusive access to the underlying database, which means that the bean
cannot be clustered or participate in workload management if option A
caching is used.
 With option B caching, the bean instance remains active (so it is not
guaranteed to be made passive at the end of each transaction), but it is
always reloaded from the database at the start of each transaction. A client
can attempt to access the bean and start a new transaction on any container
that has been configured to host that bean.
 With option C caching (which is the default), the entity bean is always
reloaded from the database at the start of each transaction and made passive
at the end of each transaction. A client can attempt to access the bean and
start a new transaction on any container that has been configured to host that
bean. This is effectively similar to the session clustering facility described for
HTTP sessions: the shared state is maintained in a shared database, and can
be accessed from any server when required.
Message-driven beans
The message-driven bean (MDB) was introduced with WebSphere V5. Support
for MDBs is a requirement of a J2EE 1.3 compliant application server. In
WebSphere V4.0, the Enterprise Edition offered a similar functionality called
message beans that leveraged stateless session EJBs and a message listener
service. That container, however, did not implement the EJB 2.0 specification.
30 WebSphere Application Server V6 Scalability and Performance Handbook
The MDB is a stateless component that is invoked by a J2EE container when a
JMS message arrives at a particular JMS destination (either a queue or topic).
Loosely, the MDB is triggered by the arrival of a message.
Messages are normally anonymous. If some degree of security is desired, the
listener will assume the credentials of the application server process during the
invocation of the MDB.
MDBs handle messages from a JMS provider within the scope of a transaction. If
transaction handling is specified for a JMS destination, the listener will start a
global transaction before reading incoming messages from that destination. Java
Transaction API (JTA) transaction control for commit or rollback is invoked when
the MDB processing has finished.
1.4.3 Server affinity
The discussion above implies that any load-distribution facility, when it chooses a
server to direct a request, is not entirely free to select any available server:
 In the case of stateful session beans or entity beans within the context of a
transaction, there is only one valid server. WebSphere WLM will always direct
a client's access of a stateful session bean to the single server instance
containing the bean (there is no possibility of choosing the wrong server
here). If the request is directed to the wrong server (for example because of a
configuration error), it will either fail, or that server itself will be forced to
forward the request to the correct server, at great performance cost.
 In the case of clustered HTTP sessions or entity beans between transactions,
the underlying shared database ensures that any server can correctly
process each request. However, accesses to this underlying database may
be expensive, and it may be possible to improve performance by caching the
database data at the server level. In such a case, if multiple consecutive
requests are directed to the same server, they may find the required data still
in the cache, and thereby reduce the overhead of access to the underlying
database.
The characteristics of each load-distribution facility, which take these constraints
into account, are generally referred to as server affinity: In effect, the load
distribution facility recognizes that multiple servers may be acceptable targets for
a given request, but it also recognizes that each request may have a particular
affinity for being directed to a particular server where it may be handled better or
faster.
Chapter 1. Overview and key concepts 31
We will encounter this notion of server affinity throughout the discussion of the
various load-distribution facilities in Chapter 6, “Plug-in workload management
and failover” on page 227 and Chapter 7, “EJB workload management” on
page 341. In particular, we will encounter the notion of session affinity, where the
load distribution facility recognizes the existence of a session and attempts to
direct all requests within that session to the same server, and we will also discuss
the notion of transaction affinity, in which the load distribution facility recognizes
the existence of a transaction, and behaves similarly.
Finally, we will also see that a particular server affinity mechanism can be weak
or strong. In a weak affinity mechanism, the system attempts to enforce the
desired affinity for the majority of requests most of the time, but may not always
be able to provide a total guarantee that this affinity will be respected. In a strong
affinity mechanism, the system guarantees that affinity will always be strictly
respected, and generates an error when it cannot.
1.5 Performance improvements over previous versions
WebSphere Application Server V6 contains a significant set of functional and
performance improvements over the initial release of WebSphere Version 5.0
and the incremental releases that followed. Most of the improvements listed
below are as compared to WebSphere V5.1.
Improved performance with Java V1.4.2
 Java Virtual Machine (JVM) enhancements.
 Various class-level enhancements for logging, BigDecimal, NumberFormat,
date formatting and XML parsing.
 Reduced lock contention for improved ORB scalability.
Improved Web container performance/scalability
 Channel Framework
The Channel Framework with NIO (a new Java framework for input/output)
supports non-blocking IO allowing for a large number of
concurrently-connected clients to be supported, which helps client scalability
by enabling the WebSphere Application Server containers to handle more
client requests with a fewer number of threads than was required previously.
 Code path improvements
The Web container is redesigned and reimplemented for significant
reductions in code path and memory utilization.
32 WebSphere Application Server V6 Scalability and Performance Handbook
 Caching enhancements
The Web container redesign also contains internal improvements for caching
various types of objects inside the Web container. These caching
improvements provide better performance and scalability.
 Improved HTTP session replication using a new high-performing transport
The Domain Replication Service is rebased on the HAManager framework,
which uses a multicast/unicast based transport (DCS), which itself uses the
channel framework.
See Chapter 6, “Plug-in workload management and failover” on page 227 and
Chapter 9, “WebSphere HAManager” on page 465 for more information about
this topic.
EJB improvements
 Cached data verification (functional improvement)
The new cached data verification in the EJB container allows for aggressive
caching of data while maintaining consistency with the database by
performing an optimistic versioning field check at transaction commit time.
This increases performance of transactions and maintains data consistency
required in high transaction rate environments.
All EJB applications can benefit from improved EJB performance due to:
 Code path lengths which are reduced throughout the runtime.
 Application profiling
The application profiling technology (which was previously only available in
IBM WebSphere Application Server Enterprise) improves overall entity EJB
performance and throughput by fine tuning the run-time behavior by enabling
EJB optimizations to be customized for multiple user access patterns without
resorting to "worst case" choices (such as Pessimistic Update on
findByPrimaryKey, regardless of whether the client needs it for read or for
actual update).
 Higher-performance Access Intent settings
An automatic analysis tool is provided that determines the most performant
access intent setting on each entity bean in the application, for each
application scenario in which that entity bean may be accessed.
 Stateful session bean replication
The new stateful session bean replication is using a new high performance
transport.
Chapter 1. Overview and key concepts 33
 Improved persistence manager
The improved persistence manager allows for more caching options and
better database access strategies.
Applications with compatible design patterns can exploit the following EJB
improvements:
 Keep-update-locks optimization for DB2 UDB V8.2
This enables applications to use a row blocking protocol for data fetching and
also avoid deadlocks by enforcing update locks on affected rows. This feature
enhances performance for applications that use search updates, such as EJB
CMP Entity Applications.
 Read-only beans
The new "Read Only Beans" optimization provides enhanced EJB caching
performance.
 Improved read-ahead
Improved Read-ahead functionality enables reading in data from multiple
CMP with a single SQL statement.
See Chapter 7, “EJB workload management” on page 341 for more information
about this topic.
Improved Web Services performance
Web Services performance improvements can be seen in several areas:
 Improved deserialization
Improvements in V6.0 tooling generates higher-performing deserializers for
all JAX-RPC beans. Because of this, redeploying a V5.x application into V6.0
may significantly decrease the processing time for large messages.
 Web Services caching
Additional capabilities are added for Web Services caching. The V6.0
serialization/deserialization runtime is enhanced to cache frequently used
serializers/deserializers, significantly decreasing the processing time for large
messages. Web Services caching also includes JAX/RPC client side Web
Services caching (first introduced in V5.1.1).
 Other Web Services improvements come in the areas of in-process
performance enhancements.
Refer to the redbook WebSphere Version 6 Web Services Handbook
Development and Deployment, SG24-6461 for more information about Web
Services.
34 WebSphere Application Server V6 Scalability and Performance Handbook
Default messaging provider
The embedded JMS server from V5.x has been replaced by a new, in-process
Java messaging engine that makes nonpersistent messaging primitives up to 5x
faster.
The improved default messaging provider for V6.0 is a new all-Java
implementation that runs within the WebSphere Application Server JVM process
and uses a relational database for its persistent store. It provides improved
function and performance over V5.1, particularly in the nonpersistent messaging
scenarios. Nonpersistent point-to-point in-process messaging (EJB to MDB) is
up to 5 times faster than V5.1. See Chapters 10 and 11 of the redbook
WebSphere Application Server V6 System Management and Configuration
Handbook, SG24-6451 for more information.
Improved Dynamic Fragment Caching
DMap caching (or DistributedMap caching), which was previously only available
in IBM WebSphere Application Server Enterprise, performs better than
command caching because of how the caching mechanisms are implemented.
See Chapter 10, “Dynamic caching” on page 501 for more information.
Type 4 JDBC driver
The Universal JDBC Driver performs better than the legacy/CLI JDBC driver, due
mainly to significant code path reductions, a reduced dependency on JNI, and
optimizations not available to the legacy/CLI JDBC driver architecture.
Tip: The IBM WebSphere Performance team publishes a performance report
for each release of WebSphere Application Server. This performance report
provides performance information using the Trade benchmark, information
about the new default messaging provider, memory considerations, and much
more. To obtain a copy of the WebSphere performance report, please contact
your IBM sales or technical support contact.
Chapter 1. Overview and key concepts 35
1.6 The structure of this redbook
This redbook is divided into six parts containing the following chapters and
appendixes:
Part 1: Getting started
Chapter 2, “Infrastructure planning and design” on page 39 introduces scalability
and scaling techniques, and helps you to evaluate the components of your
e-business infrastructure for their scalability. Basic considerations regarding
infrastructure deployment planning, sizing, benchmarking, and performance
tuning are also covered in this chapter.
Chapter 3, “Introduction to topologies” on page 71 discusses topologies
supported by WebSphere Application Server V6 and the variety of factors that
come into play when considering the appropriate topology choice.
Part 2: Distributing the workload
Chapter 4, “Introduction to WebSphere Edge Components” on page 99 gives an
overview of the capabilities and options of WebSphere Edge Components.
Chapter 5, “Using IBM WebSphere Edge Components” on page 127 looks at
Web server load balancing with the WebSphere Edge Components of IBM
WebSphere Application Server Network Deployment V6, describing some
important configurations and its use with the IBM WebSphere Application Server
Network Deployment V6. An introduction to the Caching Proxy is also included in
this chapter.
Chapter 6, “Plug-in workload management and failover” on page 227 covers Web
container workload management. It discusses components, configuration
settings, and resulting behaviors. It also covers session management in a
workload-managed environment.
Chapter 7, “EJB workload management” on page 341 examines how EJB
requests can be distributed across multiple EJB containers.
Part 3: Implementing the solution
Chapter 8, “Implementing the sample topology” on page 387 provides
step-by-step instructions for implementing a sample multi-machine environment.
This environment is used to illustrate most of IBM WebSphere Application Server
Network Deployment V6’s workload management and scalability features
throughout this book, including a Caching Proxy and Load Balancer.
36 WebSphere Application Server V6 Scalability and Performance Handbook
Part 4: High availability and caching
Chapter 9, “WebSphere HAManager” on page 465 introduces the new
HAManager feature of WebSphere Application Server V6 and shows some basic
usage and configuration examples.
Chapter 10, “Dynamic caching” on page 501 describes the use of caching
technologies in three-tier and four-tier application environments involving
browser-based clients and applications using WebSphere Application Server.
Part 5: Messaging
Chapter 11, “Using asynchronous messaging for scalability and performance” on
page 621 gives an introduction into the JMS 1.1 API.
Chapter 12, “Using and optimizing the default messaging provider” on page 643
introduces the new default messaging provider that is part of WebSphere
Application Server V6. It also explains configuration options in a clustered
application server environment. This chapter extends Chapters 10 and 11 of the
redbook WebSphere Application Server V6 System Management and
Configuration Handbook, SG24-6451 and should thus be read in conjunction
with these chapters.
Chapter 13, “Understanding and optimizing the use of WebSphere MQ” on
page 697 describes how the various components for WebSphere MQ interact,
takes you through manually configuring the components for the Trade 6
application, and demonstrates running this with WebSphere MQ and WebSphere
Business Integration Event broker.
Part 6: Performance monitoring, tuning and coding practices
Chapter 14, “Server-side performance and analysis tools” on page 769 explains
the tools that can be run on the application server side to analyze and monitor
the performance of the environment. The topics covered include the Performance
Monitoring Infrastructure, Request Metrics, Tivoli Performance Viewer, and the
Performance Advisors.
Chapter 15, “Development-side performance and analysis tools” on page 839
offers an introduction into the tools that come with IBM Rational Application
Developer V6.0, such as the profiling tools or the separately downloadable Page
Detailer.
Chapter 16, “Application development: best practices for application design,
performance and scalability” on page 895 gives suggestions for developing
WebSphere Application Server based applications.
Chapter 17, “Performance tuning” on page 939 discusses tuning an existing
environment for performance and provides a guide for testing application servers
Chapter 1. Overview and key concepts 37
and components of the total serving environment that should be examined and
potentially tuned to increase the performance of the application
Part 6: Appendixes
Appendix A, “Sample URL rewrite servlet” on page 1033 explains how to set up
an example to test session management using URL rewrites.
Appendix B, “Additional material” on page 1037 gives directions for downloading
the Web material associated with this redbook.
38 WebSphere Application Server V6 Scalability and Performance Handbook
© Copyright IBM Corp. 2005. All rights reserved. 39
Chapter 2. Infrastructure planning and
design
Are you wondering about how to plan and design an infrastructure deployment
based on WebSphere middleware? This chapter covers the WebSphere-specific
components that you need to be familiar with to run a successful WebSphere
infrastructure project.
This chapter is organized in the following sections:
 Infrastructure deployment planning
 Design for scalability
 Sizing
 Benchmarking
 Performance tuning
2
40 WebSphere Application Server V6 Scalability and Performance Handbook
2.1 Infrastructure deployment planning
This section gives a general overview of which phases you have to go through
during a project, how you gather requirements, and how to apply them to a
WebSphere project.
Typically, a new project starts with only a concept. Very little is known about
specific implementation details, especially as they relate to the infrastructure.
Hopefully, your development team and infrastructure team work closely together
to help bring some scope to the needs of the overall application environment.
In some cases, the involvement of an IBM Design Center for e-business on
demand™ might be useful. For more information about this service, please see
2.1.1, “IBM Design Centers for e-business on demand” on page 41.
Bringing together a large team of people can create an environment that helps
hone the environment requirements. If unfocused, however, a large team can be
prone to wandering aimlessly and creating more confusion than resolving issues.
For this reason, pay close attention to the size of the requirements team, and
keep the meetings focused. Provide template documents to be completed by the
developers, the application business owners, and the user experience team. Try
to gather information that falls into the following categories:
 Functional requirements, which are usually determined by the business-use
of the application and are related to function.
 Non-functional requirements, which start to describe the properties of the
underlying architecture and infrastructure like reliability, availability, or
security.
 Capacity requirements, which include traffic estimates, traffic patterns, and
expected audience size.
Requirement gathering is an iterative process. There is no way, especially in the
case of Web-based applications, to have absolutes for every item. The best you
can do is create an environment that serves your best estimates, and then
monitor closely to adjust as necessary after launch. Make sure that all your plans
are flexible enough to deal with future changes in requirements and always keep
in mind that they may have an impact on every other part of your project.
With this list of requirements, you can start to create the first drafts of your
designs. You should target developing at least the following designs:
 Application design
To create your application design, you use your functional and non-functional
requirements to create guidelines for your application developers about how
your application is built.
Chapter 2. Infrastructure planning and design 41
This redbook does not attempt to cover the specifics of a software
development cycle. There are multiple methodologies for application design,
and volumes dedicated to best practices. However, Chapter 16, “Application
development: best practices for application design, performance and
scalability” on page 895 will give you a quick start to this topic.
 Implementation design
This design defines your target deployment infrastructure on which your
application will be deployed.
The final version of this implementation design will contain every detail about
hardware, processors, and software installed on the different components, but
you do not start out with all these details in the beginning. Initially, your
implementation design will simply list component requirements such as a
database, a set of application servers, a set of Web servers, and whatever
other components are defined in the requirements phase.
This design might need to be extended during your project, whenever a
requirement for change occurs, or when you get new sizing information. Too
often, however, the reality is that a project may require new hardware, and
therefore be constrained by capital acquisition requirements. Try to not go
back too often for additional resources once the project has been accepted.
Information that helps you to create this design is found in 2.2, “Design for
scalability” on page 42.
With the two draft designs in hand, you can now begin the process of formulating
counts of servers, network requirements, and the other infrastructure related
items. Basic information about sizing can be found in 2.3, “Sizing” on page 58.
In some cases, it might be appropriate to perform benchmark tests. There are
many ways to perform benchmarking tests, and some of these methods are
described in 2.4, “Benchmarking” on page 59.
The last step in every deployment is to tune your system and measure if it can
handle the projected load specified in your non-functional requirements. Section
2.5, “Performance tuning” on page 62 covers in more detail how to plan for load
tests.
2.1.1 IBM Design Centers for e-business on demand
The IBM Design Centers for e-business on demand are state-of-the-art facilities
where certified IT architects work with customers from around the world to
design, architect, and prototype advanced IT infrastructure solutions to meet the
challenges of on demand computing. Customers are nominated by their IBM
representative based upon their current e-business plans and are usually
designing a first-of-a-kind e-business infrastructure to solve a unique business
42 WebSphere Application Server V6 Scalability and Performance Handbook
problem. After a technical and business review, qualified customers come to a
Design Center for a Design Workshop which typically lasts three to five days. If
appropriate, a proof-of-concept session ranging from two to eight weeks may
follow.
The centers are located in:
 Poughkeepsie, USA
 Montpellier, France
 Makuhari, Japan
For more information about this service, please go to
http://www-1.ibm.com/servers/eserver/design_center/
2.2 Design for scalability
Understanding the scalability of the components in your e-business infrastructure
and applying appropriate scaling techniques can greatly improve availability and
performance. Scaling techniques are especially useful in multi-tier architectures,
when you want to evaluate components associated with IP load balancers such
as dispatchers or edge servers, Web presentation servers, Web application
servers, data servers and transaction servers.
You can use the following high-level steps to classify your Web site and identify
scaling techniques that are applicable to your environment:
1. Understanding the application environment
2. Categorizing your workload
3. Determining the most affected components
4. Selecting the scaling techniques to apply
5. Applying the technique(s)
6. Re-evaluating
This systematic approach needs to be adapted to your situation. We look at each
step in detail in the following sections.
2.2.1 Understanding the application environment
For existing environments, the first step is to identify all components and
understand how they relate to each other. The most important task is to
understand the requirements and flow of the existing application(s) and what can
or cannot be altered. The application is a significant contributor to the scalability
of any infrastructure, so a detailed understanding is crucial to scale effectively. At
a minimum, application analysis must include a breakdown of transaction types
and volumes as well as a graphic view of the components in each tier.
Chapter 2. Infrastructure planning and design 43
Figure 2-1 can aid in determining where to focus scalability planning and tuning
efforts. The figure shows the greatest latency for representative customers in
three workload patterns, and on which tier the effort should be concentrated.
Figure 2-1 How latency varies based on workload pattern and tier
For example, for online banking, most of the latency typically occurs in the
database server, whereas the application server typically experiences the
greatest latency for online shopping and trading sites. Planning to resolve those
latencies from an infrastructure solution becomes very different as the latency
shifts.
The way applications manage traffic between tiers significantly affects the
distribution of latencies between the tiers, which suggests that careful analysis of
application architectures is an important part of this step and could lead to
reduced resource utilization and faster response times. Collect metrics for each
tier, and make user-behavior predictions for each change implemented.
WebSphere offers various facilities for monitoring the performance of an
application, as discussed in Chapter 14, “Server-side performance and analysis
tools” on page 769 and Chapter 15, “Development-side performance and
analysis tools” on page 839.
Users
Edge
Server
Web
Server
Application
Server
Database Server
and Legacy Systems
ES WS AS DB
End-to-end response time
Network latency Web site latency
Scope of performance management
26 54 19
5 27 53 15
Banking 5 11 23 61
Shopping
Trading
Examples of
percent of
latency
Edge Server
Application Server
Web Server
Database Server
Internet
44 WebSphere Application Server V6 Scalability and Performance Handbook
As requirements are analyzed for a new application, strive to build scaling
techniques into the infrastructure. Implementation of new applications offer the
opportunity to consider each component type, such as open interfaces and new
devices, and the potential to achieve unprecedented transaction rates. Each
scaling technique affects application design. Similarly, application design impacts
the effectiveness of the scaling technique. To achieve proper scale, application
design must consider potential architectural scaling effects. An iterative,
incremental approach will need to be followed in the absence of known workload
patterns.
2.2.2 Categorizing your workload
Knowing the workload pattern for a site determines where to focus scalability
efforts and which scaling techniques to apply. For example, a customer
self-service site such as an online bank needs to focus on transaction
performance and the scalability of databases that contain customer information
used across sessions. These considerations would not typically be significant for
a publish/subscribe site, where a user signs up for data to be sent to them,
usually via a mail message.
Workload patterns and Web site classifications
Web sites with similar workload patterns can be classified into site types.
Consider five distinct workload patterns and corresponding Web site
classifications:
 Publish/subscribe (user to data)
 Online shopping (user to online buying)
 Customer self-service (user to business)
 Online trading (user to business)
 Business to business
Publish/subscribe (user to data)
Sample publish/subscribe sites include search engines, media sites such as
newspapers and magazines, as well as event sites such as sports
championships (for example the site for the Olympics).
Site content changes frequently, driving changes to page layouts. While search
traffic is low in volume, the number of unique items sought is high, resulting in the
largest number of page views of all site types. Security considerations are minor
compared to other site types. Data volatility is low. This site type processes the
fewest transactions and has little or no connection to legacy systems. Refer to
Table 2-1 on page 45 for details on these workload patterns.
Chapter 2. Infrastructure planning and design 45
Table 2-1 Publish/subscribe site workload pattern
Online shopping (user to online buying)
Sample sites include typical retail sites where users buy books, clothes, or even
cars. Site content can be relatively static, such as a parts catalog, or dynamic,
where items are frequently added and deleted, such as for promotions and
special discounts that come and go. Search traffic is heavier than on a
publish/subscribe site, although the number of unique items sought is not as
large. Data volatility is low. Transaction traffic is moderate to high, and almost
always grows.
When users buy, security requirements become significant and include privacy,
non-repudiation, integrity, authentication, and regulations. Shopping sites have
more connections to legacy systems, such as fulfillment systems, than
Workload pattern Publish/subscribe
Categories/examples  Search engines
 Media
 Events
Content  Dynamic change of the layout of a page, based
on changes in content, or need
 Many page authors; page layout changes
frequently
 High volume, non-user-specific access
 Fairly static information sources
Security Low
Percent secure pages Low
Cross-session info No
Searches  Structured by category
 Totally dynamic
 Low volume
Unique Items High
Data volatility Low
Volume of transactions Low
Legacy integration/ complexity Low
Page views High to very high
46 WebSphere Application Server V6 Scalability and Performance Handbook
publish/subscribe sites, but generally fewer than the other site types. This is
detailed in the following table.
Table 2-2 Online shopping site workload pattern
Customer self-service (user to business)
Sample sites include those used for home banking, tracking packages, and
making travel arrangements. Home banking customers typically review their
balances, transfer funds, and pay bills. Data comes largely from legacy
applications and often comes from multiple sources, thereby exposing data
consistency.
Security considerations are significant for home banking and purchasing travel
services, less so for other uses. Search traffic is low volume; transaction traffic is
Workload pattern Online shopping
Categories/examples  Exact inventory
 Inexact inventory
Content  Catalog either flat (parts catalog) or dynamic
(items change frequently, near real time)
 Few page authors and page layout changes
less frequently
 User-specific information: user profiles with
data mining
Security  Privacy
 Non-repudiation
 Integrity
 Authentication
 Regulations
Percent secure pages Medium
Cross-session info High
Searches  Structured by category
 Totally dynamic
 High volume
Unique items Low to medium
Data volatility Low
Volume of transactions Moderate to high
Legacy integration/ complexity Medium
Page views Moderate to high
Chapter 2. Infrastructure planning and design 47
moderate, but growing rapidly. Refer to Table 2-3 for details on these workload
patterns.
Table 2-3 Customer self-service site workload pattern
Online trading (user to business)
Of all site types, trading sites have the most volatile content, the potential for the
highest transaction volumes (with significant swing), the most complex
transactions, and are extremely time sensitive. Auction sites are characterized by
highly dynamic bidding against items with predictable life times.
Trading sites are tightly connected to the legacy systems. Nearly all transactions
interact with the back-end servers. Security considerations are high, equivalent
Workload pattern Customer self-service
Categories/examples  Home banking
 Package tracking
 Travel arrangements
Content  Data is in legacy applications
 Multiple data sources, requirement for
consistency
Security  Privacy
 Non-repudiation
 Integrity
 Authentication
 Regulations
Percent secure pages Medium
Cross-session info Yes
Searches  Structured by category
 Low volume
Unique items Low
Data volatility Low
Volume of transactions Moderate and growing
Legacy integration/ complexity High
Page views Moderate to low
48 WebSphere Application Server V6 Scalability and Performance Handbook
to online shopping, with an even larger number of secure pages. Search traffic is
low volume. Refer to Table 2-4 for details on these workload patterns.
Table 2-4 Online trading site workload pattern
Business to business
These sites include dynamic programmatic links between arm’s length
businesses (where a trading partner agreement might be appropriate). One
business is able to discover another business with which it may want to initiate
transactions.
In supply chain management, for example, data comes largely from legacy
applications and often from multiple sources, thereby exposing data consistency.
Security requirements are equivalent to online shopping. Transaction volume is
moderate, but growing; transactions are typically complex, connecting multiple
Workload pattern Online trading
Categories/examples  Online stock trading
 Auctions
Content  Extremely time sensitive
 High volatility
 Multiple suppliers, multiple consumers
 Transactions are complex and interact with
back end
Security  Privacy
 Non-repudiation
 Integrity
 Authentication
 Regulations
Percent secure pages High
Cross-session info Yes
Searches  Structured by category
 Low volume
Unique items Low to medium
Data volatility High
Volume of transactions High to very high (very large swings in volume)
Legacy integration/ complexity High
Page views Moderate to high
Chapter 2. Infrastructure planning and design 49
suppliers and distributors. Refer to Table 2-5 for details on these workload
patterns.
Table 2-5 Business-to-business site workload pattern
Workload characteristics
Your site type will become clear as it is evaluated using Table 2-6 on page 50 for
other characteristics related to transaction complexity, volume swings, data
volatility, security, and so on.
If the application has further characteristics that could potentially affect its
scalability, then add the extra characteristics to Table 2-6 on page 50.
Workload pattern Business to business
Categories/examples e-Procurement
Content  Data is in legacy applications
 Multiple data sources, requirement for
consistency
 Transactions are complex
Security  Privacy
 Non-repudiation
 Integrity
 Authentication
 Regulations
Percent secure pages Medium
Cross-session info Yes
Searches  Structured by category
 Low to moderate volume
Unique items Moderate
Data volatility Moderate
Volume of transactions Moderate to low
Legacy integration/ complexity High
Page views Moderate
50 WebSphere Application Server V6 Scalability and Performance Handbook
Table 2-6 Workload characteristics and Web site classifications
2.2.3 Determining the most affected components
This step involves mapping the most important site characteristics to each
component. Once again, from a scalability viewpoint, the key components of the
infrastructure are the load balancers, the Web application servers, security
services, transaction and data servers, and the network. Table 2-7 on page 51
specifies the significance of each workload characteristic to each component. As
seen in the table, the effect on each component is different for each workload
characteristic.
Note: All site types are considered to have high volumes of dynamic
transactions. This may or may not be a standard consideration when
performing sizing and site classification.
Workload characteristics Publish/
subscribe
Online
shopping
Customer
selfservice
Online
trading
Business
to
business
Your
workload
Volume of user-specific
responses
Low Low Medium High Medium
Amount of cross-session
information
Low High High High High
Volume of dynamic searches Low High Low Low Medium
Transaction complexity Low Medium High High High
Transaction volume swing Low Medium Medium High High
Data volatility Low Low Low High Medium
Number of unique items High Medium Low Medium Medium
Number of page views High Medium Low Medium Medium
Percent secure pages
(privacy)
Low Medium Medium High High
Use of security
(authentication, integrity,
non-repudiation)
Low High High High High
Other characteristics High
Chapter 2. Infrastructure planning and design 51
Table 2-7 Load impacts on components
2.2.4 Selecting the scaling techniques to apply
The best efforts in collecting the information needed are worthwhile in order to
make the best possible scaling decisions. Only when the information gathering is
as complete as it can be is it time to consider matching scaling techniques to
components.
Manageability, security, and availability are critical factors in all design decisions.
Techniques that provide scalability but compromise any of these critical factors
cannot be used.
Here is a list of the eight scaling techniques:
 Using a faster machine
 Creating a cluster of machines
 Using appliance servers
 Segmenting the workload
Workload
characteristics
Web
present.
server
Web
app.
server
Network Security
servers
Firewalls
Existing
business
servers
Database
server
High % user-specific
responses
Low High Low Low Low High High
High % cross-session
information
Med High Low Low Low Low Med
High volume of
dynamic searches
High High Med Low Med Med High
High transaction
complexity
Low High Low Med Low High High
High transaction
volume swing
Med High Low Low Low High High
High data volatility High High Low Low Low Med High
High # unique items Low Med Low Low Low High High
High # page views High Low High Low High Low Low
High % secure pages
(privacy)
High Low Low High Low Low Low
High security High High Low High Low High Low

No comments:

Post a Comment